Google Search
Google present the best relevant result among the major player like msn, yahoo, ask. Google did not have a profitable business model until the third iteration of their popular AdWords advertising program in February of 2002, and was worth over 100 billion dollars by the end of 2005.
Representing your on page content phrases at the locations like inbound links, internal link, at the start of the page title, at the beginning of your 1st page header, starting of your page then googler may filter the documents out of the search result. Other search engines follows the same kind of techniques but those algorithms are not as sophisticated and aggressively deployed as those used by google.
Your copy should look unique and natural. You also want to sprinkle modifiers and semantically related text in your pages that you want to rank well in Google. Duplicate content detection is not just based on some magical percentage of similar content on a page, but is based on a variety of factors. Both Bill Slawski and Todd Malicoat offer great posts about duplicate content detection. This shingles PDF explains some duplicate content detection techniques.
Here is a blog post about natural SEO copywriting which expounds on the points of writing unique natural content that will rank well in Google.
While Google is more efficient at crawling than competing engines, it appears as though with Google's BigDaddy update they are looking at both inbound and outbound link quality to help set crawl priority, crawl depth, and whether or not a site even gets crawled at all. To quote Matt Cutts:
“The sites that fit “no pages in Bigdaddy” criteria were sites where our algorithms had very low trust in the inlinks or the outlinks of that site. Examples that might cause that include excessive reciprocal links, linking to spammy neighborhoods on the web, or link buying/selling.”
Query Processing: While I mentioned above that Yahoo! seemed to have a bit of a bias toward commercial search results it is also worth noting that Google's organic search results are heavily biased toward informational websites and web pages.
Google is much better than Yahoo! or MSN at determining the true intent of a query and trying to match that instead of doing direct text matching. Common words like how to may be significantly deweighted compared to other terms in the search query that provide a better discrimination value.
Google and some of the other major search engines may try to answer many common related questions to the concept being searched for. For example, in a given set of search results you may see any of the following:
• a relevant .gov and/or .edu document
• a recent news article about the topic
• a page from a well known directory such as DMOZ or the Yahoo! Directory
• a page from the Wikipedia
• an archived page from an authority site about the topic
• the authoritative document about the history of the field and recent changes
• a smaller hyper focused authority site on the topic
• a PDF report on the topic
• a relevant Amazon, eBay, or shopping comparison page on the topic
• one of the most well branded and well known niche retailers catering to that market
• product manufacturer or wholesaler sites
• a blog post / review from a popular community or blog site about a slightly broader field
Some of the top results may answer specific relevant queries or be hard to beat, while others might be easy to compete with. You just have to think of how and why each result was chosen to be in the top 10 to learn which one you will be competing against and which ones may perhaps fall away over time.
Link Reputation: PageRank is a weighted measure of link popularity, but Google's search algorithms have moved far beyond just looking at PageRank. As mentioned above, gaining an excessive number of low quality links may hurt your ability to get indexed in Google, so stay away from known spammy link exchange hubs and other sources of junk links. I still sometimes get a few junk links, but I make sure that I try to offset any junky link by getting a greater number of good links.
If your site ranks well some garbage automated links will end up linking to you whether you like it or not. Don't worry about those links, just worry about trying to get a few real high quality editorial links.
Google is much better at being able to determine the difference between real editorial citations and low quality, spammy, bought, or artificial links.
When determining link reputation Google (and other engines) may look at
• link age
• rate of link acquisition
• anchor text diversity
• deep link ratio
• link source quality (based on who links to them and who else they link at)
• weather links are editorial citations in real content (or if they are on spammy pages or near other obviously non-editorial links)
• does anybody actually click on the link?
It is generally believed that .edu and .gov links are trusted highly in Google because they are generally harder to influence than the average .com link, but keep in mind that there are some junky .edu links too (I have seen stuff like .edu casino link exchange directories). While the TrustRank research paper had some names from Yahoo! on it, I think it is worth reading the TrustRank research paper (PDF) and the link spam mass estimation paper (PDF), or at least my condensed version of them here and here understand how Google is looking at links.
When getting links for Google it is best to look in virgin lands that have not been combed over heavily by other SEOs. Either get real editorial citations or get citations from quality sites that have not yet been abused by others. Google may strip the ability to pass link authority (even from quality sites) if those sites are known obvious link sellers or other types of link manipulators. Make sure you mix up your anchor text and get some links with semantically related text.
Google likely collects usage data via Google search, Google Analytics, Google AdWords, Google AdSense, Google news, Google accounts, Google notebook, Google calendar, Google talk, Google's feed reader, Google search history annotations, and Gmail. They also created a Firefox browser bookmark synch tool, an anti-phishing tool which is built into Firefox and have relationships with the Opera (another web browser company). Most likely they can lay some of this data over the top of the link graph to record a corroborating source of the legitimacy of the linkage data. Other search engines may also look at usage data.
Page vs Site: Sites need to earn a certain amount of trust before they can rank for competitive search queries in Google. If you put up a new page on a new site and expect it to rank right away for competitive terms you are probably going to be disappointed.
If you put that exact same content on an old trusted domain and link to it from another page on that domain it can leverage the domain trust to quickly rank and bypass the concept many people call the Google Sandbox.
Many people have been exploiting this algorithmic hole by throwing up spammy subdomains on free hosting sites or other authoritative sites that allow users to sign up for a cheap or free publishing account. This is polluting Google's SERPs pretty bad, so they are going to have to make some major changes on this front pretty soon.
Site Age: Older trusted sites may also be given a pass on many things that would cause newer lesser trusted sites to be demoted or de-indexed.
The Google Sandbox is a concept many SEOs mention frequently. The idea of the 'box is that new sites that should be relevant struggle to rank for some queries they would be expected to rank for. While some people have debunked the existence of the sandbox as garbage, Google's Matt Cutts said in an interview that they did not intentionally create the sandbox effect, but that it was created as a side effect of their algorithms:
"I think a lot of what's perceived as the sandbox is artefacts where, in our indexing, some data may take longer to be computed than other data."
Paid Search: Google AdWords factors in max bid price and clickthrough rate into their ad algorithm. In addition they automate reviewing landing page quality to use that as another factor in their ad relevancy algorithm to reduce the amount of arbitrage and other noisy signals in the AdWords program.
The Google AdSense program is an extension of Google AdWords which offers a vast ad network across many content websites that distribute contextually relevant Google ads. These ads are sold on a cost per click or flat rate CPM basis.
Editorial: Google is known to be far more aggressive with their filters and algorithms than the other search engines are.
Google published their official webmaster guidelines and their thoughts on SEO. Matt Cutts is also known to publish SEO tips on his personal blog. Keep in mind that Matt's job as Google's search quality leader may bias his perspective a bit.
A site by the name of Search Bistro uncovered a couple internal Google documents which have been used to teach remote quality raters what to look for when evaluating search quality since at least 2003
• Google Spam Recognition Guide for Raters (doc) - discusses the types of sites Google considers spam. Generally sites which do not add any direct value to the search or commerce experience.
• General Guidelines on Random-Query Evaluation (PDF) - shows how sites can be classified based on their value, from vital to useful to relevant to not relevant to off topic to offensive
These raters may be used to
• help train the search algorithms, or
• flag low quality sites for internal reviews, or
• human review suspected spam sites
If Google bans or penalizes your site due to an automated filter and it is your first infraction usually the site may return to the index within about 60 days of you fixing the problem. If Google manually bans your site you have to clean up your site and plead your case to get reincluded. To do so their webmaster guidelines state that you have to click a request reinclusion link from within the Google Sitemaps program.
Google Sitemaps gives you a bit of useful information from Google about what keywords your site is ranking for and which keywords people are clicking on your listing.
Social Aspects : Google allows people to write notes about different websites they visit using Google Notebook. Google also allows you to mark and share your favorite feeds and posts. Google also lets you flavorize search boxes on your site to be biased towards the topics your website covers.
Google is not as entrenched in the social aspects of search as much as Yahoo! is, but Google seems to throw out many more small tests hoping that one will perhaps stick.They are trying to make software more collaborative and trying to get people to share things like spreadsheets and calendars, while also integrating chat into email. If they can create a framework where things mesh well they may be able to gain further marketshare by offering free productivity tools.
Google SEO Tools
• Google Sitemaps - helps you determine if Google is having problems indexing your site.
• AdWords Keyword Tool - shows keywords related to an entered keyword, web page, or web site
• AdWords Traffic Estimator - estimates the bid price required to rank #1 on 85% of Google AdWords ads near searches on Google, and how much traffic an AdWords ad would drive
• Google Suggest - auto completes search queries based on the most common searches starting with the characters or words you have entered
• Google Trends - shows multi-year search trends
• Google Sets - creates semantically related keyword sets based on keyword(s) you enter
• Google Zeitgeist - shows quickly rising and falling search queries
• Google related sites - shows sites that Google thinks are related to your site related:www.site.com
• Google related word search - shows terms semantically related to a keyword ~term -term
Business Perspectives
Google has the largest search distribution, the largest ad network, and by far the most efficient search ad auction. They have aggressively extended their brand and amazing search distribution network through partnerships with small web publishers, traditional media companies, portals like AOL, computer and other hardware manufacturers such as Dell, and popular web browsers such as Firefox and Opera.
As they throw out bits of their relevancy in an attempt to keep their algorithm hard to manipulate they create holes where competing search businesses can become more efficient.
Search Marketing Perspective :If you are new to a market and are trying to compete for generic competitive terms it can take a year or more to rank well in Google. Buying older established sites with aged trusted quality citations might also be a good way to enter competitive marketplaces.
If you have better products than the competition, are a strong viral marketer, or can afford to combine your SEO efforts with traditional marketing it is much easier to get natural citations than if you try to force your way into the index.
Creating a small site with high quality unique content and focusing on getting a few exceptionally high quality links can help a new site rank quickly. In the past I believed that a link was a link and that there was just about no such thing as a bad link, but Google has changed that significantly over the last few years. With Google sometimes less is more.
At this point sometimes buying links that may seem relatively expensive at first glance when compared to cheaper alternatives (like paying $299 a year for a Yahoo! Directory listing) can be a great buy because owners of the most spammy sites would not want to have their sites manually reviewed by any of the major search companies, so likely Yahoo! and Google both are likely to place more than average weight on a Yahoo! Directory listing.
Also getting a few citations from high quality relevant related resources can go a long way to improving your overall Google search relevancy.
Right now I think Google is doing a junky job with some of their search relevancy, by placing too much trust on older domains and favoring pages that have only one or few occurrences of certain modifiers on their pages. In doing this they are ranking many cloaked pages for terms other than the terms they are targeting, and I have seen many instances of things like Google ranking real content home mortgage pages for student loan searches, largely because student loans was in the global site navigation on the home mortgage page.
Learn More
• Google on SEO
• Google's Webmaster Guidelines
• Google Spam Recognition Guide for Raters (doc)
• General Guidelines on Random-Query Evaluation (PDF)
• Google Blog
• Google AdWords
• Google AdWords Blog
• Google AdSense
• Google AdSense Blog
• Google Sitemaps
• Google Sitemaps Blog
• Papers Written by Googlers
• patent about information retrieval based on historical data
Worker Blogs
• Matt Cutts - Matt is an amazingly friendly and absurdly accessible guy given his position as the head of Google's search quality team.
• Adam Lasnik - a sharded version of Matt Cutts. A Cuttlet, if you will.
No comments:
Post a Comment