Search Engine Optimization News, Updates, Tips: 2008

Monday, September 8, 2008

Ranking Factors of Major players - Part 5

Google
has been in the search game a long time, and saw the web graph when it is much cleaner than the current web graph
is much better than the other engines at determining if a link is a true editorial citation or an artificial link
looks for natural link growth over time
heavily biases search results toward informational resources
trusts old sites way too much
a page on a site or subdomain of a site with significant age or link related trust can rank much better than it should, even with no external citations
they have aggressive duplicate content filters that filter out many pages with similar content
if a page is obviously focused on a term they may filter the document out for that term. on page variation and link anchor text variation are important. a page with a single reference or a few references of a modifier will frequently outrank pages that are heavily focused on a search phrase containing that modifier
crawl depth determined not only by link quantity, but also link quality. Excessive low quality links may make your site less likely to be crawled deep or even included in the index.
things like cheesy off topic reciprocal links are generally ineffective in Google when you consider the associated opportunity cost

Ask
looks at topical communities
due to their heavy emphasis on topical communities they are slow to rank sites until they are heavily cited from within their topical community
due to their limited market share they probably are not worth paying much attention to unless you are in a vertical where they have a strong brand that drives significant search traffic

Thursday, September 4, 2008

Ranking Factors of Major players - Part 4

Google Search

Google present the best relevant result among the major player like msn, yahoo, ask. Google did not have a profitable business model until the third iteration of their popular AdWords advertising program in February of 2002, and was worth over 100 billion dollars by the end of 2005.

Representing your on page content phrases at the locations like inbound links, internal link, at the start of the page title, at the beginning of your 1st page header, starting of your page then googler may filter the documents out of the search result. Other search engines follows the same kind of techniques but those algorithms are not as sophisticated and aggressively deployed as those used by google.

Your copy should look unique and natural. You also want to sprinkle modifiers and semantically related text in your pages that you want to rank well in Google. Duplicate content detection is not just based on some magical percentage of similar content on a page, but is based on a variety of factors. Both Bill Slawski and Todd Malicoat offer great posts about duplicate content detection. This shingles PDF explains some duplicate content detection techniques.

Here is a blog post about natural SEO copywriting which expounds on the points of writing unique natural content that will rank well in Google.

While Google is more efficient at crawling than competing engines, it appears as though with Google's BigDaddy update they are looking at both inbound and outbound link quality to help set crawl priority, crawl depth, and whether or not a site even gets crawled at all. To quote Matt Cutts:

“The sites that fit “no pages in Bigdaddy” criteria were sites where our algorithms had very low trust in the inlinks or the outlinks of that site. Examples that might cause that include excessive reciprocal links, linking to spammy neighborhoods on the web, or link buying/selling.”

Query Processing: While I mentioned above that Yahoo! seemed to have a bit of a bias toward commercial search results it is also worth noting that Google's organic search results are heavily biased toward informational websites and web pages.

Google is much better than Yahoo! or MSN at determining the true intent of a query and trying to match that instead of doing direct text matching. Common words like how to may be significantly deweighted compared to other terms in the search query that provide a better discrimination value.

Google and some of the other major search engines may try to answer many common related questions to the concept being searched for. For example, in a given set of search results you may see any of the following:

• a relevant .gov and/or .edu document

• a recent news article about the topic

• a page from a well known directory such as DMOZ or the Yahoo! Directory

• a page from the Wikipedia

• an archived page from an authority site about the topic

• the authoritative document about the history of the field and recent changes

• a smaller hyper focused authority site on the topic

• a PDF report on the topic

• a relevant Amazon, eBay, or shopping comparison page on the topic

• one of the most well branded and well known niche retailers catering to that market

• product manufacturer or wholesaler sites

• a blog post / review from a popular community or blog site about a slightly broader field

Some of the top results may answer specific relevant queries or be hard to beat, while others might be easy to compete with. You just have to think of how and why each result was chosen to be in the top 10 to learn which one you will be competing against and which ones may perhaps fall away over time.

Link Reputation: PageRank is a weighted measure of link popularity, but Google's search algorithms have moved far beyond just looking at PageRank. As mentioned above, gaining an excessive number of low quality links may hurt your ability to get indexed in Google, so stay away from known spammy link exchange hubs and other sources of junk links. I still sometimes get a few junk links, but I make sure that I try to offset any junky link by getting a greater number of good links.

If your site ranks well some garbage automated links will end up linking to you whether you like it or not. Don't worry about those links, just worry about trying to get a few real high quality editorial links.

Google is much better at being able to determine the difference between real editorial citations and low quality, spammy, bought, or artificial links.

When determining link reputation Google (and other engines) may look at

• link age

• rate of link acquisition

• anchor text diversity

• deep link ratio

• link source quality (based on who links to them and who else they link at)

• weather links are editorial citations in real content (or if they are on spammy pages or near other obviously non-editorial links)

• does anybody actually click on the link?

It is generally believed that .edu and .gov links are trusted highly in Google because they are generally harder to influence than the average .com link, but keep in mind that there are some junky .edu links too (I have seen stuff like .edu casino link exchange directories). While the TrustRank research paper had some names from Yahoo! on it, I think it is worth reading the TrustRank research paper (PDF) and the link spam mass estimation paper (PDF), or at least my condensed version of them here and here understand how Google is looking at links.

When getting links for Google it is best to look in virgin lands that have not been combed over heavily by other SEOs. Either get real editorial citations or get citations from quality sites that have not yet been abused by others. Google may strip the ability to pass link authority (even from quality sites) if those sites are known obvious link sellers or other types of link manipulators. Make sure you mix up your anchor text and get some links with semantically related text.

Google likely collects usage data via Google search, Google Analytics, Google AdWords, Google AdSense, Google news, Google accounts, Google notebook, Google calendar, Google talk, Google's feed reader, Google search history annotations, and Gmail. They also created a Firefox browser bookmark synch tool, an anti-phishing tool which is built into Firefox and have relationships with the Opera (another web browser company). Most likely they can lay some of this data over the top of the link graph to record a corroborating source of the legitimacy of the linkage data. Other search engines may also look at usage data.

Page vs Site: Sites need to earn a certain amount of trust before they can rank for competitive search queries in Google. If you put up a new page on a new site and expect it to rank right away for competitive terms you are probably going to be disappointed.

If you put that exact same content on an old trusted domain and link to it from another page on that domain it can leverage the domain trust to quickly rank and bypass the concept many people call the Google Sandbox.

Many people have been exploiting this algorithmic hole by throwing up spammy subdomains on free hosting sites or other authoritative sites that allow users to sign up for a cheap or free publishing account. This is polluting Google's SERPs pretty bad, so they are going to have to make some major changes on this front pretty soon.

Site Age: Older trusted sites may also be given a pass on many things that would cause newer lesser trusted sites to be demoted or de-indexed.

The Google Sandbox is a concept many SEOs mention frequently. The idea of the 'box is that new sites that should be relevant struggle to rank for some queries they would be expected to rank for. While some people have debunked the existence of the sandbox as garbage, Google's Matt Cutts said in an interview that they did not intentionally create the sandbox effect, but that it was created as a side effect of their algorithms:

"I think a lot of what's perceived as the sandbox is artefacts where, in our indexing, some data may take longer to be computed than other data."

Paid Search: Google AdWords factors in max bid price and clickthrough rate into their ad algorithm. In addition they automate reviewing landing page quality to use that as another factor in their ad relevancy algorithm to reduce the amount of arbitrage and other noisy signals in the AdWords program.

The Google AdSense program is an extension of Google AdWords which offers a vast ad network across many content websites that distribute contextually relevant Google ads. These ads are sold on a cost per click or flat rate CPM basis.

Editorial: Google is known to be far more aggressive with their filters and algorithms than the other search engines are.

Google published their official webmaster guidelines and their thoughts on SEO. Matt Cutts is also known to publish SEO tips on his personal blog. Keep in mind that Matt's job as Google's search quality leader may bias his perspective a bit.

A site by the name of Search Bistro uncovered a couple internal Google documents which have been used to teach remote quality raters what to look for when evaluating search quality since at least 2003

• Google Spam Recognition Guide for Raters (doc) - discusses the types of sites Google considers spam. Generally sites which do not add any direct value to the search or commerce experience.

• General Guidelines on Random-Query Evaluation (PDF) - shows how sites can be classified based on their value, from vital to useful to relevant to not relevant to off topic to offensive

These raters may be used to

• help train the search algorithms, or

• flag low quality sites for internal reviews, or

• human review suspected spam sites

If Google bans or penalizes your site due to an automated filter and it is your first infraction usually the site may return to the index within about 60 days of you fixing the problem. If Google manually bans your site you have to clean up your site and plead your case to get reincluded. To do so their webmaster guidelines state that you have to click a request reinclusion link from within the Google Sitemaps program.

Google Sitemaps gives you a bit of useful information from Google about what keywords your site is ranking for and which keywords people are clicking on your listing.

Social Aspects : Google allows people to write notes about different websites they visit using Google Notebook. Google also allows you to mark and share your favorite feeds and posts. Google also lets you flavorize search boxes on your site to be biased towards the topics your website covers.

Google is not as entrenched in the social aspects of search as much as Yahoo! is, but Google seems to throw out many more small tests hoping that one will perhaps stick.They are trying to make software more collaborative and trying to get people to share things like spreadsheets and calendars, while also integrating chat into email. If they can create a framework where things mesh well they may be able to gain further marketshare by offering free productivity tools.

Google SEO Tools

• Google Sitemaps - helps you determine if Google is having problems indexing your site.

• AdWords Keyword Tool - shows keywords related to an entered keyword, web page, or web site

• AdWords Traffic Estimator - estimates the bid price required to rank #1 on 85% of Google AdWords ads near searches on Google, and how much traffic an AdWords ad would drive

• Google Suggest - auto completes search queries based on the most common searches starting with the characters or words you have entered

• Google Trends - shows multi-year search trends

• Google Sets - creates semantically related keyword sets based on keyword(s) you enter

• Google Zeitgeist - shows quickly rising and falling search queries

• Google related sites - shows sites that Google thinks are related to your site related:www.site.com

• Google related word search - shows terms semantically related to a keyword ~term -term

Business Perspectives

Google has the largest search distribution, the largest ad network, and by far the most efficient search ad auction. They have aggressively extended their brand and amazing search distribution network through partnerships with small web publishers, traditional media companies, portals like AOL, computer and other hardware manufacturers such as Dell, and popular web browsers such as Firefox and Opera.

As they throw out bits of their relevancy in an attempt to keep their algorithm hard to manipulate they create holes where competing search businesses can become more efficient.

Search Marketing Perspective :If you are new to a market and are trying to compete for generic competitive terms it can take a year or more to rank well in Google. Buying older established sites with aged trusted quality citations might also be a good way to enter competitive marketplaces.

If you have better products than the competition, are a strong viral marketer, or can afford to combine your SEO efforts with traditional marketing it is much easier to get natural citations than if you try to force your way into the index.

Creating a small site with high quality unique content and focusing on getting a few exceptionally high quality links can help a new site rank quickly. In the past I believed that a link was a link and that there was just about no such thing as a bad link, but Google has changed that significantly over the last few years. With Google sometimes less is more.

At this point sometimes buying links that may seem relatively expensive at first glance when compared to cheaper alternatives (like paying $299 a year for a Yahoo! Directory listing) can be a great buy because owners of the most spammy sites would not want to have their sites manually reviewed by any of the major search companies, so likely Yahoo! and Google both are likely to place more than average weight on a Yahoo! Directory listing.

Also getting a few citations from high quality relevant related resources can go a long way to improving your overall Google search relevancy.

Right now I think Google is doing a junky job with some of their search relevancy, by placing too much trust on older domains and favoring pages that have only one or few occurrences of certain modifiers on their pages. In doing this they are ranking many cloaked pages for terms other than the terms they are targeting, and I have seen many instances of things like Google ranking real content home mortgage pages for student loan searches, largely because student loans was in the global site navigation on the home mortgage page.

Learn More

• Google on SEO

• Google's Webmaster Guidelines

• Google Spam Recognition Guide for Raters (doc)

• General Guidelines on Random-Query Evaluation (PDF)

• Google Blog

• Google AdWords

• Google AdWords Blog

• Google AdSense

• Google AdSense Blog

• Google Sitemaps

• Google Sitemaps Blog

• Papers Written by Googlers

• patent about information retrieval based on historical data

Worker Blogs

• Matt Cutts - Matt is an amazingly friendly and absurdly accessible guy given his position as the head of Google's search quality team.

• Adam Lasnik - a sharded version of Matt Cutts. A Cuttlet, if you will.

Google Chrome - A Review

The day google released it's new browser it became a hot trend it self. Many posts are now availble on that topic and it's obvious. Some of the fulks has started to take the advantage and posted topics like "google chrome toolbar" , "Google chrome features" so that they will come on the result when some one search for that but belive me these post is not meant for that purpose

Some thing I collected in favour of Chrome

1. Chrome = IE + Mozila + Opera + extra Security

2. Different Tab positioning

3. auto fill address bar

4. 9 thumnails of Most visited sites by you

5. Drag drop facility

6. opensource browser

7.JAva virutual machine V8

8. “incognito” feature like IE 8's "InPrivate mode"

9. 57 time faster than IE

10.Simple Look

11. Zooming In

12. A star near the address input bar lets you bookmark a page, apparently

13. Down load status bar

14. No hard core google ads

15. Option to export data from the default browser. option for default URL, search engine

16. Google Chrome uses WebKit for rendering, which is the same rendering engine a Apple’s Safari browser

17. No data hijacking in the background

Chrome needs to take care

---------------------------

1.Google Chrome currently doesn’t support browser extensions (it does support plug-ins, such as Flash), till now.

2. No tool bar, won't help professionals

3.no Mac/Linux support only windows

Tuesday, July 29, 2008

Cuil - The World's Biggest Search Engine

Cuil.com is rocking now.

Cuil (pronounced [kuːl], "cool") is a search engine that organizes web pages by content and displays relatively long entries along with thumbnail pictures for many results. It claims to have a larger index than any other search engine, with about 120 billion web pages. It went live on July 28, 2008.^[1]^[2]

Unlike other search engines,^[3] Cuil's privacy policy states that it does not store records of users’ search activity or IP addresses.^[4]

Cuil is managed and developed largely by former employees of Google: Anna Patterson, Russell Power and Louis Monier.^[5] Another founder, Tom Costello, has worked for IBM and others.^[6] The company raised $33 million in venture capital from Greylock and others.^[7]

Here are some of the references

^ ^a ^b Liedtke, Michael, Ex-Google engineers debut 'Cuil' way to search, Associated Press, 28 July 2008, retrieved 28 July 2008
^ http://biz.yahoo.com/ap/080728/google_challenger.html
^ Liedtke, Michael (December 11, 2007). "Ask.com will purge search info in hours", Journal Gazette, Fort Wayne Newspapers. Retrieved on 2007-12-11.
^ http://www.cuil.com/info/privacy/
^ "Former Employees of Google Prepare Rival Search Engine - NYTimes.com". nytimes.com. Retrieved on 2008-07-28.
^ news.bbc.co.uk, Search site aims to rival Google
^ Crunchbase: Cuil Profile
^ http://www.cuil.com/info/faqs/#faq4
^ Needleman, Rafe (July 28, 2008). "Cuil shows us how not to launch a search engine", CNET news, CNET. Retrieved on 2008-07-28.
^ Hamilton, Anita (July 28, 2008). "Why Cuil is No Threat to Google", Time.com, Time Magazine Online. Retrieved on 2008-07-28.
^ Burdick, Dave (July 28, 2008). "Cuil Review: Really? No Dave Burdicks? This Search Engine Is Stupid", huffingtonpost.com, Huffington Post. Retrieved on 2008-07-28.
^ Metz, Cade (July 29, 2008). "Ex-Googlers reinvent web search", www.theregister.co.uk, The Register. Retrieved on 2008-07-29.
^ Sullivan, Danny (July 28, 2008). "Cuil Launches -- Can This Search Start-Up Really Best Google?", search engine land blog, Search Engine Land. Retrieved on 2008-07-28

Source- http://en.wikipedia.org/wiki/Cuil

Monday, July 21, 2008

Ranking Factors of Major players - Part 3

MSN search.

MSN was showing results from inktomi and looksmart but when yahoo captured them it’s obvious for Microsoft to develop there own search

Descriptive page titles and page content are playing a vital role on the search result of msn. Internal pages are liked to grow well as compare to the main page.
Crawling of pages by msn is little poor if compared to Google and yahoo. They are no where comprehensive as compare to yahoo and Google when crawling big sites like ebay and amazon
MSN still left behind them self to distinguish quality back links from low quality backlinks. Some more back links can bias the results no matter what may be the quality. They stand a quite better in query processing as compare to yahoo as they process queries by meaning instead of literally that yahoo does. But still they are far away from Google.
As microsofts limited crawling history MSN is not as good as the other major search engines to differentiate between real organic citations and low quality links. The effect of ranking is quicker than other. Sites with relatively few quality links that gain
Site authority is being considered by all major search engines except MSN when evaluating pages. Also they are not as good as the other engines at determining age related trust scores. New sites doing general textbook SEO and acquiring a few descriptive inbound links (perhaps even low quality links) can rank well in MSN within a month.
Microsoft Content Ads is the most advanced paid search ad platform on the web.msn seems to be lacking editorial with its internal relevancy measurement team. They don’t even think from a social aspect.
MSN SEO Tools
MSN has a wide array of new and interesting search marketing tools. Their biggest limiting factor with them is that they have limited search market share.
Some of the more interesting tools are
• Keyword Search Funnel Tool - shows terms that people search for before or after they search for a particular keyword
• Demographic Prediction Tool - predicts the demographics of searchers by keyword or site visitors by website
• Online Commercial Intention Detection Tool - estimates the probability of a search query or web page being commercial, informational-transactional, or
• Search Result Clustering Tool - clusters search results based on related topics
You can view more of their tools under the demo section at Microsoft's Adlab.
They have MSN Search, Microsoft AdCenter, and Windows Live Search. All these things are pretty much the same thing and are meshed together, the only difference between them is that Microsoft does not know what brand they want to push.
From search marketing perspective if you do standard textbook SEO practices and actively build links it is reasonable to expect to be able to rank well in MSN within about a month. If you are trying to rank for highly spammed keyword phrases keep in mind that many of the top results will have thousands and thousands of spammy links. The biggest benefit to new webmasters trying to rank in Microsoft is how quickly they rank new sites which have shown inbound link bursts.
One note of caution with Microsoft Search is that they are so new to the market that they are rapidly changing their relevancy algorithms as they try to play catch up with Yahoo! and Google, both of which had many years of a head start on them. Having said that, expect that sometimes you will rank where your site does not belong, and over time some of those rankings may go away. Additionally sometimes they may not rank you where you do belong, and the rankings will continue to shift to and fro as they keep testing new technologies.
Microsoft has a small market share, but the biggest things a search marketer has to consider with Microsoft are their vast vats of cash and the dominance on the operating system front.
So far they have lost many distribution battles to Google, but they picked up Amazon.com as a partner, and they can use their operating system software pricing to gain influence over computer manufacturer related distribution partnerships.
The next version of Internet Explorer will integrate search into the browser. This may increase the overall size of the search market by making search more convenient, and boost Microsoft's share of the search pie. This will also require search engines to bid for placement as the default search provider, and nobody is sitting on as much cash as Microsoft is.
Microsoft has one of the largest email user bases. They have been testing integrating search and showing contextually relevant ads in desktop email software. Microsoft also purchased Massive, Inc., a firm which places ads in video games.
Microsoft users tend to be default users who are less advertisement adverse than a typical Google user. Even though Microsoft has a small marketshare they should not be overlooked due to their primitive search algorithms (and thus ease of relevancy manipulation), defaultish users, and potential market growth opportunity associated with the launch of their next web browser.
Learn More
• MSN Guidelines for Successful Indexing
• MSN Site Owner Help
• MSN Search Blog
• MSN AdCenter Blog
• Microsoft AdLab
• Microsoft Research
Worker Blogs
• Robert Scoble - he is probably known as one of the top 10 bloggers, but after working for Microsoft for years he left on June 10th, 2006.

Thursday, July 10, 2008

Restrict Your Google Search

Now you can add another functionality to the Google search, you can see the result catches under certain date. Go for a normal search in google and type a search string on the google search box, press enter.

1. Add “&as_qdr=(d)” to the URL displaying on the address bar after your search

2. Another pop up will appear beside the search box, showing the date

3. Now you can choose and limit the search query according to your need

Enjoy :)

Ranking Factors of Major players - Part 2

Yahoo!

Yahoo has been in the market from 1994 founded by David Filo and Jerry Yang as a directory. In the starting they are providing there search results to other third party but they have acquired it as they feel the importance of search market.

Yahoo pays importance to onpage content, if you are well optimized your meta elements you will rank high, yahoo does this to support his paid results to look as organic.

Yahoo can crawl deeply as far as link popularity is concern but may face difficulty if two or more number of special characters is present in the URLs.

All the words in the search keyword is considered in yahoo. For example if you are searching for “ how to find hotel” , it will try to search for “how” +”to”+”find”+”hotel” unlike google which give priority to semantics of the words of them and neglect common stuff words. Yahoo! puts quite a bit of weight even on common words that occur in the search query.

A good number of back link irrespective of there quality, focused anchor text, trust scores these is what yahoo still paying attention

Inbound links to a particular page and links to that site, yahoo consider both. Pages of a new sites have a chance to rank well if they manage to collect good number of descriptive inbound links

Site age is not a matter for yahoo as compare to Google. A page of 2 to 3 month can have a good rank on yahoo organic result page.

Inclusion to yahoo directory, yahoo paid results is manually edited. Yahoo also review there search query for in many industry for some competitive search queries. Some of the top search result may be hand coded. Yahoo also manually reviews some of the spammed categories.

Yahoo buyed del.icio.us, a social bookmarking site. They are also having a similar kind of there own product My Yahoo!, yahoo answers a question answering service.
Yahoo! has a number of useful SEO tools.
• Overture Keyword Selector Tool - shows prior month search volumes across Yahoo! and their search network.
• Overture View Bids Tool - displays the top ads and bid prices by keyword in the Yahoo! Search Marketing ad network.
• Yahoo! Site Explorer - shows which pages Yahoo! has indexed from a site and which pages they know of that link at pages on your site.
• Yahoo! Mindset - shows you how Yahoo! can bias search results more toward informational or commercial search results.
• Yahoo! Advanced Search Page - makes it easy to look for .edu and .gov backlinks
o while doing link:http://www.site.com/page.html searches (links to an individual page)
o while doing linkdomain:www.site.com/ searches (links to any page on a particular domain)
• Yahoo! Buzz - shows current popular searches

Ranking Factors of Major players- Part 1

Owner of an online business! You must be dreaming about no 1 position in major players like Google, Yahoo, MSN. If you are already having any idea about search engine optimization and know how effective SEO can be for your business then you should be aware of what are the important factors. And how to well stood in this high competition.

The 1st part though most of us ignore is to understand the behavior of the major SE’s. the recent market study shows 61.6 % of the search market is occupied by Google, 20.4% by yahoo, 9.1 % by MSN and rest by other search engines.

The question is, is there any different strategy of the search engines to rank sites or all of them follows the same, before some days I was working for one of the inhouse site for the keyword Like “web development company” this was ranking well in google where poorly ranking in yahoo and no where found in msn. A common thinking there may exist in many of us that as we are optimizing our site for the Google, it will be well ranked in others too. I was completely proofed my self. No all of them have there own style of giving rank to the results and most of them have their own different algorithm. Don’t be in confusion and don’t think that you have been banned in that search engine for which you are not ranking.

Let’s discuss some of the important aspect upon which engines rely upon.
The factor we will discuss here.
1. onpage content
2. Crawling
3. query processing
4. link reputation
5. Page vs site
6. Site age
7. paid search
8. editorial
9. social aspect
10. SEO tools
11. business prospective
12. marketing perspective

To be frank I have collected all the information from seobook.com article written by Aaron Wall on June 13, 2006 , I have just divided it search engine wise and tried to put it in my own style.

So friend I will post what Aaron Wall is saying about yahoo in my next post. Stay Tuned.

Wednesday, July 9, 2008

Google showing numeric data

It's no more a headach to get the exact search voume of a keyword. Those who are all been using wordtracker, seobook, digitalpoint to know how many times the keyword is being searched on the web they can now use the google adword tool. Instead of the the Green bar which was showing only Low, Avrage, high search volume is now showing numbers. Don't know how far it's reliable and how effective will be if we will use this, let's hope for the best.

The googles tool can be used by both who are having a addword account as well as those don't have.

Here are the tools which one can use along with google adword external keyword tool

https://adwords.google.com/select/KeywordToolExternal

http://freekeywords.wordtracker.com/

http://tools.seobook.com/keyword-tools/seobook/

http://www.digitalpoint.com/tools/keywords/

Thursday, June 26, 2008

Why Wikipedia result is coming in most of the cases in Google SERP

All of us have seen that Wikipedia result is coming on the top whenever you are searching for some thing however competitive the key phrase is. Wikipedia.org is a multilingual, Web-based, free content encyclopedia project. Wikipedia is written collaboratively by volunteers from all around the world. Since its creation in 2001, there are 2,428,787 articles in English read more about Wikipedia at http://en.wikipedia.org/wiki/Wikipedia:About

The possible reason for their top position is
1. Is there any tie up between Google and wiki and Google has taken the responsibility to promote wiki?
2. Are they well optimized their page for SEO?

Content is the king, and wiki is on the head of the content. Wiki has well reached with semantically related content; they have good number back link from its own domain. Suppose you are searching for “hotel” in wiki, probably you will get a url in the top of serp like “en.wikipedia.org/wiki/Hotel”. It seems they are naturally optimized for SEO as they have hotel in their url. Wikipedia's articles provide links to guide the user to related pages with additional information. Also it has a good number of back links from pages like “en.wikipedia.org/wiki/Hotel_Chelsea”. These are the quality natural, content rich back links, which Google considers the most.
One of the interesting thing I found is if I am going for a single phrase word, I found wiki at the top but if I am adding another word to it or going for long term key phrase it’s coming down. Still not clear about the concept of Google. if any one having any additions to the above please comment it here, suggestions, feedbacks all are well come…… :)

Wednesday, June 4, 2008

How Google defines IP delivery, geolocation, and cloaking

Geolocation: Serving targeted/different content to users based on their location. As a webmaster, you may be able to determine a user's location from preferences you've stored in their cookie, information pertaining to their login, or their IP address. For example, if your site is about baseball, you may use geolocation techniques to highlight the Yankees to your users in New York.

The key is to treat Googlebot as you would a typical user from a similar location, IP range, etc. (i.e. don't treat Googlebot as if it came from its own separate country—that's cloaking).

IP delivery: Serving targeted/different content to users based on their IP address, often because the IP address provides geographic information. Because IP delivery can be viewed as a specific type of geolocation, similar rules apply. Googlebot should see the same content a typical user from the same IP address would see.

Cloaking: Serving different content to users than to Googlebot. This is a violation of our webmaster guidelines. If the file that Googlebot sees is not identical to the file that a typical user sees, then you're in a high-risk category. A program such as md5sum or diff can compute a hash to verify that two different files are identical.

First click free: Implementing Google News' First click free policy for your content allows you to include your premium or subscription-based content in Google's websearch index without violating our quality guidelines. You allow all users who find your page using Google search to see the full text of the document, even if they have not registered or subscribed. The user's first click to your content area is free. However, you can block the user with a login or payment request when he clicks away from that page to another section of your site.

If you're using First click free, the page displayed to users who visit from Google must be identical to the content that is shown to the Googlebot.Still have questions? See related thread at Google Webmaster Help Group.

How Google defines IP delivery, geolocation, and cloaking:

Matt Cutts Discusses Webmaster Tools

Wednesday, May 14, 2008

Sitemaps offer better coverage for your Custom Search Engine

If you're a webmaster or site owner, you realize the importance of providing high quality search on your site so that users easily find the right information.

We just announced today that AdSense for Search is now powered by Custom Search. Custom Search (a Google-powered search box that you can install on your website in minutes) helps your users quickly find what they're looking for. As a webmaster, Custom Search gives you advanced customization options to improve the accuracy of your site's search results. You can also choose to monetize your traffic with ads tuned to the topic of your site. If you don't want ads, you can use Custom Search Business Edition.

Now, we're also looking to index more of your site's content for inclusion in your Custom Search Engine (CSE) used for search on your site. We figure out what sites and URLs are included in your CSE, and -- if you've provided Sitemaps for the relevant sites -- we use that information to create a more comprehensive experience for your site's visitors. You don't have to do anything specific, besides submitting a Sitemap (via Webmaster Tools) for your site if you haven't already done so. Note that this change will not result in more pages indexed on Google.com and your search rankings on Google.com won't change. However, you will be able to get much better results coverage in your CSE.

Custom Search is built on top of the Google index. This means that all pages that are available on Google.com are also available to your search engine. We're now maintaining a CSE-specific index in addition to the Google.com index for enhancing the performance of search on your site. If you submit a Sitemap, it's likely that we will crawl those pages and include them in the additional index we build.

In order for us to index these additional pages, our crawlers must be able to crawl them. Your Sitemap will also help us identify the URLs that are important. Please ensure you are not blocking us from crawling any pages you want indexed. Improved index coverage is not instantaneous, as it takes some time for the pages to be crawled and indexed.

So what are you waiting for? Submit your Sitemap!

Webmaster Tools now in 26 languages

Webmasters come from all corners of the world and we are working hard to reach each and everyone of you. A few months back we introduced you to Googlers who help monitor our Webmaster Help Groups in fifteen languages. Since then, that number has grown to sixteen with the addition of the Chinese Help Group. Today, we're happy to announce that Webmaster Tools is now available in four more languages:

Arabic
Hebrew
Hindi
Thai

Webmaster Tools is already available in 22 other languages: British English, Czech, Danish, Dutch, Finnish, French, German, Hungarian, Italian, Japanese, Korean, Norwegian, Polish, Portuguese, Romanian, Russian, Simplified Chinese, Spanish, Swedish, Traditional Chinese, Turkish, and US English.

http://googlewebmastercentral.blogspot.com/2008/05/webmaster-tools-now-in-26-languages.html

Add Social Features to Your Site

Adding Social features to you site can generate buzz and traffic to your pages. make you site like other social media sites on the web by which your visitors will be able to see, invite, and interact with their friends from existing sources of friends, including Facebook, Google Talk, hi5, LinkedIn, orkut, Plaxo, and others. And you'll be able to more actively engage your visitors by adding social features from a growing gallery of social applications.

Make you site social feature enable by the help of Google Friend Connect.

Google Friend Connect is a service that that helps you grow traffic by enabling you to easily provide social features for your visitors. Just add a snippet of code, and, voilà, you can add social functionality -- picking and choosing from built-in functionality like user registration, invitations, members gallery, message posting, and reviews, as well as third-party applications built by the OpenSocial developer community.

Social gadgets created by Google. Google Friend Connect provides a core set of social gadgets such as member management, message board, reviews, and picture-sharing. The key gadget is the members gadget. This gadget provides core social features for your visitors:

* sign-in with their existing Google, Yahoo, AIM, or OpenID account
* invite and show activity to existing friends from social networks such as Facebook, Google Talk, hi5, orkut, Plaxo, and more
* browse member profiles across social networks
* connect with new friends on your site

Social gadgets created by other developers. For the past year, the developer community has been creating hundreds of applications for OpenSocial, an open standard for social applications. Once you have added Friend Connect to your site, you can offer many of these applications to your users, simply by pasting the relevant snippet into your site.

* OpenSocial Diagram

Once you have added the members gadget, and the additional social gadgets, your visitors can start inviting their friends and more deeply engage with your site.
Reaping the rewards

URL Re-Write Codes

f your pages are dynamically generated and more than two special characters (?, #, @, & etc) are there in the URL then that will be a problem for indexing of the page. To avoid this you need to create a new file at the root of your website and name it as ".htaccess". Here you need to write down some codes collected from different sources.

RewriteEngine on
RewriteRule ^/([a-zA-Z0-9_-]+)/([a-zA-Z0-9_-]+)\.html$ index.php?cat=$1&page=$2

Explanation of the code

First line indicates to enable rewrites. The second line is the redirects rule,
RewriteRule has 2 parts, the first part, the from part, “^/([a-zA-Z0-9_-]+)/([a-zA-Z0-9_-]+)\.html$” is to tell apache that if you get a url which looks like this, redirect it to the second part, the to part, which may be like“index.php?cat=$1&page=$2“

The technique is you need to put the links on your page, which are static, and when some one clicks or Search engine follow them it will redirect them to the actual dynamic page

In the from part there are 2 variables which are alphanumeric, which is defined as “([a-zA-Z0-9_-]+)” (from a-z, A-Z, 0-9, _-]+)) the variable ends as soon as a character is reached which is not in the range, specified there. In this case it’s the ‘/’, then the second variable starts which is for the pages. As you can see in this case, one can use there keywords on the urls for better ranking

If you had moved a file to a new location and want all links to the old location to be forwarded to the new location. Though you shouldn’t really ever » move a file once it has been placed on the web; at least when you simply have to, you can do your best to stop any old links from breaking.

RewriteEngine on
RewriteRule ^old\.html$ new.html

Though this is the simplest example possible, it may throw a few people off. The structure of the ‘old’ URL is the only difficult part in this RewriteRule. There are three special characters in there.

* The caret, ^, signifies the start of an URL, under the current directory. This directory is whatever directory the .htaccess file is in. You’ll start almost all matches with a caret.
* The dollar sign, $, signifies the end of the string to be matched. You should add this in to stop your rules matching the first part of longer URLs.
* The period or dot before the file extension is a special character in regular expressions, and would mean something special if we didn’t escape it with the backslash, which tells Apache to treat it as a normal character. So, this rule will make your server transparently redirect from old.html to the new.html page. Your reader will have no idea that it happened, and it’s pretty much instantaneous.Forcing New Requests

Sometimes you do want your readers to know a redirect has occurred, and can do this by forcing a new HTTP request for the new page. This will make the browser load up the new page as if it was the page originally requested, and the location bar will change to show the URL of the new page. All you need to do is turn on the [R] flag, by appending it to the rule:

RewriteRule ^old\.html$ new.html [R]

Using Regular Expressions

Now we get on to the really useful stuff. The power of mod_rewrite comes at the expense of complexity. If this is your first encounter with regular expressions, you may find them to be a tough nut to crack, but the options they afford you are well worth the slog. I’ll be providing plenty of examples to guide you through the basics here.

Using regular expressions you can have your rules matching a set of URLs at a time, and mass-redirect them to their actual pages. Take this rule;

RewriteRule ^products/([0-9][0-9])/$ /productinfo.php?prodID=$1

This will match any URLs that start with ‘products/’, followed by any two digits, followed by a forward slash. For example, this rule will match an URL like products/12/ or products/99/, and redirect it to the PHP page.

The parts in square brackets are called ranges. In this case we’re allowing anything in the range 0-9, which is any digit. Other ranges would be [A-Z], which is any uppercase letter; [a-z], any lowercase letter; and [A-Za-z], any letter in either case.

We have encased the regular expression part of the URL in parentheses, because we want to store whatever value was found here for later use. In this case we’re sending this value to a PHP page as an argument. Once we have a value in parentheses we can use it through what’s called a back-reference. Each of the parts you’ve placed in parentheses are given an index, starting with one. So, the first back-reference is $1, the third is $3 etc.

Thus, once the redirect is done, the page loaded in the readers’ browser will be something like productinfo.php?prodID=12 or something similar. Of course, we’re keeping this true URL secret from the reader, because it likely ain’t the prettiest thing they’ll see all day.

Multiple Redirects

If your site visitor had entered something like products/12, the rule above won’t do a redirect, as the slash at the end is missing. To promote good URL writing, we’ll take care of this by doing a direct redirect to the same URL with the slash appended.

RewriteRule ^products/([0-9][0-9])$ /products/$1/ [R]

Multiple redirects in the same .htaccess file can be applied in sequence, which is what we’re doing here. This rule is added before the one we did above, like so:

RewriteRule ^products/([0-9][0-9])$ /products/$1/ [R]
RewriteRule ^products/([0-9][0-9])/$ /productinfo.php?prodID=$1

Thus, if the user types in the URL products/12, our first rule kicks in, rewriting the URL to include the trailing slash, and doing a new request for products/12/ so the user can see that we likes our trailing slashes around here. Then the second rule has something to match, and transparently redirects this URL to productinfo.php?prodID=12. Slick.

Match Modifiers

You can expand your regular expression patterns by adding some modifier characters, which allow you to match URLs with an indefinite number of characters. In our examples above, we were only allowing two numbers after products. This isn’t the most expandable solution, as if the shop ever grew beyond these initial confines of 99 products and created the URL productinfo.php?prodID=100, our rules would cease to match this URL.

So, instead of hard-coding a set number of digits to look for, we’ll work in some room to grow by allowing any number of characters to be entered. The rule below does just that:

RewriteRule ^products/([0-9]+)$ /products/$1/ [R]

Note the plus sign (+) that has snuck in there. This modifier changes whatever comes directly before it, by saying ‘one or more of the preceding character or range.’ In this case it means that the rule will match any URL that starts with products/ and ends with at least one digit. So this’ll match both products/1 and products/1000.

Other match modifiers that can be used in the same way are the asterisk, *, which means ‘zero or more of the preceding character or range’, and the question mark, ?, which means ‘zero or only one of the preceding character or range.’

Adding Guessable URLs

Using these simple commands you can set up a slew of ‘shortcut URLs’ that you think visitors will likely try to enter to get to pages they know exist on your site. For example, I’d imagine a lot of visitors try jumping straight into our stylesheets section by typing the URL http://www.yourhtmlsource.com/css/. We can catch these cases, and hopefully alert the reader to the correct address by updating their location bar once the redirect is done with these lines:

RewriteRule ^css(/)?$ /stylesheets/ [R]

The simple regular expression in this rule allows it to match the css URL with or without a trailing slash. The question mark means ‘zero or one of the preceding character or range’ — in other words either yourhtmlsource.com/css or yourhtmlsource.com/css/ will both be taken care of by this one rule.

This approach means less confusing 404 errors for your readers, and a site that seems to run a whole lot smoother all ’round.

Canonical Hostnames

Description:

The goal of this rule is to force the use of a particular hostname, in preference to other hostnames which may be used to reach the same site. For example, if you wish to force the use of www.example.com instead of example.com, you might use a variant of the following recipe.

Solution:

# For sites running on a port other than 80
RewriteCond %{HTTP_HOST} !^www\.example\.com [NC]
RewriteCond %{HTTP_HOST} !^$
RewriteCond %{SERVER_PORT} !^80$
RewriteRule ^/(.*) http://www.example.com:%{SERVER_PORT}/$1 [L,R]

# And for a site running on port 80
RewriteCond %{HTTP_HOST} !^www\.example\.com [NC]
RewriteCond %{HTTP_HOST} !^$
RewriteRule ^/(.*) http://www.example.com/$1 [L,R]

Trailing Slash Problem
Description:

Every webmaster can sing a song about the problem of the trailing slash on URLs referencing directories. If they are missing, the server dumps an error, because if you say /~quux/foo instead of /~quux/foo/ then the server searches for a file named foo. And because this file is a directory it complains. Actually it tries to fix it itself in most of the cases, but sometimes this mechanism need to be emulated by you. For instance after you have done a lot of complicated URL rewritings to CGI scripts etc.

Solution:

The solution to this subtle problem is to let the server add the trailing slash automatically. To do this correctly we have to use an external redirect, so the browser correctly requests subsequent images etc. If we only did a internal rewrite, this would only work for the directory page, but would go wrong when any images are included into this page with relative URLs, because the browser would request an in-lined object. For instance, a request for image.gif in /~quux/foo/index.html would become /~quux/image.gif without the external redirect!

So, to do this trick we write

RewriteEngine on
RewriteBase /~quux/
RewriteRule ^foo$ foo/ [R]

The crazy and lazy can even do the following in the top-level .htaccess file of their homedir. But notice that this creates some processing overhead.

RewriteEngine on
RewriteBase /~quux/
RewriteCond %{REQUEST_FILENAME} -d
RewriteRule ^(.+[^/])$ $1/ [R]

301 Redirect

301 redirect is the most efficient and Search Engine Friendly method for webpage redirection. It's not that hard to implement and it should preserve your search engine rankings for that particular page. If you have to change file names or move pages around, it's the safest option. The code "301" is interpreted as "moved permanently".

You can Test your redirection with Search Engine Friendly Redirect Checker

Below are a Couple of methods to implement URL Redirection

IIS Redirect

* In internet services manager, right click on the file or folder you wish to redirect
* Select the radio titled "a redirection to a URL".
* Enter the redirection page
* Check "The exact url entered above" and the "A permanent redirection for this resource"
* Click on 'Apply'

ColdFusion Redirect

<.cfheader statuscode="301" statustext="Moved permanently">
<.cfheader name="Location" value="http://www.new-url.com">

PHP Redirect

Header( "HTTP/1.1 301 Moved Permanently" );
Header( "Location: http://www.new-url.com" );
?>

ASP Redirect

<%@ Language=VBScript %>
<%
Response.Status="301 Moved Permanently"
Response.AddHeader "Location","http://www.new-url.com/"
%>

ASP .NET Redirect

JSP (Java) Redirect

<%
response.setStatus(301);
response.setHeader( "Location", "http://www.new-url.com/" );
response.setHeader( "Connection", "close" );
%>

CGI PERL Redirect

$q = new CGI;
print $q->redirect("http://www.new-url.com/");

Ruby on Rails Redirect

def old_action
headers["Status"] = "301 Moved Permanently"
redirect_to "http://www.new-url.com/"
end

Redirect Old domain to New domain (htaccess redirect)

Create a .htaccess file with the below code, it will ensure that all your directories and pages of your old domain will get correctly redirected to your new domain.

The .htaccess file needs to be placed in the root directory of your old website (i.e the same directory where your index file is placed)

Options +FollowSymLinks
RewriteEngine on
RewriteRule (.*) http://www.newdomain.com/$1 [R=301,L]

Please REPLACE www.newdomain.com in the above code with your actual domain name.

In addition to the redirect I would suggest that you contact every backlinking site to modify their backlink to point to your new website.

Note* This .htaccess method of redirection works ONLY on Linux servers having the Apache Mod-Rewrite moduled enabled.

Redirect to www (htaccess redirect)

Create a .htaccess file with the below code, it will ensure that all requests coming in to domain.com will get redirected to www.domain.com

The .htaccess file needs to be placed in the root directory of your old website (i.e the same directory where your index file is placed)

Options +FollowSymlinks
RewriteEngine on
rewritecond %{http_host} ^domain.com [nc]
rewriterule ^(.*)$ http://www.domain.com/$1 [r=301,nc]

Please REPLACE domain.com and www.newdomain.com with your actual domain name.

Note* This .htaccess method of redirection works ONLY on Linux servers having the Apache Mod-Rewrite moduled enabled.

Sources
----------
http://httpd.apache.org/docs/2.0/misc/rewriteguide.html
http://www.yourhtmlsource.com/sitemanagement/urlrewriting.html
http://www.webconfs.com/how-to-redirect-a-webpage.php
http://www.johny.org/2008/05/13/apache-url-rewrite-for-seo/