seo chapter 3

Search Engine Spam

Search engine spam

     Manipulation of web pages to improve rakings in search engine results is defined as search engine spam. Guidelines that are considered as search engine abuse have been outlined by industry leading search engines. They are available at:

     Google                                            http://www.google.com/webmasters/guidelines.html
Yahoo! Search                    help.yahoo.com/help/us/ysearch/basics/basics-18.html
MSN Search                        search.msn.com/docs/siteowner.aspx

Consequences of spamming

     Spammers are constantly reinventing techniques to outdo spam control set forth by search engines. Nevertheless, search engines constantly upgrade their spam policies with constant modifications to their algorithms. Since the algorithms are proprietary, there is no definite way of knowing what a search engine considers spam. On detection of a website as an offender/abuser, the search engines may penalize the website or even remove the site from the index. Once blacklisted as a spammer, the website will not be crawled by the spider. One needs to communicate with search engine staff to get the website back into the crawling index. This process of communication between the website owner and the search engine staff is time consuming thus costing the owner valuable traffic and new clients.

Spamming techniques

     Below are the different types of spamming methods that have been used to improve rankings.

  1. Hidden text
  2. IP Cloaking
  3. Doorway pages
  4. Pagejacking
  5. Domain duplication
  6. Excessive popup
  7. Inflating link popularity
  8. ALT stuffing
  9. Link farming
  10. FFA
  11. Mousetrapping

Hidden text

     Hidden text or keyword stuffing is the practice of overloading a webpage with keywords and key phrases. These are invisible to the visitor but are present in the body of the webpage. Since search engines read the HTML source code of web pages, this text is visible to the spider. The spider is manipulated to believe that due to the high occurrence of keyword in the content of the web page, the web page is highly relevant to the keyword and hence assigns a higher ranking to this webpage. Various techniques can be employed to inflate the density of keywords. Most prominent among these are:

  1. Hidden input tag
    <input type=hidden name=keyword1 value=’list of keywords’>
  2. Invisible text
    This is done by rendering the color of the font with the background color of the web page so that these characters are invisible to the naked eye

IP Cloaking

     IP Cloaking is the practice of creating specialized web pages with the intention of serving search engine spiders. These web pages are invisible to normal visitors. The pages are programmed to detect whether the URL request is coming from a regular browser or a search engine spider and serve each request with different page content. The end result is that the spider sees a highly optimized web page with a heavy keyword density while the visitor is served with the regular page.

Doorway pages

     Doorway pages serve as a bridge for the spider. The doorway pages are created for the same purpose as cloaking only that they are served to all incoming requests. The doorway page has a meta refresh tag which will redirect the visitor to the appropriate page or a link that the visitor has to click to reach the destination. Doorway pages are also used to inflate link popularity.

Pagejacking

     Pagejacking or content duplication is the practice of copying content (HTML source code) from another site and creating duplicate copies of web pages on one’s site. These illegitimate web pages are indexed by spiders and show up in search engine results. The spammer uses these pages to attract visitors. The visitors are tricked into thinking that the illegal site is the site they are looking for. Once on the site, the visitors may become victims of mousetrapping.

Domain duplication

     The practice of creating identical websites with the only difference that they have different domain names is termed as domain duplication. This would enable the websites to occupy multiple listings in the search engine results on the same page. Since these web pages are identical, their rankings will more or less be the same. The visitor is thus tricked into visiting the same content from search engine results since adjoining listings point to the same content.

Excessive popups

     Yahoo specifies that they consider excessive popups as spam. This is related to mousetrapping. Hence a website should have a maximum of 1 to 2 popup’s per page.

Inflating link popularity

     Internal link popularity can be inflated by creating an infinite amount of dynamically created web pages with content of little use to point to popular web pages within the site, thereby inflating the internal inbound links of the web pages. This tends to increase the PageRank of the intended web pages.

ALT stuffing

     This is a special case of keyword stuffing. Like the input tag, the ALT attribute is almost invisible from the visitor. The visitor sees the content of the ALT attribute only when the mouse is over the image. This attribute can be manipulated to have a very long string of keywords which have no relevance to the image or the webpage. This increases the keyword density of the web page.

Link farming

     Link farming is the process of artificially inflating the inbound links to the website by organized exchange of links. The reciprocal linking program can be abused by exchanging links with other websites which are not related to the content or the theme of the website.

FFA

     Free for all web pages are usually pages which have hardly any content except links to other websites. FFA is a malicious technique to inflate link popularity.

Mousetrapping

     Mousetrapping uses JavaScript handlers to open up new windows with content that is of no interest to the visitor. The visitor is prevented from leaving the site. Whenever the visitor tries to close the window another window opens. Sometimes, mousetrapping is programmed to end after a finite number of new browser windows. Otherwise, the visitor will have to close the browser program using the Task manager, thus losing all other open windows.

Search engine spam is directly related to the evolution of search engine algorithms. Spammers come up with new strategies every day to adapt to restrictions imposed by search engines. Search engines try to isolate these strategies and penalize websites participating in spam. It is best not to use any spamming methods to increase popularity of one’s website.

Advertisements
Leave a comment

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: