• Create BookmarkCreate Bookmark
  • Create Note or TagCreate Note or Tag
  • PrintPrint
Share this Page URL
Help

Chapter 2. How Search Engines Work > Analyzing the Content

Analyzing the Content

Now that you see how spiders find pages on the Web, it's time to see what search engines do with all those pages. The first thing that you will find is that not every document in the search index is an HTML-coded Web page.

Converting Different Types of Documents

Up until now, we have assumed that all Web pages are made of HTML, but many are not. Modern search engines can analyze Adobe Acrobat (PDF) files and many other kinds of documents. Trusted feeds, in particular, tend to use their own formats.


PREVIEW

                                                                          

Not a subscriber?

Start A Free Trial


  
  • Creative Edge
  • Create BookmarkCreate Bookmark
  • Create Note or TagCreate Note or Tag
  • PrintPrint