• Create BookmarkCreate Bookmark
  • Create Note or TagCreate Note or Tag
  • PrintPrint
Share this Page URL
Help

Chapter 9. Programming Google > A Note on Spidering and Scraping

9.8. A Note on Spidering and Scraping

Some small share of the hacks in this book involve spidering, or meandering through sites and scraping data from their web pages to be used outside of their intended context. Given that we have the Google API at our disposal, why then do we resort at times to spidering and scraping?

The main reason is simply that you can't gain access to everything Google through the API. While it nicely serves the purposes of searching the Web programmatically, the API (at the time of this writing) doesn't go any further than Google's main web search index. And it's even limited in what you can pull from the index. You can't do a phonebook search, trawl Google News, leaf through Google Catalogs, or interact in any way with any of Google's other specialty search properties.


PREVIEW

                                                                          

Not a subscriber?

Start A Free Trial


  
  • Creative Edge
  • Create BookmarkCreate Bookmark
  • Create Note or TagCreate Note or Tag
  • PrintPrint