• Create BookmarkCreate Bookmark
  • Create Note or TagCreate Note or Tag
  • PrintPrint
Share this Page URL
Help

8. Parsing and Using Feeds > 8.4. Using Regular Expressions

Using Regular Expressions

Using regular expressions to parse feeds may seem a little brutish, but it does have two advantages. First, it totally negates the issues regarding the differences between standards. Second, it is a much easier installation: it requires no XML parsing modules or any dependencies thereof.

Regular expressions, however, aren’t pretty. Consider Example 8-7, which is a section from Rael Dornfest’s lightweight RSS aggregator, Blagg.

Example 8-7. A section of code from Blagg
# Feed's title and link
my($f_title, $f_link) = ($rss =~ m#<title>(.*?)</title>.*?<link>(.*?)</link>#ms);

   
# RSS items' title, link, and description
   
while ( $rss =~ m{<item(?!s).*?>.*?(?:<title>(.*?)</title>.*?)?(?:<link>(.*?)</link>.

*?)?(?:<description>(.*?)</description>.*?)?</item>}mgis ) {
     my($i_title, $i_link, $i_desc, $i_fn) = ($1||'', $2||'', $3||'', undef);
   
     # Unescape &amp; &lt; &gt; to produce useful HTML
     my %unescape = ('&lt;'=>'<', '&gt;'=>'>', '&amp;'=>'&', '&quot;'=>'"');

     my $unescape_re = join '|' => keys %unescape;
     $i_title && $i_title =~ s/($unescape_re)/$unescape{$1}/g;
     $i_desc && $i_desc =~ s/($unescape_re)/$unescape{$1}/g;
   
     # If no title, use the first 50 non-markup ch....

PREVIEW

                                                                          

Not a subscriber?

Start A Free Trial


  
  • Creative Edge
  • Create BookmarkCreate Bookmark
  • Create Note or TagCreate Note or Tag
  • PrintPrint