parsing a large html-file (local) - with Perl or PHP -
i have large document - need parse , spit out part: schule.php?schulnr=80287&lschb=
how parse stuff!?
<td> <a href="schule.php?schulnr=80287&lschb=" target="_blank"> <center><img border=0 height=16 width=15 src="sh_info.gif"></center> </a> </td>
love hear you
you way (it's not perl more "visual"):
- load document browser, if possible
- install firebug extension/add-on
- install firepath extension
copy + paste xpath expression text field labeled "xppath:"
//a[contains(@href, "schule")]/@href
click "eval" button.
there tools on command line, e.g. "xmllint" (for unix)
xmllint --html --xpath '//a[contains(@href, "schule")]/@href' myfile.php.or.html
you further processing thereon.
Comments
Post a Comment