php - Using Zend_Dom as a screen scraper -
how?
more point...
this:
$url = 'http://php.net/manual/en/class.domelement.php'; $client = new zend_http_client($url); $response = $client->request(); $html = $response->getbody(); $dom = new zend_dom_query($html); $result = $dom->query('div.note'); zend_debug::dump($result);
gives me this:
object(zend_dom_query_result)#867 (7) { ["_count":protected] => null ["_cssquery":protected] => string(8) "div.note" ["_document":protected] => object(domdocument)#79 (0) { } ["_nodelist":protected] => object(domnodelist)#864 (0) { } ["_position":protected] => int(0) ["_xpath":protected] => null ["_xpathquery":protected] => string(33) "//div[contains(@class, ' note ')]" }
and cannot life of me figure out how this.
i want extract various parts of retrieved data (that being div class "note" , of elements inside it... text , urls) cannot working.
someone pointed me domelement class on @ php.net when try using of methods mentioned, can't things work. how grab chunk of html page , go through grabbing various parts? how inspect object getting can @ least figure out in it?
hjälp?
the iterator
implementation of zend_dom_query_result
returns domelement
object each iteration:
foreach ($result $element) { var_dump($element instanceof domelement); // true }
from $element variable, can use domelement method:
foreach ($result $element) { echo 'element id: '.$element->getattribute('id').php_eol; if ($element->haschildnodes()) { echo 'element has child nodes'.php_eol; } $anodes = $element->getelementsbytagname('a'); // etc }
you can access document element, or can use zend_dom_query_result
so:
$document1 = $element->ownerdocument; $document2 = $result->getdocument(); var_dump($document1 === $document2); // true echo $document1->savehtml();
Comments
Post a Comment