python - How to retrieve these elements from a webpage? -
i have webpage in html these elements:
<div class="content_page"> <a href="/earth" class="nametessera" >earth</a> </div> <div class="content_page"> <a href="/world" class="nametessera" >world</a> </div> <div class="content_page"> <a href="/planet" class="nametessera">planet</a> </div> ...
i need retrieve /earth, /world, /planet, etc. need retrieve links of tag class "nametessera".
how can python ?
short answer:
use beautifulsoup parse page, urls , use urlib2 or pycurl download mentioned urls.
[edit:]
adding on examples below use the href contained in div
>>> alldiv = soup.findall('div', { "class" : "content_page" }) >>> div in alldiv: print div.a ... <a href="/earth" class="nametessera">earth</a> <a href="/world" class="nametessera">world</a> <a href="/planet" class="nametessera">planet</a> >>> div in alldiv: print div.a['href'] ... /earth /world /plan
similarly
allhref = soup.findall('a', { "class" : "nametessera" })
Comments
Post a Comment