Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I once wrote a scraper for a Yellow Pages site in Python. It pulled down the business category, name, telephone and email for every entry, and returned a nicely formatted spreadsheet. The hours I spent learning the ElementTree API and XPath expressions have paid for themselves several times over, now that I have a nicely segmented spreadsheet of business categories and email addresses, which I target via email marketing.


As someone responsible for search on a yellow pages company, I can confirm that most YP websites have little to no protection against this. Company information is usually public anyway. We just make it very easy for you to get it :)




Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: