Menu

Post image 1
Post image 2
1 / 2
0

I’m looking for ugly URLs that break normal scrapers

DEV Community·Zee·about 1 month ago
#pQyE9xOG
#webscraping#api#llm#automation#pages#docs
Reading 0:00
15s threshold

Most scraper demos use friendly pages. A blog post. A docs page. A fake ecommerce product. Something clean enough that BeautifulSoup could probably manage it after a coffee. That is not where web extraction gets annoying. The annoying cases are the ugly ones: JavaScript-rendered pages pages with no stable CSS selectors pages where the useful data is mixed into layout sludge Cloudflare / bot-wall weirdness vendor pages where the table changes every week docs pages where the answer is spread across several sections pages that look simple in a browser but return nonsense to curl Those are the URLs I actually care about.…

Continue reading — create a free account

Join HashtagPLUS to read full articles, follow hashtags, vote, and join the conversation.

Read More