Menu

Post image 1
Post image 2
1 / 2
0

Next.js job board: reliable scrapes with leases

DEV Community·Sathish·about 1 month ago
#K5UbE3O5
#nextjs#typescript#postgres#webdev#error#cron
Reading 0:00
15s threshold

I stopped double-scraping jobs with a Postgres lease table. I made cron runs idempotent with a single SQL function. I got rid of “why is this job missing?” by logging per-source runs. Works on Next.js 14 + Supabase + Vercel cron. Context I’m building a job board for Psychiatric Mental Health Nurse Practitioners. It scrapes 200+ jobs daily. Multiple sources. Different HTML. Different APIs. The usual pain. At ~8,000 active listings and ~2,000 companies, one bad cron day hurts. Duplicates show up. Or worse — jobs silently don’t. My first version was “just run scraping in a cron route.” Brutal. Vercel cron can overlap. Deploys can overlap. And if a source gets slow, the next run starts anyway. I spent 4 hours “fixing dedupe” and most of it was wrong because the real issue was concurrency. So I moved coordination into Postgres. Not Redis. Not queues. Just SQL. 1) I stopped trusting cron timing. I added a lease. Two cron invocations can run at the same minute. And if you redeploy while a scrape is running?…

Continue reading — create a free account

Join HashtagPLUS to read full articles, follow hashtags, vote, and join the conversation.

Read More