Menu

📰
0

What’s your rule for when a CronJob problem deserves a page?

Reddit r/kubernetes·u/HrvoslavJankovic_·about 1 month ago
#VOTTag1I
Reading 0:00
15s threshold

I’m dealing with a few K8s CronJobs that are important, but not all of them are “wake someone up at 3 a.m.” important.

Some fail once and recover on the next run, some get delayed, some quietly stop being useful long before they technically fail. I’m trying to find a sane line between “ignore it” and “page for every hiccup.”

If you run a lot of CronJobs, how do you decide what becomes a ticket, what becomes an alert, and what becomes a page?

Read More