-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add an error counter for internal errors in the HTTP handler #594
Conversation
errCnt.WithLabelValues("encoding") | ||
if err := opts.Registry.Register(errCnt); err != nil { | ||
if are, ok := err.(prometheus.AlreadyRegisteredError); ok { | ||
errCnt = are.ExistingCollector.(*prometheus.CounterVec) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using a global would seem cleaner here, rather than poking around error types to the same effect
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A global CounterVec wouldn't help here. We would still need the same error type exercise. Even if you re-register the same collector, you'll get an error.
However, if we used a global here, you couldn't use separate counters when passing different registries.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, that had just occurred to me.
prometheus/promhttp/http.go
Outdated
prometheus.DefaultRegisterer, HandlerFor(prometheus.DefaultGatherer, HandlerOpts{}), | ||
prometheus.DefaultRegisterer, HandlerFor( | ||
prometheus.DefaultGatherer, | ||
HandlerOpts{Registry: prometheus.DefaultRegisterer}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure this makes sense for the default handler, as gathering will fail the whole scrape anyway and once we've OpenMetrics it has an explicit EOF that'll catch encoding issues. Given that a little wary of adding two more series for ~every target, for cases that don't/won't happen.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a good point. 99% of the usage for the new metric will be with ContinueOnError
. However, the incident that triggered #542 would happen even with the default handler.
But yeah, adding metrics for the default case might be confusing. I'll remove it here.
The doc comments explain the rationale in a quite detailed way. Fixes #543 and #542 Signed-off-by: beorn7 <[email protected]>
👍 |
Thanks. Working on a test now. |
Signed-off-by: beorn7 <[email protected]>
Test added, will merge on green. |
The doc comments explain the rationale in a quite detailed way.
Fixes #543 and #542
I'll add a test once we have decided this is the way to go.