I'm submitting this PR as a proposal to provide a mechanism for tracking the duration any given health check implementation took, providing the information in the details
of the Health
object.
In reality, this tracks the nanoseconds between the Health.Builder
construction time and when the builder's build()
method is invoked. Though this may blur the execution time very slightly from that of the doHealthCheck()
method, it would be negligible.
The reason that timing would be beneficial (at least for me) is that in some boot applications the time it takes for the overall health check to return (based on monitoring metrics on the actuator endpoint) continuously increases to a point at which kubernetes will begin to restart our services. I am attempting to identify which individual health check may be responsible, but also feel that this is nice information to have for anyone showing the details of the health endpoint.
This may not be a desired solution, so do please let me know your thoughts. I did attempt to catch these timings with an @Around
aspect, but was unable to successfully get granular enough for items like the DataSourceHealthIndicator
which was (for me) just 1 of 4 such health indicators wrapped in a CompositeHealthIndicator
based on my particular app having 4 DataSources.
I look forward to any and all feedback. If this is accepted, I would back port this at least to 2.1.x, but would go farther back based on guidance and suggestions from contributing members.
Comment From: snicoll
@dugshnay thanks for the PR but this was already suggested and we've declined the request, see #9238 for some background.
Comment From: dugshnay
@snicoll thank you for the feedback! In reading the notes on the referenced issue, we do monitor the underlying resources, yet they report healthy despite the growing slowness of the health check from a node in the cluster. I'll keep searching for a way to pinpoint this. Have a great day!