feat(#5352): healthgroups endpoint#5416
Conversation
| * specific data in {@link HealthGroupsCache} on receiving an | ||
| * {@link InstanceDeregisteredEvent}. | ||
| */ | ||
| public class HealthGroupsCacheCleanupTrigger extends AbstractEventHandler<InstanceDeregisteredEvent> { |
There was a problem hiding this comment.
This seems a bit an overkill to me.
It would be enough to invoke the deletion in the StatusUpdater with a doOnNext as it's already done to persist the groups (actually, it's enough to add an if there to check the event type and either call the save or the delete method).
We're scheduling a thread to constantly check for a single even type, which barely happens, where the StatusUpdater is already handling all of them anyway.
There was a problem hiding this comment.
StatusUpdater polls the /health endpoint — it does not receive InstanceDeregisteredEvent. A deregistered instance simply stops being polled (doUpdateStatus returns early if !instance.isRegistered()), so eviction would never fire there.
I'll investigate a little further and probably will either
(a) keep a lightweight event subscription but reuse an existing handler/scheduler, or
(b) accept the small overhead since cleanup-on-deregister is the established pattern in SBA
There was a problem hiding this comment.
StatusUpdater polls the /health endpoint — it does not receive InstanceDeregisteredEvent.
Indeed. For a moment I had the impression it's handled there because the journal view shows all of them.
I'll investigate a little further and probably will either
(a) keep a lightweight event subscription but reuse an existing handler/scheduler, or
(b) accept the small overhead since cleanup-on-deregister is the established pattern in SBA
Wouldn't be enough to have a Spring Boot's @EventListener for the specific application event and remove the entry from the cache when it's received?
There was a problem hiding this comment.
SBA's InstanceDeregisteredEvent is not a Spring ApplicationEvent. It's a domain event emitted through a Reactor Sinks.Many (InstanceEventPublisher) and consumed via Flux.from(Publisher). These events are never published through Spring's ApplicationEventPublisher, so a plain @eventlistener method would never fire.
To still fulfill your request, I implemented a lightweight reactive subscriber bean, no dedicated scheduler, self-contained lifecycle.
There was a problem hiding this comment.
To still fulfill your request
You make me look like the bad guy here :)
There are already performance issues around, with memory, so we should try to not add more now, on CPU level.
|
|
||
| describe('SSE reactive updates', () => { | ||
| it('should call fetchHealth once on mount, not on SSE version changes', async () => { | ||
| it('should call fetchCachedHealthGroups once on mount, not on SSE version changes', async () => { |
There was a problem hiding this comment.
Actually I think we should call the fetchHealthGroups when an SSE is received now, otherwise stale data may be potentially used.
This comment is not specifically related to this test case, but it's a more general statement.
There was a problem hiding this comment.
Good catch, and agreed in principle. The current else branch only invalidates the per-group details (group.data = null) but keeps the previously fetched group list, so a group added/removed at runtime would indeed show stale entries until the instance id changes or the view is remounted (although this seems to be an edge case).
I'll update onInstanceChanged so the same-instance branch also calls fetchHealthGroups() (which already resets healthGroupOpenStatus/healthGroupLoadingMap and clears stale data), making the displayed group list self-correcting on SSE updates.
There was a problem hiding this comment.
Yep.
Also because, from @SteKoe in a previous PR about a Thymeleaf fix, I discovered that the configuration properties can be changed at runtime from SBA.
Probably, to cover this case as well, we would need SSE also for the health groups, but I don't know how much worthy it really is. Unless a STATUS_CHANGED even is also created when the health response body changes in general and not just by the status (UP, DOWN and so on).
Performance and code clearity wise, it would then indeed be better to update the model of the instance to also have the health groups in it.
| this.cache.remove(instanceId); | ||
| } | ||
| else { | ||
| this.cache.put(instanceId, List.copyOf(groups)); |
There was a problem hiding this comment.
I think we should ensure that groups are unique, either here by using a LinkedHashSet instead of a List or in the UI to avoid potentially duplicated entries.
There was a problem hiding this comment.
We're now deduplicating directly in StatusUpdater.extractAndCacheHealthGroups
There was a problem hiding this comment.
In terms of design, I would have done it here (it's a responsibility of the repository to decide what to store and how), but it's fine also like this for me.
| if (Array.isArray(res.data.groups)) { | ||
| this.healthGroups = res.data.groups.map((name: string) => ({ | ||
| if (Array.isArray(res.data)) { | ||
| this.healthGroups = res.data.map((name: string) => ({ |
There was a problem hiding this comment.
Hi @ulischulte.
Will this assignment trigger a UI re-rendering?
If it's so, we should maybe check the content of the response first and set the new state (health groups + reset of state of the accorditions) only if there is indeed a change in the collection of the returned group names.
Otherwise, on every fetch, we'll be re-rendering the UI for nothing if nothing really has changed in the context of the health groups.
There was a problem hiding this comment.
Yes, it will trigger a re-rendering of UI. I will add a check to prevent this in terms of UX, as it will also reset the collapsed state. Changing groups in backend will not happen that often, though.
There was a problem hiding this comment.
Changing groups in backend will not happen that often, though.
I was wondering exactly because their update will be very rare.
There was a problem hiding this comment.
fetchHealthGroups() now compares incoming vs. current group names and returns early when unchanged, preserving accordion open/loading state.
# Conflicts: # spring-boot-admin-server/src/test/java/de/codecentric/boot/admin/server/services/StatusUpdaterTest.java
…in/server/services/StatusUpdater.java Co-authored-by: Cosimo Damiano Prete <8491864+cdprete@users.noreply.github.com>
fa4bcf7 to
ffe3250
Compare
|
Hi @ulischulte. What do you think about the idea of adding a metadata boolean property that the applications under monitoring can populate to indicate if they use health groups or not? Moreover, another optimization (to be verified if it's indeed possible; for now it's just an idea) would be to not just return the group names, but the name of the health indicators they include as well. For example: I'm still in favour of adding them to the |
| this.healthGroupLoadingMap = {}; | ||
| } | ||
| // Re-fetch the (server-cached) group list on every instance change | ||
| this.fetchHealthGroups(); |
There was a problem hiding this comment.
We should do this only if the instance has changed in some meaningful way regarding its status, which means to not do it if the event that gets sent is INFO_UPDATED.
If the process InfoContributor is enabled, the data returned by https://docs.spring.io/spring-boot/api/rest/actuator/info.html changes on every poll, which would mean we're re-fetching the health groups over and over for no real reason.
Also, if the status code didn't change from one event to another but only the details, I think it's meaningless to fetch them again.
There was a problem hiding this comment.
Just replaced watch: instance with a watch on a computed key = id:status:statusTimestamp. It deliberately excludes version, so INFO_UPDATED events (which bump version every poll with the process InfoContributor) no longer trigger refetches, while genuine status changes still do.
There was a problem hiding this comment.
Probably even id:status is enough.
I don't know if using a less complex key could bring any benefit in terms of perf to the UI or not, though.
There was a problem hiding this comment.
@ulischulte @SteKoe given that what is getting watched is getting changed here, would this change fix #5482 as well maybe?
Or is this the whole refresh triggered on an higher level?
…am notifications (#5409) Co-authored-by: Stephan Köninger <stephan.koeninger@codecentric.de>
Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
#5472) Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
…18n text rendering
Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
Co-authored-by: Stephan Köninger <stephan.koeninger@codecentric.de>
…ugin to v1.1.3 (#5495) Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com> Co-authored-by: Stephan Köninger <stephan.koeninger@codecentric.de>
Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
…5503) Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
… prevent duplicate group names and redundant groupname fetches
| .exchangeToMono(this::convertStatusInfo) | ||
| .exchangeToMono((response) -> this.convertStatusInfo(response, instance.getId())) | ||
| .log(log.getName(), Level.FINEST) | ||
| .timeout(getTimeoutWithMargin()) |
There was a problem hiding this comment.
Can we maybe move this up after the uri "operator"?
Given that the cache is a ConcurrentMap and changing the timeout is not really possible as of today, it would be great to not make Reactor count all the things around the locking and context switching between threads as part of the overall timeout.
|
Yes, it may be an option.
Il ven 3 lug 2026, 17:07 Ulrich Schulte ***@***.***> ha
scritto:
… ***@***.**** commented on this pull request.
------------------------------
In
spring-boot-admin-server/src/main/java/de/codecentric/boot/admin/server/services/StatusUpdater.java
<#5416 (comment)>
:
> @@ -80,7 +83,7 @@ protected Mono<Instance> doUpdateStatus(Instance instance) {
return this.instanceWebClient.instance(instance)
.get()
.uri(Endpoint.HEALTH)
- .exchangeToMono(this::convertStatusInfo)
+ .exchangeToMono((response) -> this.convertStatusInfo(response, instance.getId()))
.log(log.getName(), Level.FINEST)
.timeout(getTimeoutWithMargin())
Good point. We could move the ConcurrentMap write outside the timeout
window by making convertStatusInfo return the parsed groups alongside the
StatusInfo, and performing healthGroupsCache.updateGroups(...) in a
.doOnNext placed after .timeout()
—
Reply to this email directly, view it on GitHub
<#5416?email_source=notifications&email_token=ACAZGWDKIPG4NZ7UGVITXD35C7D4DA5CNFSNUABKM5UWIORPF5TWS5BNNB2WEL2QOVWGYUTFOF2WK43UKJSXM2LFO4XTINRSGY3TEOBQGU22M4TFMFZW63VHMNXW23LFNZ2KKZLWMVXHJLDGN5XXIZLSL5RWY2LDNM#discussion_r3520686059>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ACAZGWEPTJED2ZE55U73LZT5C7D4DAVCNFSNUABEKJSXA33TNF2G64TZHMZDEMRVGQ4DKNR3JFZXG5LFHM2DKOJXHA3TSMZYHCQXMAQ>
.
Triage notifications, keep track of coding agent tasks and review pull
requests on the go with GitHub Mobile for iOS
<https://github.com/notifications/mobile/ios/ACAZGWEA6PX2EB6LLJ5SQJT5C7D4DA5CNFSNUABKM5UWIORPF5TWS5BNNB2WEL2QOVWGYUTFOF2WK43UKJSXM2LFO4XTINRSGY3TEOBQGU22M4TFMFZW63VHMNXW23LFNZ2KKZLWMVXHJKTGN5XXIZLSL5UW64Y>
and Android
<https://github.com/notifications/mobile/android/ACAZGWADTYULKUNWTZRRPT35C7D4DA5CNFSNUABKM5UWIORPF5TWS5BNNB2WEL2QOVWGYUTFOF2WK43UKJSXM2LFO4XTINRSGY3TEOBQGU22M4TFMFZW63VHMNXW23LFNZ2KKZLWMVXHJLTGN5XXIZLSL5QW4ZDSN5UWI>.
Download it today!
You are receiving this because you commented.Message ID:
***@***.***>
|
Closes #5352:
Added caching and a dedicated REST endpoint for health groups.
When SBA polls an application's /health endpoint, the response may include a groups field (e.g., ["liveness", "readiness"]). Previously this data was fetched on-the-fly from the client. Now:
Key design decisions