Describe the bug Using spring cloud 3.1.3 we ran a scenario where we had a git repository when spring cloud started and then removed access (so as not to impact others, we essentially ran it behind tcpproxy and then stopped the proxy). We expected a healtcheck to trigger.

We saw a stack trace occur:

2022-08-23 17:53:38.498 DEBUG 867680 --- [alth-1-thread-3] .s.c.c.s.s.GitCredentialsProviderFactory : Constructing UsernamePasswordCredentialsProvider for URI https://localhost:8433/SHY/shy-cloud-configuration.git
2022-08-23 17:53:38.499 DEBUG 867680 --- [alth-1-thread-3] .s.c.c.s.s.GitCredentialsProviderFactory : Constructing GitSkipSslValidationCredentialsProvider for URI https://localhost:8433/SHY/shy-cloud-configuration.git
2022-08-23 17:53:38.526  WARN 867680 --- [alth-1-thread-3] .c.s.e.MultipleJGitEnvironmentRepository : Could not fetch remote for local remote: https://localhost:8433/SHY/shy-cloud-configuration.git
2022-08-23 17:53:38.528 DEBUG 867680 --- [alth-1-thread-3] .c.s.e.MultipleJGitEnvironmentRepository : Stacktrace for: Could not fetch remote for local remote: https://localhost:8433/SHY/shy-cloud-configuration.git

org.eclipse.jgit.api.errors.TransportException: https://localhost:8433/SHY/shy-cloud-configuration.git: connection failed
    at org.eclipse.jgit.api.FetchCommand.call(FetchCommand.java:224)
    at org.springframework.cloud.config.server.environment.JGitEnvironmentRepository.fetch(JGitEnvironmentRepository.java:551)
    at org.springframework.cloud.config.server.environment.JGitEnvironmentRepository.refresh(JGitEnvironmentRepository.java:298)
    at org.springframework.cloud.config.server.environment.JGitEnvironmentRepository.getLocations(JGitEnvironmentRepository.java:262)
    at org.springframework.cloud.config.server.environment.MultipleJGitEnvironmentRepository.getLocations(MultipleJGitEnvironmentRepository.java:139)
    at org.springframework.cloud.config.server.environment.AbstractScmEnvironmentRepository.findOne(AbstractScmEnvironmentRepository.java:55)
    at org.springframework.cloud.config.server.environment.MultipleJGitEnvironmentRepository.findOneFromCandidate(MultipleJGitEnvironmentRepository.java:188)
    at org.springframework.cloud.config.server.environment.MultipleJGitEnvironmentRepository.findOne(MultipleJGitEnvironmentRepository.java:173)
    at org.springframework.cloud.config.server.environment.CompositeEnvironmentRepository.findOne(CompositeEnvironmentRepository.java:64)
    at org.springframework.cloud.config.server.config.ConfigServerHealthIndicator.doHealthCheck(ConfigServerHealthIndicator.java:72)
    at org.springframework.boot.actuate.health.AbstractHealthIndicator.health(AbstractHealthIndicator.java:82)
    at org.springframework.boot.actuate.health.HealthIndicator.getHealth(HealthIndicator.java:37)
    at org.springframework.boot.actuate.health.HealthEndpoint.getHealth(HealthEndpoint.java:94)
    at org.springframework.boot.actuate.health.HealthEndpoint.getHealth(HealthEndpoint.java:41)
    at org.springframework.boot.actuate.health.HealthEndpointSupport.getLoggedHealth(HealthEndpointSupport.java:172)
    at org.springframework.boot.actuate.health.HealthEndpointSupport.getContribution(HealthEndpointSupport.java:145)
    at org.springframework.boot.actuate.health.HealthEndpointSupport.getAggregateContribution(HealthEndpointSupport.java:156)
    at org.springframework.boot.actuate.health.HealthEndpointSupport.getContribution(HealthEndpointSupport.java:141)
    at org.springframework.boot.actuate.health.HealthEndpointSupport.getHealth(HealthEndpointSupport.java:110)
    at org.springframework.boot.actuate.health.HealthEndpointSupport.getHealth(HealthEndpointSupport.java:81)
    at org.springframework.boot.actuate.health.HealthEndpoint.health(HealthEndpoint.java:88)
    at org.springframework.boot.actuate.health.HealthEndpoint.health(HealthEndpoint.java:78)
    at com.cba.cas.shy.health.starter.ShyHealthAutoConfiguration$HealthRecorderConfiguration$healthRecorder$1.invoke(ShyHealthAutoConfiguration.kt:172)
    at com.cba.cas.shy.health.starter.ShyHealthAutoConfiguration$HealthRecorderConfiguration$healthRecorder$1.invoke(ShyHealthAutoConfiguration.kt:171)
    at com.cba.cas.shy.health.starter.HealthRecorder.determineOverallHealth(HealthRecorder.kt:68)
    at com.cba.cas.shy.health.starter.HealthRecorder.access$determineOverallHealth(HealthRecorder.kt:22)
    at com.cba.cas.shy.health.starter.HealthRecorder$start$1.invoke(HealthRecorder.kt:54)
    at com.cba.cas.shy.health.starter.HealthRecorder$start$1.invoke(HealthRecorder.kt:52)
    at com.cba.cas.shy.health.starter.ShyHealthAutoConfiguration$healthFixedDelayScheduler$1$1.invoke(ShyHealthAutoConfiguration.kt:147)
    at com.cba.cas.shy.health.starter.ShyHealthAutoConfiguration$healthFixedDelayScheduler$1$1.invoke(ShyHealthAutoConfiguration.kt:144)
    at com.cba.cas.shy.extensions.lang.LangExtensionsKt.scheduleWithFixedDelayFunc$lambda-11(LangExtensions.kt:503)
    at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
    at java.base/java.util.concurrent.FutureTask.runAndReset$$$capture(FutureTask.java:305)
    at java.base/java.util.concurrent.FutureTask.runAndReset(FutureTask.java)
    at java.base/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:305)
    at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
    at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
    at java.base/java.lang.Thread.run(Thread.java:829)
Caused by: org.eclipse.jgit.errors.TransportException: https://localhost:8433/SHY/shy-cloud-configuration.git: connection failed
    at org.eclipse.jgit.transport.TransportHttp.connect(TransportHttp.java:729)
    at org.eclipse.jgit.transport.TransportHttp.openFetch(TransportHttp.java:465)
    at org.eclipse.jgit.transport.FetchProcess.executeImp(FetchProcess.java:142)
    at org.eclipse.jgit.transport.FetchProcess.execute(FetchProcess.java:94)
    at org.eclipse.jgit.transport.Transport.fetch(Transport.java:1309)
    at org.eclipse.jgit.api.FetchCommand.call(FetchCommand.java:213)
    ... 37 common frames omitted
Caused by: java.net.ConnectException: Connect to localhost:8433 [localhost/127.0.0.1, localhost/0:0:0:0:0:0:0:1] failed: Connection refused (Connection refused) localhost
    at org.eclipse.jgit.util.HttpSupport.response(HttpSupport.java:213)
    at org.eclipse.jgit.transport.TransportHttp.connect(TransportHttp.java:654)
    ... 42 common frames omitted

And tracing through, seems that whilst the source is expecting to catch and exception (ConfigServerHealthIndicator, line 91), that exception does not occur, because JGitEnvironmentRepository swallows it: you

protected void doHealthCheck(Health.Builder builder) throws Exception {
        builder.up();
        List<Map<String, Object>> details = new ArrayList<>();
        for (String name : this.repositories.keySet()) {
            Repository repository = this.repositories.get(name);
            String application = (repository.getName() == null) ? name : repository.getName();
            String profiles = repository.getProfiles();

            try {
                Environment environment = this.environmentRepository.findOne(application, profiles,
                        repository.getLabel(), false);

                HashMap<String, Object> detail = new HashMap<>();
                detail.put("name", environment.getName());
                detail.put("label", environment.getLabel());
                if (environment.getProfiles() != null && environment.getProfiles().length > 0) {
                    detail.put("profiles", Arrays.asList(environment.getProfiles()));
                }

                if (!CollectionUtils.isEmpty(environment.getPropertySources())) {
                    List<String> sources = new ArrayList<>();
                    for (PropertySource source : environment.getPropertySources()) {
                        sources.add(source.getName());
                    }
                    detail.put("sources", sources);
                }
                details.add(detail);
            }
91:         catch (Exception e) {
                logger.debug("Could not read repository: " + application, e);
                HashMap<String, String> map = new HashMap<>();
                map.put("application", application);
                map.put("profiles", profiles);
                builder.withDetail("repository", map);
                builder.down(e);
                return;
            }
        }
        builder.withDetail("repositories", details);

    }

can see here the exception is logged and not thrown:

protected FetchResult fetch(Git git, String label) {
        FetchCommand fetch = git.fetch();
        fetch.setRemote("origin");
        fetch.setTagOpt(TagOpt.FETCH_TAGS);
        fetch.setRemoveDeletedRefs(this.deleteUntrackedBranches);
        if (this.refreshRate > 0) {
            this.setLastRefresh(System.currentTimeMillis());
        }

        configureCommand(fetch);
        try {
            FetchResult result = fetch.call();
            if (result.getTrackingRefUpdates() != null && result.getTrackingRefUpdates().size() > 0) {
                this.logger.info("Fetched for remote " + label + " and found " + result.getTrackingRefUpdates().size()
                        + " updates");
            }
            return result;
        }
        catch (Exception ex) {
            String message = "Could not fetch remote for " + label + " remote: "
                    + git.getRepository().getConfig().getString("remote", "origin", "url");
            warn(message, ex);
            return null;
        }
    }

Can someone please clarify if this is the correct intention, as I would have expected the health to indicate down.

Comment From: ryanjbaxter

This seems similar to https://github.com/spring-cloud/spring-cloud-config/issues/1934. Let me know what you think.

Comment From: greenwayb

This seems similar to #1934. Let me know what you think.

I dont think so. My expectation was that if cloud config was running and then the git repository had been taken offline (we use our own on-prem solution, that is occasionally down for maintenance/update etc) that the health check would indicate something wrong, but in this case the exception that would have flowed back up seems to have been swallowed and therefore the health doesn't fire.

Having said that though perhaps in this case if git repo was down its better that the service is just marked as CRITICAL - ie still functions with the data it had before, whilst still running.

For now we have written our own healthcheck with status appropriate to our situation, but i wasn't sure we were intrepretting correctly how the cloud config healthcheck was expected to behave in this situation - ie started and then git repo unavailable.

Comment From: ryanjbaxter

When the git repository is taken offline, can the config server still serve configuration?

Comment From: greenwayb

Yes, I hit it with curl commands and got results. We were basically trying to check that in our microservices if our internal git repo was down that we could still function. In this case the healthcheck would just allow us to be aware of potential issues, I.e. best not to restart config server, though we have now changed it to not clone on startup, which has had the behaviour such that if we had previously started before and the repo had at some point been cloned then the repo went off-line that we could still restart. It means our clients can come up, obviously at this point we can't change config, but we can operate off the most recent config whilst our repo comes back online.

Comment From: ryanjbaxter

OK so to me this seems very similar to the other issue. It basically comes down to whether the health indicator should report some kind of degraded functionality because it can't reach the git repo. However it can still do its job and serve configuration data in this state.