Looking back on my early professional years, I’m somewhat startled by how I troubleshooted production issues. I was working on a project where I handled everything — from interfacing with customers to coding, deployment, and, naturally, resolving issues. Whenever an issue arose and I lacked a clear solution, my approach was to take a dump of the production database, run the application on my machine with that database, then engage in a call with the customer and attempt to replicate what they were doing, using necessary debug pointers and print statements to pinpoint the problem. This method proved effective for me at the time. The application was small, with a limited user base, and entirely manageable for a single developer. However, I had yet to grasp the importance of logging.
Fast forward a few years, and now I work on projects involving multiple developers, a product owner, a scrum master, an operations team, an infrastructure team, and, of course, millions of users. Importantly, the production environment is mostly inaccessible to us developers. Additionally, customers cannot directly communicate with developers, as that would drive us crazy.
The operations team served as a bridge between these two realms. Their primary task was to investigate issues by examining the system, application logs, and the database if necessary. If they couldn’t find a solution or had insights for improvement, they would then approach developers with the relevant data.
Working in this environment, I began to appreciate the importance of logs. Logs help us trace back through the events in the life of an application or a transaction. With proper analysis, they help derive patterns and even forecast anomalies.
However, I still didn’t know much about distributed tracing (I still know very little). I used to include different identifiers like phone numbers, user IDs, etc., so that our ops team could properly investigate issues for a customer.
But this approach had its problems too. The same customer might have engaged in various interactions with the system, generating multiple streams of logs with their identifier. So pinpointing the exact logs for the issue was still cumbersome. And don’t even start with me about what a nightmare it was when multiple applications were involved. I used to manually add UUIDs in the payloads between service calls and log them.
Eventually, it became evident that my current approach was inadequate. Some of my applications served as middlewares between other applications maintained by different teams. Why would they tolerate my inconvenience?
Then, I discovered Sleuth. It was amazing; adding just one dependency and configuring logging would enable application-wide tracing. Even with multiple microservices in play, there was no need for additional measures to propagate the trace ID among services.
Before Spring Boot 3, whenever I started a project, I always included Spring Cloud Sleuth to enable distributed tracing.
Then came Spring Boot 3, and Sleuth was migrated to Micrometer. In this article, we’ll explore how to achieve Sleuth-like functionality in Spring Boot 3 with Micrometer tracing.
Right now I have a controller and a service with some logs. If I hit the controller endpoint with postman, I get logs like this:
2024-02-23T23:28:18.043+06:00 INFO 14443 --- [rest-api] [nio-8080-exec-2] x.r.t.restapi.web.MessageController : Received message: Message(header=some header, content=some content) 2024-02-23T23:28:18.046+06:00 INFO 14443 --- [rest-api] [nio-8080-exec-2] x.r.t.r.service.MessageServiceImpl : Handling message Message(header=some header, content=some content)
It’s clearly discernible that there is no tracing. Now I will add the following dependency:
<dependency> <groupId>io.micrometer</groupId> <artifactId>micrometer-tracing-bridge-brave</artifactId> <scope>compile</scope> </dependency>
Just hit the same endpoint once again, but still no luck or log!
A bit of internet searching revealed you also need to add actuator to make it work.
It contains an ObservationAutoConfiguration that provides an injectable instance of ObservationRegistry (if it doesn’t already exist), and configures ObservationHandlers for collecting metrics and traces. baeldung.com .
<dependency> <groupId>org.springframework.boot</groupId> <artifactId>spring-boot-starter-actuator</artifactId> <scope>compile</scope> </dependency>
Hitting the same endpoint again produces logs like this:
2024-02-24T00:11:06.050+06:00 INFO 18463 --- [rest-api] [nio-8080-exec-1] [65d8dfb96a137925ba56091d26f33e80-ba56091d26f33e80] x.r.t.restapi.web.MessageController : Received message: Message(header=some header, content=some content) 2024-02-24T00:11:06.053+06:00 INFO 18463 --- [rest-api] [nio-8080-exec-1] [65d8dfb96a137925ba56091d26f33e80-ba56091d26f33e80] x.r.t.r.service.MessageServiceImpl : Handling message Message(header=some header, content=some content)
See? [65d8dfb96a137925ba56091d26f33e80-ba56091d26f33e80] this part here contains the trace id and span id.
Now let’s see If we can call another service and if traceId gets propagated there.
I have created another service rest-api-2 , which has the necessary controller, service as well as required dependencies.
Now we will use the shiny new rest-client to make an http call from rest-api to rest-api-2.
I have modified the MessageServiceImpl in rest-api to look like this:
@Service @Slf4j public class MessageServiceImpl implements MessageService { private final RestClient restClient; public MessageServiceImpl(@Value("${rest-api-2.url}") String restApi2Url) { this.restClient = RestClient.builder().baseUrl(restApi2Url) .defaultHeader("Content-Type", MediaType.APPLICATION_JSON_VALUE).build(); } @Override public void handleMessage(Message message) { log.info("Handling message {}", message); this.restClient.post().uri( "/process-message").body(message).retrieve(); } }
Here, In the constructor, I have declared an instance of rest client. Then, in the service I made the call to rest-api-2.
Log for rest-api:
2024-02-24T14:31:13.060+06:00 INFO 29880 --- [rest-api] [nio-8080-exec-5] [65d9a95114390c055da56f5c4e138be3-5da56f5c4e138be3] x.r.t.restapi.web.MessageController : Received message: Message(header=some header, content=some content) 2024-02-24T14:31:13.061+06:00 INFO 29880 --- [rest-api] [nio-8080-exec-5] [65d9a95114390c055da56f5c4e138be3-5da56f5c4e138be3] x.r.t.r.service.MessageServiceImpl : Handling message Message(header=some header, content=some content)
Log for rest-api-2:
2024-02-24T14:31:13.072+06:00 INFO 29563 --- [rest-api-2] [nio-8081-exec-7] [65d9a9513ed86cb5ea2477e08166b9d2-ea2477e08166b9d2] x.r.t.r.web.MessageProcessorController : Received message for processing Message(header=some header, content=some content) 2024-02-24T14:31:13.072+06:00 INFO 29563 --- [rest-api-2] [nio-8081-exec-7] [65d9a9513ed86cb5ea2477e08166b9d2-ea2477e08166b9d2] x.r.t.r.s.MessageProcessorServiceImpl : Processing Message Message(header=some header, content=some content)
We can see that though rest-api-2 has traceId and spanId. But, these doesn’t match! Because it’s starting new trace context for each requests.
This means traceId from first application is not propagating to the second application.
Let’s verify this.
I have stopped the rest-api-2 application, and started listening to the port it was running with a tool called netcat:
Now If it hit the endpoint , I get the following:
We can see, though there supposed to be some header containing traceId, it’s not there. We need to make sure the proper headers with trace context are present in the request.
This is not that tough. We need to use the default restClient builder that spring boot auto-configured for use. Which has necessary knowledge for propagating traceId. We can do this like this:
public MessageServiceImpl(@Value("${rest-api-2.url}") String restApi2Url, RestClient.Builder restClientBuilder) { this.restClient = restClientBuilder.baseUrl(restApi2Url) .defaultHeader("Content-Type", MediaType.APPLICATION_JSON_VALUE).build(); }
We received the default RestClient.Builder through the constructor in the service where we want it to use. Rest of the code is simple. Now if we initiate a request we can see something like this:
Now we have a header called traceparent, and it seems to have the traceId from the log. Now we will stop the netcat and start the rest-api-2 application.
Log from first application:
Log from second application:
And bingo!
We can see the trace id between two applications matches. Though spanId doesn’t match, which is expected.
I think this is a good time to refactor the service. Instead of declaring the rest-client here, we will declare it as a reusable bean in a config class.
Rest client config class looks like this:
@Configuration public class RestClientConfig { @Bean("restApi2Client") RestClient restApi2Client(@Value("${rest-api-2.url}") String restApi2Url, RestClient.Builder restClientBuilder) { return restClientBuilder.baseUrl(restApi2Url) .defaultHeader("Content-Type", MediaType.APPLICATION_JSON_VALUE).build(); } }
And refactored MessageServiceImpl class:
@Service @Slf4j public class MessageServiceImpl implements MessageService { private final RestClient restClient; public MessageServiceImpl(@Qualifier("restApi2Client") RestClient restClient) { this.restClient = restClient; } @Override public void handleMessage(Message message) { log.info("Handling message {}", message); this.restClient.post().uri( "/process-message").body(message).retrieve(); } }
Cleaner right ?
This is just a basic demo. I plan to write more on how to leverage micrometer for other observability use cases.
Here is the repo for all the codes.
Happy coding!