OpenTelemetry has become a vital tool in ensuring smooth observability across microservices, distributed systems, and different languages. In this post, we’ll walk through improving an OpenTelemetry tracing demo, expanding it by incorporating multiple programming languages like Kotlin, Python (Flask), and Rust. We’ll also cover PostgreSQL integration, MQTT messaging, and Dockerfile optimization for multi-service applications.
Original Demo Overview
The demo involved tracing with OpenTelemetry across three different services:
- Kotlin with Spring Boot
- Python with Flask
- Rust with Axum
Initially, these services used simple, embedded databases (H2 for Kotlin, SQLite for Python, and a Rust HashMap). This setup has been enhanced to use a unified PostgreSQL database for all services, making the system more robust and scalable.
OpenTelemetry Integration with PostgreSQL
In the new setup, PostgreSQL replaces embedded databases across all services. OpenTelemetry automatically generates spans when a service connects to the PostgreSQL database in the JVM and Python environments. With Java, this process is automated via the Java agent, whereas Python requires the manual installation of the opentelemetry-distro
package.
pip install opentelemetry-distro
opentelemetry-bootstrap -a install
For Python, integrating Flask with OpenTelemetry required adding the following packages:
Copy code
pip install opentelemetry-instrumentation-flask opentelemetry-instrumentation-sqlalchemy
By doing so, every connection to the PostgreSQL database is traced, which aids in better monitoring and observability across services.
Optimizing Python with Gunicorn
In the original demo, the Python Flask service used the default Flask development server, which is not suitable for production. For the improved version, we integrated Gunicorn as the HTTP server for Flask to make the setup more production-ready.
Copy code
RUN pip install gunicorn
ENTRYPOINT ["opentelemetry-instrument", "gunicorn", "-b", "0.0.0.0", "-w", "4", "app:app"]
By making this simple change in the Dockerfile, we ensure the Flask service is production-ready without disrupting the OpenTelemetry instrumentation.
Dockerfile Optimization with Heredocs
In the world of Docker, minimizing image layers is essential for efficiency. In the original demo, each Docker command created new layers, which resulted in a bloated image. In the updated Dockerfile, Heredocs are used to group commands into a single layer, improving both readability and efficiency.
Copy code
RUN <<EOF
pip install pip-tools
pip-compile
pip install -r requirements.txt
pip install gunicorn
opentelemetry-bootstrap -a install
EOF
This reduces unnecessary layers and speeds up Docker image builds, making your application more lightweight and agile for deployment.
Explicit API Calls on the JVM with OpenTelemetry
Another significant improvement was adding explicit OpenTelemetry API calls for tracing analytics. In the improved version, we manually create spans using Spring Boot annotations in the Kotlin service, which allows better control over tracing. This is particularly useful for scenarios involving message queues like MQTT.
Here’s how you can manually create a span in Kotlin:
Copy code
val otel = GlobalOpenTelemetry.get()
val tracer = otel.tracerBuilder("ch.frankel.catalog").build()
val span = tracer.spanBuilder("AnalyticsFilter.filter")
.setParent(Context.current())
.startSpan()
span.end()
This code captures analytics data and sends it as trace information, providing deep insights into application performance.
MQTT Messaging with OpenTelemetry
To make the demo more comprehensive, we added MQTT messaging for cross-service communication. The Kotlin service now publishes messages to an MQTT topic, while a NodeJS service subscribes to these messages. OpenTelemetry traces the entire flow, ensuring end-to-end observability, even across message queues.
To add trace context to an MQTT message, we reconstruct the traceparent HTTP header from the spanContext and include it in the message properties.
Copy code
val spanContext = span.spanContext
val message = MqttMessage().apply {
properties = MqttProperties().apply {
val traceparent = "00-${spanContext.traceId}-${spanContext.spanId}-${spanContext.traceFlags}"
userProperties = listOf(UserProperty("traceparent", traceparent))
}
qos = options.qos
payload = Json.encodeToString(Payload(req.path(), req.remoteAddress().get().address.hostAddress)).toByteArray()
}
This approach ensures that the trace data follows the message across different services and systems.
OpenTelemetry Subscriber in NodeJS
On the receiving end, a NodeJS service subscribes to the MQTT topic and extracts the traceparent data from the message metadata, allowing it to create a new span in the trace.
Copy code
const sdk = new NodeSDK({
resource: new Resource({[SemanticResourceAttributes.SERVICE_NAME]: 'analytics'}),
traceExporter: new OTLPTraceExporter({url: `${collectorUri}/v1/traces`})
});
sdk.start();
client.on('message', (aTopic, payload, packet) => {
const userProperties = packet.properties['userProperties'];
const activeContext = propagation.extract(context.active(), userProperties);
const span = tracer.startSpan('Read message', {attributes: {path: data['path'], clientIp: data['clientIp']}}, activeContext);
span.end();
});
This ensures that the entire message flow is traced, giving us a complete picture of the system’s health and performance.
Conclusion
By enhancing the OpenTelemetry tracing demo, we have achieved better observability across multiple services written in Kotlin, Python, and Rust. The introduction of PostgreSQL, MQTT messaging, and Gunicorn has made the application more production-ready, while Dockerfile optimization has improved efficiency.
For anyone looking to improve their multi-language, distributed systems with OpenTelemetry tracing, incorporating these best practices will ensure robust monitoring, enhanced performance, and easier debugging.