Thursday, May 12, 2022

Pattern: Log aggregation

Problem

How to understand the behavior of an application and troubleshoot problems?

Forces

  • Any solution should have minimal runtime overhead

Solution

Use a centralized logging service that aggregates logs from each service instance. The users can search and analyze the logs. They can configure alerts that are triggered when certain messages appear in the logs.


Log service activity and write logs into a centralized logging server, which provides searching and alerting.

Logs are a valuable troubleshooting tool. If you want to know what’s wrong with your application, a good place to start is the log files. But using logs in a microservice architecture is challenging.

Most of the time, the log entries you need are scattered across the log files of the API gateway and several services. The solution is to use log aggregation.



The log aggregation pipeline sends the logs of all of the service instances to a centralized logging server. Once the logs are stored by the logging server, you can view, search, and analyze them. You can also configure alerts that are triggered when certain messages appear in the logs.


A Simple Use-Case Using Observability Patterns in Microservices Architecture.




Distributed Tracing

Assign each external request a unique ID and trace requests as they flow between services.

A good way to get insight into what your application is doing is to use distributed tracing. Distributed tracing is analogous to a performance profiler in a monolithic application. It records information (Ex: start time and end time) about the tree of service calls that are made when handling a request.

Exception Tracking

Report exceptions to an exception tracking service, which de-duplicates exceptions, alerts developers, and tracks the resolution of each exception.

A service should rarely log an exception, and when it does, it’s important that you identify the root cause. The exception might be a symptom of a failure or a programming bug. The traditional way to view exceptions is to look in the logs. 

You might even configure the logging server to alert you if an exception appears in the log file. A better approach is to use an exception tracking service.

Application Metrics

Services maintain metrics, such as counters and gauges, and expose them to a metrics server

A key part of the production environment is monitoring and alerting. So it's important to have a monitoring system that gathers metrics, which provide critical information about

  • Health of an application, from every part of the technology stack.
  • Metrics range from infrastructure-level metrics, such as CPU, memory, and disk utilization, to application-level metrics, such as service request latency and a number of requests executed, etc...

Ex: Newrelic, Datadog, etc...

Audit Logging

Log user actions

The purpose of audit logging is to record each user’s actions. An audit log is typically used to help customer support, ensure compliance, and detect suspicious behavior.

Each audit log entry records the identity of the user, the action they performed, and the business objects.



You may also like

Kubernetes Microservices
Python AI/ML
Spring Framework Spring Boot
Core Java Java Coding Question
Maven AWS