What are the standard logging levels and when should I use each one?

The standard levels are TRACE (granular detail), DEBUG (diagnostic info for development), INFO (routine operations like startup and shutdown), WARN (unexpected but recoverable situations), ERROR (failures that need attention), and FATAL (unrecoverable errors). In production, you typically set the minimum level to INFO or WARN to avoid excessive noise.

Should I use structured logging or plain text logging?

Structured logging (outputting JSON or key-value pairs) is almost always the better choice for production systems. It makes logs machine-parseable, which is essential for log aggregation tools like the ELK stack, Datadog, or Grafana Loki. Plain text is fine for local development, but structured logs pay for themselves the moment you need to search or filter across thousands of entries.

How much logging is too much logging?

If your logs are generating so much volume that they become expensive to store or impossible to search, you have too much. A good rule of thumb is to log at INFO level for business-significant events, WARN for things that might need human attention, and ERROR for actual failures. Reserve DEBUG and TRACE for development. You can always increase verbosity temporarily when diagnosing an issue.

What should I never log?

Never log passwords, API keys, authentication tokens, credit card numbers, or any personally identifiable information (PII) such as email addresses or national insurance numbers. Beyond the security risk, logging PII can put you in breach of GDPR and other data protection regulations. Use redaction or masking if you need to reference sensitive values.

How do I correlate logs across microservices?

Use a correlation ID (also called a trace ID or request ID). Generate a unique identifier when a request enters your system and pass it through every service call via HTTP headers. Include this ID in every log entry so you can trace a single request across all services using your log aggregation tool.

The Developer's Guide to Logging

Most developers only think about logging when something has already gone wrong. A production incident hits, you open the logs, and you find either a wall of noise or, worse, nothing useful at all. Good logging is a skill that separates reactive debugging from proactive observability.

In my experience working across multiple production systems over the past decade, the teams that invest in logging early are the ones that sleep through the night. I have personally spent countless hours sifting through poorly structured logs at 3am, and those experiences shaped every recommendation in this guide. What follows are practical logging strategies that will help you debug faster, monitor your systems effectively, and avoid the common pitfalls that make logs useless when you need them most.

Why Logging Matters More Than You Think

Logging is your primary window into what your application is doing in production. Unlike local development, you cannot attach a debugger to a live server serving thousands of requests. Your logs are often the only evidence you have when diagnosing a 3am incident.

Beyond debugging, well-structured logs enable capacity planning, security auditing, and business analytics. They feed into alerting systems that catch problems before your users do. According to the Splunk State of Observability report ↗, organisations with mature observability practices resolve incidents 69% faster than those without. A separate New Relic Observability Forecast ↗ found that teams practising full-stack observability experience 60% fewer outages annually. Investing in logging early saves you from expensive firefighting later.

Choosing the Right Log Levels

Every logging framework supports severity levels, but surprisingly few teams use them consistently. Here is how to think about each level.

Level	Purpose	Production Default	Example
TRACE	Extremely granular detail	Off	Variable contents inside a loop
DEBUG	Diagnostic information	Off	Method entry/exit, intermediate calculations
INFO	Routine operations	On	Server startup, job completed, user authenticated
WARN	Unexpected but handled	On	Retry succeeded, cache miss, deprecated endpoint hit
ERROR	Failures needing investigation	On	Database timeout, API error, file write failure
FATAL	Unrecoverable, app shutting down	On	Out of memory, critical config missing

TRACE and DEBUG

These are for development and deep diagnostics. TRACE captures extremely granular detail, such as the contents of every variable in a loop. DEBUG records diagnostic information like method entry and exit points or intermediate calculation results.

Keep these turned off in production by default. The volume they generate can overwhelm your storage and make it harder to find the signals that matter. I have seen teams generate over 500GB of logs per day simply because DEBUG was left on in production after a debugging session. One team I worked with discovered their monthly logging bill had ballooned to over 4,000 pounds before they realised the root cause.

INFO

INFO is your workhorse level. Use it for events that confirm your application is behaving normally: server startup, configuration loaded, scheduled job completed, user authenticated. These entries form a timeline of your application’s life.

A good INFO log tells you what happened and when, without requiring you to read surrounding context.

WARN

WARN signals something unexpected that your application handled gracefully. A retry that succeeded, a cache miss that fell back to the database, or a deprecated API endpoint that is still receiving traffic. These are not emergencies, but they deserve attention during regular review.

ERROR and FATAL

ERROR means something failed and needs investigation. A database query timed out, an external API returned an unexpected response, or a file could not be written. FATAL means the application cannot continue and is shutting down.

Always include the exception or error message, a stack trace where available, and enough context to reproduce the problem. Working with teams over the years, I have found that the single biggest logging mistake is logging an error without the context needed to reproduce it.

Common Logging Mistake	Impact	Fix
Logging errors without context	Cannot reproduce bugs	Always include request ID, user ID, input parameters
Leaving DEBUG on in production	Storage costs, signal buried in noise	Use runtime-configurable log levels
Logging PII in plain text	GDPR violations, security risk	Redact or hash sensitive fields
Inconsistent log formats	Breaks aggregation and search	Adopt structured logging with a shared schema
No correlation IDs	Cannot trace requests across services	Generate and propagate request IDs

Structured Logging: Stop Writing Plain Text

If you are still logging strings like "User 12345 placed order for £50.00", you are making your future self’s job harder. Structured logging outputs each entry as a set of key-value pairs, typically in JSON format.

{
  "timestamp": "2026-02-09T14:23:01Z",
  "level": "INFO",
  "message": "Order placed",
  "userId": 12345,
  "orderId": "ORD-98765",
  "amount": 50.00,
  "currency": "GBP"
}

This format lets you filter, aggregate, and search across millions of log entries. Want to find all errors for a specific user? That is a simple query. Want to calculate average order value from your logs? Also straightforward. With plain text, both of those tasks require fragile regex parsing.

Structured Logging Libraries by Language

Language	Recommended Library	Notes
Node.js	pino, winston	pino is faster; winston has more transports
Python	structlog, built-in logging	structlog offers cleaner API
Java	SLF4J + Logback (JSON encoder)	Industry standard
Go	zerolog, zap	Both offer high-performance structured output
Ruby	Semantic Logger	Integrates well with Rails
.NET	Serilog	Excellent structured logging with enrichers

The pino documentation ↗ provides excellent examples of structured logging patterns in Node.js if you want to see this in practice.

What to Log (and What Not To)

Always Log

Request and response metadata: HTTP method, path, status code, response time, and correlation ID.
Business events: User registration, payment processed, subscription cancelled. These are invaluable for both debugging and analytics.
State transitions: Order status changes, deployment events, feature flag toggles.
Errors with context: The error message alone is rarely enough. Include the input that caused the failure, the state of the system, and any relevant identifiers.

Never Log

Secrets: Passwords, API keys, tokens, and connection strings. Even in DEBUG mode, these should be redacted.
PII without justification: Email addresses, phone numbers, and other personal data create GDPR liability. If you must log an identifier, use a hashed or tokenised version.
High-frequency noise: Logging every iteration of a tight loop or every healthcheck response will bury the useful information.

Correlation IDs: Tracing Requests Across Services

In a distributed system, a single user action might trigger calls across five or ten services. Without a way to link those calls together, debugging becomes guesswork. If you are working with microservices, correlation IDs are not optional; they are essential.

The solution is a correlation ID. Generate a UUID when a request enters your system at the API gateway or load balancer. Pass it downstream via an HTTP header (commonly X-Request-ID or X-Correlation-ID). Every service includes this ID in its log entries.

// Express middleware example
app.use((req, res, next) => {
  req.correlationId = req.headers['x-correlation-id'] || crypto.randomUUID();
  res.setHeader('x-correlation-id', req.correlationId);
  next();
});

When an incident occurs, you search for that single ID and get the complete picture across every service. This practice ties closely into the broader discipline of observability. Building robust, traceable APIs depends on getting this right from the start.

Log Aggregation and Centralisation

Logs sitting on individual servers are useful only if you know which server to check. For any system with more than one instance, centralise your logs using a dedicated platform.

Popular options include the ELK stack (Elasticsearch, Logstash, Kibana), Grafana Loki (lightweight and cost-effective), Datadog, and AWS CloudWatch. The choice depends on your budget, scale, and existing infrastructure.

Whichever tool you choose, ensure your logs are searchable within seconds of being emitted. A log aggregation system with a 15-minute delay is significantly less useful during an active incident.

Alerting on Log Patterns

Centralised logs become even more powerful when you build alerts on top of them. Configure your monitoring tool to notify you when specific patterns emerge.

Error rate exceeds a threshold (for example, more than 5% of requests returning 500 errors)
A specific error message appears for the first time
A critical business event stops occurring (for example, zero orders processed in the last 30 minutes)

Alerting turns your logs from a passive record into an active early warning system. If you are running a mature CI/CD pipeline, your alerting should be as well-tested as your deployment process.

Performance Considerations

Logging is not free. Every log statement involves string formatting, I/O operations, and potentially network calls if you are shipping logs to a remote service. At high throughput, careless logging can measurably impact your application’s latency.

Write logs asynchronously wherever possible. Buffer entries and flush them in batches rather than writing each one individually. Use sampling for extremely high-volume events, logging one in every hundred healthcheck responses rather than all of them.

Most importantly, make your log levels configurable at runtime. The ability to temporarily increase verbosity on a single service without redeploying is invaluable for diagnosing production issues. Feature flags can be an effective mechanism for toggling log verbosity in production.

Getting Started

If your current logging is inconsistent or minimal, start with three changes. First, adopt structured logging with a consistent schema across all your services. Second, add correlation IDs to every request that crosses a service boundary. Third, centralise your logs into a searchable platform with basic alerting.

These three steps will transform your ability to understand and debug your systems. For guidance on the broader monitoring picture, see our guide to observability vs monitoring. If you want to strengthen how your applications handle failures gracefully before they become log entries, our guide to building resilient APIs is a natural next step. Everything else is refinement.

The Developer's Guide to Logging

Why Logging Matters More Than You Think

Choosing the Right Log Levels

TRACE and DEBUG

INFO

WARN

ERROR and FATAL

Structured Logging: Stop Writing Plain Text

Structured Logging Libraries by Language

What to Log (and What Not To)

Always Log

Never Log

Correlation IDs: Tracing Requests Across Services

Log Aggregation and Centralisation

Alerting on Log Patterns

Performance Considerations

Getting Started

Frequently asked questions

Comments

The Developer's Guide to Logging

Why Logging Matters More Than You Think

Choosing the Right Log Levels

TRACE and DEBUG

INFO

WARN

ERROR and FATAL

Structured Logging: Stop Writing Plain Text

Structured Logging Libraries by Language

What to Log (and What Not To)

Always Log

Never Log

Correlation IDs: Tracing Requests Across Services

Log Aggregation and Centralisation

Alerting on Log Patterns

Performance Considerations

Getting Started

Frequently asked questions

Comments

Related articles

Caching Patterns Explained: A Practical Catalogue with Code Examples

The Developer's Guide to API Versioning

The Developer's Guide to WebSockets