Log Levels
There are generally 6 different accepted levels of logging: FATAL, ERROR, WARN, INFO, DEBUG, and TRACE. I think we can all imagine use cases for ERROR, INFO, and DEBUG, but what about the rest? When is something TRACE and not DEBUG, or WARN instead of ERROR.
Here is where I've landed:
- When something goes wrong:
FATAL: an error so bad the entire process mustexit(1)ERROR: an error which caused the current task to fail completelyWARN: an error that happened, but the current task can recover and carry on
- What's going on, these are separated by their intended audience:
INFO: broad runtime information for someone running the application under normal circumstancesDEBUG: detailed information useful to a user trying to debug problem with their instanceTRACE: extremely detailed information useful to a developer of the application
Note that I don't take credit for these definitions. I owe that to random people on Reddit and Stackoverflow.
FATAL
Indicates a critical failure that makes the application unable to continue operating.
Log at a FATAL level when an error occurs that the application cannot possibly recover from.
After logging at this level, the process should exit with a non-zero status code.
Examples:
FATAL: Failed to bind to port 8080
FATAL: Database connection pool exhausted
FATAL: Configuration file missing at /etc/app/config.yaml
ERROR
Indicates that a specific operation or task has failed, but the application can continue running.
Use ERROR when a user request or internal task fails completely and cannot be retried.
The application as a whole continues to run, but that particular task never completed.
Examples:
ERROR: Failed to process payment for order #12345
ERROR: User registration failed: duplicate email in database
ERROR: Failed to send notification email to [email protected]
ERROR: Unable to route request to '/api/user/123', route unknown
WARN
Indicates something unexpected or concerning happened that requires attention, but doesn't prevent the application from continuing operation.
Log at WARN level for two categories of issues:
- Recoverable errors: A specific task failed but the application recovered and can continue
- Serious operational issues: System health concerns that don't immediately stop the application, but need investigation
Examples:
WARN: Database connection failed, retrying
WARN: Disk usage at 95%
WARN: API response time exceeded 1s threshold, took 3.2s
WARN: Deprecated config option 'legacyAuth' used, this will be removed in the future
WARN: Retrying failed API call to payment service, attempt 2/3, backoff: 1000ms
WARN: Selected database replica-2 for read operation, replica-1 unhealthy
WARN: User lookup executed in 236ms, exceeded threshold
INFO
Provides visibility into the normal operation of the application.
Log at INFO level to document the normal operation of the application.
This means successful operations, state changes, important lifecycle events, etc.
This is what a system operator should see in production.
Examples:
INFO: Server started on port 8080
INFO: User '[email protected]' logged in successfully
INFO: Processed 1500 items in batch job in 5.3s
INFO: Database migration from v5 to v6 completed
DEBUG
Provides detailed diagnostic information for troubleshooting operational issues.
Use DEBUG when a user or operator needs to understand what the application is doing to diagnose a problem.
This could be high-level task inputs, request routing, major decision branches taken, API calls to external services, disk reads, etc.
Examples:
DEBUG: Loaded config from /etc/app/config.yaml
DEBUG: Request headers: {Authorization: Bearer xxx, Content-Type: application/json}
DEBUG: Routed request POST '/api/orders' to OrderController.create()
DEBUG: Requesting payment for order #55685 user '[email protected]'
DEBUG: Rate limit check high for user '[email protected]': 95/100 requests used
DEBUG: Task 'send_weekly_digest' skipped: feature flag disabled
DEBUG: Validation failed for field 'email': must match pattern /^...$/
DEBUG: Configuration reloaded from file (changes: log_level, max_connections)
DEBUG: Queue 'send_order_email' processed 10 items
DEBUG: Opening file '/tmp/order.csv' for import
DEBUG: Cache miss for key '[email protected]', fetching from database
DEBUG: User lookup executed in 32ms
TRACE
Provides extremely detailed information about application internals intended for developers debugging issues.
Use TRACE only when logging information useful during development or when providing information to assist a future user debugging something specific.
Examples:
TRACE: Rate limit check for user '[email protected]': 12/100 requests used
TRACE: Enter order_create(), user: [email protected], items: {id:36, quantity: 3}, ...
TRACE: User lookup query: SELECT * FROM users WHERE email = '[email protected]'
TRACE: User '[email protected]' found
TRACE: Loop iteration 1/17: processing item at index 1, id:36, quantity: 3
TRACE: Exiting order_create(), finished in 67ms
DEBUG vs TRACE
The distinction between DEBUG and TRACE is inherently fuzzy.
Ask three developers where the line is, and you'll get four different answers.
The truth is: there's no universal rule that works for every application.
The examples in this post show one way to think about it: DEBUG for operators debugging issues, TRACE for developers tracing internals.
Your application's users and use case might demand a different split.
Some teams log:
- Individual metric readings at
DEBUG, aggregated summaries atINFO - SQL queries at
TRACE, but high-level query decisions atDEBUG - Function entry/exit at
TRACEfor internal library code, butDEBUGfor application code - Everything at
DEBUGand ignoreTRACEentirely
What matters more than precision is consistency. Decide where your team draws the line, document it, and stick with it. The examples above are meant to inspire, not dictate. Your logging scheme should ultimately serve your actual debugging workflow, not philosophy.