Jump to content

Logstash/MicroHOWTO

From Wikitech

Logging micro-HOWTO

This page describes how to generate logs from internal services. Logs are usually captured by journald and then sent to Logstash.

Logs should be ECS compliant.

Use key-value pairs to AKA structured logging to allow better searching, filtering and visualization.

For example, logging mytool: auth.connection_pool: connected to host is not machine parsable.

Instead, the same log could could be: span=auth.connection_pool event.outcome=success organization.name=wmf.myteam transaction.id=123456 msg=connected to host.

Here is a set of useful key-value pairs:

Generic

Tracing

Allows tracking events across multiple components and multiple hosts.

  • span - describes one or few activities
  • trace.id - trace a series of logs across multiple systems
  • span.id - a single operation nested under trace.id
  • transaction.id - the highest level of work measured within a service, such as a request to a server

See https://www.elastic.co/guide/en/ecs/current/ecs-tracing.html

Errors

  • error.code - usually a short unique code
  • error.message - full message
  • error.stack_trace - full stack trace in plain text

See https://www.elastic.co/guide/en/ecs/current/ecs-error.html

For a full list of supported keywords see https://www.elastic.co/guide/en/ecs/current/ecs-field-reference.html

Python example

The structlog library helps making tracing easier.

import structlog
import logging
import sys

structlog.configure(
    processors=[
        structlog.processors.TimeStamper(fmt="iso"),
        structlog.processors.JSONRenderer()
    ],
    wrapper_class=structlog.make_filtering_bound_logger(logging.INFO),
    context_class=dict,
    logger_factory=structlog.PrintLoggerFactory(file=sys.stdout),
)
# Global, persistent logger
logger = structlog.get_logger().bind(
    organization_name="wmf.myteam",
    app_name="myservice"
)

# Typically a temporary logger is created for each incoming request. Sets trace_id from the request (or create a new unique one).

log = logger.bind(trace_id="abc123")

log.bind(span="auth")  # binds span=auth to further logs

# now everything we log here has "span=auth" set to help track we are doing authentication stuff:
log.info(
    "user_logged_in",
    user_id="123",
    method="oauth",
    event_outcome="success",
)

# Authentication is finished, so we override span=auth with span=math
log.bind(span="math")

try:
    1 / 0
except ZeroDivisionError:
    # Logs the error including the full stack trace
    log.error(
        "division_by_zero",
        error_message="division by zero",
        exc_info=True
    )