How to Install the Hermes Agent: A Step-by-Step Guide

This guide walks you through everything you need to know to install the Hermes Agent, from prerequisites to verification, and shows real‑world outcomes.

Why You Need the Hermes Agent: Context and Benefits

The monitoring gap in modern stacks

Moving a monolith into a collection of Docker containers often reveals blind spots that were hidden in a single process. An APM that reports request rates for service-a may no longer provide a complete picture once traffic passes through an nginx ingress, a side‑car proxy, and a Kafka consumer.

Fragmented data sources. Metrics live in Prometheus, traces in Jaeger, and logs in Elasticsearch. Correlating a latency spike with a specific log entry can require a manual “join” that takes minutes.
High‑cardinality overload. Adding a label such as user_id to a metric can explode the series count in Prometheus, leading to scrape failures and dropped data.
Cold‑start blind spots. Serverless functions spin up on demand, and traditional agents often miss the first few milliseconds of execution—the window where many timeouts occur.
Resource‑constrained environments. Edge devices and IoT gateways may have sub‑100 MiB memory budgets, making a full‑blown collector impractical.

Typical symptoms include logs that show an error, metrics that indicate a CPU spike, and missing trace data because a container terminated before the exporter flushed its buffers. Resolving such incidents can take days without a unified telemetry pipeline.

Visibility is siloed by technology stack.
Instrumentation overhead can be prohibitive in low‑resource contexts.
Correlation latency—the time between an event occurring and it appearing in a dashboard—can be measured in minutes rather than seconds.

Enter the Hermes Agent. It sits at the intersection of these gaps, providing a single data pipeline that respects the constraints of modern, heterogeneous environments.

Key capabilities that set Hermes apart

What makes Hermes worth the extra minutes in the install script? Below are the features that have convinced many teams to replace multiple agents with a single, unified solution.

Zero‑code auto‑instrumentation. Drop the binary into a container image and Hermes automatically wraps HTTP servers, database drivers, and message queues. For example, a Flask app gains request tracing with just:

# Dockerfile snippet
FROM python:3.11-slim
RUN pip install flask hermes-agent
COPY . /app
WORKDIR /app
ENV HERMES_AUTO_INSTRUMENT=1
CMD ["hermes-agent", "python", "app.py"]

Low CPU overhead. Benchmarks show a Go microservice handling 10 k RPS with an average CPU increase of less than 2 % after Hermes is attached, compared with a higher increase when separate Prometheus and Jaeger exporters are used.
Unified telemetry model. Metrics, traces, and logs share a common identifier (hermes.trace_id) that propagates across services, containers, and serverless invocations, eliminating manual “correlate by request ID” steps.
Dynamic sampling at the edge. Hermes can drop the majority of low‑value spans while preserving all error paths, without requiring a central configuration push. The policy is expressed in a compact YAML block that lives alongside the agent binary.
Built‑in high‑cardinality handling. Instead of exploding a Prometheus series, Hermes aggregates cardinal data into a sketch (HyperLogLog) and ships it as a single metric, keeping series counts within typical limits.
Edge‑first processing. The agent can execute a user‑defined Lua script to redact PII or enrich logs before they leave the host. A short script can strip email fields from outgoing JSON payloads, simplifying compliance efforts.
Service‑mesh awareness. When running with Istio, Hermes automatically reads Envoy side‑car metrics and merges them with application‑level data, presenting a single view of request latency broken down by mesh hop.

# hermes-agent.yaml
sampling:
  default_rate: 0.02        # 2 % of normal traffic
  error_rate: 1.0           # 100 % of error traces
  rules:
    - path: /healthz
      rate: 0.0              # never sample health checks
enrichment:
  lua: |
    function enrich(record)
      if record["email"] then
        record["email"] = "[REDACTED]"
      end
      return record
    end

Deploying this configuration alongside the agent turns a noisy production environment into a manageable data stream, allowing SRE teams to pinpoint issues quickly.

Providing a single point of truth for metrics, traces, and logs.
Keeping the performance impact low enough for edge devices and high‑traffic services alike.
Enabling real‑time correlation without manual stitching, thanks to a shared identifier.