Berserk Docs

Service Metrics

OpenTelemetry metrics emitted by Berserk services

All Berserk services emit metrics via OpenTelemetry. Metrics are exported over OTLP to the configured collector endpoint and can be queried in Berserk itself.

Each metric name is prefixed with bzrk. followed by the service scope (e.g. bzrk.ui.query_duration).

A pre-built Grafana dashboard is available for download: bzrk-service-metrics.json. Import it into Grafana and select your Berserk datasource to visualize all metrics below.

Ingest

Simplified OTLP ingest service that receives traces, metrics, and logs over HTTP/gRPC and uploads to S3 via ingest_client.

MetricTypeUnitDescription
bzrk.ingest.queue_rejectionscounterTotal requests rejected due to admission control (semaphore exhaustion or dead stream actor)
bzrk.ingest.batch_flush_durationhistogrammsDuration of batch flush operations (S3 upload latency)
bzrk.ingest.batch_inputshistogramitemsIncoming OTLP requests coalesced into one S3 batch flush. p50/p99 sizes the fan-in expected at the batch upload span links.
bzrk.ingest.data_droppedcounterTotal requests dropped before reaching a stream — missing/unresolvable ingest token, or routing failed because the target stream actor died
bzrk.ingest.time_since_last_upload_secondsgaugesWorst-silent stream's seconds since last successful S3 upload
bzrk.ingest.inflight_requestsgaugeAdmission permits in use across HTTP/gRPC/Loki transports
bzrk.ingest.buffer_bytesgaugebytesBytes buffered across stream actors (pod-level SUM)
bzrk.ingest.inflight_bytesgaugebytesIn-flight bytes reserved against the admission byte budget — the primary memory gate. Leading indicator of byte-budget pressure before throttling starts.
bzrk.ingest.inflight_bytes_limitgaugebytesTotal admission byte budget (auto-sized to a fraction of the cgroup memory limit). Constant; lets dashboards compute utilization without joining config.
bzrk.ingest.process_rss_bytesgaugebytesSampled process resident set size feeding the memory-ceiling admission gate
bzrk.ingest.memory_ceiling_bytesgaugebytesRSS ceiling above which admission sheds (0 when the memory gate is disabled)
bzrk.ingest.phantom_write_retriescounterTotal retry attempts after an uncertain upload (held by PendingRetry)
bzrk.ingest.phantom_write_retry_budget_exhaustedcounterTotal times a PendingRetry exhausted its budget without resolving (fell back to retryable error)
bzrk.ingest.invalid_otlp_totalcounterOTLP payloads rejected by the fast-path validator. Attributes: signal (traces

Janitor

Background service responsible for segment lifecycle management: merging small segments into larger ones, deleting tombstoned segments from cloud storage, and running probe queries to monitor query service health.

MetricTypeUnitDescription
bzrk.janitor.segment_countgaugeCurrent number of segments in the cluster
bzrk.janitor.total_data_sizegaugebytesTotal size of all segment data in cloud storage
bzrk.janitor.segments_deletedcounterTotal segments deleted from cloud storage
bzrk.janitor.merge_cycle_durationhistogrammsDuration of segment merge cycles
bzrk.janitor.merge_failurescounterTotal failed merge cycles
bzrk.janitor.probe_durationhistogrammsDuration of probe query executions
bzrk.janitor.vsearch_merger_artifacts_emittedcounterMerged segments where the merger rebuilt VCEN/VTPH/VTPC/VIDF. See docs/dev/vidx-vsearch-impl-plan.md PR 8.
bzrk.janitor.vsearch_merger_pre_feature_inputscounterInput segments to a merge that had no vsearch artifacts (pre-feature). Each increment indicates one input was skipped during VXXX rebuild.
bzrk.janitor.vsearch_merger_unstamped_rowscounterRows seen during merge that lacked a template_id.FIELD stamp. Persistent nonzero rate indicates ingest-side stamping isn't keeping up with merger fan-in.
bzrk.janitor.vsearch_merger_template_index_bytesgaugeByPeak transient JanitorTemplateIndex memory during a merge (sum of input VTPC embedding tables). docs/dev/vidx-vsearch.md 8.2 caps the risk; this metric flags the cap binding before OOM.
bzrk.janitor.vsearch_merger_duration_mshistogrammsWall-clock added to a merge by VXXX rebuild (loading input VTPCs + tier selection + writing output VCEN/VTPH/VTPC/VIDF). Excludes the base ROWS-merger time.
bzrk.janitor.probes_completedcounterTotal probe queries that completed successfully. Used as the canonical 'query service is reachable' signal — the rate_below alert below fires when the count stops arriving, which only happens if the query service is genuinely unavailable or the janitor itself is stuck. See .claude/skills/berserk-observability/references/alert-framework.md for the canary design.

Nursery

Ingestion service that receives OpenTelemetry data from the collector, converts it into segments, and manages segment merging for optimal query performance.

MetricTypeUnitDescription
bzrk.nursery.streams_activeup_down_counterNumber of currently active stream followers
bzrk.nursery.ingest_lag_secondsgaugesLag of the most-stale active stream (seconds since its last ingest_time)
bzrk.nursery.download_duration_mshistogrammsS3 segment download duration
bzrk.nursery.conversion_duration_mshistogrammsProtobuf to segment conversion duration
bzrk.nursery.total_duration_mshistogrammsTotal segment processing duration (download + conversion)
bzrk.nursery.bytes_ingestedcounterByTotal compressed bytes downloaded from S3 (use rate() for throughput)
bzrk.nursery.bytes_ingested_uncompressedcounterByTotal uncompressed proto bytes ingested (use rate() for throughput)
bzrk.nursery.segment_output_bytescounterByTotal bytes of segment files produced (use rate() for throughput)
bzrk.nursery.data_errorscounterData errors (malformed protobuf, conversion failures)
bzrk.nursery.infra_errorscounterInfrastructure errors (S3 failures, I/O errors)
bzrk.nursery.active_streamsgaugeNumber of active streams reported by Meta
bzrk.nursery.closed_streamsgaugeNumber of closed streams reported by Meta
bzrk.nursery.merge_countcounterTotal number of completed merges
bzrk.nursery.merge_inputshistogramsegmentsIngest segments consumed by one baby-segment merge. p50/p99 sizes the fan-in expected at the nursery merge span links.
bzrk.nursery.merge_output_size_mbhistogramMBCompressed output size of merged segments
bzrk.nursery.merge_durationhistogrammsDuration of segment merge operations
bzrk.nursery.merge_speed_mbpshistogramMB/sMerge throughput in megabytes per second
bzrk.nursery.oldest_unmerged_data_age_secondsgaugesAge of the oldest unmerged baby segment in seconds
bzrk.nursery.events_ingestedcounterTotal events ingested across all streams
bzrk.nursery.forward_dated_events_clampedcounterEvents whose OTLP timestamp was in the future relative to ingest_time and got clamped to ingest_time on write. Never drops the row. Non-zero indicates clock skew at the source — the map-reduce-state cache's monotonic-ingest invariant still holds because the row's timestamp is now ≤ its ingest_time.
bzrk.nursery.ingest_delayhistogrammsDelay between event timestamp and ingest time
bzrk.nursery.routing_unknown_tablecounterDropped segments where the routing key did not match any table in the token's database
bzrk.nursery.vsearch_seal_artifacts_emittedcounterSegments whose seal wrote VCEN/VTPH/VTPC/VIDF. Increments by 1 per sealed segment that produced vsearch artifacts.
bzrk.nursery.vsearch_seal_artifacts_skippedcounterSegments where seal skipped vsearch artifact emission (no model configured, or no vsearch_fields). Increments by 1 per such segment.
bzrk.nursery.vsearch_embedding_cache_hitscounterTemplate-hash cache hits in the seal-time embedding cache. High hit rate = log data is template-clustered as expected.
bzrk.nursery.vsearch_embedding_cache_missescounterTemplate-hash cache misses — model.encode() invocations at seal time.
bzrk.nursery.vsearch_merger_artifacts_emittedcounterMerged segments where the merger rebuilt VCEN/VTPH/VTPC/VIDF. See docs/dev/vidx-vsearch-impl-plan.md PR 8.
bzrk.nursery.vsearch_merger_pre_feature_inputscounterInput segments to a merge that had no vsearch artifacts (pre-feature). Each increment indicates one input was skipped during VXXX rebuild.
bzrk.nursery.vsearch_merger_unstamped_rowscounterRows seen during merge that lacked a template_id.FIELD stamp. Persistent nonzero rate indicates ingest-side stamping isn't keeping up with merger fan-in.
bzrk.nursery.vsearch_merger_template_index_bytesgaugeByPeak transient JanitorTemplateIndex memory during a merge (sum of input VTPC embedding tables). docs/dev/vidx-vsearch.md 8.2 caps the risk; this metric flags the cap binding before OOM.
bzrk.nursery.vsearch_merger_duration_mshistogrammsWall-clock added to a merge by VXXX rebuild (loading input VTPCs + tier selection + writing output VCEN/VTPH/VTPC/VIDF). Excludes the base ROWS-merger time.

Query

Query execution service that receives KQL queries over HTTP and gRPC, plans and executes them against segments, and streams results back to clients.

MetricTypeUnitDescription
bzrk.query.execution_durationhistogrammsEnd-to-end query execution duration
bzrk.query.requestscounterTotal query requests received
bzrk.query.result_rowshistogramNumber of rows returned per query
bzrk.query.errorscounterTotal query errors by error type
bzrk.query.open_fdsgaugebzrk_lib::count_open_fds() periodic sample (10s interval).
Pair with bzrk.query.fd_limit to compute open_fds / fd_limit
on dashboards/alerts without joining against startup logs.
apps/query in cache_mode=remote holds a UDS connection per worker
task plus the SCM_RIGHTS cache_fd + shm_fd passed by cache_server,
so the count tracks engine concurrency directly. Symmetric with
bzrk.cache_server.open_fds.
bzrk.query.fd_limitgaugeCurrent RLIMIT_NOFILE soft cap. Companion to open_fds
sampled on the same 10s tick so dashboards can show
"fds: N / LIMIT (X%)" and alerts can fire on
open_fds / fd_limit > 0.8 before saturation. Production
binaries raise the soft limit to the hard cap at startup, so
this is effectively static; emitting it as a gauge keeps the
query simple.
bzrk.query.routing_decisionscounterSessions opened by PoolBackedQwsTransport, attributed by the
routing decision taken (mode):
  • sticky — QC supplied a target_node_id and the live member was found in the pool snapshot. Ring routing held end-to-end.
  • fallback_walk — target_node_id supplied, but the targeted member was gone from the live snapshot. Fell back to the partition ring's next-priority node for the batch's first segment.
  • fallback_round_robin — target_node_id supplied, member was gone, and the priority walk failed too (snapshot didn't cover the segment). Degraded to round-robin: NOT ring-aware. A non-zero rate here is the signature of seed-vs-read disagreement surviving the ring's protections.
  • no_target_round_robin — QC didn't supply a target_node_id at all (no partition snapshot at coordinator time → bootstrap / in-process / non-sticky). NOT ring-aware. Sustained non-zero on a fully-bootstrapped cluster means the coordinator never saw a snapshot.
  • pool_empty — no QwsCloud members visible at session-open. Session will fail on first send; pool likely re-populating after a pod rollout. The counter exists to spot the two "NOT ring-aware" rows showing up at non-trivial rates — the failure mode that would let a query land on a non-owner pod and cold-fetch a freshly-seeded segment. | | bzrk.query.vsearch_queries | counter | — | vsearch queries handled by the coordinator (one increment per query containing a vsearch operator). | | bzrk.query.vsearch_query_latency_ms | histogram | ms | End-to-end vsearch query latency (coordinator encode + worker scatter/gather + reducer merge). | | bzrk.query.vsearch_segment_ctx_built | counter | — | Successful per-segment SegmentVsearchContext builds during query execution (segment had VCEN+VTPH for the queried field). | | bzrk.query.vsearch_segment_ctx_skipped | counter | — | Per-segment context builds that returned None (segment lacked VCEN or VTPH for the queried field — pre-feature or wrong field_id). Worker degrades to BM25-only for these. | | bzrk.query.vsearch_chunk_gate_admits | counter | — | ROWS chunks admitted by the vsearch tier-1 gate (composite alpha*template_sim_ub + (1-alpha)*bm25_max >= tau_chunk). | | bzrk.query.vsearch_chunk_gate_drops | counter | — | ROWS chunks dropped by the vsearch tier-1 gate before row scan. | | bzrk.query.vsearch_degraded_to_bm25 | counter | — | Queries where the binder set degraded_to_bm25=true on the VSearchScore op (lineage-modifying upstream op — parse/mv-expand). Per-query, not per-row. | | bzrk.query.vsearch_precompute_duration_us | histogram | us | Per-(segment, query) SegmentPrecompute build time. docs/dev/vidx-vsearch.md 5.6 estimates ~10-30 us at defaults. |

Ui

Web UI for querying Berserk.

MetricTypeUnitDescription
bzrk.ui.query_durationhistogrammsDuration of proxied queries from start to stream completion
bzrk.ui.site_visitscounterNumber of page visits to the UI
bzrk.ui.browser_span_durationhistogrammsDuration of spans reported by the browser via /api/telemetry/spans

On this page