Skip to content

[Bug]: Adaptive sampling doesn't work as expected #6550

Open
@beyimjan

Description

What happened?

I have set up Jaeger Collector to get traces from over 70+ services in production. The throughput of traces is large enough to fill 400 GiB every day.

I used the following instruction to set up adaptive sampling on the collector:
Adaptive Sampling Setup

I use ScyllaDB 5.1.5.

Even though I set up adaptive sampling using the instruction, prod-jaeger-collector:14268/api/sampling for every service returns the same default sampling configuration without calculating probabilities:

{  
  "strategyType": "PROBABILISTIC",  
  "operationSampling": {  
    "defaultSamplingProbability": 0.001,  
    "defaultLowerBoundTracesPerSecond": 0.016666666666666666,  
    "perOperationStrategies": [],  
    "defaultUpperBoundTracesPerSecond": 0  
  }  
}

Steps to reproduce

  1. Set up Jaeger Collector with adaptive sampling using the provided instructions.
  2. Configure the collector to connect to ScyllaDB 5.1.5.
  3. Query the sampling endpoint prod-jaeger-collector:14268/api/sampling.

Expected behavior

I expected the adaptive sampling configuration to calculate probabilities and provide different sampling configurations for different services, rather than returning the default configuration.

Relevant log output

Screenshot

No response

Additional context

No response

Jaeger backend version

v1.53.0

SDK

OpenTelemetry SDKs

Pipeline

OTEL SDK -> OTEL collector -> Jaeger Collector -> ScyllaDB

Stogage backend

ScyllaDB 5.1.5

Operating system

Linux

Deployment model

Kubernetes

Deployment configs

Helm chart 0.70.0

ingester:
  enabled: false
agent:
  enabled: false
spark:
  enabled: false
esIndexCleaner:
  enabled: false
esRollover:
  enabled: false
esLookback:
  enabled: false
hotrod:
  enabled: false
collector:
  enabled: true
  image: jaegertracing/jaeger-collector
  tag: 1.53.0
  replicaCount: 1
  podSecurityContext: {}
  securityContext: {}
  resources:
    limits:
      cpu: 1
      memory: 1Gi
    requests:
      cpu: 500m
      memory: 512Mi
  service:
    otlp:
      grpc:
        port: 4317
      http:
        port: 4318


result for collector pod:


containers:
  - args:
      - '--sampling.initial-sampling-probability=0.001'
      - '--sampling.target-samples-per-second=1'
    env:
      - name: SAMPLING_STORAGE_TYPE
        value: cassandra
      - name: SAMPLING_CONFIG_TYPE
        value: adaptive
      - name: COLLECTOR_OTLP_ENABLED
        value: 'true'
      - name: SPAN_STORAGE_TYPE
        value: cassandra
      - name: CASSANDRA_SERVERS
        value: <CASSANDRA_SERVERS>
      - name: CASSANDRA_PORT
        value: '9042'
      - name: CASSANDRA_KEYSPACE
        value: jaeger_v1_scylla_prod
      - name: CASSANDRA_USERNAME
        value: jaeger
      - name: CASSANDRA_PASSWORD
        valueFrom:
          secretKeyRef:
            key: password
            name: prod-jaeger-cassandra
    image: jaegertracing/jaeger-collector:1.53.0

Activity

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions