The Problem

Observing distributed systems is not a trivial task. We have several machines running, each hosting workloads that minding their business, do network calls, and print logs. That makes one think “how on earth am I gonna keep it together?”.

One opinionated solution

Well, there’s a widely adopted tool that works as a central piece, gracefully turning chaos into an ordered flow. It is called OpenTelemetry Collector, and this article approaches it in an opinionated way.

Architecture overview

  • One otel-collector runs as a DaemonSet to capture and process metrics, logs, and traces
  • One trace-collector runs as a Deployment to reconstruct traces across different nodes
  • Each node gets a subset of app metrics to scrape
  • Metrics are scraped from Apps, Kubernetes, and the Host itself, and then forwarded to Victoriametrics
  • Log files are tailed, and forwarded to Victorialogs
  • Traces are received from app instrumentation, glued together in trace-collector, and forwarded to Victoriatraces.
    • They are also parsed to calls_total and duration_milliseconds_.* metrics

Introduction

The collector has three main pieces that are responsible for orchestrating the telemetry flow. Receivers, Processors, and Exporters. OpenTelemetry Collector has this repo opentelemetry-collector-contrib that offers, abundantly, options of these three.

These three pieces are then orchestrated by a Pipeline, and for traces <> metrics, transformed via Connector.

Installing the collector

The easiest way is via the operator, which enables several CustomResourceDefinitions. The one we are mainly interested is OpenTelemetryCollector. If you $ kubectl explain opentelemetrycollector --recursive, it can be a bit scary of how many settings there are. But we’ll approach the most relevant ones in this article.

Telemetry pipelines

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
metrics:
  receivers: 
    - hostmetrics
    - hostmetrics/disk # Get host metrics
    - kubeletstats # Get Kubernetes metrics
    - spanmetrics # Get parsed metrics from Traces
    - prometheus # Get scraped metrics from the Cluster's service monitors
  processors:
    - memory_limiter # Keep things cool
    - resource/instance # Set "instance" label so we know, in which machine, the daemonset pod has processed the telemetry
    - k8sattributes # Label metrics with infos from the Kubernetes workload
    - transform/metrics # Sanitize non-needed labels
    - batch # Batch stuff so we don't DDoS the timeseries backend
  exporters: 
    - prometheusremotewrite # Write to the telemetry backend

logs:
  receivers:
    - filelog # Tail container log files in the host
    - otlp # Possibly receive other logs from apps's OpenTelemetry instrumentation
  processors:
    - memory_limiter # Keep things cool
    - transform/logs # Drop unwanted fields
    - batch # Don't DDoS the log storage backend
  exporters: 
    - otlphttp/victoriametrics # Write to the log backend

traces:
  receivers: 
    - otlp # Receive traces from apps' instrumentation
  processors: 
    - memory_limiter # Keep things cool
    - k8sattributes # Enrich trace metadata
    - resource/instance # Say which node has processed the span
    - transform/spanmetrics # Parse spans to metrics
    - batch # Don't DDoS the trace storage backend
  exporters: 
    - spanmetrics # Send newly-parsed metrics
    - loadbalancing # Send spans to trace-collector for gathering cross-node spans

OpenTelemetry Collector has the features to make it happen. In fact, these above are pieces to achieve such telemetry flow.

Metrics

These come from four different sources

Are processed with

  • memory_limiter processor
    • Ensures the collector doesn’t trespass the max limit and crash; at the expense of losing data
  • resource processor
    • Label the metric arbitrarily
  • k8sattributes processor
    • Label the metric from Kubernetes resource label
  • transform processor
    • Add and remove metric labels aiming to keep cardinality lower as possible
  • batch processor
    • Suggestively, gathers metrics before sending them to save some network calls

And forwarded to

  • prometheusremotewrite exporter
    • Sends metrics to Victoriametrics’ insert workload
      • In my cluster, the prometheusremotewrite was noticeably much more memory friendly than otlphttp

Logs

These can come from two origins

  • filelog receiver
    • Tails pod log files logs (assuming containerd as the CRI), parses them to json when applicable, and extract resource metadata out of it
  • otlp receiver
    • To enable apps sending arbitrary logs not written in stdout/stderr

Are processed with

  • memory_limiter processor
    • Ensures the collector doesn’t trespass the max limit and crash; at the expense of losing data
  • transform processor
    • Aiming to keep storage lower as possible, drops everything we don’t need as a filterable field
  • batch processor

And forwarded to

  • otlphttp exporter
    • Sends to Victorialogs insert workload

Traces

These can come from a single origin

  • otlp receiver
    • App instrumentation reports traces via otlp to their node’s Collector, on the hostPort opened by the DaemonSet

Are processed with

And forwarded to

  • spanmetrics connector
    • Which is also one of the metrics’ pipeline receiver
  • loadbalancing exporter
    • Routes spans by traceID, so all spans are gathered in the same trace-collector, which is necessary for tail sampling

Appendix: Full manifests

otel-collector DaemonSet

  1
  2
  3
  4
  5
  6
  7
  8
  9
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
apiVersion: opentelemetry.io/v1beta1
kind: OpenTelemetryCollector
metadata:
  name: otel
spec:
  mode: daemonset
  image: ghcr.io/open-telemetry/opentelemetry-collector-releases/opentelemetry-collector-contrib:0.140.1
  resources:
    requests:
      cpu: 100m
      memory: 448Mi
    limits:
      cpu: "2"
      memory: 1536Mi
  ports:
    - name: grpc
      port: 4317
      hostPort: 4317
      targetPort: 4317
    - name: http
      port: 4318
      hostPort: 4318
      targetPort: 4318

  observability:
    metrics:
      enableMetrics: true
  env:
    - name: POD_IP
      valueFrom:
        fieldRef:
          fieldPath: status.podIP
    - name: K8S_NODE_NAME
      valueFrom:
        fieldRef:
          fieldPath: spec.nodeName

  targetAllocator:
    enabled: true
    image: ghcr.io/open-telemetry/opentelemetry-operator/target-allocator:0.138.0
    resources:
      requests:
        cpu: 25m
        memory: 128Mi
      limits:
        cpu: 250m
        memory: 256Mi
    allocationStrategy: per-node
    prometheusCR:
      enabled: true
      scrapeInterval: 30s
      serviceMonitorSelector: {}

  volumes:
    - name: varlogpods
      hostPath:
        path: /var/log/pods

  volumeMounts:
    - name: varlogpods
      mountPath: /var/log/pods

  config:
    extensions:
      health_check:
        endpoint: ${env:POD_IP}:13133

    # https://opentelemetry.io/docs/collector/components/receiver/
    receivers:
      otlp:
        protocols:
          grpc:
            endpoint: 0.0.0.0:4317
          http:
            endpoint: 0.0.0.0:4318
      filelog:
        include:
        - /var/log/pods/*/*/*.log
        include_file_name: false
        include_file_path: true
        retry_on_failure:
          enabled: true
        start_at: beginning
        operators:
        - id: parser-containerd
          type: regex_parser 
          regex: ^(?P<time>[^ ^Z]+Z) (?P<stream>stdout|stderr) (?P<logtag>[^ ]*) ?(?P<log>.*)$
          timestamp:
            layout: '%Y-%m-%dT%H:%M:%S.%LZ'
            parse_from: attributes.time

        - id: parser-pod-info
          parse_from: attributes["log.file.path"]
          regex: ^.*\/(?P<namespace>[^_]+)_(?P<pod_name>[^_]+)_(?P<uid>[a-f0-9\-]+)\/(?P<container_name>[^\._]+)\/(?P<restart_count>\d+)\.log$
          type: regex_parser

        # Handle line breaks
        - type: recombine
          is_last_entry: attributes.logtag == 'F'
          combine_field: attributes.log
          combine_with: ""
          max_batch_size: 1000
          max_log_size: 1048576
          output: handle_empty_log
          source_identifier: attributes["log.file.path"]
        - field: attributes.log
          id: handle_empty_log
          if: attributes.log == nil
          type: add
          value: ""

        - type: json_parser
          parse_from: attributes.log
          if: attributes.log matches "^\\{"

        - type: add
          field: attributes.instance
          value: ${env:K8S_NODE_NAME}

        - id: export
          type: noop

      hostmetrics:
        collection_interval: 30s
        root_path: /
        scrapers:
          cpu:
            enabled:
            metrics:
              system.cpu.time:
                enabled: true
              system.cpu.utilization:
                enabled: true
              system.cpu.physical.count:
                enabled: true
          memory:
            metrics:
              system.memory.usage:
                enabled: true
              system.memory.utilization:
                enabled: true
              system.memory.limit:
                enabled: true
          load:
            cpu_average: true
            metrics:
              system.cpu.load_average.1m:
                enabled: true
              system.cpu.load_average.5m:
                enabled: true
              system.cpu.load_average.15m:
                enabled: true
          network:
            metrics:
              system.network.connections:
                enabled: true
              system.network.dropped:
                enabled: true
              system.network.errors:
                enabled: true
              system.network.io:
                enabled: true
              system.network.packets:
                enabled: true
              system.network.conntrack.count:
                enabled: true
              system.network.conntrack.max:
                enabled: true

      hostmetrics/disk:
        collection_interval: 1m
        root_path: /
        scrapers:
          disk:
            metrics:
              system.disk.io:
                enabled: true
              system.disk.operations:
                enabled: true
          filesystem:
            metrics:
              system.filesystem.usage:
                enabled: true
              system.filesystem.utilization:
                enabled: true

      kubeletstats:
        collection_interval: 30s
        auth_type: "serviceAccount"
        endpoint: "https://${env:K8S_NODE_NAME}:10250"
        insecure_skip_verify: true
        collect_all_network_interfaces:
          node: true
          pod: true

      prometheus:
        target_allocator:
          collector_id: ${env:POD_NAME}
          endpoint: http://otel-targetallocator
          interval: 30s
        config:
          scrape_configs:
          - job_name: otel-collector
            scrape_interval: 30s
            static_configs:
              - targets:
                  - ${env:POD_IP}:8888

    # https://opentelemetry.io/docs/collector/components/processor/
    processors:
      memory_limiter:
        check_interval: 1s
        limit_percentage: 75
        spike_limit_percentage: 15
      batch:
        send_batch_max_size: 2048
        send_batch_size: 1024
        timeout: 1s
      k8sattributes:
        auth_type: "serviceAccount"
        passthrough: false
        filter:
          node_from_env_var: K8S_NODE_NAME
        extract:
          metadata:
            - k8s.namespace.name
            - k8s.deployment.name
            - k8s.replicaset.name
            - k8s.statefulset.name
            - k8s.daemonset.name
            - k8s.cronjob.name
            - k8s.job.name
            - k8s.node.name
            - k8s.pod.name
            - k8s.pod.ip
            - k8s.container.name
            - container.id
          labels:
            - tag_name: owner
              key: app.kubernetes.io/owner
              from: pod
        pod_association:
        - sources:
          - from: resource_attribute
            name: k8s.pod.ip
        - sources:
          - from: resource_attribute
            name: k8s.pod.name
          - from: resource_attribute
            name: k8s.namespace.name
        - sources:
          - from: connection
      resource/instance:
        attributes:
        # Sets "instance" label on metrics
        - action: upsert
          key: service.instance.id
          value: ${env:K8S_NODE_NAME}
      transform/logs:
        error_mode: ignore
        log_statements:
          # Keep only essential fields
          - statements:
            - set(log.attributes["namespace"], resource.attributes["namespace"])
            - keep_matching_keys(log.attributes, "^(_.*|@.*|filename|log|service|job|agent|k8s\\.|container_name|instance|level|msg|message|namespace|pod_name|severity|severity_text|stream)")
          - conditions: IsMap(log.body)
            statements:
              - keep_matching_keys(log.body, "^(level|msg|message|severity|severity_text)$")

      transform/metrics:
        error_mode: ignore
        metric_statements:
        - statements:
            - set(datapoint.attributes["env"], resource.attributes["k8s.cluster.name"])
            - set(datapoint.attributes["owner"], resource.attributes["owner"]) where resource.attributes["owner"] != nil
            - set(datapoint.attributes["namespace"], resource.attributes["k8s.namespace.name"]) where resource.attributes["k8s.namespace.name"] != nil and resource.attributes["k8s.namespace.name"] != "kube-system"
            - set(datapoint.attributes["pod"], resource.attributes["k8s.pod.name"]) where resource.attributes["k8s.pod.name"] != nil
            - set(datapoint.attributes["container"], resource.attributes["k8s.container.name"]) where resource.attributes["k8s.container.name"] != nil

            # Normalize label names for kube-state-metrics, ingress-nginx, etc.
            - set(datapoint.attributes["namespace"], datapoint.attributes["exported_namespace"]) where datapoint.attributes["exported_namespace"] != nil and resource.attributes["k8s.namespace.name"] != "kube-system"
            - set(datapoint.attributes["service"], datapoint.attributes["exported_service"]) where datapoint.attributes["exported_service"] != nil
            - set(datapoint.attributes["pod"], datapoint.attributes["exported_pod"]) where datapoint.attributes["exported_pod"] != nil
            - set(datapoint.attributes["container"], datapoint.attributes["exported_container"]) where datapoint.attributes["exported_container"] != nil

        - statements:
            - delete_key(datapoint.attributes, "exported_namespace") where datapoint.attributes["exported_namespace"] != nil
            - delete_key(datapoint.attributes, "exported_service") where datapoint.attributes["exported_service"] != nil
            - delete_key(datapoint.attributes, "exported_pod") where datapoint.attributes["exported_pod"] != nil
            - delete_key(datapoint.attributes, "exported_container") where datapoint.attributes["exported_container"] != nil

      transform/spanmetrics:
        error_mode: silent
        trace_statements:
        - statements:
            - set(span.attributes["namespace"], resource.attributes["k8s.namespace.name"]) where resource.attributes["k8s.namespace.name"] != nil
            - set(span.attributes["namespace"], resource.attributes["service.namespace"]) where resource.attributes["service.namespace"] != nil

    connectors:
      spanmetrics:
        aggregation_cardinality_limit: 100000
        dimensions:
          - name: namespace
          - name: http.route
          - name: http.method
          - name: http.status_code
        exclude_dimensions:
          - status.code
          - span.name
          - span.kind
          - service.name # The "job" label usually carries the same value
        histogram:
          explicit:
            buckets:
            - 10ms
            - 50ms
            - 100ms
            - 250ms
            - 500ms
            - 1s
            - 2s
            - 5s
        metrics_expiration: 1m
        metrics_flush_interval: 30s
        namespace: ""

    # https://opentelemetry.io/docs/collector/components/exporter/
    exporters:
      debug: {}
      prometheusremotewrite:
        endpoint: http://vmmetrics-insert.victoriametrics:8480/insert/0/prometheus
        timeout: 30s
        retry_on_failure:
          enabled: true
          initial_interval: 10s
          max_interval: 60s
          max_elapsed_time: 300s
      otlphttp/victoriametrics:
        compression: gzip
        encoding: proto
        logs_endpoint: http://vmlogs-insert.victoriametrics:9481/insert/opentelemetry/v1/logs
        tls:
          insecure: true
      loadbalancing:
        routing_key: traceID
        resolver:
          dns:
            hostname: trace-collector
        protocol:
          otlp:
            tls:
              insecure: true

    # https://opentelemetry.io/docs/collector/configuration/#service
    service:
      telemetry:
        logs:
          encoding: json
          level: info

      extensions:
        - health_check

      # https://opentelemetry.io/docs/collector/configuration/#pipelines
      pipelines:
        logs:
          receivers: [filelog, otlp]
          processors:
            - memory_limiter
            - transform/logs
            - batch
          exporters: [otlphttp/victoriametrics]
        metrics:
          receivers: 
            - hostmetrics
            - hostmetrics/disk
            - kubeletstats
            - spanmetrics
            - prometheus
          processors:
            - memory_limiter
            - resource/instance
            - k8sattributes
            - transform/metrics
            - batch
          exporters: 
            - prometheusremotewrite
        traces:
          receivers: [otlp]
          processors: 
            - memory_limiter
            - k8sattributes
            - resource/instance
            - transform/spanmetrics
            - batch
          exporters: 
            - spanmetrics
            - loadbalancing

otel-collector RBAC

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: otel-collector
rules:
  - apiGroups: [""]
    resources:
      - pods
      - namespaces
      - nodes
      - nodes/metrics
      - nodes/stats
    verbs: ["get", "list", "watch"]
  - apiGroups: ["apps"]
    resources:
      - replicasets
      - deployments
      - statefulsets
      - daemonsets
    verbs: ["get", "list", "watch"]
  - apiGroups: ["batch"]
    resources:
      - jobs
      - cronjobs
    verbs: ["get", "list", "watch"]
  - apiGroups: ["extensions"]
    resources:
      - replicasets
    verbs: ["get", "list", "watch"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: otel-collector
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: otel-collector
subjects:
  - kind: ServiceAccount
    name: otel-collector # Controller provisions the SA but not the ClusterRole
    namespace: open-telemetry

trace-collector Deployment

  1
  2
  3
  4
  5
  6
  7
  8
  9
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
# trace-collector
  apiVersion: opentelemetry.io/v1beta1
  kind: OpenTelemetryCollector
  metadata:
    name: trace
  spec:
    mode: deployment
    image: ghcr.io/open-telemetry/opentelemetry-collector-releases/opentelemetry-collector-contrib:0.140.1
    autoscaler:
      minReplicas: 2
      maxReplicas: 6
      targetCPUUtilization: 100
    resources:
      requests:
        cpu: 100m
        memory: 384Mi
      limits:
        cpu: 500m
        memory: 1Gi
    ports:
      - name: grpc
        port: 4317
        targetPort: 4317

    observability:
      metrics:
        enableMetrics: true
    env:
      - name: POD_IP
        valueFrom:
          fieldRef:
            fieldPath: status.podIP
      - name: K8S_NODE_NAME
        valueFrom:
          fieldRef:
            fieldPath: spec.nodeName

    config:
      extensions:
        health_check:
          endpoint: ${env:POD_IP}:13133

      # https://opentelemetry.io/docs/collector/components/receiver/
      receivers:
        otlp:
          protocols:
            grpc:
              endpoint: 0.0.0.0:4317

        prometheus:
          config:
            scrape_configs:
            - job_name: trace-collector
              scrape_interval: 30s
              static_configs:
                - targets:
                    - ${env:POD_IP}:8888

      # https://opentelemetry.io/docs/collector/components/processor/
      processors:
        memory_limiter:
          check_interval: 1s
          limit_percentage: 75
          spike_limit_percentage: 15
        batch:
          send_batch_max_size: 2048
          send_batch_size: 1024
          timeout: 1s
        tail_sampling:
          policies:
            - name: drop_spann
              type: drop
              drop:
                drop_sub_policy:
                  - type: ottl_condition
                    name: sub-policy-0
                    ottl_condition:
                      error_mode: ignore
                      span:
                        - IsMatch(attributes["http.target"], "^(/health|/metrics|/ping|/ready)")

            - name: keep_slow_requests
              type: latency
              latency:
                threshold_ms: 1000

            - name: keep_error_requests
              type: numeric_attribute
              numeric_attribute:
                key: http.status_code
                min_value: 400
                max_value: 599

            - name: keep_user_spans
              type: ottl_condition
              ottl_condition:
                error_mode: ignore
                span:
                  - attributes["user.id"] != nil and attributes["user.id"] != ""

            - name: keep_1_percent_of_the_rest
              type: probabilistic
              probabilistic:
                sampling_percentage: 1

      # https://opentelemetry.io/docs/collector/components/exporter/
      exporters:
        debug: {}
        otlphttp/victoriametrics:
          compression: gzip
          encoding: proto
          traces_endpoint: http://vmtraces-insert.victoriametrics:10481/insert/opentelemetry/v1/traces
          tls:
            insecure: true

      # https://opentelemetry.io/docs/collector/configuration/#service
      service:
        telemetry:
          logs:
            encoding: json
            level: info

        extensions:
          - health_check

        # https://opentelemetry.io/docs/collector/configuration/#pipelines
        pipelines:
          logs:
            receivers: [otlp]
            processors: []
            exporters: [debug]
          metrics:
            receivers: [otlp]
            processors: []
            exporters: [debug]
          traces:
            receivers: [otlp]
            processors:
              - memory_limiter
              - tail_sampling
              - batch
            exporters: [otlphttp/victoriametrics]

trace-collector RBAC

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: otel-targetallocator
rules:
  - apiGroups: [""]
    resources:
      - pods
      - services
      - endpoints
      - nodes
      - nodes/metrics
      - namespaces
      - configmaps
    verbs: ["get", "list", "watch"]
  - apiGroups: ["discovery.k8s.io"]
    resources:
      - endpointslices
    verbs: ["get", "list", "watch"]
  - apiGroups: ["monitoring.coreos.com"]
    resources:
      - probes
      - scrapeconfigs
    verbs: ["get", "list", "watch"]
  - apiGroups: ["monitoring.coreos.com"]
    resources:
      - servicemonitors
      - podmonitors
    verbs: ["*"] # targetAllocator throws a warning if this isnt permissive
  - apiGroups: ["opentelemetry.io"]
    resources:
      - opentelemetrycollectors
    verbs: ["get", "list", "watch"]
  - apiGroups: ["networking.k8s.io"]
    resources:
      - ingresses
    verbs: ["get", "list", "watch"]
  - nonResourceURLs:
      - /apis
      - /apis/*
      - /api
      - /api/*
      - /metrics
    verbs: ["get"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: otel-targetallocator
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: otel-targetallocator
subjects:
  - kind: ServiceAccount
    name: otel-targetallocator # Controller provisions the SA but not the ClusterRole
    namespace: open-telemetry