Skip to content

sumologic-otelcol-logs stuck crashing on public: assertion failed. #4037

@Aaron-ML

Description

@Aaron-ML

Describe the bug Essentially the same bug as mentioned in #3206

Logs

2025-11-25T18:34:30.903Z	info	sumologicexporter@v0.130.0/exporter.go:286	setting data urls	{"resource": {"service.instance.id": "9379e5eb-a034-4a7a-aba9-91ecb2d1de99", "service.name": "otelcol-sumo", "service.version": "0.130.1-sumo-0-9cd0cff3fd63f4db526097a355ef9672f0956027"}, "otelcol.component.id": "sumologic", "otelcol.component.kind": "exporter", "otelcol.signal": "logs", "logs_url": "https://endpoint3.collection.us2.sumologic.com/receiver/v1/otlp/************************************************************************************************************************/v1/logs", "metrics_url": "https://endpoint3.collection.us2.sumologic.com/receiver/v1/otlp/************************************************************************************************************************/v1/metrics", "traces_url": "https://endpoint3.collection.us2.sumologic.com/receiver/v1/otlp/************************************************************************************************************************/v1/traces"}
panic: assertion failed: Page expected to be: 87, but self identifies as 509593322935055205

goroutine 1 [running]:
go.etcd.io/bbolt/internal/common.Assert(...)
	go.etcd.io/bbolt@v1.4.2/internal/common/verify.go:65
go.etcd.io/bbolt/internal/common.(*Page).FastCheck(0x7feddc077000, 0x57)
	go.etcd.io/bbolt@v1.4.2/internal/common/page.go:83 +0x1d9
go.etcd.io/bbolt.(*Tx).page(0xc000cfb180?, 0xc00140e0f8?)
	go.etcd.io/bbolt@v1.4.2/tx.go:598 +0x7b
go.etcd.io/bbolt.(*Tx).forEachPageInternal(0xc000000620, {0xc000cfb180, 0x1, 0xa}, 0xc00140e1b0)
	go.etcd.io/bbolt@v1.4.2/tx.go:610 +0x5a
go.etcd.io/bbolt.(*Tx).forEachPage(...)
	go.etcd.io/bbolt@v1.4.2/tx.go:606
go.etcd.io/bbolt.(*Tx).checkInvariantProperties(0xc000000620, 0x57, 0x138ec5c0?, 0xc001b4bb90?, {0xc81e358, 0x1391d800}, 0xc000c316c0)
	go.etcd.io/bbolt@v1.4.2/tx_check.go:143 +0xb2
go.etcd.io/bbolt.(*Tx).recursivelyCheckBucket(0xc000000620, 0xc000000638, 0xc00140e450, 0xc00140e398, {0xc81e358, 0x1391d800}, 0xc000c316c0)
	go.etcd.io/bbolt@v1.4.2/tx_check.go:130 +0x73
go.etcd.io/bbolt.(*DB).freepages(0xc001ae8d88)
	go.etcd.io/bbolt@v1.4.2/db.go:1251 +0x21b
go.etcd.io/bbolt.(*DB).loadFreelist.func1()
	go.etcd.io/bbolt@v1.4.2/db.go:422 +0xc9
sync.(*Once).doSlow(0xc00140e518?, 0xc001ae8d88?)
	sync/once.go:78 +0xab
sync.(*Once).Do(...)
	sync/once.go:69
go.etcd.io/bbolt.(*DB).loadFreelist(0xc001ae8d88?)
	go.etcd.io/bbolt@v1.4.2/db.go:418 +0x3b
go.etcd.io/bbolt.Open({0xc001b54fc0, 0x2d}, 0x180, 0xc000793810)
	go.etcd.io/bbolt@v1.4.2/db.go:299 +0xb48
github.com/open-telemetry/opentelemetry-collector-contrib/extension/storage/filestorage.newClient(0xc001b4e900, {0xc001b54fc0, 0x2d}, 0x2540be400, 0xc000ff2e40, 0x1)
	github.com/open-telemetry/opentelemetry-collector-contrib/extension/storage/filestorage@v0.130.0/client.go:53 +0x7b
github.com/open-telemetry/opentelemetry-collector-contrib/extension/storage/filestorage.(*localFileStorage).GetClient(0xc001b4a420, {0x0?, 0x0?}, {{0xb502e85?, 0x0?}}, {{{0xc000f1a7f5?, 0x0?}}, {0x0?, 0x0?}}, {0xb4f183e, ...})
	github.com/open-telemetry/opentelemetry-collector-contrib/extension/storage/filestorage@v0.130.0/extension.go:74 +0x6bf
go.opentelemetry.io/collector/exporter/exporterhelper/internal/queue.toStorageClient({0xc87b830, 0x1391d800}, {{{0xc001517ec0, 0xc}}, {0x0, 0x0}}, {0xc7bc5c0?, 0xc001b4b980?}, {{{0xc000f1a7f5, 0x9}}, ...}, ...)
	go.opentelemetry.io/collector/exporter@v0.130.1/exporterhelper/internal/queue/persistent_queue.go:561 +0x113
go.opentelemetry.io/collector/exporter/exporterhelper/internal/queue.(*persistentQueue[...]).Start(0xc9697a0, {0xc87b830, 0x1391d800}, {0xc7bc5c0, 0xc001b4b980?})
	go.opentelemetry.io/collector/exporter@v0.130.1/exporterhelper/internal/queue/persistent_queue.go:117 +0xc5
go.opentelemetry.io/collector/exporter/exporterhelper/internal/queue.(*asyncQueue[...]).Start(0xc828220, {0xc87b830, 0x1391d800?}, {0xc7bc5c0?, 0xc001b4b980?})
	go.opentelemetry.io/collector/exporter@v0.130.1/exporterhelper/internal/queue/async_queue.go:32 +0x49
go.opentelemetry.io/collector/exporter/exporterhelper/internal/queuebatch.(*QueueBatch).Start(0xc0018b50a0, {0xc87b830, 0x1391d800}, {0xc7bc5c0, 0xc001b4b980})
	go.opentelemetry.io/collector/exporter@v0.130.1/exporterhelper/internal/queuebatch/queue_batch.go:80 +0x8e
go.opentelemetry.io/collector/exporter/exporterhelper/internal.(*BaseExporter).Start(0xc0016d9000?, {0xc87b830?, 0x1391d800?}, {0xc7bc5c0?, 0xc001b4b980?})
	go.opentelemetry.io/collector/exporter@v0.130.1/exporterhelper/internal/base_exporter.go:131 +0xa2
go.opentelemetry.io/collector/service/internal/graph.(*Graph).StartAll(0xc001730f60, {0xc87b830, 0x1391d800}, 0xc00095cf20)
	go.opentelemetry.io/collector/service@v0.130.1/internal/graph/graph.go:431 +0x24b
go.opentelemetry.io/collector/service.(*Service).Start(0xc001727e00, {0xc87b830, 0x1391d800})
	go.opentelemetry.io/collector/service@v0.130.1/service.go:272 +0x2e8
go.opentelemetry.io/collector/otelcol.(*Collector).setupConfigurationComponents(0xc001574420, {0xc87b830, 0x1391d800})
	go.opentelemetry.io/collector/otelcol@v0.130.1/collector.go:242 +0xb9d
go.opentelemetry.io/collector/otelcol.(*Collector).Run(0xc001574420, {0xc87b830, 0x1391d800})
	go.opentelemetry.io/collector/otelcol@v0.130.1/collector.go:312 +0x55
go.opentelemetry.io/collector/otelcol.NewCommand.func1(0xc00156a908, {0xb4fd7c8?, 0x7?, 0xb4f1a62?})
	go.opentelemetry.io/collector/otelcol@v0.130.1/command.go:39 +0x94
github.com/spf13/cobra.(*Command).execute(0xc00156a908, {0xc0000b23d0, 0x1, 0x1})
	github.com/spf13/cobra@v1.9.1/command.go:1015 +0xaaa
github.com/spf13/cobra.(*Command).ExecuteC(0xc00156a908)
	github.com/spf13/cobra@v1.9.1/command.go:1148 +0x46f
github.com/spf13/cobra.(*Command).Execute(0xb9afe88?)
	github.com/spf13/cobra@v1.9.1/command.go:1071 +0x13
main.runInteractive({0xb9afe88, {{0xb521b94, 0xc}, {0xb6cb972, 0x2f}, {0xb71496f, 0x37}, {}}, 0x0, {{{0x0, ...}, ...}, ...}, ...})
	github.com/SumoLogic/sumologic-otel-collector/main.go:71 +0x9c
main.run(...)
	github.com/SumoLogic/sumologic-otel-collector/main_others.go:10
main.main()
	github.com/SumoLogic/sumologic-otel-collector/main.go:57 +0x498

Configuration

❯ helm get values -n monitoring sumologic
USER-SUPPLIED VALUES:
fluent-bit:
  enabled: false
resources:
  limits:
    cpu: 256m
    memory: 356Mi
sumologic:
  accessId: X
  accessKey: X
  clusterName: X
  logs:
    collector:
      otelcol:
        enabled: true
    container:
      excludeContainerRegex: istio-proxy
      excludePodRegex: remotewrite-prometheus-internal-server|loki-distributed-gateway|ingress-nginx-controller|loki-distributed-ingester|thanos-receiver|ingress-nginx-external-controller|taskmanager|citus-upsert-citus|^.*-connect-\d
      sourceCategoryPrefix: X/kubernetes/
    kubelet:
      sourceCategoryPrefix: X/kubernetes/
    metadata:
      provider: otelcol
    multiline:
      enabled: true
      first_line_regex: ^\[?\d{4}-\d{1,2}-\d{1,2}.\d{2}:\d{2}:\d{2}
    systemd:
      sourceCategoryPrefix: X/kubernetes/
  metrics:
    enabled: false
  setupEnabled: true
  traces:
    enabled: false
  • Collection version (e.g. helm ls -n sumologic):
    NAME NAMESPACE REVISION UPDATED STATUS CHART APP VERSION
sumologic                	monitoring	1       	2025-10-23 09:21:58.77046 -0700 PDT 	deployed	sumologic-4.17.0                    4.17.0
  • Kubernetes version (e.g. kubectl version):
    v1.32.7
  • Cloud provider:
    Azure AKS
  • Others:
    The otelcol logs have PVC attached, so I'm unsure how data corruption would happen outside of the service.

Anything else do we need to know
It seems like when this happens the pod can not recover on it's own. Not sure what's going on here.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workingstale

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions