Skip to content

TimeSeries InfluxDB Line Protocol: silent data loss, no gzip support, doc/grammar mismatch #3821

@robfrank

Description

@robfrank

Context

Reported in #3819 - a user integrating Telegraf with ArcadeDB's TimeSeries InfluxDB Line Protocol endpoint experiences silent data loss and protocol incompatibilities. Code analysis confirms multiple bugs in the write handler, payload parsing, and documentation.


Bug 1 (Critical): Silent data loss on unknown measurement type

File: server/src/main/java/com/arcadedb/server/http/handler/PostTimeSeriesWriteHandler.java (lines 99-144)

When a line protocol measurement name (e.g. cpu, mem) has no matching TIMESERIES TYPE in the schema, the handler silently skips it and still returns 204 No Content. Telegraf interprets 204 as success and keeps shipping data that is never persisted.

if (!database.getSchema().existsType(measurement))
  continue; // silently dropped

// ...later...
if (inserted == 0)
  return new ExecutionResponse(204, ""); // success even though nothing was inserted

Impact: Complete silent data loss. The user sees "Wrote batch of N metrics" in Telegraf logs while zero rows land in the database. This is the root cause of the symptom described in #3819.

Suggested fix: Return an error response (400 or 422) when measurements are skipped due to missing types, including the list of unknown measurement names in the response body. Also consider logging a warning server-side. The inserted == 0 path should not return 204.

Additionally, when a tag/field key from the line protocol doesn't match any column in the TIMESERIES TYPE, the value is silently dropped and the sample is appended with nulls - at minimum this should warn.


Bug 2 (Critical): No gzip decompression for request bodies

File: server/src/main/java/com/arcadedb/server/http/handler/AbstractServerHttpHandler.java (lines 55-74)

The payload parser reads raw bytes and decodes as UTF-8 without checking Content-Encoding. Telegraf's outputs.influxdb plugin sends Content-Encoding: gzip by default. The gzipped bytes are interpreted as text, producing the exact error from #3819:

Skipping malformed line protocol line: '�???????�T�QN�0'

Suggested fix: In PostTimeSeriesWriteHandler.parseRequestPayload (or in AbstractServerHttpHandler for all handlers), inspect Content-Encoding and decompress through GZIPInputStream when gzip is specified. Reject unknown encodings with 415 Unsupported Media Type.


Bug 3 (Feature gap): No InfluxDB v1 protocol compatibility endpoints

Telegraf's native outputs.influxdb plugin issues a POST /query with form-urlencoded body q=CREATE DATABASE "telegraf" as a handshake before writing. ArcadeDB has no matching route, so this hits PostTimeSeriesQueryHandler which expects JSON, producing:

Error parsing request payload: Invalid JSON object format: q=CREATE+DATABASE+%22telegraf%22

Suggested fix: Either implement a thin InfluxDB v1 shim (POST /query returning {"results":[{"statement_id":0}]} for CREATE DATABASE/SHOW DATABASES, POST /write?db=... delegating to the line protocol handler), or clearly document that only [[outputs.http]] pointed at /api/v1/ts/{db}/write is supported.


Bug 4: Documentation/grammar mismatch for CREATE TIMESERIES TYPE

File: engine/src/main/antlr4/com/arcadedb/query/sql/grammar/SQLParser.g4 (lines 447-456)

The public docs show syntax that the grammar does not accept:

Documented syntax Actual grammar Status
TIMESTAMP ts PRECISION NANOSECOND TIMESTAMP identifier (no PRECISION clause) Parse error
COMPACTION INTERVAL 30s COMPACTION_INTERVAL INTEGER_LITERAL (DAYS|HOURS|MINUTES)? Parse error (two words vs underscore, s not a valid unit)

Suggested fix: Update the public documentation to match the implemented grammar. Optionally add a negative test in CreateTimeSeriesTypeStatementTest that asserts these invalid forms produce clear error messages.


Reproduction

All issues are reproducible with ArcadeDB 26.3.2 + Telegraf (latest) as described in #3819.

Checklist

  • Bug 1: Return error on unknown measurement types instead of silent 204
  • Bug 2: Support Content-Encoding: gzip in the write handler
  • Bug 3: Decide on InfluxDB v1 compatibility (implement shim or document limitation)
  • Bug 4: Fix public docs to match grammar (PRECISION, COMPACTION_INTERVAL, units)

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions