Summary
System tests are supposed to fail when indexed documents contain values in the _ignored metadata field, but they don't. The Painless runtime script that feeds the ignored_fields terms aggregation reads _ignored through the stored-fields lookup (params['_fields']._ignored.values), which returns nothing on at least Elasticsearch 9.3.4. The terms aggregation produces zero buckets, validateIgnoredFields sees an empty list, and the test passes — even when every document in the data stream has _ignored populated.
This defeats the entire _ignored-fields safety net that was added in #1738 ("Extend system test to validate absence of _ignored", b33490b).
How it was detected
While triaging an unrelated bug in the netflow integration, elastic-package test system --data-streams log against packages/netflow v2.25.0 passed on a 9.3.4 stack despite the indexed documents containing:
{
"_index": ".ds-logs-netflow.log-79901-2026.05.05-000001",
"_ignored": ["event.created", "netflow.exporter.timestamp"],
"_source": {
"event": { "created": {} },
"netflow": { "exporter": { "timestamp": {} } }
},
"ignored_field_values": {
"event.created": [{}],
"netflow.exporter.timestamp": [{}]
}
}
All 29 documents in the data stream had _ignored set on both fields. The package does not configure skip_ignored_fields, both fields are declared as date (event.created via ECS import, netflow.exporter.timestamp in package-fields.yml), and the test config does not set skip_reason/skip_link. The validation should have fired but didn't.
Why it's failing
The _ignored metadata field is exposed via doc-values, not stored fields. On Elasticsearch 9.3.4, params['_fields']['_ignored'] resolves to a FieldLookup whose getValues() returns an empty list for _ignored, even on documents whose search response clearly includes "_ignored": [...].
Reproduction against the same cluster, same index, same query body as elastic-package:
# 1. Confirm 29 docs have _ignored populated
$ curl -sk -u elastic:changeme \
'https://localhost:9200/logs-netflow.log-*/_search?size=0' \
-H 'Content-Type: application/json' \
-d '{"query":{"exists":{"field":"_ignored"}}}'
# → "hits": { "total": { "value": 29, "relation": "eq" } }
# 2. Run the exact aggregation from internal/testrunner/runners/system/tester.go
$ curl -sk -u elastic:changeme \
'https://localhost:9200/logs-netflow.log-*/_search?size=0' \
-H 'Content-Type: application/json' \
-d '{
"runtime_mappings": {
"my_ignored": {
"type": "keyword",
"script": { "source": "for (def v : params[\"_fields\"]._ignored.values) { emit(v); }" }
}
},
"aggs": {
"all_ignored": {
"filter": { "exists": { "field": "_ignored" } },
"aggs": { "ignored_fields": { "terms": { "size": 100, "field": "my_ignored" } } }
}
}
}'
# → all_ignored.doc_count = 29
# → all_ignored.ignored_fields.buckets = [] ← BUG: no field names emitted
# 3. Same aggregation with doc['_ignored'] instead
# "source": "for (def v : doc['_ignored']) { emit(v); }"
# → buckets: [
# { "key": "event.created", "doc_count": 29 },
# { "key": "netflow.exporter.timestamp", "doc_count": 29 }
# ]
Direct introspection confirms params['_fields']['_ignored'].getValues().size() == 0 on documents whose response includes _ignored: ["event.created", "netflow.exporter.timestamp"].
Resulting code path
getDocs() runs the search — 29 hits, all_ignored filter agg counts 29, but the inner ignored_fields terms agg returns zero buckets because the my_ignored runtime field never emits.
|
func (r *tester) getDocs(ctx context.Context, dataStream string) (*hits, error) { |
|
resp, err := r.esAPI.Search( |
|
r.esAPI.Search.WithContext(ctx), |
|
r.esAPI.Search.WithIndex(dataStream), |
|
r.esAPI.Search.WithSort("@timestamp:asc"), |
|
r.esAPI.Search.WithSize(elasticsearchQuerySize), |
|
r.esAPI.Search.WithSource("true"), |
|
r.esAPI.Search.WithBody(strings.NewReader(FieldsQuery)), |
|
r.esAPI.Search.WithIgnoreUnavailable(true), |
|
) |
|
if err != nil { |
|
return nil, fmt.Errorf("could not search data stream: %w", err) |
|
} |
|
defer resp.Body.Close() |
|
|
|
if resp.StatusCode == http.StatusServiceUnavailable && strings.Contains(resp.String(), "no_shard_available_action_exception") { |
|
// Index is being created, but no shards are available yet. |
|
// See https://github.com/elastic/elasticsearch/issues/65846 |
|
return &hits{}, nil |
|
} |
|
if resp.IsError() { |
|
return nil, fmt.Errorf("failed to search docs for data stream %s: %s", dataStream, resp.String()) |
|
} |
|
|
|
var results FieldsQueryResult |
|
if err := json.NewDecoder(resp.Body).Decode(&results); err != nil { |
|
return nil, fmt.Errorf("could not decode search results response: %w", err) |
|
} |
|
|
|
numHits := results.Hits.Total.Value |
|
if results.Error != nil { |
|
logger.Debugf("found %d hits in %s data stream: %s: %s Status=%d", |
|
numHits, dataStream, results.Error.Type, results.Error.Reason, results.Status) |
|
} else { |
|
logger.Debugf("found %d hits in %s data stream", numHits, dataStream) |
|
} |
|
|
|
var hits hits |
|
for _, hit := range results.Hits.Hits { |
|
hits.Source = append(hits.Source, hit.Source) |
|
hits.Fields = append(hits.Fields, hit.Fields) |
|
} |
|
for _, bucket := range results.Aggregations.AllIgnored.IgnoredFields.Buckets { |
|
hits.IgnoredFields = append(hits.IgnoredFields, bucket.Key) |
|
} |
|
hits.DegradedDocs = results.Aggregations.AllIgnored.IgnoredDocs.Hits.Hits |
|
|
|
return &hits, nil |
|
} |
tester.go:880-881 writes nothing into hits.IgnoredFields.
|
for _, bucket := range results.Aggregations.AllIgnored.IgnoredFields.Buckets { |
|
hits.IgnoredFields = append(hits.IgnoredFields, bucket.Key) |
|
} |
|
hits.DegradedDocs = results.Aggregations.AllIgnored.IgnoredDocs.Hits.Hits |
tester.go:1207 sets sds.ignoredFields = [].
|
sds.ignoredFields = hits.IgnoredFields |
validateIgnoredFields (called unconditionally at tester.go:1990) sees len(ds.ignoredFields) == 0 and returns nil.
|
func validateIgnoredFields(stackVersion *semver.Version, ds scenarioDataStream, config *testConfig) error { |
|
skipIgnoredFields := append([]string(nil), config.SkipIgnoredFields...) |
|
if stackVersion.LessThan(semver.MustParse("8.14.0")) { |
|
// Pre 8.14 Elasticsearch commonly has event.original not mapped correctly, exclude from check: https://github.com/elastic/elasticsearch/pull/106714 |
|
skipIgnoredFields = append(skipIgnoredFields, "event.original") |
|
} |
|
|
|
ignoredFields := make([]string, 0, len(ds.ignoredFields)) |
|
|
|
for _, field := range ds.ignoredFields { |
|
if !slices.Contains(skipIgnoredFields, field) { |
|
ignoredFields = append(ignoredFields, field) |
|
} |
|
} |
|
|
|
if len(ignoredFields) > 0 { |
|
issues := make([]struct { |
|
ID any `json:"_id"` |
|
Timestamp any `json:"@timestamp,omitempty"` |
|
IgnoredFields any `json:"ignored_field_values"` |
|
}, len(ds.degradedDocs)) |
|
for i, d := range ds.degradedDocs { |
|
issues[i].ID = d["_id"] |
|
if source, ok := d["_source"].(map[string]any); ok { |
|
if ts, ok := source["@timestamp"]; ok { |
|
issues[i].Timestamp = ts |
|
} |
|
} |
|
issues[i].IgnoredFields = d["ignored_field_values"] |
|
} |
|
degradedDocsJSON, err := json.MarshalIndent(issues, "", " ") |
|
if err != nil { |
|
return fmt.Errorf("failed to marshal degraded docs to JSON: %w", err) |
|
} |
|
|
|
return testrunner.ErrTestCaseFailed{ |
|
Reason: "found ignored fields in data stream", |
|
Details: fmt.Sprintf("found ignored fields in data stream %s: %v. Affected documents: %s", ds.dataStream, ignoredFields, degradedDocsJSON), |
|
} |
|
} |
|
|
|
return nil |
|
} |
- The test passes.
Affected source
FieldsQuery constant containing the broken Painless script:
|
const FieldsQuery = `{ |
|
"fields": [ |
|
"*" |
|
], |
|
"runtime_mappings": { |
|
"my_ignored": { |
|
"type": "keyword", |
|
"script": { |
|
"source": "for (def v : params['_fields']._ignored.values) { emit(v); }" |
|
} |
|
} |
|
}, |
|
"aggs": { |
|
"all_ignored": { |
|
"filter": { |
|
"exists": { |
|
"field": "_ignored" |
|
} |
|
}, |
|
"aggs": { |
|
"ignored_fields": { |
|
"terms": { |
|
"size": 100, |
|
"field": "my_ignored" |
|
} |
|
}, |
|
"ignored_docs": { |
|
"top_hits": { |
|
"size": 5 |
|
} |
|
} |
|
} |
|
} |
|
} |
|
}` |
- The single offending line:
|
"source": "for (def v : params['_fields']._ignored.values) { emit(v); }" |
Proposed fix
Switch the runtime field to read _ignored from doc-values:
--- a/internal/testrunner/runners/system/tester.go
+++ b/internal/testrunner/runners/system/tester.go
@@ -50,7 +50,7 @@ const FieldsQuery = `{
"my_ignored": {
"type": "keyword",
"script": {
- "source": "for (def v : params['_fields']._ignored.values) { emit(v); }"
+ "source": "for (def v : doc['_ignored']) { emit(v); }"
}
}
},
doc['_ignored'] returns the field names correctly; the rest of the aggregation, getDocs, and validateIgnoredFields are unchanged. Verified against Elasticsearch 9.3.4 (build hash 69a3e6c50ebb57a1fdbf3f235be9f11061ac7d86).
Suggested follow-up
- Add a regression test (or integration test) that asserts a system test fails when the indexed doc has
_ignored populated. The current unit tests cover validateIgnoredFields's post-aggregation behaviour but not the search query, so this regression slipped past CI.
- Audit how widely the bug has masked failures by re-running system tests for packages that rely on
ignore_malformed: true from logs@settings (most logs integrations). It is likely that other packages have been silently shipping schema drift.
Impact
Every system test relying on _ignored validation has been a no-op on stacks where _ignored is not exposed via the stored-fields lookup. For data streams using the logs@settings defaults (which set ignore_malformed: true on date/numeric fields), this means real schema mismatches — a date field receiving an object, a long receiving a string — are silently absorbed by the index instead of surfacing as test failures.
Environment
Summary
System tests are supposed to fail when indexed documents contain values in the
_ignoredmetadata field, but they don't. The Painless runtime script that feeds theignored_fieldsterms aggregation reads_ignoredthrough the stored-fields lookup (params['_fields']._ignored.values), which returns nothing on at least Elasticsearch 9.3.4. The terms aggregation produces zero buckets,validateIgnoredFieldssees an empty list, and the test passes — even when every document in the data stream has_ignoredpopulated.This defeats the entire
_ignored-fields safety net that was added in #1738 ("Extend system test to validate absence of_ignored", b33490b).How it was detected
While triaging an unrelated bug in the
netflowintegration,elastic-package test system --data-streams logagainstpackages/netflowv2.25.0 passed on a 9.3.4 stack despite the indexed documents containing:{ "_index": ".ds-logs-netflow.log-79901-2026.05.05-000001", "_ignored": ["event.created", "netflow.exporter.timestamp"], "_source": { "event": { "created": {} }, "netflow": { "exporter": { "timestamp": {} } } }, "ignored_field_values": { "event.created": [{}], "netflow.exporter.timestamp": [{}] } }All 29 documents in the data stream had
_ignoredset on both fields. The package does not configureskip_ignored_fields, both fields are declared asdate(event.createdvia ECS import,netflow.exporter.timestampinpackage-fields.yml), and the test config does not setskip_reason/skip_link. The validation should have fired but didn't.Why it's failing
The
_ignoredmetadata field is exposed via doc-values, not stored fields. On Elasticsearch 9.3.4,params['_fields']['_ignored']resolves to aFieldLookupwhosegetValues()returns an empty list for_ignored, even on documents whose search response clearly includes"_ignored": [...].Reproduction against the same cluster, same index, same query body as elastic-package:
Direct introspection confirms
params['_fields']['_ignored'].getValues().size() == 0on documents whose response includes_ignored: ["event.created", "netflow.exporter.timestamp"].Resulting code path
getDocs()runs the search — 29 hits,all_ignoredfilter agg counts 29, but the innerignored_fieldsterms agg returns zero buckets because themy_ignoredruntime field never emits.elastic-package/internal/testrunner/runners/system/tester.go
Lines 838 to 886 in 5b266c6
tester.go:880-881writes nothing intohits.IgnoredFields.elastic-package/internal/testrunner/runners/system/tester.go
Lines 880 to 883 in 5b266c6
tester.go:1207setssds.ignoredFields = [].elastic-package/internal/testrunner/runners/system/tester.go
Line 1207 in 5b266c6
validateIgnoredFields(called unconditionally attester.go:1990) seeslen(ds.ignoredFields) == 0and returnsnil.elastic-package/internal/testrunner/runners/system/tester.go
Lines 2594 to 2636 in 5b266c6
Affected source
FieldsQueryconstant containing the broken Painless script:elastic-package/internal/testrunner/runners/system/tester.go
Lines 46 to 80 in 5b266c6
elastic-package/internal/testrunner/runners/system/tester.go
Line 54 in 5b266c6
Proposed fix
Switch the runtime field to read
_ignoredfrom doc-values:doc['_ignored']returns the field names correctly; the rest of the aggregation,getDocs, andvalidateIgnoredFieldsare unchanged. Verified against Elasticsearch 9.3.4 (build hash69a3e6c50ebb57a1fdbf3f235be9f11061ac7d86).Suggested follow-up
_ignoredpopulated. The current unit tests covervalidateIgnoredFields's post-aggregation behaviour but not the search query, so this regression slipped past CI.ignore_malformed: truefromlogs@settings(most logs integrations). It is likely that other packages have been silently shipping schema drift.Impact
Every system test relying on
_ignoredvalidation has been a no-op on stacks where_ignoredis not exposed via the stored-fields lookup. For data streams using thelogs@settingsdefaults (which setignore_malformed: trueon date/numeric fields), this means real schema mismatches — adatefield receiving an object, alongreceiving a string — are silently absorbed by the index instead of surfacing as test failures.Environment
main@5b266c6(also affects all releases since Extend system test to validate absence of _ignored #1738)69a3e6c50ebb57a1fdbf3f235be9f11061ac7d86)elastic/integrationspackages/netflowv2.25.0