Skip to content

Handle Avro fixed field type in AvroInputCodec#6603

Open
p1ck wants to merge 2 commits intoopensearch-project:mainfrom
p1ck:main
Open

Handle Avro fixed field type in AvroInputCodec#6603
p1ck wants to merge 2 commits intoopensearch-project:mainfrom
p1ck:main

Conversation

@p1ck
Copy link
Copy Markdown

@p1ck p1ck commented Mar 3, 2026

Added a case in AvroInputCodec convertRecordToMap to handle GenericData.Fixed which comes from Avro "fixed" field type.

Description

Previously, the use of the "fixed" field type in Avro schemas caused an exception on read. Here it is converted to a byte array and added to the map as expected.

e.g.
An Avro file with the following valid schema results in an exception

{
  "type": "record",
  "name": "TestRecord",
  "fields": [
    {
      "name": "fixedField",
      "type": {
        "type": "fixed",
        "name": "FixedType",
        "size": 16
      }
    }
  ]
}
data-prepper-ingest  | 2026-03-03T20:55:35,494 [scanning-fingerprint-pipeline-sink-worker-2-thread-1] INFO  org.opensearch.dataprepper.plugins.source.file.FileSource - Starting file source with /usr/share/data-prepper/data/test_fixed.avro path.
data-prepper-ingest  | 2026-03-03T20:55:35,858 [file-source] ERROR org.opensearch.dataprepper.plugins.codec.avro.AvroInputCodec - An exception has occurred while parsing avro InputStream 
data-prepper-ingest  | java.lang.IllegalArgumentException: Not an enum: {"type":"fixed","name":"FixedType","size":16} (through reference chain: java.util.HashMap["fixedField"]->org.apache.avro.generic.GenericData$Fixed["schema"]->org.apache.avro.Schema$FixedSchema["enumSymbols"])
data-prepper-ingest  |  at com.fasterxml.jackson.databind.ObjectMapper.valueToTree(ObjectMapper.java:3633)
data-prepper-ingest  |  at org.opensearch.dataprepper.model.event.JacksonEvent.getInitialJsonNode(JacksonEvent.java:149)
data-prepper-ingest  |  at org.opensearch.dataprepper.model.event.JacksonEvent.<init>(JacksonEvent.java:114)
data-prepper-ingest  |  at org.opensearch.dataprepper.model.log.JacksonLog.<init>(JacksonLog.java:23)
data-prepper-ingest  |  at org.opensearch.dataprepper.model.log.JacksonLog$Builder.build(JacksonLog.java:55)
data-prepper-ingest  |  at org.opensearch.dataprepper.model.log.JacksonLog$Builder.build(JacksonLog.java:41)
data-prepper-ingest  |  at org.opensearch.dataprepper.event.DefaultLogEventBuilderFactory$DefaultLogEventBuilder.build(DefaultLogEventBuilderFactory.java:39)
data-prepper-ingest  |  at org.opensearch.dataprepper.event.DefaultLogEventBuilderFactory$DefaultLogEventBuilder.build(DefaultLogEventBuilderFactory.java:25)
data-prepper-ingest  |  at org.opensearch.dataprepper.plugins.codec.avro.AvroInputCodec.parseAvroStream(AvroInputCodec.java:74)
data-prepper-ingest  |  at org.opensearch.dataprepper.plugins.codec.avro.AvroInputCodec.parse(AvroInputCodec.java:56)
data-prepper-ingest  |  at org.opensearch.dataprepper.plugins.source.file.FileSource$CodecFileStrategy.start(FileSource.java:182)
data-prepper-ingest  |  at org.opensearch.dataprepper.plugins.source.file.FileSource.lambda$start$0(FileSource.java:88)
data-prepper-ingest  |  at java.base/java.lang.Thread.run(Thread.java:840)
data-prepper-ingest  | Caused by: com.fasterxml.jackson.databind.JsonMappingException: Not an enum: {"type":"fixed","name":"FixedType","size":16} (through reference chain: java.util.HashMap["fixedField"]->org.apache.avro.generic.GenericData$Fixed["schema"]->org.apache.avro.Schema$FixedSchema["enumSymbols"])
data-prepper-ingest  |  at com.fasterxml.jackson.databind.JsonMappingException.wrapWithPath(JsonMappingException.java:400)
data-prepper-ingest  |  at com.fasterxml.jackson.databind.JsonMappingException.wrapWithPath(JsonMappingException.java:359)
data-prepper-ingest  |  at com.fasterxml.jackson.databind.ser.std.StdSerializer.wrapAndThrow(StdSerializer.java:324)
data-prepper-ingest  |  at com.fasterxml.jackson.databind.ser.std.BeanSerializerBase.serializeFields(BeanSerializerBase.java:765)
data-prepper-ingest  |  at com.fasterxml.jackson.databind.ser.BeanSerializer.serialize(BeanSerializer.java:183)
data-prepper-ingest  |  at com.fasterxml.jackson.databind.ser.BeanPropertyWriter.serializeAsField(BeanPropertyWriter.java:732)
data-prepper-ingest  |  at com.fasterxml.jackson.databind.ser.std.BeanSerializerBase.serializeFields(BeanSerializerBase.java:760)
data-prepper-ingest  |  at com.fasterxml.jackson.databind.ser.BeanSerializer.serialize(BeanSerializer.java:183)
data-prepper-ingest  |  at com.fasterxml.jackson.databind.ser.std.MapSerializer.serializeFields(MapSerializer.java:807)
data-prepper-ingest  |  at com.fasterxml.jackson.databind.ser.std.MapSerializer.serializeWithoutTypeInfo(MapSerializer.java:763)
data-prepper-ingest  |  at com.fasterxml.jackson.databind.ser.std.MapSerializer.serialize(MapSerializer.java:719)
data-prepper-ingest  |  at com.fasterxml.jackson.databind.ser.std.MapSerializer.serialize(MapSerializer.java:34)
data-prepper-ingest  |  at com.fasterxml.jackson.databind.ser.DefaultSerializerProvider._serialize(DefaultSerializerProvider.java:503)
data-prepper-ingest  |  at com.fasterxml.jackson.databind.ser.DefaultSerializerProvider.serializeValue(DefaultSerializerProvider.java:342)
data-prepper-ingest  |  at com.fasterxml.jackson.databind.ObjectMapper.valueToTree(ObjectMapper.java:3628)
data-prepper-ingest  |  ... 12 more
data-prepper-ingest  | Caused by: org.apache.avro.AvroRuntimeException: Not an enum: {"type":"fixed","name":"FixedType","size":16}
data-prepper-ingest  |  at org.apache.avro.Schema.getEnumSymbols(Schema.java:303)
data-prepper-ingest  |  at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
data-prepper-ingest  |  at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77)
data-prepper-ingest  |  at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
data-prepper-ingest  |  at java.base/java.lang.reflect.Method.invoke(Method.java:569)
data-prepper-ingest  |  at com.fasterxml.jackson.databind.ser.BeanPropertyWriter.serializeAsField(BeanPropertyWriter.java:688)
data-prepper-ingest  |  at com.fasterxml.jackson.databind.ser.std.BeanSerializerBase.serializeFields(BeanSerializerBase.java:760)
data-prepper-ingest  |  ... 23 more

Issues Resolved

Resolves #6602

Check List

  • New functionality includes testing.
  • New functionality has a documentation issue. Please link to it in this PR.
    • New functionality has javadoc added
  • Commits are signed with a real name per the DCO

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

Added a case in convertRecordToMap to handle GenericData.Fixed.

Previously, the use of the fixed field type in Avro schemas caused an
exception.  Now it is converted to a byte array and added to the map as
expected.

Signed-off-by: Jack Pickett <jackpick@gmail.com>
@acidjazz
Copy link
Copy Markdown

acidjazz commented Mar 3, 2026

Would love to see this fixed!

value = new String(utf8Bytes, "UTF-8");
}

else if(value instanceof GenericData.Fixed){
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@p1ck , Thank you for the contribution. Please add a unit tests that fails without this change but works with it. This way we know it works and it won't fail again if somebody changes the code.

@p1ck
Copy link
Copy Markdown
Author

p1ck commented Mar 12, 2026

This is somewhat related to #4096 "Create a model for binary data".

Even when I am able to read the bytes field, there are no functions or processors within data prepper to do anything with the contents except pass them on as a base64 encoded string.

e.g. I read an Avro file with an IPv4 address stored as 4 bytes. I would like to convert it to dotted decimal notation to ingest into an opensearch ip field, but this does not seem to be possible.

@dlvenable
Copy link
Copy Markdown
Member

@p1ck , Are you able to add a unit test that demonstrates the failure that the code change fixes? We want to be sure that the behavior doesn't regress by having a unit test.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[BUG] AvroInputCodec exception on fixed type Avro fields

3 participants