| title | sidebar_position | id | license |
|---|---|---|---|
Row Format |
2 |
row_format_spec |
Licensed to the Apache Software Foundation (ASF) under one or more
contributor license agreements. See the NOTICE file distributed with
this work for additional information regarding copyright ownership.
The ASF licenses this file to You under the Apache License, Version 2.0
(the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
|
Apache Fory Row Format is a cache-friendly, random-access binary format designed for high-performance data processing. Unlike traditional serialization formats that require full deserialization, the row format enables:
- Random Field Access: Read individual fields without deserializing the entire row
- Zero-Copy Operations: Direct memory access without data transformation
- Cache-Friendly Layout: Optimized memory layout for CPU cache efficiency
- Cross-Language Support: Consistent binary format across Java, C++, and Python
Fory provides two row format variants:
| Format | Languages | Use Case |
|---|---|---|
| Standard Format | Java, C++, Python | Cross-language compatibility |
| Compact Format | Java only | Space efficiency, smaller rows |
| Feature | Standard Format | Compact Format |
|---|---|---|
| Field Slot Size | Fixed 8 bytes | Natural width (1, 2, 4, or 8 bytes) |
| Null Bitmap Size | 8-byte aligned | Byte-aligned, can borrow padding |
| Null Bitmap Position | Before field slots | After field slots (at end) |
| Fixed-Size Structs | Variable region (offset+size) | Inline in fixed region |
| Field Ordering | Schema-defined order | Sorted by alignment |
| All Non-Nullable | Bitmap still present | Bitmap skipped entirely |
| Alignment | Strict 8-byte | Relaxed (2, 4, or 8-byte) |
The standard format prioritizes cross-language compatibility and simplicity with uniform 8-byte field slots.
- 8-Byte Alignment: All major structures are aligned to 8-byte boundaries for optimal memory access
- Fixed-Width Field Slots: Every field uses an 8-byte slot for uniform offset calculation
- Null Bitmap: Compact null tracking using bit vectors
- Relative Offsets: Variable-length data uses relative offsets for sub-buffer navigation
A row stores structured data with the following layout:
+----------------+------------------+------------------+-----+------------------+------------------+
| Null Bitmap | Field 0 Slot | Field 1 Slot | ... | Field N-1 Slot | Variable Data |
+----------------+------------------+------------------+-----+------------------+------------------+
| B bytes | 8 bytes | 8 bytes | | 8 bytes | Variable size |
The null bitmap tracks which fields contain null values:
- Size:
((num_fields + 63) / 64) * 8bytes (rounded up to nearest 8-byte word) - Encoding: Each bit corresponds to a field index
- Bit value
1= field is null - Bit value
0= field is not null
- Bit value
- Bit Order: Bit 0 of the first byte corresponds to field 0
Example: For 10 fields, bitmap size = ((10 + 63) / 64) * 8 = 8 bytes
Each field occupies a fixed 8-byte slot regardless of its actual data type:
- Slot Offset:
bitmap_size + field_index * 8 - Total Fixed Region:
bitmap_size + num_fields * 8bytes
Field Slot Contents by Type:
| Type Category | Slot Contents |
|---|---|
| Fixed-width | Value stored directly (zero-padded) |
| Variable-width | Offset + Size encoding (see below) |
Variable-length fields (strings, arrays, maps, nested structs) store an offset-size pair in their slot:
+---------------------------+---------------------------+
| Relative Offset | Size |
| (32 bits) | (32 bits) |
+---------------------------+---------------------------+
|<-------------- 64-bit field slot value -------------->|
- Relative Offset (upper 32 bits): Offset from the row's base address
- Size (lower 32 bits): Size of the variable-width data in bytes
Encoding:
offset_and_size = (relative_offset << 32) | size
Decoding:
relative_offset = (offset_and_size >> 32) & 0xFFFFFFFF
size = offset_and_size & 0xFFFFFFFF
Variable-length data is stored after the fixed region:
- Data is written sequentially as fields are set
- Each variable-length value is padded to 8-byte alignment
- Padding bytes are zeroed for deterministic output
Arrays store homogeneous sequences of elements:
+------------------+------------------+------------------+
| Element Count | Null Bitmap | Element Data |
+------------------+------------------+------------------+
| 8 bytes | B bytes | Variable size |
| Field | Size | Description |
|---|---|---|
| Element Count | 8 bytes | Number of elements (uint64) |
| Null Bitmap | ((count + 63) / 64) * 8 bytes |
Per-element null flags |
Header Size: 8 + ((num_elements + 63) / 64) * 8 bytes
Elements are stored contiguously after the header:
- Fixed-width elements: Stored with their natural width (1, 2, 4, or 8 bytes)
- Variable-width elements: Stored as 8-byte offset+size pairs
Element Offset: header_size + element_index * element_size
Data Region Size: Rounded up to nearest 8-byte boundary
| Element Type | Element Size |
|---|---|
| bool | 1 byte |
| int8 | 1 byte |
| int16 | 2 bytes |
| int32 | 4 bytes |
| int64 | 8 bytes |
| float32 | 4 bytes |
| float64 | 8 bytes |
| string/binary | 8 bytes (offset+size) |
| array/map/struct | 8 bytes (offset+size) |
Maps store key-value pairs as two separate arrays:
+------------------+------------------+------------------+
| Keys Array Size | Keys Array | Values Array |
+------------------+------------------+------------------+
| 8 bytes | Variable size | Variable size |
| Field | Size | Description |
|---|---|---|
| Keys Array Size | 8 bytes | Total size of keys array in bytes |
| Keys Array | Variable | Full array structure for keys |
| Values Array | Variable | Full array structure for values |
Keys Array Offset: base_offset + 8
Values Array Offset: base_offset + 8 + keys_array_size
Both keys and values arrays follow the standard array binary layout.
Nested structs are stored as complete row structures within the variable data region:
- Parent field slot contains offset+size pointing to nested row
- Nested row has its own null bitmap and field slots
- Supports arbitrary nesting depth
Parent Row:
+----------------+------------------+------------------+
| Null Bitmap | ... Slots ... | Nested Row Data |
+----------------+------------------+------------------+
| ^
| offset+size |
+------------------->+
Nested Row:
+----------------+------------------+------------------+
| Null Bitmap | Field Slots | Variable Data |
+----------------+------------------+------------------+
The compact format provides space-efficient encoding with additional optimizations. It is currently implemented in Java only.
Note: The compact format is still under development and may not be stable yet.
- Natural Width Storage: Fixed-size fields use their natural byte width instead of 8 bytes
- Alignment-Based Field Sorting: Fields are sorted by alignment requirements to minimize padding
- Conditional Null Bitmap: Null bitmap is omitted when all fields are non-nullable
- Inline Fixed-Size Structs: Nested structs with all fixed-size fields are stored inline
+------------------+------------------+-----+------------------+----------------+------------------+
| Field 0 Value | Field 1 Value | ... | Field N-1 Value | Null Bitmap | Variable Data |
+------------------+------------------+-----+------------------+----------------+------------------+
| W0 bytes | W1 bytes | | WN-1 bytes | B bytes (opt) | Variable size |
- Field Slot Sizes: Each field uses its natural width (Wi = type width or 8 for variable)
- Null Bitmap Position: Placed after field slots, can borrow alignment padding
- Field Order: Fields are sorted by alignment (8-byte → 4-byte → 2-byte → 1-byte → variable)
- Conditional Bitmap: Skipped entirely if all fields are non-nullable
- Size:
(num_nullable_fields + 7) / 8bytes (byte-aligned, not 8-byte aligned) - Skipped: When all fields are primitive/non-nullable
- Position: After all fixed-size field slots, can use alignment padding space
Fields are sorted to minimize padding and optimize alignment:
Priority order (highest to lowest):
1. Fields with 8-byte alignment (int64, float64, variable-width)
2. Fields with 4-byte alignment (int32, float32)
3. Fields with 2-byte alignment (int16)
4. Fields with 1-byte alignment (int8, bool)
Within each alignment group, larger fields come first.
Nested structs with all fixed-size fields are stored inline in the parent row:
Standard Format (nested struct with 2 int32 fields):
Parent slot: [offset (4 bytes) | size (4 bytes)] → Points to nested row (8+ bytes elsewhere)
Compact Format (same nested struct):
Parent slot: [int32 field 0 | int32 field 1] → 8 bytes total, inline
This eliminates the offset+size indirection for fixed-size nested structures.
A field's fixed width is determined recursively:
- Primitive types: Natural byte width (1, 2, 4, or 8)
- Struct types: Sum of all child fixed widths (if all children are fixed-width)
- Variable types (string, array, map): Returns -1 (uses 8-byte offset+size slot)
fixed_width(field) =
if primitive: type_width
if struct and all_children_fixed: header_bytes + sum(fixed_width(child) for each child)
else: -1 (variable, uses 8-byte slot)
+------------------+------------------+------------------+
| Element Count | Null Bitmap | Element Data |
+------------------+------------------+------------------+
| 4 bytes | B bytes (opt) | Variable size |
| Field | Size | Description |
|---|---|---|
| Element Count | 4 bytes | Number of elements (int32) |
| Null Bitmap | (count + 7) / 8 bytes (opt) |
Per-element null flags |
Header Size Calculation:
header_size = 4 + (element_nullable ? (num_elements + 7) / 8 : 0)
// Round to 8-byte boundary only if element width is 8-byte aligned
if (fixed_width % 8 == 0):
header_size = round_to_8_bytes(header_size)
- Element Count: 4 bytes instead of 8 bytes
- Null Bitmap: Byte-aligned, skipped if elements are non-nullable
- Fixed-Size Structs: Inline storage for fixed-width struct elements
The following specifications apply to both standard and compact formats.
| Type | Width | Encoding |
|---|---|---|
| bool | 1 byte | 0x00 (false) or 0x01 (true) |
| int8 | 1 byte | Two's complement |
| int16 | 2 bytes | Two's complement, little-endian |
| int32 | 4 bytes | Two's complement, little-endian |
| int64 | 8 bytes | Two's complement, little-endian |
| float32 | 4 bytes | IEEE 754 single precision |
| float64 | 8 bytes | IEEE 754 double precision |
| Type | Width | Encoding |
|---|---|---|
| timestamp | 8 bytes | Microseconds since Unix epoch (int64) |
| date32 | 4 bytes | Days since Unix epoch (int32) |
| duration | 8 bytes | Duration in microseconds (int64) |
- Encoding: UTF-8 for strings, raw bytes for binary
- Storage: Offset+size pair in field slot, data in variable region
- Padding: Data padded to 8-byte alignment (standard) or natural alignment (compact)
- Null fields have their corresponding bit set to 1 in the null bitmap
- Field slot contents are undefined for null fields (standard) or zeroed (compact)
- Reading a null field returns a null/empty value indicator
- Null elements have their corresponding bit set to 1 in the array's null bitmap
- Element data is undefined for null elements
- Compact format: Bitmap skipped if elements are non-nullable
When reading variable-width data from a null field:
- Returns size of -1 or equivalent null indicator
- No data access is performed
- Null Bitmap: Size rounded to 8-byte boundary
- Field Slots: Always 8 bytes each
- Variable Data: Each value padded to 8-byte boundary
- Array Data: Total data region padded to 8-byte boundary
- Field Slots: Natural width (1, 2, 4, or 8 bytes)
- Null Bitmap: Byte-aligned, placed after fields
- Variable Data: Padded to 8-byte boundary only when needed
- Header: May use relaxed alignment for smaller overhead
- All padding bytes must be set to zero
- Ensures deterministic serialization output
- Prevents information leakage from uninitialized memory
row_size = bitmap_size + num_fields * 8 + variable_data_size
where:
bitmap_size = ((num_fields + 63) / 64) * 8
variable_data_size = sum of (padded_size for each variable field)
padded_size = ((size + 7) / 8) * 8
row_size = fixed_region_size + bitmap_size + variable_data_size
where:
fixed_region_size = sum of (fixed_width(field) or 8 for each field)
bitmap_size = all_non_nullable ? 0 : (num_nullable_fields + 7) / 8
// May be rounded to 8-byte boundary if has variable fields
array_size = header_size + data_size
where:
header_size = 8 + ((num_elements + 63) / 64) * 8
data_size = ((num_elements * element_size + 7) / 8) * 8
array_size = header_size + data_size
where:
header_size = 4 + (element_nullable ? (num_elements + 7) / 8 : 0)
// header_size rounded to 8 if element_width % 8 == 0
data_size = num_elements * element_width
map_size = 8 + keys_array_size + values_array_size
| Component | Standard Format | Compact Format |
|---|---|---|
| Row Header | ((N + 63) / 64) * 8 bytes |
0 or (N + 7) / 8 bytes (at end) |
| Row Field Slots | N * 8 bytes |
sum(field_widths) bytes |
| Array Header | 8 + ((E + 63) / 64) * 8 bytes |
4 + (E + 7) / 8 bytes (if nullable) |
| Array Elements | E * element_size (8-aligned) |
E * element_width |
| Map Header | 8 bytes | 8 bytes |
| Offset+Size Pair | 8 bytes (32-bit offset + size) | 8 bytes (same) |
Where N = number of fields, E = number of elements
| Category | Storage Width | Standard Slot | Compact Slot |
|---|---|---|---|
| bool | 1 byte | 8 bytes | 1 byte |
| int8 | 1 byte | 8 bytes | 1 byte |
| int16 | 2 bytes | 8 bytes | 2 bytes |
| int32 | 4 bytes | 8 bytes | 4 bytes |
| int64 | 8 bytes | 8 bytes | 8 bytes |
| float32 | 4 bytes | 8 bytes | 4 bytes |
| float64 | 8 bytes | 8 bytes | 8 bytes |
| string/binary | Variable | 8 bytes | 8 bytes |
| array | Variable | 8 bytes | 8 bytes |
| map | Variable | 8 bytes | 8 bytes |
| struct | Variable | 8 bytes | inline or 8 |
- All multi-byte integers are stored in little-endian format
- Floating-point values use native IEEE 754 representation
- Writers must zero padding bytes to prevent information leakage
- Readers must validate offsets and sizes before accessing data
- Buffer bounds checking is required for untrusted input
Standard Format:
- Fixed 8-byte slots enable O(1) field access with simple arithmetic
- 8-byte alignment optimizes CPU cache line usage
- Best for cross-language interoperability
Compact Format:
- Smaller row sizes reduce memory bandwidth
- Field sorting minimizes padding waste
- Inline structs eliminate pointer chasing
- Relaxed alignment may have slight CPU overhead on some architectures
| Scenario | Recommended Format |
|---|---|
| Cross-language data exchange | Standard |
| Java-only, memory-constrained | Compact |
| Many small primitive fields | Compact |
| Many nested fixed-size structs | Compact |
| Maximum read performance | Standard |
| Interoperability with C++/Python | Standard |