Skip to content

pa.Table.from_struct_array fails for an empty array #48344

@dangotbanned

Description

@dangotbanned

Describe the bug, including details regarding any error messages, version, and platform.

This is the inverse of #46355, so I'll continue from there 😅

Repro

import pyarrow as pa

table = pa.table({"a": pa.array([]), "b": pa.array([], type=pa.float64())})
array = table.to_struct_array()

print(array)
print(array.type.fields)
pa.Table.from_struct_array(array)

Output

<pyarrow.lib.ChunkedArray object at 0x...>
[

]
[pyarrow.Field<a: null>, pyarrow.Field<b: double>]

./pyarrow/table.pxi:4945, in pyarrow.lib.Table.from_struct_array()

./pyarrow/table.pxi:5032, in pyarrow.lib.Table.from_batches()

ValueError: Must pass schema, or at least one RecordBatch

Workaround

I'm doing quite a lot of version branching here already:

In short, it looks like:

import pyarrow as pa

table = pa.table({"a": pa.array([]), "b": pa.array([], type=pa.float64())})
array = table.to_struct_array()

from_struct_array = pa.schema(array.type.fields).empty_table()
print(from_struct_array)
pyarrow.Table
a: null
b: double
----
a: [0 nulls]
b: [[]]

Component(s)

Python

Metadata

Metadata

Assignees

No one assigned

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions