Describe the bug, including details regarding any error messages, version, and platform.
I started adding this solely as a comment on #40423, but since the warning just started for me when I updated to arrow_22.0.0 (and R-4.5) and that other issue is from 2024, I thought it might be a different cause. It may also be related to #32729. A common theme is the presence of attributes in the columns (as shown in the second link).
I can trigger the issue with a named-vector as one of the columns:
mutate(mtcars, cyl = setNames(nm = cyl)) |>
arrow_table() |>
rename_with(.fn = toupper)
# Warning: Invalid metadata$r
# Warning: Invalid metadata$r
# Table (query)
# MPG: double
# CYL: double
# DISP: double
# HP: double
# DRAT: double
# WT: double
# QSEC: double
# VS: double
# AM: double
# GEAR: double
# CARB: double
# See $.data for the source Arrow object
This can be hacked in open_dataset() by removing the "names" component of the attributes, but this does not work with a table created with arrow_table().
This breaks at
|
attributes(x)[names(r_metadata$attributes)] <- r_metadata$attributes |
attributes(x)[names(r_metadata$attributes)] <- r_metadata$attributes
# Error in attributes(x)[names(r_metadata$attributes)] <- r_metadata$attributes :
# 'names' attribute [32] must be the same length as the vector [0]
### for context
x
# numeric(0)
attributes(x)
# NULL
r_metadata$attributes
# $names
# [1] "6" "6" "4" "6" "8" "6" "8" "4" "4" "6" "6" "8" "8" "8" "8" "8" "8" "4" "4" "4" "4" "8" "8" "8" "8" "4" "4" "4" "8" "6" "8" "4"
The error is because base R itself requires the "names" attribute to be sized the same as the data, which at this point in the call x is length 0 (numeric(0)).
The underlying issue is that x here is still lazy with a placeholder numeric(0). Is it possible to change whether the data is realized already for this specific data path?
Component(s)
R
Describe the bug, including details regarding any error messages, version, and platform.
I started adding this solely as a comment on #40423, but since the warning just started for me when I updated to
arrow_22.0.0(and R-4.5) and that other issue is from 2024, I thought it might be a different cause. It may also be related to #32729. A common theme is the presence of attributes in the columns (as shown in the second link).I can trigger the issue with a named-vector as one of the columns:
This can be hacked in
open_dataset()by removing the"names"component of the attributes, but this does not work with a table created witharrow_table().This breaks at
arrow/r/R/metadata.R
Line 211 in 29586f4
The error is because base R itself requires the
"names"attribute to be sized the same as the data, which at this point in the callxis length 0 (numeric(0)).The underlying issue is that
xhere is still lazy with a placeholdernumeric(0). Is it possible to change whether the data is realized already for this specific data path?Component(s)
R