Skip to content

fails for datasets with base dataset that get auto-converted to parquet #93

@ArneBinder

Description

@ArneBinder

Since datasets>=2.16.0, Huggingface datasets get autoconverted to Parquet, if possible (see huggingface/datasets#6448). This breaks the respective PIE datasets, because they need to have the matching dataset builder parent class (ArrowBasedBuilder / GeneratorBasedBuilder), but auto-conversion may have changed the base dataset builder parent class from GeneratorBasedBuilder to ArrowBasedBuilder.

To fix this, the respective PIE dataset builders need to be derived from ArrowBasedBuilder when using datasets>=2.16.0.

Affected PIE datasets:

  • conll2003
  • imdb
  • sqad_v2

Edit: For now, we just #94. The real fix needs to be done in accordance with a min version update of datasets to >=0.16.0.

Metadata

Metadata

Assignees

Labels

bugSomething isn't working

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions