Skip to content

Impossible to pass multiple ZIM languages as CSV #449

@benoit74

Description

@benoit74

Probably following the upgrade to zimscraperlib 5, it is not possible anymore to pass multiple languages as CSV:

Traceback (most recent call last):
  File "/usr/bin/zimit", line 8, in <module>
    sys.exit(zimit.zimit())
             ~~~~~~~~~~~^^
  File "/app/zimit/lib/python3.13/site-packages/zimit/zimit.py", line 1247, in zimit
    sys.exit(run(sys.argv[1:]))
             ~~~^^^^^^^^^^^^^^
  File "/app/zimit/lib/python3.13/site-packages/zimit/zimit.py", line 852, in run
    res = warc2zim(warc2zim_args)
  File "/app/zimit/lib/python3.13/site-packages/warc2zim/main.py", line 168, in main
    return converter.run()
           ~~~~~~~~~~~~~^^
  File "/app/zimit/lib/python3.13/site-packages/warc2zim/converter.py", line 278, in run
    metadata.LanguageMetadata(self.language)
    ~~~~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^
  File "<@beartype(zimscraperlib.zim.metadata.TextListBasedMetadata.__init__) at 0x7ff107e04900>", line 59, in __init__
  File "/app/zimit/lib/python3.13/site-packages/zimscraperlib/zim/metadata.py", line 297, in __init__
    super().__init__(value=value, name=name)
    ~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^
  File "<@beartype(zimscraperlib.zim.metadata.MetadataBase.__init__) at 0x7ff107da6340>", line 57, in __init__
  File "/app/zimit/lib/python3.13/site-packages/zimscraperlib/zim/metadata.py", line 93, in __init__
    self.value = self.get_cleaned_value(value)
                 ~~~~~~~~~~~~~~~~~~~~~~^^^^^^^
  File "<@beartype(zimscraperlib.zim.metadata.TextListBasedMetadata.get_cleaned_value) at 0x7ff107e04a40>", line 40, in get_cleaned_value
  File "/app/zimit/lib/python3.13/site-packages/zimscraperlib/zim/metadata.py", line 316, in get_cleaned_value
    raise ValueError(
        f"Following code(s) are not ISO-639-3: {','.join(invalid_codes)}"
    )
ValueError: Following code(s) are not ISO-639-3: eng,nld,spa

Probably the problem is that we should not join back with commas the array:

return ",".join(langs)

This should probably be covered by some kind of unit / e2e test, either here or in zimit.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions