Skip to content

Error loading German model #103

@tannonk

Description

@tannonk

Hi,

I'd like to use benepar to parse German, however, when trying to add the German model to spacy's nlp_pipe, I get the following error:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File ".../lib/python3.10/site-packages/spacy/language.py", line 814, in add_pipe
    pipe_component = self.create_pipe(
  File ".../lib/python3.10/site-packages/spacy/language.py", line 702, in create_pipe
    resolved = registry.resolve(cfg, validate=validate)
  File ".../lib/python3.10/site-packages/confection/__init__.py", line 756, in resolve
    resolved, _ = cls._make(
  File ".../lib/python3.10/site-packages/confection/__init__.py", line 805, in _make
    filled, _, resolved = cls._fill(
  File ".../lib/python3.10/site-packages/confection/__init__.py", line 877, in _fill
    getter_result = getter(*args, **kwargs)
  File ".../lib/python3.10/site-packages/benepar/integrations/spacy_plugin.py", line 176, in create_benepar_component
    return BeneparComponent(
  File ".../lib/python3.10/site-packages/benepar/integrations/spacy_plugin.py", line 116, in __init__
    self._parser = load_trained_model(name)
  File ".../lib/python3.10/site-packages/benepar/integrations/downloader.py", line 34, in load_trained_model
    parser = ChartParser.from_trained(model_path)
  File ".../lib/python3.10/site-packages/benepar/parse_chart.py", line 186, in from_trained
    parser.load_state_dict(state_dict)
  File ".../lib/python3.10/site-packages/torch/nn/modules/module.py", line 2041, in load_state_dict
    raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format(
RuntimeError: Error(s) in loading state_dict for ChartParser:
        Unexpected key(s) in state_dict: "pretrained_model.embeddings.position_ids".

To reproduce:

import spacy
import benepar

nlp = spacy.load('de_core_news_md')
nlp.add_pipe("benepar", config={"model": "benepar_de2"})

Libraries:

torch                    2.0.1
torch-struct             0.5
spacy                    3.6.1
benepar                  0.2.0

If I swap out the models for their English counterparts (en_core_web_md, benepar_en3), it runs fine. Any intuitions on why the German model fails to load?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions