Skip to content

[Bug]:merge_nodes_and_edges calls missing entity_chunks_storage and relation_chunks_storage during multimodal processing #241

@ashah1992

Description

@ashah1992

Do you need to file an issue?

  • I have searched the existing issues and this bug is not already filed.
  • I believe this is a legitimate bug, not just a question or feature request.

Describe the bug

During multimodal content ingestion, several merge_nodes_and_edges call sites in processor.py and modalprocessors.py do not pass entity_chunks_storage, relation_chunks_storage (and in one case, full_entities_storage / full_relations_storage) to lightrag.operate.merge_nodes_and_edges.

Since these parameters default to None, the calls succeed silently — but entity-to-chunk and relation-to-chunk mappings are never persisted. This means multimodal entities are created in the knowledge graph and vector DB, but their chunk association metadata is lost, degrading retrieval quality for multimodal content.

Affected Files (v1.2.10)

File Method Missing Parameters
processor.py _process_multimodal_content_individual (line ~756) entity_chunks_storage, relation_chunks_storage
processor.py _batch_merge_lightrag_style_type_aware (line ~1360) entity_chunks_storage, relation_chunks_storage
modalprocessors.py BaseModalProcessor._process_chunk_for_extraction (line ~780) full_entities_storage, full_relations_storage, entity_chunks_storage, relation_chunks_storage

Steps to reproduce

  1. Ingest a PDF containing images/tables via process_document_complete()
  2. Inspect kv_store_entity_chunks.json and kv_store_relation_chunks.json — multimodal entity/chunk mappings are absent
  3. Compare with text-only ingestion where mappings are correctly populated by LightRAG's internal pipeline

Expected Behavior

All merge_nodes_and_edges calls should pass the full set of storage instances, consistent with how lightrag itself calls the function internally during text ingestion:

await merge_nodes_and_edges(
    chunk_results=...,
    knowledge_graph_inst=self.lightrag.chunk_entity_relation_graph,
    entity_vdb=self.lightrag.entities_vdb,
    relationships_vdb=self.lightrag.relationships_vdb,
    global_config=self.lightrag.__dict__,
    full_entities_storage=self.lightrag.full_entities,       # missing in modalprocessors.py
    full_relations_storage=self.lightrag.full_relations,     # missing in modalprocessors.py
    pipeline_status=pipeline_status,
    pipeline_status_lock=pipeline_status_lock,
    llm_response_cache=self.lightrag.llm_response_cache,
    entity_chunks_storage=self.lightrag.entity_chunks,       # missing in all 3 call sites
    relation_chunks_storage=self.lightrag.relation_chunks,   # missing in all 3 call sites
    current_file_number=1,
    total_files=1,
    file_path=file_path,
)

Suggested Fix

processor.py — add two lines to both _process_multimodal_content_individual and _batch_merge_lightrag_style_type_aware:

             llm_response_cache=self.lightrag.llm_response_cache,
+            entity_chunks_storage=self.lightrag.entity_chunks,
+            relation_chunks_storage=self.lightrag.relation_chunks,
             current_file_number=1,

modalprocessors.py — add four lines to BaseModalProcessor._process_chunk_for_extraction:

             global_config=self.global_config,
+            full_entities_storage=self.lightrag.full_entities,
+            full_relations_storage=self.lightrag.full_relations,
             pipeline_status=pipeline_status,
             pipeline_status_lock=pipeline_status_lock,
             llm_response_cache=self.hashing_kv,
+            entity_chunks_storage=self.lightrag.entity_chunks,
+            relation_chunks_storage=self.lightrag.relation_chunks,
             current_file_number=1,

LightRAG Config Used

Paste your config here

Logs and screenshots

No response

Additional Information

  • LightRAG Version:1.4.13
  • Operating System: 26.3.1
  • Python Version:3.14
  • Related Issues:

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions