refactor: extract models.py, ui.py, docx_utils.py from index.py by Chetic · Pull Request #40 · Chetic/chunksilo

Chetic · 2026-02-26T05:30:34Z

Summary

Extract shared model utilities into models.py, eliminating duplication between index.py and search.py
Extract terminal UI classes (IndexingUI, FileProcessingContext, GracefulAbort) into ui.py (~560 lines)
Extract DOCX processing functions into docx_utils.py (~250 lines)
Modernize type annotations in index.py to Python 3.11+ syntax (list[], dict[], X | None)
Add ruff linter configuration to pyproject.toml

index.py drops from 2,655 to 1,720 lines (-35%).

Test plan

CI passes: all functional tests
ruff check src/chunksilo/ passes on new files
Verify no behavioral changes — purely mechanical extraction + import updates

🤖 Generated with Claude Code

…x.py - Extract shared model utilities (_get_cached_model_path, resolve_flashrank_model_name, configure_offline_mode) into models.py, eliminating duplication between index.py and search.py - Extract IndexingUI, FileProcessingContext, FileProcessingTimeoutError, GracefulAbort into ui.py (~560 lines) - Extract DOCX processing (_parse_heading_level, _get_doc_temp_dir, _convert_doc_to_docx, split_docx_into_heading_documents) into docx_utils.py (~250 lines) - Modernize type annotations in index.py to Python 3.11+ syntax (list[], dict[], X | None) - Add ruff linter configuration to pyproject.toml index.py drops from 2,655 to 1,720 lines (-35%). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Move the import of EXCLUDED_EMBED_METADATA_KEYS, EXCLUDED_LLM_METADATA_KEYS, and get_heading_store to inside split_docx_into_heading_documents() to avoid the index -> docx_utils -> index circular import at module load time. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: a4ee99ddb9

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-02-26T05:33:35Z

src/chunksilo/docx_utils.py

+from llama_index.core import Document as LlamaIndexDocument
+
+from . import cfgload
+from .index import EXCLUDED_EMBED_METADATA_KEYS, EXCLUDED_LLM_METADATA_KEYS, get_heading_store


Remove circular import between index and DOCX utils

Importing chunksilo.index now fails at module import time because index.py imports docx_utils, and docx_utils.py immediately imports EXCLUDED_EMBED_METADATA_KEYS, EXCLUDED_LLM_METADATA_KEYS, and get_heading_store back from index.py; those names are not defined yet when the first import is in progress, so Python raises ImportError from a partially initialized module. This blocks any workflow that loads chunksilo.index (including CLI indexing entrypoints) before runtime logic can execute.

Useful? React with 👍 / 👎.

Pass heading_store, excluded_embed_metadata_keys, and excluded_llm_metadata_keys as parameters to split_docx_into_heading_documents() instead of importing them from index.py. This cleanly breaks the circular dependency without any runtime import tricks. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Chetic and others added 2 commits February 26, 2026 06:30

chatgpt-codex-connector bot reviewed Feb 26, 2026

View reviewed changes

Chetic closed this Feb 26, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

refactor: extract models.py, ui.py, docx_utils.py from index.py#40

refactor: extract models.py, ui.py, docx_utils.py from index.py#40
Chetic wants to merge 3 commits intomainfrom
refactor/codebase-cleanup

Chetic commented Feb 26, 2026

Uh oh!

chatgpt-codex-connector bot left a comment

Uh oh!

chatgpt-codex-connector bot Feb 26, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

Chetic commented Feb 26, 2026

Summary

Test plan

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector bot Feb 26, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant