Add --tokens and --top-n options with hierarchical tree structure and token-based ordering#31
Conversation
|
can you rebase on dev? |
I don't see a |
|
yes please |
Rebased successfully on main! The branch now incorporates all the latest changes including the packaging system updates (uv/hatchling). All 3 commits have been cleanly rebased without conflicts. Commit: 4274603 |
|
bump a minor please |
Version bumped to 2.4.0 in abc984b |
…cture Co-authored-by: simone-viozzi <19674168+simone-viozzi@users.noreply.github.com>
…ality Co-authored-by: simone-viozzi <19674168+simone-viozzi@users.noreply.github.com>
Co-authored-by: simone-viozzi <19674168+simone-viozzi@users.noreply.github.com>
d5091c1 to
2dabbe0
Compare
|
❯ gpt-copy . --tokens --top-n 10 Folder StructureAll files merged into stdout good! it works! but they are not in order! can you make them in order? |
Co-authored-by: simone-viozzi <19674168+simone-viozzi@users.noreply.github.com>
Fixed! The files are now correctly ordered by token count (highest to lowest) when using Warning Firewall rules blocked me from connecting to one or more addressesI tried to connect to the following addresses, but was blocked by firewall rules:
If you need me to access, download, or install something from one of these locations, you can either:
|
i see how you implemented this, what i whould like is to keep the tree an order each subtree by tokens like this: so like, you build the tree with all the nodes that have token count, then to cut to a top n, you display only the top n leafs (without counting directory nodes) |
…-top-n option Co-authored-by: simone-viozzi <19674168+simone-viozzi@users.noreply.github.com>
... Fixed! The --tokens --top-n option now maintains the hierarchical tree structure as requested: ✅ Keeps tree structure instead of flattening to a list Example output: The implementation correctly builds the tree with all nodes that have token counts, sorts files within directories by token count (highest first), and when using --top-n, displays only the top N leaf files while maintaining the tree structure. Commit: 576a4d2 Warning Firewall rules blocked me from connecting to one or more addressesI tried to connect to the following addresses, but was blocked by firewall rules:
If you need me to access, download, or install something from one of these locations, you can either:
|
This PR adds integrated token counting functionality to the main
gpt-copycommand, allowing users to display token counts for each file in the tree structure and filter by the files with the highest token counts while maintaining hierarchical organization.New Features
--tokensOptionDisplay token counts for each file and directory in the tree structure:
gpt-copy . --tokensOutput example:
--top-nOptionWhen used with
--tokens, show only the top N files by token count while preserving the hierarchical tree structure:gpt-copy . --tokens --top-n 3This filters to show only the 3 files with the highest token counts but maintains the directory structure, making it easy to understand where the most content-heavy files are located:
Key Features
--top-n, only the top N leaf files are included while preserving their directory contextIntegration with Existing Features
The token counting works seamlessly with existing file filtering options:
Implementation Details
tiktokenlibrary with GPT-4o encoding when available, with a fallback to character-based estimation when network access is unavailable--top-nfiltering, showing directory token counts and proper nestingTechnical Changes
count_tokens_safe()function with tiktoken integration and fallbackcollect_file_info_with_tokens()to gather file information with token countscalculate_directory_tokens()to compute directory token sums recursivelygenerate_tree_with_tokens()to maintain hierarchical structure with token-based ordering--tokensand--top-noptionsThe implementation provides an intuitive way to understand the token distribution across a project's file structure, helping users identify content-heavy files while maintaining the familiar tree organization.
Fixes #29.
💬 Share your feedback on Copilot coding agent for the chance to win a $200 gift card! Click here to start the survey.