Skip to content

Add Inclusion/Exclusion File Filtering via Glob Patterns#9

Merged
simone-viozzi merged 5 commits intomainfrom
1-add-parameters-for-inclusion-and-exlusion
Feb 16, 2025
Merged

Add Inclusion/Exclusion File Filtering via Glob Patterns#9
simone-viozzi merged 5 commits intomainfrom
1-add-parameters-for-inclusion-and-exlusion

Conversation

@simone-viozzi
Copy link
Copy Markdown
Owner

Summary

This PR introduces two new, repeatable CLI options to allow users to filter files by inclusion and exclusion patterns. The options accept glob-style patterns (with optional brace expansion) so that users can intuitively select which files should be processed without having to provide full, exact paths. This feature provides more granular control over file selection and helps avoid redundant specification of similar paths.

Proposed Changes

  1. New CLI Options:

    • --include (-i):
      • Accepts one or more glob patterns.
      • When provided, only files whose relative paths match at least one include pattern will be processed.
    • --exclude (-e):
      • Accepts one or more glob patterns.
      • Any file matching an exclude pattern is omitted—even if it matches an include pattern.
  2. Pattern Matching:

    • Use Python’s standard glob matching (or extend the use of the pathspec library) to support both wildcards (e.g., "src/*.py") and brace expansion (e.g., "src/{module1,module2}.py").
    • Patterns will be applied against the file paths relative to the user-specified root directory.
  3. Processing Order:

    • Step 1: If include patterns are provided, filter the list of candidate files to those that match at least one of the patterns.
    • Step 2: Apply exclusion patterns to remove any files that match.
    • Note: Exclusion patterns take precedence over inclusion patterns.
  4. User Feedback (Optional):

    • When running in a verbose or “dry-run” mode, print the list of files matching the include and exclude filters so users can verify the selection.

Rationale

  • Intuitive Syntax:
    Glob patterns (e.g., "src/*.py") are widely recognized and understood by developers. This avoids the need for users to specify lengthy or exact file paths.

  • Flexibility:
    The ability to combine multiple patterns lets users select broad classes of files (e.g., all Python files) and then exclude specific ones (e.g., src/__init__.py), all in one command.

  • Consistency:
    The new options align with the .gitignore syntax already in use, so users familiar with that system will quickly understand how to use the new parameters.

  • Implementation Ease:
    Leveraging existing libraries (like pathspec or Python’s fnmatch) minimizes the development effort while providing robust pattern matching.

Examples

  • Include all Python files in the src directory:

    python gpt-copy /project/root --include "src/*.py"
  • Include multiple specific modules (using brace expansion):

    python gpt-copy /project/root --include "src/{module1,module2}.py"
  • Include all .py files but exclude initialization files:

    python gpt-copy /project/root --include "src/*.py" --exclude "src/__init__.py"
  • Exclude test files across the project:

    python gpt-copy /project/root --exclude "src/tests/*"

Acceptance Criteria

  1. CLI Interface:

    • The CLI must accept --include (alias -i) and --exclude (alias -e) as repeatable options.
    • The options accept one or more glob patterns.
  2. File Selection:

    • When --include is used, only files matching at least one include pattern are processed.
    • When --exclude is used, any file matching an exclude pattern is omitted—even if it would otherwise be included.
    • Exclusion patterns must override inclusion patterns.
  3. Pattern Matching:

    • The matching should support standard glob wildcards (e.g., *, ?) and brace expansion (e.g., {file1,file2}).
    • Patterns are applied to file paths relative to the root provided by the user.
  4. Testing & Documentation:

    • Unit tests must cover:
      • Matching files using include patterns.
      • Excluding files with exclude patterns.
      • Combined behavior when both options are provided.
    • Documentation (README and help text in the CLI) must explain the syntax and examples of usage.
    • A “dry-run” or verbose mode (if available) should output the matched file list for user verification.
  5. Backward Compatibility:

    • The new options must be optional.
    • Existing behavior (without specifying any include/exclude patterns) should remain unchanged.

@simone-viozzi simone-viozzi linked an issue Feb 16, 2025 that may be closed by this pull request
@simone-viozzi simone-viozzi merged commit 292908e into main Feb 16, 2025
@simone-viozzi simone-viozzi deleted the 1-add-parameters-for-inclusion-and-exlusion branch February 16, 2025 17:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

add parameters for inclusion and exlusion

1 participant