-
Notifications
You must be signed in to change notification settings - Fork 5
File collection rules
Darren Li edited this page Apr 11, 2025
·
2 revisions
-
First set of files is collected based on:
- Extensions from
--exts,--add-exts,--single-line-exts,--single-line-add-exts-
--extsdefaults totxt,md,pdf,epub,odt,docx,fb2,ipynb,html,htm -
--single-line-extsdefaults tolog,csv,tsv -
--add-extsand--single-line-add-extsboth default to empty strings
-
-
PATHs provided as command line arguments, e.g.dir0,dir1,file0indocfd dir0 dir1 file0-
PATHs default to.only when none of--paths-from,--glob,--single-line-globare specified
-
- Paths specified in
FILEfrom--paths-from FILE
- Extensions from
-
Second set of files is collected based on
--single-line-glob -
Third set of files is collected based on
--glob -
Directories captured by globs are not recursively scanned, i.e. files must be directly picked up by glob to be considered for second and third set of files
-
Files are categorized for single line search mode and default search mode
- Default search mode is multiline search mode, unless
--single-lineis used
- Default search mode is multiline search mode, unless
-
A file falls into the single line search mode category if it satisfies any of the following:
- File is in
PATHs or inFILEfrom--paths-from FILEand the extension falls into--single-line-extsor--single-line-add-exts - File is captured by
--single-line-glob - File is captured by
--glob, and the extension falls into--single-line-extsor--single-line-add-exts
- File is in
-
Otherwise, the file falls into the default search mode category