Skip to content

Advanced usage

Darren Li edited this page Apr 29, 2025 · 4 revisions

Handling a large collection of files

In this case, the default cache limit might not be enough or you might want to keep a stable cache for this collection of files.

The following script template may be handy in this situation for creating a collection specific cache:

#!/usr/bin/env bash

docfd --cache-dir /large/collection/.cache --cache-soft-limit 20000 /large/collection

Search scope narrowing

Docfd allows you to restrict the next search to a range surrounding the current search results by typing nN where N is the "level" to narrow to and ranges from 0 to 9. N=0 means resetting the search scope to cover the full document, other values of N means narrowing the search scope to:

N * tokens_per_search_scope_level

tokens around the existing search results. You can adjust the multiplier via --tokens-per-search-scope-level.

Except for N=0, search scope narrowing is applied on top of the existing restrictions. For example, suppose we do the following steps:

  • search "example"
  • type n1
  • search "hello world"
  • type n1
  • search "document"
  • type n0
  • search "file"

The third search ("document") is then restricted to a range close to at least one search result of "example" and at least one search result of "hello world".

The forth search ("file") is no longer restricted to any range as n0 resets the search scopes of all documents.

Editing/viewing command history

Undoing and redoing suffices for simple adjustment of last one or two steps of your search, but sometimes it is handy to be able to adjust the entire history of your interaction with Docfd in one go, to change a very early search term etc.

In Docfd, most of the interactions map to a command. To see this in action, trying searching, say, "fuzzy search" and drop all unlisted (dL) files. Then press h, which triggers Docfd to open the command history in your default text editor:

search: fuzzy search
drop unlisted

# You are viewing/editing Docfd command history.
# If any change is made to this file, Docfd will replay the commands from the start.
#
# If a line is not blank and does not start with #,
# then the line should contain exactly one command.
# A command cannot be written across multiple lines.
#
# Starting point is v0, the full document store.
# Each command adds one to the version number.
# Command at the top is oldest, command at bottom is the newest.
#
# Note that for commands that accept text, all trailing text is trimmed and then used in full.
# This means " and ' are treated literally and are not used to delimit strings.
#
# Possible commands:
# - search: search phrase
# - clear search
# - filter: file.*pattern
# - clear filter
# - narrow level: 1
# - mark: /path/to/document
# - unmark: /path/to/document
# - unmark all
# - drop: /path/to/document
# - drop all except: /path/to/document
# - drop marked
# - drop unmarked
# - drop listed
# - drop unlisted

From here, you can make any adjustments you want, e.g. clear the entire history, reorder, add new steps. If any adjustment is made, then Docfd will replay the history from the start.

This is also handy to save as a text file for when you want to repeat the same search in the future by either pasting it again when editing the command history or pass it to Docfd via --commands-from argument.

Some later examples make use of this feature.

Using a saved search as part of your workflow

Lets say you work at XYZ Corp and you want to find all your payslip PDFs to pass to another command etc.

From an earlier interactive search, you gathered you can narrow down to your payslip PDFs by searching "XYZ corp", dropping the unrelated files, and then searching "payslip", and you saved the command history into file xyz_payslips.docfd_commands:

search: XYZ corp
drop unlisted
search: payslip

To gather the list of files that would have been listed in Docfd interactive mode, simply combine --commands-from and -l/--files-with-match:

$ docfd --commands-from xyz_payslips.docfd_commands -l
/home/.../XYZ_corp-payslips/2025-04-04.pdf
/home/.../XYZ_corp-payslips/2025-04-11.pdf
...

Clone this wiki locally