Skip to content

Commit 8b88e63

Browse files
updated documentation and taskfile
1 parent be0cec0 commit 8b88e63

4 files changed

Lines changed: 76 additions & 21 deletions

File tree

samples/python/.gitignore

Lines changed: 5 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -13,6 +13,10 @@ __pycache__/
1313
.Python
1414
*.egg-info/
1515

16-
# Virtual environment
16+
# uv
1717
.venv/
18+
uv.lock
19+
20+
# Ruff cache
21+
.ruff_cache/
1822

samples/python/.python-version

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
3.14

samples/python/README.md

Lines changed: 60 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -51,9 +51,59 @@ This project uses [uv](https://docs.astral.sh/uv/) for fast, reliable Python pac
5151

5252
4. Run the quickstart example:
5353
```bash
54-
python quickstart.py
54+
uv run python quickstart.py
5555
```
5656

57+
## Development
58+
59+
### Running Scripts with uv
60+
61+
All Python scripts should be run using `uv run` to ensure they use the correct dependencies:
62+
63+
```bash
64+
uv run python script.py
65+
```
66+
67+
### Code Quality Tools
68+
69+
This project uses multiple linting and type checking tools configured in `pyproject.toml`:
70+
71+
#### Ruff (Fast Python linter)
72+
```bash
73+
# Check for linting issues
74+
uvx ruff check
75+
76+
# Auto-fix issues
77+
uvx ruff check --fix
78+
79+
# Format code
80+
uvx ruff format
81+
```
82+
83+
#### Pylint (Comprehensive linter)
84+
```bash
85+
# Check all Python files
86+
uv run pylint *.py api/*.py helper_functions/*.py
87+
88+
# Check specific file
89+
uv run pylint convert_cli.py
90+
```
91+
92+
#### Pyright (Type checker)
93+
```bash
94+
# Type check all files
95+
uv run pyright
96+
97+
# Type check specific file
98+
uv run pyright convert_cli.py
99+
```
100+
101+
#### Run All Quality Checks
102+
```bash
103+
# Run all three tools
104+
uvx ruff check && uv run pylint *.py api/*.py helper_functions/*.py && uv run pyright
105+
```
106+
57107
## Architecture
58108

59109
### API Client Structure
@@ -101,43 +151,43 @@ api/
101151

102152
### Authentication
103153
```bash
104-
python quickstart.py
154+
uv run python quickstart.py
105155
```
106156

107157
### Convert Documents
108158
```bash
109159
# Convert DOCX to PDF
110-
python convert_cli.py input.docx output.pdf pdf
160+
uv run python convert_cli.py input.docx output.pdf pdf
111161

112162
# Convert PDF to DOCX
113-
python convert_cli.py input.pdf output.docx docx
163+
uv run python convert_cli.py input.pdf output.docx docx
114164
```
115165

116166
### Extract Data
117167
```bash
118168
# Extract tables
119-
python extract_data.py tables input.pdf output.json
169+
uv run python extract_data.py tables input.pdf output.json
120170

121171
# Extract forms
122-
python extract_data.py forms input.pdf output.json
172+
uv run python extract_data.py forms input.pdf output.json
123173
```
124174

125175
### Redact Content
126176
```bash
127177
# Auto-detect and redact PII
128-
python smart_redact_pii.py input.pdf output.pdf
178+
uv run python smart_redact_pii.py input_folder output_folder
129179

130180
# Redact specific keywords
131-
python redact_by_keyword.py input.pdf output.pdf "confidential" "secret"
181+
uv run python redact_by_keyword.py input_folder output_folder "confidential" "secret"
132182
```
133183

134184
### Batch Operations
135185
```bash
136186
# Convert all DOCX files to PDF
137-
python batch_process.py ./input ./output pdf "*.docx"
187+
uv run python batch_process.py ./input ./output pdf "*.docx"
138188

139189
# Password protect all PDFs
140-
python bulk_password_protect.py ./input ./output "MyPassword123"
190+
uv run python bulk_password_protect.py ./input ./output "MyPassword123"
141191
```
142192

143193
## Using the API Client

samples/python/Taskfile.yml

Lines changed: 10 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -4,49 +4,49 @@ tasks:
44
smart-redact:
55
desc: "Smart redact PII from documents in a folder (e.g., task smart-redact INPUT_DIR=./documents OUTPUT_DIR=./redacted)"
66
cmds:
7-
- python smart_redact_pii.py "{{.INPUT_DIR}}" "{{.OUTPUT_DIR}}"
7+
- uv run python smart_redact_pii.py "{{.INPUT_DIR}}" "{{.OUTPUT_DIR}}"
88
requires:
99
vars: [INPUT_DIR, OUTPUT_DIR]
1010

1111
convert:
1212
desc: "Convert document from CLI (e.g., task convert INPUT=doc.docx OUTPUT=doc.pdf FORMAT=pdf)"
1313
cmds:
14-
- python convert_cli.py {{.INPUT}} {{.OUTPUT}} {{.FORMAT}}
14+
- uv run python convert_cli.py {{.INPUT}} {{.OUTPUT}} {{.FORMAT}}
1515
requires:
1616
vars: [INPUT, OUTPUT, FORMAT]
1717

1818
batch:
1919
desc: "Batch process documents (e.g., task batch INPUT_DIR=./docs OUTPUT_DIR=./output FORMAT=pdf PATTERN='*.docx')"
2020
cmds:
21-
- python batch_process.py "{{.INPUT_DIR}}" "{{.OUTPUT_DIR}}" {{.FORMAT}} "{{.PATTERN | default "*"}}"
21+
- uv run python batch_process.py "{{.INPUT_DIR}}" "{{.OUTPUT_DIR}}" {{.FORMAT}} "{{.PATTERN | default "*"}}"
2222
requires:
2323
vars: [INPUT_DIR, OUTPUT_DIR, FORMAT]
2424

2525
extract:
2626
desc: "Extract forms or tables from PDF (e.g., task extract MODE=forms INPUT=form.pdf OUTPUT=data.json)"
2727
cmds:
28-
- python extract_data.py {{.MODE}} "{{.INPUT}}" "{{.OUTPUT}}"
28+
- uv run python extract_data.py {{.MODE}} "{{.INPUT}}" "{{.OUTPUT}}"
2929
requires:
3030
vars: [MODE, INPUT, OUTPUT]
3131

3232
password-protect:
3333
desc: "Bulk password protect PDFs (e.g., task password-protect INPUT_DIR=./pdfs OUTPUT_DIR=./protected PASSWORD=secret123)"
3434
cmds:
35-
- python bulk_password_protect.py "{{.INPUT_DIR}}" "{{.OUTPUT_DIR}}" "{{.PASSWORD}}"
35+
- uv run python bulk_password_protect.py "{{.INPUT_DIR}}" "{{.OUTPUT_DIR}}" "{{.PASSWORD}}"
3636
requires:
3737
vars: [INPUT_DIR, OUTPUT_DIR, PASSWORD]
3838

3939
redact-keyword:
4040
desc: "Redact specific keywords from a PDF (e.g., task redact-keyword INPUT=doc.pdf OUTPUT=redacted.pdf KEYWORDS='confidential secret')"
4141
cmds:
42-
- python redact_by_keyword.py {{.INPUT}} {{.OUTPUT}} {{.KEYWORDS}}
42+
- uv run python redact_by_keyword.py {{.INPUT}} {{.OUTPUT}} {{.KEYWORDS}}
4343
requires:
4444
vars: [INPUT, OUTPUT, KEYWORDS]
4545

4646
install:
47-
desc: "Install Python dependencies"
47+
desc: "Install Python dependencies using uv"
4848
cmds:
49-
- pip install -r requirements.txt
49+
- uv sync
5050

5151
setup:
5252
desc: "Setup environment (copy .env.example to .env)"
@@ -57,7 +57,7 @@ tasks:
5757
prepare-distribution:
5858
desc: "Prepare documents for external distribution (convert, compress, remove metadata) (e.g., task prepare-distribution INPUT_DIR=./brochures OUTPUT_DIR=./ready)"
5959
cmds:
60-
- python prepare_pdf_for_distribution.py "{{.INPUT_DIR}}" "{{.OUTPUT_DIR}}"
60+
- uv run python prepare_pdf_for_distribution.py "{{.INPUT_DIR}}" "{{.OUTPUT_DIR}}"
6161
requires:
6262
vars: [INPUT_DIR, OUTPUT_DIR]
6363

@@ -66,6 +66,6 @@ tasks:
6666
onboard-employees:
6767
desc: "Send company policies to new employees for signature (e.g., task onboard-employees POLICIES_DIR=./policies CSV=new_hires.csv)"
6868
cmds:
69-
- python employee_policy_onboarding.py "{{.POLICIES_DIR}}" "{{.CSV}}"
69+
- uv run python employee_policy_onboarding.py "{{.POLICIES_DIR}}" "{{.CSV}}"
7070
requires:
7171
vars: [POLICIES_DIR, CSV]

0 commit comments

Comments
 (0)