Skip to content

Version 2.4.1: CSV Parser Bulletproofing#287

Merged
vincentlaucsb merged 17 commits intomasterfrom
feb-15-2026
Feb 16, 2026
Merged

Version 2.4.1: CSV Parser Bulletproofing#287
vincentlaucsb merged 17 commits intomasterfrom
feb-15-2026

Conversation

@vincentlaucsb
Copy link
Copy Markdown
Owner

@vincentlaucsb vincentlaucsb commented Feb 16, 2026

  • Fixes Debug builds on linux always link against gcov #279 by disabling gcov by default
  • Fixes numerous address and thread safety issues with lock-free design
    • Replaced std::deque with fixed-size atomic block pointer table in CSVFieldList to prevent heap-use-after-free from internal map reallocation
    • CSVRow::iterator now captures shared_ptr to data, preventing use-after-free when row is reassigned
    • ThreadSafeDeque hot paths (empty(), is_waitable()) now lock-free with atomic flags
      • Removed unnecessary locks from notify_all()/kill_all()
    • Added iterator semantics documentation warning against forward-iterator algorithms
  • Made CSVField hex parsing templated
    • Does not break existing int usages
    • Now defaults to long long for increased overflow safety
  • Added ASan/TSan to GitHub CI/CD

Thread safety issue wasn't in library, but rather a test was was not constructed properly.
Allows for using long long for better overflow safety without breaking int usages
Fixed with low cost copies of simple data structures to avoid possible concurrency related issues
std::deque actually does have invalidation issues (of iterators) when expanding just like std::vector
Use more lockless sync. Fix some thread safety issues.
Concurrent R/W to std::map is UB
@github-advanced-security
Copy link
Copy Markdown
Contributor

This pull request sets up GitHub code scanning for this repository. Once the scans have completed and the checks have passed, the analysis results for this pull request branch will appear on this overview. Once you merge this pull request, the 'Security' tab will show more code scanning analysis results (for example, for the default branch). Depending on your configuration and choice of analysis tool, future pull requests will be annotated with code scanning analysis results. For more information about GitHub code scanning, check out the documentation.

Comment on lines +11 to +72
runs-on: ubuntu-latest

strategy:
fail-fast: false
matrix:
sanitizer: [address, thread, undefined]
cxx_standard: [17, 20]
include:
- sanitizer: address
flag: "-fsanitize=address -fno-omit-frame-pointer"
name: "AddressSanitizer"
- sanitizer: thread
flag: "-fsanitize=thread"
name: "ThreadSanitizer"
- sanitizer: undefined
flag: "-fsanitize=undefined -fno-omit-frame-pointer"
name: "UndefinedBehaviorSanitizer"

env:
ASAN_OPTIONS: "detect_leaks=1:halt_on_error=0"
TSAN_OPTIONS: "halt_on_error=1"
UBSAN_OPTIONS: "halt_on_error=1"

steps:
- name: Checkout repository and submodules
uses: actions/checkout@v4
with:
submodules: recursive

- name: Install dependencies
run: |
sudo apt-get update
sudo apt-get install -y build-essential cmake

- name: Configure CMake with ${{ matrix.name }}
run: |
mkdir -p build
cd build
cmake .. \
-DCMAKE_BUILD_TYPE=Debug \
-DCSV_CXX_STANDARD=${{ matrix.cxx_standard }} \
-DCMAKE_CXX_COMPILER=g++ \
-DCMAKE_C_COMPILER=gcc \
-DCMAKE_CXX_FLAGS="${{ matrix.flag }} -g" \
-DCMAKE_C_FLAGS="${{ matrix.flag }} -g"

- name: Build with ${{ matrix.name }}
run: cmake --build build --config Debug

- name: Test with ${{ matrix.name }}
working-directory: build
run: ctest --output-on-failure -V
timeout-minutes: 20

- name: Upload sanitizer logs
if: failure()
uses: actions/upload-artifact@v4
with:
name: sanitizer-logs-${{ matrix.sanitizer }}-std${{ matrix.cxx_standard }}
path: build/Testing/

valgrind:

Check warning

Code scanning / CodeQL

Workflow does not contain permissions Medium

Actions job or workflow does not limit the permissions of the GITHUB_TOKEN. Consider setting an explicit permissions block, using the following as a minimal starting point: {contents: read}

Copilot Autofix

AI 2 months ago

In general, the fix is to explicitly set a restrictive permissions block in the workflow, ideally at the top level so it applies to all jobs, and only broaden it if/when a specific job needs more. For this workflow, the jobs only need to read repository contents and interact with artifacts; they do not need to write to code, PRs, or issues. A suitable minimal configuration is permissions: contents: read, which still allows actions/checkout to function, and does not interfere with actions/upload-artifact (which does not use GITHUB_TOKEN).

The single best fix without changing functionality is to add a top-level permissions section right under the name: (and before on:) that sets contents: read. This will apply to both sanitizers and valgrind jobs, satisfying CodeQL and enforcing least privilege. No other code, steps, or jobs need modifications, and no imports or external libraries are involved since this is a YAML configuration change only.

Specifically, in .github/workflows/sanitizers.yml, insert:

permissions:
  contents: read

between line 1 (name: Memory and Thread Sanitizers) and line 3 (on:). No other file changes are required.

Suggested changeset 1
.github/workflows/sanitizers.yml

Autofix patch

Autofix patch
Run the following command in your local git repository to apply this patch
cat << 'EOF' | git apply
diff --git a/.github/workflows/sanitizers.yml b/.github/workflows/sanitizers.yml
--- a/.github/workflows/sanitizers.yml
+++ b/.github/workflows/sanitizers.yml
@@ -1,4 +1,6 @@
 name: Memory and Thread Sanitizers
+permissions:
+  contents: read
 
 on:
   push:
EOF
@@ -1,4 +1,6 @@
name: Memory and Thread Sanitizers
permissions:
contents: read

on:
push:
Copilot is powered by AI and may make mistakes. Always verify output.
Comment on lines +73 to +111
runs-on: ubuntu-latest

steps:
- name: Checkout repository and submodules
uses: actions/checkout@v4
with:
submodules: recursive

- name: Install dependencies
run: |
sudo apt-get update
sudo apt-get install -y build-essential cmake valgrind

- name: Configure CMake
run: |
mkdir -p build
cd build
cmake .. \
-DCMAKE_BUILD_TYPE=Debug \
-DCSV_CXX_STANDARD=17 \
-DCMAKE_CXX_COMPILER=g++ \
-DCMAKE_C_COMPILER=gcc \
-DCMAKE_CXX_FLAGS="-g -O1" \
-DCMAKE_C_FLAGS="-g -O1"

- name: Build
run: cmake --build build --config Debug

- name: Test with Valgrind
working-directory: build
run: ctest --output-on-failure
timeout-minutes: 60

- name: Upload Valgrind results
if: failure()
uses: actions/upload-artifact@v4
with:
name: valgrind-results
path: build/Testing/

Check warning

Code scanning / CodeQL

Workflow does not contain permissions Medium

Actions job or workflow does not limit the permissions of the GITHUB_TOKEN. Consider setting an explicit permissions block, using the following as a minimal starting point: {contents: read}

Copilot Autofix

AI 2 months ago

In general, to fix this issue you should explicitly declare permissions for the GITHUB_TOKEN either at the workflow root (applying to all jobs) or for each job individually, granting only the minimal scopes needed. For a build-and-test workflow that only checks out the repo and uploads artifacts, contents: read is typically sufficient.

The best, least-invasive fix here is to add a single root-level permissions block right under the name: (or under on:) in .github/workflows/sanitizers.yml. This will apply to both sanitizers and valgrind jobs and ensures that the GITHUB_TOKEN has read-only access to repository contents and no write scopes. No existing steps need modification, and no new dependencies or imports are required, because GitHub Actions interprets the permissions key natively.

Concretely:

  • Edit .github/workflows/sanitizers.yml.
  • Insert:
permissions:
  contents: read

after the name: Memory and Thread Sanitizers line (or equivalently between on: and jobs:), ensuring indentation matches YAML rules (root-level, no extra spaces). This documents and enforces least-privilege access for the workflow.

Suggested changeset 1
.github/workflows/sanitizers.yml

Autofix patch

Autofix patch
Run the following command in your local git repository to apply this patch
cat << 'EOF' | git apply
diff --git a/.github/workflows/sanitizers.yml b/.github/workflows/sanitizers.yml
--- a/.github/workflows/sanitizers.yml
+++ b/.github/workflows/sanitizers.yml
@@ -1,5 +1,8 @@
 name: Memory and Thread Sanitizers
 
+permissions:
+  contents: read
+
 on:
   push:
     branches: [ "master", "feb-15-2026" ]
EOF
@@ -1,5 +1,8 @@
name: Memory and Thread Sanitizers

permissions:
contents: read

on:
push:
branches: [ "master", "feb-15-2026" ]
Copilot is powered by AI and may make mistakes. Always verify output.
Comment thread tests/test_csv_iterator.cpp Dismissed
@vincentlaucsb vincentlaucsb merged commit eb35215 into master Feb 16, 2026
25 checks passed
@vincentlaucsb vincentlaucsb deleted the feb-15-2026 branch February 16, 2026 07:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Debug builds on linux always link against gcov

2 participants