Version 2.4.1: CSV Parser Bulletproofing by vincentlaucsb · Pull Request #287 · vincentlaucsb/csv-parser

vincentlaucsb · 2026-02-16T07:01:21Z

Fixes Debug builds on linux always link against gcov #279 by disabling gcov by default
Fixes numerous address and thread safety issues with lock-free design
- Replaced std::deque with fixed-size atomic block pointer table in CSVFieldList to prevent heap-use-after-free from internal map reallocation
- CSVRow::iterator now captures shared_ptr to data, preventing use-after-free when row is reassigned
- ThreadSafeDeque hot paths (empty(), is_waitable()) now lock-free with atomic flags
  - Removed unnecessary locks from notify_all()/kill_all()
- Added iterator semantics documentation warning against forward-iterator algorithms
Made CSVField hex parsing templated
- Does not break existing int usages
- Now defaults to long long for increased overflow safety
Added ASan/TSan to GitHub CI/CD

Thread safety issue wasn't in library, but rather a test was was not constructed properly.

Allows for using long long for better overflow safety without breaking int usages

Fixed with low cost copies of simple data structures to avoid possible concurrency related issues

std::deque actually does have invalidation issues (of iterators) when expanding just like std::vector

Use more lockless sync. Fix some thread safety issues.

Concurrent R/W to std::map is UB

github-advanced-security · 2026-02-16T07:01:23Z

This pull request sets up GitHub code scanning for this repository. Once the scans have completed and the checks have passed, the analysis results for this pull request branch will appear on this overview. Once you merge this pull request, the 'Security' tab will show more code scanning analysis results (for example, for the default branch). Depending on your configuration and choice of analysis tool, future pull requests will be annotated with code scanning analysis results. For more information about GitHub code scanning, check out the documentation.

+    runs-on: ubuntu-latest
+
+    strategy:
+      fail-fast: false
+      matrix:
+        sanitizer: [address, thread, undefined]
+        cxx_standard: [17, 20]
+        include:
+          - sanitizer: address
+            flag: "-fsanitize=address -fno-omit-frame-pointer"
+            name: "AddressSanitizer"
+          - sanitizer: thread
+            flag: "-fsanitize=thread"
+            name: "ThreadSanitizer"
+          - sanitizer: undefined
+            flag: "-fsanitize=undefined -fno-omit-frame-pointer"
+            name: "UndefinedBehaviorSanitizer"
+
+    env:
+      ASAN_OPTIONS: "detect_leaks=1:halt_on_error=0"
+      TSAN_OPTIONS: "halt_on_error=1"
+      UBSAN_OPTIONS: "halt_on_error=1"
+
+    steps:
+    - name: Checkout repository and submodules
+      uses: actions/checkout@v4
+      with:
+        submodules: recursive
+
+    - name: Install dependencies
+      run: |
+        sudo apt-get update
+        sudo apt-get install -y build-essential cmake
+
+    - name: Configure CMake with ${{ matrix.name }}
+      run: |
+        mkdir -p build
+        cd build
+        cmake .. \
+          -DCMAKE_BUILD_TYPE=Debug \
+          -DCSV_CXX_STANDARD=${{ matrix.cxx_standard }} \
+          -DCMAKE_CXX_COMPILER=g++ \
+          -DCMAKE_C_COMPILER=gcc \
+          -DCMAKE_CXX_FLAGS="${{ matrix.flag }} -g" \
+          -DCMAKE_C_FLAGS="${{ matrix.flag }} -g"
+
+    - name: Build with ${{ matrix.name }}
+      run: cmake --build build --config Debug
+
+    - name: Test with ${{ matrix.name }}
+      working-directory: build
+      run: ctest --output-on-failure -V
+      timeout-minutes: 20
+
+    - name: Upload sanitizer logs
+      if: failure()
+      uses: actions/upload-artifact@v4
+      with:
+        name: sanitizer-logs-${{ matrix.sanitizer }}-std${{ matrix.cxx_standard }}
+        path: build/Testing/
+
+  valgrind:


In general, the fix is to explicitly set a restrictive permissions block in the workflow, ideally at the top level so it applies to all jobs, and only broaden it if/when a specific job needs more. For this workflow, the jobs only need to read repository contents and interact with artifacts; they do not need to write to code, PRs, or issues. A suitable minimal configuration is permissions: contents: read, which still allows actions/checkout to function, and does not interfere with actions/upload-artifact (which does not use GITHUB_TOKEN).

The single best fix without changing functionality is to add a top-level permissions section right under the name: (and before on:) that sets contents: read. This will apply to both sanitizers and valgrind jobs, satisfying CodeQL and enforcing least privilege. No other code, steps, or jobs need modifications, and no imports or external libraries are involved since this is a YAML configuration change only.

Specifically, in .github/workflows/sanitizers.yml, insert:

permissions: contents: read

between line 1 (name: Memory and Thread Sanitizers) and line 3 (on:). No other file changes are required.

+    runs-on: ubuntu-latest
+
+    steps:
+    - name: Checkout repository and submodules
+      uses: actions/checkout@v4
+      with:
+        submodules: recursive
+
+    - name: Install dependencies
+      run: |
+        sudo apt-get update
+        sudo apt-get install -y build-essential cmake valgrind
+
+    - name: Configure CMake
+      run: |
+        mkdir -p build
+        cd build
+        cmake .. \
+          -DCMAKE_BUILD_TYPE=Debug \
+          -DCSV_CXX_STANDARD=17 \
+          -DCMAKE_CXX_COMPILER=g++ \
+          -DCMAKE_C_COMPILER=gcc \
+          -DCMAKE_CXX_FLAGS="-g -O1" \
+          -DCMAKE_C_FLAGS="-g -O1"
+
+    - name: Build
+      run: cmake --build build --config Debug
+
+    - name: Test with Valgrind
+      working-directory: build
+      run: ctest --output-on-failure
+      timeout-minutes: 60
+
+    - name: Upload Valgrind results
+      if: failure()
+      uses: actions/upload-artifact@v4
+      with:
+        name: valgrind-results
+        path: build/Testing/


In general, to fix this issue you should explicitly declare permissions for the GITHUB_TOKEN either at the workflow root (applying to all jobs) or for each job individually, granting only the minimal scopes needed. For a build-and-test workflow that only checks out the repo and uploads artifacts, contents: read is typically sufficient.

The best, least-invasive fix here is to add a single root-level permissions block right under the name: (or under on:) in .github/workflows/sanitizers.yml. This will apply to both sanitizers and valgrind jobs and ensures that the GITHUB_TOKEN has read-only access to repository contents and no write scopes. No existing steps need modification, and no new dependencies or imports are required, because GitHub Actions interprets the permissions key natively.

Concretely:

Edit .github/workflows/sanitizers.yml.

Insert:

permissions: contents: read

after the name: Memory and Thread Sanitizers line (or equivalently between on: and jobs:), ensuring indentation matches YAML rules (root-level, no extra spaces). This documents and enforces least-privilege access for the workflow.

vincentlaucsb added 17 commits February 15, 2026 20:04

Set up memory sanitizers

4aa9ae9

Update sanitizers.yml

95797eb

Fix poorly organized iterator test suite

718eadc

Fix thread safety issue in test

27282ed

Thread safety issue wasn't in library, but rather a test was was not constructed properly.

Fix potential heap use after free in CSVRow::iterator

1507866

Fix CSV iterator test to not use std::max_element

6a31bdc

Update sanitizers.yml

3f02253

Made CSVField::try_parse_hex a template

6604cac

Allows for using long long for better overflow safety without breaking int usages

Disable gcov by default

a78686d

Fix more thread safety issues

bcd9432

Fixed with low cost copies of simple data structures to avoid possible concurrency related issues

Update thread_safe_deque.hpp

384e753

Replace std::deque with std::map in internal data structure

a412c83

std::deque actually does have invalidation issues (of iterators) when expanding just like std::vector

Fix issue with std::map implementation

ba4c42e

Update thread_safe_deque.hpp

483911d

Use more lockless sync. Fix some thread safety issues.

Update thread_safe_deque.hpp

c82889d

Use std::vector of unique_ptrs to RawCSVFields

23f8999

Concurrent R/W to std::map is UB

Remove some unnecessary locks in ThreadSafeDeque

3596f1a

github-advanced-security AI found potential problems Feb 16, 2026

View reviewed changes

vincentlaucsb merged commit eb35215 into master Feb 16, 2026
25 checks passed

vincentlaucsb deleted the feb-15-2026 branch February 16, 2026 07:18

vincentlaucsb mentioned this pull request Feb 16, 2026

Linker error in Debug mode, apprently due to "--coverage" #184

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Version 2.4.1: CSV Parser Bulletproofing#287

Version 2.4.1: CSV Parser Bulletproofing#287
vincentlaucsb merged 17 commits intomasterfrom
feb-15-2026

vincentlaucsb commented Feb 16, 2026 •

edited

Loading

Uh oh!

github-advanced-security AI commented Feb 16, 2026

Uh oh!

Check warning

Copilot Autofix

Check warning

Copilot Autofix

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

@@ -1,4 +1,6 @@
             name: Memory and Thread Sanitizers
+            permissions:
+              contents: read
             on:
               push:

Conversation

vincentlaucsb commented Feb 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-advanced-security AI commented Feb 16, 2026

Uh oh!

Check warning

Copilot Autofix

Check warning

Copilot Autofix

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

vincentlaucsb commented Feb 16, 2026 •

edited

Loading