Merged
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Fix Edge Cases and Third-Party Stream Compatibility
Summary
This PR addresses two legitimate edge cases and improvements discovered by community members:
Issue #218: Infinite Loop on Rows Exceeding Chunk Size
Thanks to @sjoubert for the detailed report and working patch suggestion.
Problem: When a CSV row exceeds
ITERATION_CHUNK_SIZE(10MB), the parser would enter an infinite loop spawning threads indefinitely, eventually crashing with resource exhaustion.Solution:
set_chunk_size(size_t)API to allow users to customize chunk sizeread_row()test_edge_cases_large_rows.cppCredit: The core detection logic is based on @sjoubert's patch from the issue.
Issue #259: StreamParser Compilation Error with Non-Copyable Streams
Thanks to @addy90 for the clear problem description and working solution with Godbolt example.
Problem:
StreamParserattempted tostd::move()a stream reference into a value member, causing compilation errors with third-party stream libraries that have deleted copy constructors (e.g., Boost streams, custom streams).Solution:
TStream _sourcetoTStream& _source(reference member)std::move()calls in constructorstest_stream_sources.cppwith mock non-copyable stream to prevent regressionCredit: Implementation follows exactly the approach suggested by @addy90.
Testing
✅ All existing tests pass
✅ New test:
test_edge_cases_large_rows.cpp- validates chunk size customization and infinite loop detection✅ New test:
test_stream_sources.cpp- validates third-party stream compatibility✅ Full CI/CD pipeline (CMake, Sanitizers, CodeQL) runs on PR
Changes
New Public API
void CSVReader::set_chunk_size(size_t size)- Configure chunk size (min 10MB, default 10MB)Modified Files
include/internal/csv_reader.hpp- Addedset_chunk_size(),_chunk_size,_read_requestedmembersinclude/internal/csv_reader.cpp- Implemented infinite loop detection inread_row()include/internal/basic_csv_parser.hpp- Fixed StreamParser to use reference memberNew Test Files
tests/test_edge_cases_large_rows.cpp- Comprehensive edge case coverage for large rowstests/test_stream_sources.cpp- Third-party stream compatibility testsContributors
Special thanks to:
Notes