iceberg: support bucket names with dots in URIs#29071
iceberg: support bucket names with dots in URIs#29071nvartolomei merged 1 commit intoredpanda-data:devfrom
Conversation
Previously, the URI parser would reject bucket names containing dots by treating any host with a dot as an unsupported virtual host-style URI. This prevented using S3 buckets with dots in their names. We also change slightly the behavior to reject uris like `s3://bucket.s3.amazonaws.com/path/to/object`. I never seen them in the wild and I think it is unsafe to accept them without additional checks. Fixes https://redpandadata.atlassian.net/browse/CORE-15069
c643382 to
1ddaad0
Compare
There was a problem hiding this comment.
Pull request overview
This PR fixes a bug in Iceberg's URI parser that incorrectly rejected S3 bucket names containing dots. Previously, the parser treated any hostname with a dot as an unsupported virtual host-style URI, preventing the use of valid bucket names like "test.bucket.name".
Key Changes:
- Simplified URI validation logic to accept bucket names with dots
- Added test coverage for bucket names containing dots
- Updated comments to clarify the rejection of virtual host-style URIs
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 1 comment.
| File | Description |
|---|---|
| src/v/iceberg/uri.cc | Removed dot-checking logic that prevented bucket names with dots from being parsed |
| src/v/iceberg/tests/uri_test.cc | Added test case for bucket names with dots and refactored test structure |
| src/v/iceberg/tests/BUILD | Removed unused test_schemas dependency |
can you explicitly say what is wrong with this? looks fine to me... |
dotnwat
left a comment
There was a problem hiding this comment.
should we include a ducktape test? it seems like bucket names end up getting used in many additional places inside and outside redpanda so having something that can catch regressions with issues with dot-containing-bucket-names seems good?
without a more throughout input validation we could mistake a table returned uri |
|
/backport v25.3.x |
|
/backport v25.2.x |
|
/backport v25.1.x |
Previously, the URI parser would reject bucket names containing dots by treating any host with a dot as an unsupported virtual host-style URI. This prevented using S3 buckets with dots in their names.
We also change slightly the behavior to reject uris like
s3://bucket.s3.amazonaws.com/path/to/object. I never seen them in the wild and I think it is unsafe to accept them without additional checks.Fixes https://redpandadata.atlassian.net/browse/CORE-15069
Backports Required
Release Notes
Bug Fixes
s3://bucket.s3.amazonaws.com/path/to/object.