datalake: support iceberg_default_catalog_namespace config#29113
Conversation
There was a problem hiding this comment.
Pull request overview
This PR adds support for customizing the Iceberg catalog namespace (database name) through a new configuration option iceberg_default_catalog_namespace. Previously, the namespace was hardcoded to "redpanda". The default value remains "redpanda" for backward compatibility, and nested namespaces are not yet supported.
Key Changes:
- Added a new configuration property
iceberg_default_catalog_namespacewith validation to ensure at least one element and prevent multi-level namespaces for now - Updated C++ code to use the configurable namespace instead of the hardcoded "redpanda" value
- Modified Python test infrastructure to support
Identifiertype (a tuple of strings) for namespaces and updated related functions
Reviewed changes
Copilot reviewed 12 out of 12 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
src/v/config/configuration.h |
Added property declaration for iceberg_default_catalog_namespace |
src/v/config/configuration.cc |
Defined the configuration property with default value ["redpanda"] and validation |
src/v/config/validators.h |
Added validator function declaration for namespace validation |
src/v/config/validators.cc |
Implemented validation logic ensuring non-empty namespace with single element only |
src/v/config/tests/validator_tests.cc |
Added unit tests for the namespace validator |
src/v/datalake/table_id_provider.cc |
Updated to use the configurable namespace instead of hardcoded "redpanda" |
src/v/datalake/coordinator/iceberg_file_committer.cc |
Added debug logging for committed files to main and DLQ tables |
tests/rptest/tests/datalake/iceberg.py |
Defined Identifier type for representing table namespaces |
tests/rptest/tests/datalake/query_engine_base.py |
Updated count_table to accept Identifier type for namespace |
tests/rptest/tests/datalake/query_engine_factory.py |
Added type hints and QueryEngineService protocol |
tests/rptest/tests/datalake/datalake_services.py |
Updated methods to use Identifier type and support custom namespaces |
tests/rptest/tests/datalake/datalake_e2e_test.py |
Added end-to-end test for custom namespace functionality |
000c5f8 to
a439a15
Compare
Retry command for Build#78391please wait until all jobs are finished before running the slash command |
a439a15 to
511b904
Compare
Retry command for Build#78393please wait until all jobs are finished before running the slash command |
|
|
||
| iceberg::table_identifier table_id_provider::table_id(const model::topic& t) { | ||
| return { | ||
| // TODO: namespace as a topic property? Keep it in the table metadata? |
There was a problem hiding this comment.
q: have we foreclosed on the idea of making this per-topic configurable?
There was a problem hiding this comment.
No. Removed the todo by accident. Cluster level comes as a separate PR so that it is backportable.
|
/ci-repeat 1 |
Retry command for Build#78574please wait until all jobs are finished before running the slash command |
Add a new configuration option to customize the Iceberg table namespace (database name), which was previously hardcoded to "redpanda". This allows users to specify a custom namespace for their Iceberg tables within the catalog. The default value remains "redpanda" for backward compatibility.
511b904 to
379254f
Compare
Retry command for Build#78638please wait until all jobs are finished before running the slash command |
|
/backport 25.3.x |
Add a new configuration option to customize the Iceberg table namespace
(database name), which was previously hardcoded to "redpanda". This
allows users to specify a custom namespace for their Iceberg tables
within the catalog.
The default value remains "redpanda" for backward compatibility.
https://redpandadata.atlassian.net/browse/CORE-13736
Backports Required
Release Notes
Features
redpanda) at cluster level.