CORE-15587 proto: allow large TLS certs in shadow link#29718
CORE-15587 proto: allow large TLS certs in shadow link#29718pgellert merged 2 commits intoredpanda-data:devfrom
Conversation
The lookup_field() generator was missing .share() for string fields with (redpanda.core.pbgen.iobuf) = true. Since iobuf is move-only, this caused compile errors. Fix by checking isIOBuf(f) alongside the existing BytesKind check.
d42881c to
e27e32b
Compare
There was a problem hiding this comment.
Pull request overview
This PR removes the 128KiB limitation on TLS PEM certificate fields in shadow link configurations by using iobuf-backed protobuf fields instead of regular strings. The change enables large CA bundles with multiple certificates to be used in shadow link TLS settings without deserialization failures.
Changes:
- Added
(redpanda.core.pbgen.iobuf) = trueannotation to ca, key, and cert fields in TLSPEMSettings proto - Updated pbgen code generator to handle iobuf-backed string fields with
.share()calls in lookup_field - Modified shadow_link converter to use iobuf_to_string() helper for converting iobuf PEM data to sstring
Reviewed changes
Copilot reviewed 9 out of 9 changed files in this pull request and generated no comments.
Show a summary per file
| File | Description |
|---|---|
| proto/redpanda/core/common/v1/tls.proto | Added iobuf annotations to ca, key, and cert string fields in TLSPEMSettings |
| bazel/pbgen/main.go | Extended lookup_field generation to call .share() for iobuf-backed string fields |
| src/v/serde/protobuf/tests/codegen_test.proto | Added test message IOBufStringField to verify iobuf annotation codegen |
| src/v/serde/protobuf/tests/goldens/codegen_test.proto.h | Generated header for IOBufStringField test message |
| src/v/serde/protobuf/tests/goldens/codegen_test.proto.cc | Generated implementation with correct .share() usage in lookup_field |
| src/v/serde/protobuf/tests/round_trip_test.cc | Added test verifying 150KiB iobuf-backed field roundtrips correctly |
| src/v/redpanda/admin/services/shadow_link/converter.cc | Implemented iobuf_to_string() helper and updated all TLS PEM conversions |
| src/v/redpanda/admin/services/shadow_link/BUILD | Added iobuf_parser dependency |
| src/v/redpanda/admin/services/shadow_link/tests/converter_test.cc | Updated tests to use iobuf::from() for TLS PEM test data |
Comments suppressed due to low confidence (1)
src/v/redpanda/admin/services/shadow_link/tests/converter_test.cc:471
- The converter_test lacks coverage for the primary issue this PR addresses: TLS certificates exceeding 128KiB. While the protobuf layer is tested with a 150KiB string in round_trip_test.cc, there is no end-to-end test verifying that large TLS PEM data (ca, key, cert) can be successfully converted from proto to metadata and back. Consider adding a test case that exercises the iobuf_to_string conversion with TLS data exceeding the previous 128KiB limit to ensure the complete flow works correctly.
TEST(converter_test, create_with_tls_value) {
const auto name = "test-link";
const auto ca = "ca";
const auto key = "key";
const auto cert = "cert";
proto::admin::shadow_link shadow_link;
proto::admin::create_shadow_link_request req;
proto::admin::shadow_link_configurations shadow_link_configurations;
proto::admin::shadow_link_client_options shadow_link_client_options;
proto::common::tls_settings tls_settings;
proto::common::tlspem_settings tls_pem_settings;
tls_pem_settings.set_ca(iobuf::from(ca));
tls_pem_settings.set_key(iobuf::from(key));
tls_pem_settings.set_cert(iobuf::from(cert));
tls_settings.set_tls_pem_settings(std::move(tls_pem_settings));
shadow_link_client_options.set_tls_settings(std::move(tls_settings));
shadow_link_client_options.set_bootstrap_servers({"localhost:9092"});
shadow_link_configurations.set_client_options(
std::move(shadow_link_client_options));
shadow_link.set_configurations(std::move(shadow_link_configurations));
shadow_link.set_name(ss::sstring{name});
req.set_shadow_link(std::move(shadow_link));
auto md = admin::convert_create_to_metadata(std::move(req));
ASSERT_TRUE(md.connection.ca.has_value());
ASSERT_TRUE(
std::holds_alternative<cluster_link::model::tls_value>(
md.connection.ca.value()));
EXPECT_EQ(
std::get<cluster_link::model::tls_value>(md.connection.ca.value()), ca);
ASSERT_TRUE(md.connection.key.has_value());
ASSERT_TRUE(
std::holds_alternative<cluster_link::model::tls_value>(
md.connection.key.value()));
EXPECT_EQ(
std::get<cluster_link::model::tls_value>(md.connection.key.value()), key);
ASSERT_TRUE(md.connection.cert.has_value());
ASSERT_TRUE(
std::holds_alternative<cluster_link::model::tls_value>(
md.connection.cert.value()));
EXPECT_EQ(
std::get<cluster_link::model::tls_value>(md.connection.cert.value()),
cert);
}
| /// Converts an iobuf to ss::sstring for TLS PEM data. | ||
| /// | ||
| /// Note: This creates a contiguous allocation which may exceed 128KiB for large | ||
| /// CA bundles. While large allocations are generally discouraged in Seastar, | ||
| /// this is acceptable here because the Seastar TLS layer (set_x509_key, etc.) | ||
| /// requires linearized certificate data anyway. | ||
| ss::sstring iobuf_to_string(const iobuf& buf) { | ||
| auto sz = buf.size_bytes(); | ||
| iobuf_const_parser p(buf); | ||
| return p.read_string(sz); | ||
| } |
There was a problem hiding this comment.
Instead of crashing, can we temporarily disable the crash on oversized alloc behavior in our allocator and instead catch and rethrow the bad_alloc as an rpc error?
There was a problem hiding this comment.
Great idea, I've added this now
There was a problem hiding this comment.
We should consider making this a method in iobuf proper, but otherwise this LGTM
CI test resultstest results on build#81144
|
e27e32b to
289ff8e
Compare
|
force-push: address PR feedback (throw on bad alloc) |
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 9 out of 9 changed files in this pull request and generated no new comments.
Comments suppressed due to low confidence (4)
src/v/redpanda/admin/services/shadow_link/converter.cc:92
iobuf_to_stringcan throwserde::pb::rpc::resource_exhausted_exception, which will bypass thestd::invalid_argumentcatch-and-wrap inconvert_create_to_metadata. If callers/documentation assume onlyinvalid_argument_exceptionfrom conversion, this introduces a new exception contract. Consider updating the conversion API docs/handling to reflect/translate this error consistently.
} catch (const std::bad_alloc&) {
throw serde::pb::rpc::resource_exhausted_exception(
ssx::sformat(
"TLS certificate data too large to linearize ({} bytes)", sz));
}
src/v/redpanda/admin/services/shadow_link/tests/converter_test.cc:440
- These tests cover TLS PEM handling but still only exercise very small CA/key/cert payloads. Since the regression here is specifically about values exceeding the 128KiB string allocation limit, adding a test that constructs a CA/key/cert >128KiB (and verifies conversion succeeds) would better lock in the intended behavior for shadow link creation.
TEST(converter_test, create_with_tls_value) {
const auto name = "test-link";
const auto ca = "ca";
const auto key = "key";
const auto cert = "cert";
proto::admin::shadow_link shadow_link;
proto::admin::create_shadow_link_request req;
proto::admin::shadow_link_configurations shadow_link_configurations;
proto::admin::shadow_link_client_options shadow_link_client_options;
proto::common::tls_settings tls_settings;
proto::common::tlspem_settings tls_pem_settings;
tls_pem_settings.set_ca(iobuf::from(ca));
tls_pem_settings.set_key(iobuf::from(key));
tls_pem_settings.set_cert(iobuf::from(cert));
tls_settings.set_tls_pem_settings(std::move(tls_pem_settings));
proto/redpanda/core/common/v1/tls.proto:64
- By making
ca/certiobuf-backed, these fields can now be much larger than before. Code paths that log full request objects (e.g.,create_shadow_link: {}formatting) may end up emitting very large log lines containing full PEM material. Consider addingdebug_redact(or another truncation/redaction mechanism) for these fields, or ensuring request logging avoids printing the raw PEM payloads.
// The CA
string ca = 1 [(redpanda.core.pbgen.iobuf) = true];
// Key and Cert are optional but if one is provided, then both must be
// The key
string key = 2 [
(google.api.field_behavior) = INPUT_ONLY,
debug_redact = true,
(redpanda.core.pbgen.iobuf) = true
];
// The SHA-256 of the key, in base64 format
string key_fingerprint = 3 [(google.api.field_behavior) = OUTPUT_ONLY];
// The cert
string cert = 4 [(redpanda.core.pbgen.iobuf) = true];
src/v/redpanda/admin/services/shadow_link/converter.cc:83
iobuf_to_stringtemporarily flips Seastar's global (per-shard) abort-on-allocation-failure flag. Because other fibers on the same shard may interleave while this flag is disabled, unrelated allocations could start throwingstd::bad_allocinstead of aborting, changing failure mode outside this conversion. Consider avoiding global state changes (e.g., fail fast on an explicit size cap / preflight, or introduce a scoped helper in Seastar/ssx that restores the previous value without affecting other tasks).
// Temporarily disable crash-on-allocation-failure so we can convert
// std::bad_alloc to an RPC error instead of crashing the broker.
ss::memory::set_abort_on_allocation_failure(false);
auto guard = ss::defer([]() {
ss::memory::set_abort_on_allocation_failure(
config::shard_local_cfg().memory_abort_on_alloc_failure());
});
Shadow link TLS PEM fields (ca, key, cert) were limited to 128KiB by proto serde's string size limit. Large CA bundles with multiple certificates could exceed this, causing deserialization failures. Add (redpanda.core.pbgen.iobuf) = true annotation to these fields in TLSPEMSettings proto. Update shadow_link converter to use iobuf_to_string() for conversions. This allows arbitrarily large PEM data while maintaining sstring API compatibility.
289ff8e to
34090ee
Compare
|
force-push: fix clang tidy failure by switching to |
|
/backport v25.3.x |
Shadow link TLS PEM fields (ca, key, cert) were limited to 128KiB
by proto serde's string size limit. Large CA bundles with multiple
certificates could exceed this, causing deserialization failures.
Add (redpanda.core.pbgen.iobuf) = true annotation to these fields
in TLSPEMSettings proto. Update shadow_link converter to use
iobuf_to_string() for conversions. This allows arbitrarily large
PEM data while maintaining sstring API compatibility.
Fixes https://redpandadata.atlassian.net/browse/CORE-15587
Backports Required
Release Notes
Improvements