Validate build hash in handshake#65601
Conversation
There is no guarantee of wire compatibility between nodes running different builds of the same version, but today we do not validate whether two communicating nodes are compatible or not. This results in confusing failures that look like serialization bugs, and it usually takes nontrivial effort to determine that the failure is in fact due to the user running incompatible builds. This commit adds the build hash to the transport service handshake and validates that matching versions have matching build hashes. Closes elastic#65249
|
Pinging @elastic/es-distributed (Team:Distributed) |
|
NB I opened this PR against |
original-brownbear
left a comment
There was a problem hiding this comment.
LGTM, just random NITs that can be ignored :)
| private static final Logger logger = LogManager.getLogger(TransportService.class); | ||
|
|
||
| private static final String PERMIT_HANDSHAKES_FROM_INCOMPATIBLE_BUILDS_KEY = "es.unsafely_permit_handshake_from_incompatible_builds"; | ||
| private static final boolean PERMIT_HANDSHAKES_FROM_INCOMPATIBLE_BUILDS = getPermitHandshakesFromIncompatibleBuilds(); |
There was a problem hiding this comment.
NIT: maybe just put the logic in this method in a static { } block below these two constants to make this easier to read in one go?
There was a problem hiding this comment.
Ah yeah not sure why I didn't do that. Done in 8132bf3
| if (PERMIT_HANDSHAKES_FROM_INCOMPATIBLE_BUILDS) { | ||
| logger.warn("transport handshakes from incompatible builds are unsafely permitted on this node; remove system property [" + | ||
| PERMIT_HANDSHAKES_FROM_INCOMPATIBLE_BUILDS_KEY + "] to resolve this warning"); | ||
| DeprecationLogger.getLogger(TransportService.class).deprecate("permit_handshake_from_incompatible_builds", |
There was a problem hiding this comment.
NIT: little much to log two messages for one thing? This seems like it's kind of a dev-only option anyway, why bother with the deprecation logger?
There was a problem hiding this comment.
Ehh I worry we'll get people using this, and the two logs are for two separate aspects. Deprecation logs don't go into the main server log (by default at least) but I think this deserves a visible warning whenever it's used.
Today in `7.x` there is a deprecated system property that bypasses the check that prevents nodes of incompatible builds from communicating. This commit removes the system property in `master` so that the check is always enforced. Relates elastic#65601, elastic#65249
There is no guarantee of wire compatibility between nodes running
different builds of the same version, but today we do not validate
whether two communicating nodes are compatible or not. This results in
confusing failures that look like serialization bugs, and it usually
takes nontrivial effort to determine that the failure is in fact due to
the user running incompatible builds.
This commit adds the build hash to the transport service handshake and
validates that matching versions have matching build hashes.
Closes #65249