(from discussion with @lmolkova in #2469 (comment))
In the HTTP protocol, the peer you connect to can be unclear if an HTTP-level proxy is used. In that case, an application opens a TCP connection to the configured proxy host and port, and sends an n HTTP request with a full URL as target to it (GET http://hostname.example.com/some?thing=1 HTTP/1.1).
In this case, it is not clear whether net.peer.name should be the hostname from the requested URL (hostname.example.com) or of the HTTP proxy the application sends the HTTP request to.
One thing that seems clear is that neet.peer.name and net.peer.ip should be consistent which each other. If net.peer.name is set to a logical peer distinct from the socket-level peer, net.peer.ip would have to be left unset and there would currently be no way to store network-level connection information.
There is also an issue with gRPC which (usually?) goes over HTTP/2, but the RPC conventions require at least one of net.peer.ip or net.peer.name. Clearly, in absence of any other attribute to hold it, the destination hostname would be more useful than the hostname of an HTTP proxy.
An open question to me is, whether this problem is specific to HTTP (and HTTP-based protocols), or if this is common also in other protocols.
- Option 1: Leave net.peer.* ambiguous, make it deprecated. Depending on the semantic convention used, it might be the logical peer, the socket-level peer or (usually) both. Introduce separate
net.sock.* and net.app.* namespaces to have unambiguous options.
- Option 2: Define net.peer.* to be either one or the other, introduce only the other namespace.
- Option 3: Try to stay as close to the status quo as possible: Define net.peer.* to be the network-level peer. Protocols that have a logical peer canonically identified by a DNS-hostname that might be different from the network-level peer should have a special attribute that defines the role of this peer (http.host which already exists, rpc.endpoint, db.server_name, ...)
- Option 4: ("Do nothing") Leave net.peer.* ambigous, accept that only one of network-level or logical peer can be set on a span in most semantic conventions (e.g. HTTP server conventions have both net.peer.ip and http.client_ip, while rpc client conventions only have net.peer.ip/name)
Option 1 would be the easiest. Option 3 would be the most complex, also for processing on the backend, but would be future-proof also in case some protocol defines multiple peers with different roles that you might connect to with the same span. That seems unlikely, so I don't think Option 3 is very useful.
Option 2 is similar to Option 1 and would be preferable if we find that net.peer.* is overwhelmingly used for either network-level or logical peer information, instead of being used for both in different instrumentations/languages. Choosing option 2 would need a survey of some sort to find how feasible it is and choose one or the other option.
(from discussion with @lmolkova in #2469 (comment))
In the HTTP protocol, the peer you connect to can be unclear if an HTTP-level proxy is used. In that case, an application opens a TCP connection to the configured proxy host and port, and sends an n HTTP request with a full URL as target to it (
GET http://hostname.example.com/some?thing=1 HTTP/1.1).In this case, it is not clear whether net.peer.name should be the hostname from the requested URL (hostname.example.com) or of the HTTP proxy the application sends the HTTP request to.
One thing that seems clear is that neet.peer.name and net.peer.ip should be consistent which each other. If net.peer.name is set to a logical peer distinct from the socket-level peer, net.peer.ip would have to be left unset and there would currently be no way to store network-level connection information.
There is also an issue with gRPC which (usually?) goes over HTTP/2, but the RPC conventions require at least one of net.peer.ip or net.peer.name. Clearly, in absence of any other attribute to hold it, the destination hostname would be more useful than the hostname of an HTTP proxy.
An open question to me is, whether this problem is specific to HTTP (and HTTP-based protocols), or if this is common also in other protocols.
net.sock.*andnet.app.*namespaces to have unambiguous options.Option 1 would be the easiest. Option 3 would be the most complex, also for processing on the backend, but would be future-proof also in case some protocol defines multiple peers with different roles that you might connect to with the same span. That seems unlikely, so I don't think Option 3 is very useful.
Option 2 is similar to Option 1 and would be preferable if we find that net.peer.* is overwhelmingly used for either network-level or logical peer information, instead of being used for both in different instrumentations/languages. Choosing option 2 would need a survey of some sort to find how feasible it is and choose one or the other option.