rdmasniff: clean up some resource freeing issues.#1633
rdmasniff: clean up some resource freeing issues.#1633guyharris wants to merge 1 commit intothe-tcpdump-group:masterfrom
Conversation
f241f7a to
9d82b14
Compare
pcap-rdmasniff.c
Outdated
| */ | ||
| port = strchr(handle->opt.device, ':'); | ||
| if (port) { | ||
| port_num = strtoul(port + 1, NULL, 10); |
There was a problem hiding this comment.
Would the new pcapint_get_decuint() be a better fit?
There was a problem hiding this comment.
Yes - I discovered the use-after-free issue while pcapint_get_decuint()ifying a bunch of modules, but decided to make this change separately from the pcapint_get_decuint()ifying, which I'll do in a subsequent change.
(Is this a change that should be backported? It fixes some significant bugs, but it's also a somewhat significant change.)
There was a problem hiding this comment.
For severe problems only, obviously correct bug fixes only, or we risk delaying 1.11.0 even more.
pcap-rdmasniff.c
Outdated
| port_num = strtoul(port + 1, NULL, 10); | ||
| if (port_num > 0) { | ||
| namelen = port - handle->opt.device; | ||
| } else { |
There was a problem hiding this comment.
How would this treat name: (with the colon, without a number)?
There was a problem hiding this comment.
The same way as it always did, so there might be a bug there.
C99 says on page 310 that strtoul() and other functions
If the subject sequence is empty or does not have the expected form, no conversion is performed; the value of
nptris stored in the object pointed to byendptr, provided thatendptris not a null pointer.
Unfortunately, endptr is NULL here, so we can't test for that. However, C99 also says, on the same page, under "Returns" that "If no conversion could be performed, zero is returned.", so presumably it returns 0 for port_num, and it defaults to port 1.
There was a problem hiding this comment.
It would be wrong to treat an invalid input as a valid input.
There was a problem hiding this comment.
It would be wrong to treat an invalid input as a valid input.
Yes, but it's worked that way since Day One (f5f1484).
So the question is "what, if anything, would break if iwarp0: no longer meant 'port 1 of device iwarp0?"
If the answer is "nothing" or "so little that they could just adapt and explicitly say iwarp0:1 - or just use iwarp0, as port 1 is the default port - without any real disruption", we might as well reject that.
There was a problem hiding this comment.
The original pull request for RDMA sniffer support is #585.
There was a problem hiding this comment.
And I've asked some questions in that pull request.
pcap-rdmasniff.c
Outdated
| * If so, if there's more than one port to specify on it, | ||
| * we'll need to handle that. | ||
| */ | ||
| port_num = strtoul(port + 1, NULL, 10); |
There was a problem hiding this comment.
This looks similar to the earlier block in rdmasniff_activate(). If the name and the port remain the same, would it be better to parse this exactly once and to keep struct rdmasniff as it is?
There was a problem hiding this comment.
Yeah, we could save the results of the parsing and reuse them.
There was a problem hiding this comment.
Or not bother parsing the port number in the create routine, so that we accept anything of the form <known_rdma_device>:, and then parse it in the activate routine and return an error if the port number isn't valid, so if somebody does rdma17:ffffffff they get an error from pcap-rdmasniff.c pointing that out, rather than having pcap-rdmasniffer.c not accept it as an RDMA device - even if rdma17 is a valid RDMA device - so that it probably ends up getting opened as a regular networking device and probably getting an error claiming that rdma17:ffffffff isn't a known device.
That's what I've done.
9d82b14 to
afe379b
Compare
afe379b to
d576191
Compare
|
In the commit message: |
d576191 to
0cbdf0e
Compare
Fixed. |
|
@rolandd do you have any comments on this change or on any of the issues it's intended to fix? |
|
@paravmellanox do you have any comments on this change or on any of the issues it's intended to fix? |
| priv->context = ibv_open_device(priv->rdma_device); | ||
| dev_list = ibv_get_device_list(&numdev); | ||
| if (!dev_list) { | ||
| if (errno == EPERM) { |
There was a problem hiding this comment.
this is unlikely to happen as this mainly reads the /sys/class/infiniband files. but does not hurt either.
There was a problem hiding this comment.
Might as well catch unexpected errors and report them, so we can figure out what's causing them and either fixing them or, perhaps, ignoring them if they're unfixable.
rolandd
left a comment
There was a problem hiding this comment.
This all looks good to me, I'm glad to see longstanding bugs fixed and I don't have strong feelings about some of the finer points on device name parsing etc discussed here.
b7b3519 to
7a6456f
Compare
Add a routine rdmasniff_free_resoources() to free up all resources pointed to by the struct pcap_rdmasniff structure for an open device. Use it both when closing a pcap_t and when cleaning up after a failed attempt to activate a pcap_t. Free those routines in the order specified in issue the-tcpdump-group#1532. Don't save the struct ibv_device * found in pcap_create(); we free up the ibv_device array allocated by ibv_get_device_list() when we return from rdmasniff_create(), and that invalidates the entries in that array. See the ibv_get_device_list() documentation, and issue the-tcpdump-group#1300. Fixes the-tcpdump-group#1532. Fixes the-tcpdump-group#1300. Improve checking for ibv_ error returns and messages for those errors. Improve handling of the port number - don't fetch or check it in the create routine, so that an invalid port number doesn't prevent the underlying device from being recognized as a valid RDMA device, but do fetch and check it in the activate routine, so we can repor the error to the user.
7a6456f to
4309ca3
Compare
Add a routine rdmasniff_free_resoources() to free up all resources pointed to by the struct pcap_rdmasniff structure for an open device. Use it both when closing a pcap_t and when cleaning up after a failed attempt to activate a pcap_t.
Free those routines in the order specified in issue #1532.
Don't save the struct ibv_device * found in pcap_create(); we free up the ibv_device array allocated by ibv_get_device_list() when we return from rdmasniff_create(), and that invalidates the entries in that array. See the ibv_get_device_list() documentation, and issue #1300.
Fixes #1532.
Fixes #1300.
Improve checking for ibv_ error returns and messages for those errors.
Improve handling of the port number - don't fetch or check it in the create routine, so that an invalid port number doesn't prevent the underlying device from being recognized as a valid RDMA device, but do fetch and check it in the activate routine, so we can repor the error to the user.