Skip to content

network: various fixes for OS X socket behavior#1381

Merged
mattklein123 merged 11 commits intoenvoyproxy:masterfrom
turbinelabs:osx-7-socket-fixes
Aug 10, 2017
Merged

network: various fixes for OS X socket behavior#1381
mattklein123 merged 11 commits intoenvoyproxy:masterfrom
turbinelabs:osx-7-socket-fixes

Conversation

@zuercher
Copy link
Copy Markdown
Member

@zuercher zuercher commented Aug 2, 2017

Some network socket changes for OS X support:

  • getpeername returns a larger size in its addrlen parameter for an anonymous peer than on Linux.
  • OS X doesn't have a mechanism to create a socket in non-blocking mode. It require a call to fcntl to set O_NONBLOCK. While making this change I made all the socket creation paths RELEASE_ASSERT on the fd being valid.
  • OS X returns an error on setsockopt if it's invoked on an already-closed socket.
  • Fixes a race between artificially generated write events and the write event that signals connection completed in ConnectionImpl.
  • OS X doesn't support getsockopt(,,SOL_ORIGINAL_DST,) -- use getsockname instead and work around the fact that the latter works for IPv6 where the Linux getsockopt call does not. Added direct testing of Network::Utility::getOriginalDst.

(Split out from #1348, in support of #128.)

return;
}
}
#endif
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@PiotrSikora you mentioned kq_errno but I'm not familiar with that and google only seems to be showing me references to nginx source code. It looks like there the event flags from libevent are preserved. Does that enable it to detect this condition?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ooops, sorry! kq_errno is indeed just NGINX's field name for that.

What you want is to check flags (if it's EV_EOF) and fflags (for errno) fields from struct kevent, see kqueue(2).

INSTANTIATE_TEST_CASE_P(IpVersions, NetworkUtilityGetOriginalDst,
testing::ValuesIn(TestEnvironment::getIpVersionsForTest()));

TEST_P(NetworkUtilityGetOriginalDst, GetOriginalDst) {
Copy link
Copy Markdown
Member Author

@zuercher zuercher Aug 2, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test is a bit disingenuous since the connection isn't really being forwarded from another port.

However, I don't think it's even possible to do that kind of port redirection in OS X, so perhaps I should change getOriginalDst to always return nullptr in OS X and instead consider disabling the use_original_dst feature instead.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's reasonable to disable this feature altogether on OS X for now, and bring it back via separate PR at some point in the future, if it's possible at all (I think it should be possible to redirect traffic via either divert(2) or pf.conf on OS X, but I didn't verify it).

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree. I would just revert all of this IMO and only do the SOL_IP call on linux, and on apple just set error to ENOTSUPPORTED (or whatever) so that the subsequence if check fails with a TODO.

UNREFERENCED_PARAMETER(rc);

#ifdef __APPLE__
// When a connect() is performed on a non-blocking socket, it'll return -1/EINPROGRESS
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This whole thing is absolutely nutty. Is there some documentation or stack overflow or something we can link to?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR LGTM with this exception as well. Just some better citations would be sufficient if there is no way to improve on this.

signal();
thread->join();

#if !defined(__APPLE__)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't quite understand what this test is doing. Just making sure the calls actually work? And we don't support it on IPv6 basically? Per above I would just disable all of this on OSX and just leave the code as is before. If we keep this test I have some other comments but I can hold off for now until we decide.

RELEASE_ASSERT(fd != -1);

#ifdef __APPLE__
// Cannot set SOCK_NONBLOCK as a ::socket flag
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we make Linux do what is needed for OS X here, and reduce the special casing?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are perf implications for this (multiple system calls vs. 1), so in this case I would be in favor of the ifdef. If there is a way we can make it less visually annoying that's fine.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I switched flagsFromSocketType to be socketFromSocketType and do all the flag setting, fcntl, and socket creation there. I think this is better, since it's all contained in one block.


#ifdef __APPLE__
// Cannot set SOCK_NONBLOCK as a ::socket flag
RELEASE_ASSERT(fcntl(fd, F_SETFL, O_NONBLOCK) != -1);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This overwrites existing flags... It's probably safe, since we've just created the socket, but for my sanity, could you verify that fcntl(fd, F_GETFL) returns 0 at this point? Just one-off experiment, not in the PR.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm... I'm confused. Apparently, it doesn't overwrite flags on OS X.

fcntl(fd, F_GETFL) on a new socket returns 0x0002 (O_RDWR), but it returns 0x0006 (O_RDWR|O_NONBLOCK) after fcntl(fd, F_SETFL, O_NONBLOCK), which is not something I'd expect.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From http://pubs.opengroup.org/onlinepubs/9699919799/functions/fcntl.html:

F_SETFL
Set the file status flags, defined in <fcntl.h>, for the file description associated with fildes from the corresponding bits in the third argument, arg, taken as type int. Bits corresponding to the file access mode and the file creation flags, as defined in <fcntl.h>, that are set in arg shall be ignored. If any bits in arg other than those mentioned here are changed by the application, the result is unspecified. If fildes does not support non-blocking operations, it is unspecified whether the O_NONBLOCK flag will be ignored.

which explains why O_RDWR isn't cleared.

mattklein123
mattklein123 previously approved these changes Aug 4, 2017
Copy link
Copy Markdown
Member

@mattklein123 mattklein123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This LGTM beyond my ask for some additional comments on how you knew about the crazy SO_ERROR stuff on OS X. @htuch @PiotrSikora ?

@PiotrSikora
Copy link
Copy Markdown
Contributor

@zuercher ping on the flags & EV_EOF instead of getpeername() hack.

@mattklein123
Copy link
Copy Markdown
Member

Ah maybe @PiotrSikora knows how to get rid of the crazy hack. Nice.

@zuercher
Copy link
Copy Markdown
Member Author

zuercher commented Aug 5, 2017

Sorry, I was deep in bazel internals all day trying to figure out a better solution for my other open PR.

I plan to revisit this and get rid of that ifdef if possible. I'm off Monday, so not much will happen till Tuesday.

Empirically, that block does get triggered but when it does the read ends up returning EAGAIN. Then the next call to onWriteReady gets no errors. Taking the block out crashes some integration tests. It seems like the socket isn't fully connected somehow, but that seems crazy as well. I'm hopeful that I can inspect the event earlier and detect the EAGAIN case before we ever call onWriteReady.

@zuercher
Copy link
Copy Markdown
Member Author

zuercher commented Aug 8, 2017

So, I can't directly look at EV_EOF (as @PiotrSikora suggested) because the kqueue event is hidden inside libevent.

I believe the problem occurs when ConnectionImpl::write triggers a FileTypeReady::Write event before a connection has been formed. The eventual call to writev within libevent returns an error, but instead of errno == EAGAIN we see errno == ENOTCONN. If we treat that result the same as EAGAIN, things seem to work. This is basically the same condition as the original #ifdef, but without the extra system calls. It also seems to change the sequence of events in some of the unit tests, which I'm working through.

@zuercher
Copy link
Copy Markdown
Member Author

zuercher commented Aug 9, 2017

I eventually realized that ENOTCONN in both the original #ifdef and the one I described above was caused by this sequence in tests:

  1. Construct client-side ConnectionImpl
  2. Invoke ConnectionImpl::write. This buffers the write data and activates an artificial write event because the ConnectionImpl is not in the connecting state.
  3. Invoke ConnectionImpl::connect. The connect call sees EINPROGRESS. Connecting state is flagged.
  4. When dispatcher's run loop is triggered, the artificial write event tricks ConnectionImpl::onWriteReady into thinking the connection completed and it attempts to write to the socket, which fails with ENOTCONN. (Or in the #ifdef code, getpeername returns ENOTCONN and read returns EAGAIN.)

The original ifdef code worked by ignoring the ENOTCONN error and waiting for the real write event that occurs when the connection completes. You can get a similar effect by ignoring the ENOTCONN from write, however things become tricky in the tests, because there's no way to tell when the connection is actually completed (the ConnectionEvent::Connected is triggered spuriously by the artificial write event).

Instead, I added an Unconnected state to ConnectionImpl::InternalState and skip generating the artificial write event if either that or Connecting are set. Now the ConnectionEvent::Connected event is triggered only when the connection actually completes, and subsequent errors from write can be trusted. This then allows tests to invoke the dispatcher's run loop in blocking mode to wait for the connection to complete before attempting to write.

static const uint32_t Connecting = 0x02;
static const uint32_t CloseWithFlush = 0x04;
static const uint32_t ImmediateConnectionError = 0x08;
static const uint32_t Unconnected = 0x10;
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if we just start client connections in the Connecting state? Is there any downside to that? Do we need Unconnected state for anything?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That should work. Just me being pedantic.

int new_value = enable;
rc = setsockopt(fd_, IPPROTO_TCP, TCP_NODELAY, &new_value, sizeof(new_value));
#ifdef __APPLE__
if (-1 == rc && errno == EINVAL) {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When does this happen? Is this needed with the connected fixes?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm. This still happens and I think it's basically the same reason.

We call ConnectionImpl::connect() and then immediately ConnectionImpl::noDelay(true), but the socket isn't connected yet, so OS X returns an error.

I added some code to read the value of TCP_NODELAY and it does get set despite the error.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the way to get rid of this #ifdef is to only set TCP_NODELAY after the socket is connected. That means either changing all the call sites to invoke this from a callback or else tracking the desired setting and applying in onWriteReady.

Copy link
Copy Markdown
Member

@mattklein123 mattklein123 Aug 10, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems like an OS X bug to me, but I digress. Hmm. In this case I would be in favor of just putting in more of a comment on why this is done/breaks and maybe a TODO?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a comment.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would also file an issue on this and reference it here. I'm imagining some point in the future where we're looking at performance on OS X, and the TCP_NODELAY fail ends up being due to some reason that does not actually set the option. We will then have non-deterministic performance characteristics. This is only slightly hypothetical - we've recently been chasing down a TCP proxy overhead issue that relates somewhat to TCP_NODELAY on envoy-users.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See #1446.

server_connection_ = std::move(conn);
server_connection_->addConnectionCallbacks(server_callbacks_);
server_connection_->addReadFilter(read_filter_);
printf("argh!\n");
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

del

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oops nevermind. You did del. :)

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did. It snuck in via a different PR of mine. 😬

: ConnectionImpl(dispatcher, address->socket(Address::SocketType::Stream), address,
getNullLocalAddress(*address), false) {}
getNullLocalAddress(*address), false) {
markConnecting();
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: IMO it would be better to just pass a boolean to ConnectionImpl constructor saying whether to mark connecting or not vs. having a method.

Copy link
Copy Markdown
Member

@mattklein123 mattklein123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry two more nits. LGTM. @PiotrSikora @dnoe @htuch ?

if (version == IpVersion::v6) {
domain = AF_INET6;
} else {
RELEASE_ASSERT(version == IpVersion::v4);
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I would make this normal ASSERT

domain = AF_INET;
}
} else {
RELEASE_ASSERT(type() == Type::Pipe);
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I would make this normal ASSERT

Copy link
Copy Markdown
Member

@htuch htuch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM if you can file that issue on TCP_NODELAY. Thanks.

@mattklein123 mattklein123 merged commit 2f180a6 into envoyproxy:master Aug 10, 2017
@zuercher zuercher deleted the osx-7-socket-fixes branch August 10, 2017 16:07
@danmcd
Copy link
Copy Markdown

danmcd commented Jan 30, 2020

Seeing #1446 also reference illumos, I'm curious if at least some of the fixes here may need to apply to illumos (and possibly Oracle Solaris) as well?

jpsim pushed a commit that referenced this pull request Nov 28, 2022
Signed-off-by: Jose Nino <jnino@lyft.com>
Signed-off-by: JP Simard <jp@jpsim.com>
jpsim pushed a commit that referenced this pull request Nov 29, 2022
Signed-off-by: Jose Nino <jnino@lyft.com>
Signed-off-by: JP Simard <jp@jpsim.com>
mathetake pushed a commit that referenced this pull request Mar 3, 2026
**Description**

This PR adds the Anthropic logo to the LLM providers section and sorts
all providers alphabetically for better organization.

**Changes:**

- Added Anthropic SVG logo to static assets
- Updated provider list in both README.md and website component
- Implemented alphabetical sorting for the provider list on homepage
- Updated website description to clarify that providers are displayed
alphabetically

**Related Issues/PRs:**

Related PR: #1369

---------

Signed-off-by: Erica Hughberg <erica.sundberg.90@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants