Skip to content

Conversation

@rschu1ze
Copy link
Member

No description provided.

yijiem and others added 30 commits August 21, 2023 16:20
… on Windows (#34107)

Local Bazel invocation succeeds:

```
C:\Users\yijiem\projects\grpc>bazel --output_base=C:\bazel2 test --dynamic_mode=off --verbose_failures //test/cpp/naming:resolver_component_tests_runner_invoker@poller=epoll1
INFO: Analyzed target //test/cpp/naming:resolver_component_tests_runner_invoker@poller=epoll1 (0 packages loaded, 0 targets configured).
INFO: Found 1 test target...
Target //test/cpp/naming:resolver_component_tests_runner_invoker@poller=epoll1 up-to-date:
  bazel-bin/test/cpp/naming/resolver_component_tests_runner_invoker@poller=epoll1.exe
INFO: Elapsed time: 199.262s, Critical Path: 193.48s
INFO: 2 processes: 1 internal, 1 local.
INFO: Build completed successfully, 2 total actions
//test/cpp/naming:resolver_component_tests_runner_invoker@poller=epoll1  PASSED in 193.4s

Executed 1 out of 1 test: 1 test passes.
```

The local invocation of RBE failed with linker error `LINK : error
LNK2001: unresolved external symbol mainCRTStartup`, but that does not
limited to this target:
https://gist.github.com/yijiem/2c6cbd9a31209a6de8fd711afbf2b479.

<!--

If you know who should review your pull request, please assign it to
that
person, otherwise the pull request would get assigned randomly.

If your pull request is for a specific language, please add the
appropriate
lang label.

-->
This is to make sure upgrading packaging module won't break our logic on
version-based version skipping.
This also fixes a small issue with `dev-` prefix - it should only be
allowed on the left side of the comparison.

Context: packaging module needs to be upgraded to be compatible with
`blackd`.
…o pull in __STDC_FORMAT_MACROS define) (#32159)

Some versions of MacOS (as well as some `glibc`-based platforms) require
`__STDC_FORMAT_MACROS` to be defined prior to including `<cinttypes>` or
`<inttypes.h>`, otherwise build fails with undeclared `PRIdMAX`,
`PRIdPTR` etc.

See note here: https://en.cppreference.com/w/cpp/types/integer

label: (release notes: no)
Add example for TLS.

<!--

If you know who should review your pull request, please assign it to
that
person, otherwise the pull request would get assigned randomly.

If your pull request is for a specific language, please add the
appropriate
lang label.

-->
…… (#34129)

…th Bazel on Windows (#34107)"

This reverts commit d540b4c.
…34127)

The tests are skipped incorrectly because `config.server_lang` is
incorrectly compared with the string value "java", instead of
`skips.Lang.JAVA`.

This has been broken since #26998. 

```
xds_url_map_testcase.py:372] ----- Testing TestTimeoutInRouteRule -----
xds_url_map_testcase.py:373] Logs timezone: UTC
skips.py:121] Skipping TestConfig(client_lang='java', server_lang='java', version='v1.57.x')
[  SKIPPED ] setUpClass (timeout_test.TestTimeoutInRouteRule)
xds_url_map_testcase.py:372] ----- Testing TestTimeoutInApplication -----
xds_url_map_testcase.py:373] Logs timezone: UTC
skips.py:121] Skipping TestConfig(client_lang='java', server_lang='java', version='v1.57.x')
[  SKIPPED ] setUpClass (timeout_test.TestTimeoutInApplication)
```
- Add Github Action to conditionally run PSM Interop unit tests:
- Only run when changes are detected in
`tools/run_tests/xds_k8s_test_driver` or any of the proto files used by
the driver
  - Only run against PRs and pushes to `master`, `v1.*.*` branches
  - Runs using `python3.9` and `python3.10`
  - Ready to be added to the list of required GitHub checks
- Add `tools/run_tests/xds_k8s_test_driver/tests/unit/__main__.py` test
loader that recursively discovers all unit tests in
`tools/run_tests/xds_k8s_test_driver/tests/unit`
- Add basic coverage for `XdsTestClient` and `XdsTestServer` to verify
the test loader picks up all folders

Related:
- First unit tests without automated CI added in #34097
We enabled OpenSSL3 testing with #31256 and missed a failing test

It wasn't running before, so this isn't a regression - disabling it so
master doesn't fail while we figure out how to fix it.
This WorkStealingThreadPool change improves the (lagging)
`cpp_protobuf_async_streaming_qps_unconstrained_secure` 32-core
benchmark.

Baseline OriginalThreadPool QPS: 830k
Previous average WorkStealingThreadPool QPS: 755k
New WorkStealingThreadPool average (2 runs) QPS: 850k
…(#34130)

Fixes an issue when an active context selected automatically picked up
as context for `secondary_k8s_api_manager`.

This was introducing an error in GAMMA Baseline PoC

```
sys:1: ResourceWarning: unclosed <ssl.SSLSocket fd=4, family=AddressFamily.AF_INET, type=SocketKind.SOCK_STREAM, proto=0, laddr=('100.71.2.143', 56723), raddr=('35.199.174.232', 443)>
```

Here's how the secondary context is incorrectly falls back to the
default context when `--secondary_kube_context` is not set:
```
k8s.py:142] Using kubernetes context "gke_grpc-testing_us-central1-a_psm-interop-security", active host: https://35.202.85.90
k8s.py:142] Using kubernetes context "None", active host: https://35.202.85.90
```
Change was created by the release automation script. See
go/grpc-release.
The `work_stealing` experiment on its own is not very valuable, so let's
delete it and save CI resources. We have a benchmark for
`GRPC_EXPERIMENTS=event_engine_listener,work_stealing`, which is really
what we care about right now.
Previously black wouldn't install, as it required newer `packaging`
package.

This fixes `pip install -r requirements-dev.txt`. In addition, `black`
in dev dependencies file is changed to `black[d]`, which bundles
`blackd` binary (["black as a
server"](https://black.readthedocs.io/en/stable/usage_and_configuration/black_as_a_server.html)).
This helps developers run benchmark loadtests locally. See comments in
scenario_runner.py for usage.

---------

Co-authored-by: drfloob <[email protected]>
…s (#34150)

Reminder: changes to this file are picked up immediately by all
tests/all branches/all langs.
This reverts commit fe1ba18.

Reason: break import



<!--

If you know who should review your pull request, please assign it to
that
person, otherwise the pull request would get assigned randomly.

If your pull request is for a specific language, please add the
appropriate
lang label.

-->
1. Revert parts of 440eef2 that
reverted 16b67ae
2. Fix race conditions in the test case that caused TSAN failures.
… on RBE (#34122)

This makes the resolver component tests suite run on Window RBE by
adding a flag in the test driver to further differentiate between Bazel
local run and Bazel RBE run on Windows since they have different
RUNFILES behavior.

Local Bazel run succeeds:
```
C:\Users\yijiem\projects\grpc>bazel --output_base=C:\bazel2 test --dynamic_mode=off --verbose_failures --test_arg=--running_locally=true //test/cpp/naming:resolver_component_tests_runner_invoker
INFO: Analyzed target //test/cpp/naming:resolver_component_tests_runner_invoker (0 packages loaded, 0 targets configured).
INFO: Found 1 test target...
Target //test/cpp/naming:resolver_component_tests_runner_invoker up-to-date:
  bazel-bin/test/cpp/naming/resolver_component_tests_runner_invoker.exe
INFO: Elapsed time: 196.080s, Critical Path: 193.21s
INFO: 2 processes: 1 internal, 1 local.
INFO: Build completed successfully, 2 total actions
//test/cpp/naming:resolver_component_tests_runner_invoker                PASSED in 193.1s

Executed 1 out of 1 test: 1 test passes.
```

RBE run succeeds:
```
C:\Users\yijiem\projects\grpc>bazel --bazelrc=tools/remote_build/windows.bazelrc test --config=windows_opt --dynamic_mode=off --verbose_failures --host_linkopt=/NODEFAULTLIB:libcmt.lib --host_linkopt=/DEFAULTLIB:msvcrt.lib --nocache_test_results //test/cpp/naming:resolver_component_tests_runner_invoker
INFO: Invocation ID: d467f2e3-7da6-4bb5-8b9b-84f1181ebc60
WARNING: --remote_upload_local_results is set, but the remote cache does not support uploading action results or the current account is not authorized to write local results to the remote cache.
INFO: Streaming build results to: https://source.cloud.google.com/results/invocations/d467f2e3-7da6-4bb5-8b9b-84f1181ebc60
INFO: Analyzed target //test/cpp/naming:resolver_component_tests_runner_invoker (0 packages loaded, 133 targets configured).
INFO: Found 1 test target...
Target //test/cpp/naming:resolver_component_tests_runner_invoker up-to-date:
  bazel-bin/test/cpp/naming/resolver_component_tests_runner_invoker.exe
INFO: Elapsed time: 41.627s, Critical Path: 39.42s
INFO: 2 processes: 1 internal, 1 remote.
//test/cpp/naming:resolver_component_tests_runner_invoker                PASSED in 33.0s

Executed 1 out of 1 test: 1 test passes.
INFO: Streaming build results to: https://source.cloud.google.com/results/invocations/d467f2e3-7da6-4bb5-8b9b-84f1181ebc60
INFO: Build completed successfully, 2 total actions
```


<!--

If you know who should review your pull request, please assign it to
that
person, otherwise the pull request would get assigned randomly.

If your pull request is for a specific language, please add the
appropriate
lang label.

-->
Relands  #34117, using a different json parsing mechanism.
…#34148)

* Before change:
* Test hangs
[here](https://github.com/grpc/grpc/blob/master/src/python/grpcio/grpc/_cython/_cygrpc/aio/call.pyx.pxi#L250)(Called
from
[here](https://github.com/grpc/grpc/blob/3b23fe62cac979a4120bc5f8909414c45805230e/src/python/grpcio/grpc/aio/_call.py#L261))
waiting for status.

* After change, we'll manually set status. We're also printing traceback
like the following:
```
Traceback (most recent call last):
  File "/usr/local/google/home/xuanwn/.cache/bazel/_bazel_xuanwn/da3828576aa39e99a5c826cc2e2e22fb/sandbox/linux-sandbox/1576/execroot/com_github_grpc_grpc/bazel-out/k8-fastbuild/bin/src/python/grpcio_tests/tests_aio/unit/metadata_test.runfiles/com_github_grpc_grpc/src/python/grpcio/grpc/aio/_call.py", line 492, in _writ
e
    await self._cython_call.send_serialized_message(serialized_request)
  File "src/python/grpcio/grpc/_cython/_cygrpc/aio/call.pyx.pxi", line 379, in send_serialized_message
  File "src/python/grpcio/grpc/_cython/_cygrpc/aio/callback_common.pyx.pxi", line 163, in _send_message
  File "src/python/grpcio/grpc/_cython/_cygrpc/aio/callback_common.pyx.pxi", line 160, in _cython.cygrpc._send_message
  File "src/python/grpcio/grpc/_cython/_cygrpc/aio/callback_common.pyx.pxi", line 106, in execute_batch
_cython.cygrpc.ExecuteBatchError: Failed grpc_call_start_batch: 11 with grpc_call_error value: 'GRPC_CALL_ERROR_INVALID_MESSAGE'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/local/google/home/xuanwn/.cache/bazel/_bazel_xuanwn/da3828576aa39e99a5c826cc2e2e22fb/sandbox/linux-sandbox/1576/execroot/com_github_grpc_grpc/bazel-out/k8-fastbuild/bin/src/python/grpcio_tests/tests_aio/unit/metadata_test.runfiles/com_github_grpc_grpc/src/python/grpcio_tests/tests_aio/unit/_test_base.py", l
ine 31, in wrapper
    return loop.run_until_complete(f(*args, **kwargs))
  File "/usr/local/google/home/xuanwn/.pyenv/versions/3.10.9/lib/python3.10/asyncio/base_events.py", line 649, in run_until_complete
    return future.result()
  File "/usr/local/google/home/xuanwn/.cache/bazel/_bazel_xuanwn/da3828576aa39e99a5c826cc2e2e22fb/sandbox/linux-sandbox/1576/execroot/com_github_grpc_grpc/bazel-out/k8-fastbuild/bin/src/python/grpcio_tests/tests_aio/unit/metadata_test.runfiles/com_github_grpc_grpc/src/python/grpcio_tests/tests_aio/unit/metadata_test.py"
, line 310, in test_stream_unary
    await call.write(_REQUEST)
  File "/usr/local/google/home/xuanwn/.cache/bazel/_bazel_xuanwn/da3828576aa39e99a5c826cc2e2e22fb/sandbox/linux-sandbox/1576/execroot/com_github_grpc_grpc/bazel-out/k8-fastbuild/bin/src/python/grpcio_tests/tests_aio/unit/metadata_test.runfiles/com_github_grpc_grpc/src/python/grpcio/grpc/aio/_call.py", line 517, in write
    await self._write(request)
  File "/usr/local/google/home/xuanwn/.cache/bazel/_bazel_xuanwn/da3828576aa39e99a5c826cc2e2e22fb/sandbox/linux-sandbox/1576/execroot/com_github_grpc_grpc/bazel-out/k8-fastbuild/bin/src/python/grpcio_tests/tests_aio/unit/metadata_test.runfiles/com_github_grpc_grpc/src/python/grpcio/grpc/aio/_call.py", line 495, in _writ
e
    await self._raise_for_status()
  File "/usr/local/google/home/xuanwn/.cache/bazel/_bazel_xuanwn/da3828576aa39e99a5c826cc2e2e22fb/sandbox/linux-sandbox/1576/execroot/com_github_grpc_grpc/bazel-out/k8-fastbuild/bin/src/python/grpcio_tests/tests_aio/unit/metadata_test.runfiles/com_github_grpc_grpc/src/python/grpcio/grpc/aio/_call.py", line 263, in _rais
e_for_status
    raise _create_rpc_error(
grpc.aio._call.AioRpcError: <AioRpcError of RPC that terminated with:
        status = StatusCode.INTERNAL
        details = "Internal error from Core"
        debug_error_string = "Failed grpc_call_start_batch: 11 with grpc_call_error value: 'GRPC_CALL_ERROR_INVALID_MESSAGE'"
>

```

<!--

If you know who should review your pull request, please assign it to
that
person, otherwise the pull request would get assigned randomly.

If your pull request is for a specific language, please add the
appropriate
lang label.

-->
This turns on the `work_stealing` experiment everywhere, enabling the
work-stealing thread pool in Posix & Windows EventEngine
implementations. This only really has an effect when EventEngine is used
for I/O (currently flag-guarded).
Adds initial support for K8s
[GAMMA](https://gateway-api.sigs.k8s.io/concepts/gamma/) (Gateway API
for Service Mesh) initiative.

- Add framework support for loading CRD-based APIs using k8s python
dynamic client
- Add basic mesh baseline test (aka ping-pong) using GAMMA setup
- Implement initial framework changes needed to run PSM tests on
GAMMA-enabled cluster using
[TDMesh](https://cloud.google.com/traffic-director/docs/gke-gateway-overview#gateway-api)
and GRPCRoute.

Based on grpc/grpc#33504.
gnossen and others added 17 commits September 26, 2023 20:29
Backports grpc/grpc#34477 to v1.59.x

This is test-only, so going forward with 1.59.0-pre despite the one
failing distribtest image.

Co-authored-by: Eugene Ostroukhov <[email protected]>
Change was created by the release automation script. See go/grpc-release
…1.59.x backport) (#34725)

Backport of #34712 to v1.59.x.
---
Fix: grpc/grpc#34672
With some recent changes in core, now `grpc_ssl_credentials_create` is
guarded by `gpr_once_init`. In our current implementation, The thread
got `gpr_once_init` lock might require GIL lock during the execution of
`grpc_ssl_credentials_create`, which might cause a deadlock if another
thread is holding GIL lock and waiting for `gpr_once_init` lock.

This change adds `with nogil` to calls to native function
`grpc_ssl_credentials_create` to make sure GIL is released before
calling `grpc_ssl_credentials_create`.
<!--

If you know who should review your pull request, please assign it to
that
person, otherwise the pull request would get assigned randomly.

If your pull request is for a specific language, please add the
appropriate
lang label.

-->
Co-authored-by: ctiller <[email protected]>
Co-authored-by: Mark D. Roth <[email protected]>
These two PRs bump the Python Windows distribtest timeout, which has
been causing several artifact builds and distribtests to fail.

---------

Co-authored-by: Xuan Wang <[email protected]>
@CLAassistant
Copy link

CLAassistant commented Nov 21, 2023

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you all sign our Contributor License Agreement before we can accept your contribution.
1 out of 16 committers have signed the CLA.

✅ rschu1ze
❌ apolcyn
❌ HannahShiSFB
❌ markdroth
❌ eugeneo
❌ yashykt
❌ XuanWang-Amos
❌ sergiitk
❌ jtattermusch
❌ veblush
❌ gnossen
❌ ginayeh
❌ dconeybe
❌ ananda1066
❌ drfloob
❌ ctiller
You have signed the CLA already but the status is still pending? Let us recheck it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.