Introduce rally-tracks compatibility testing#1564
Introduce rally-tracks compatibility testing#1564michaelbaamonde merged 12 commits intoelastic:masterfrom
Conversation
This commit modifies and extends `pytest-rally` so that it can be invoked from
within the Rally repo. This enables Rally developers and CI jobs to test
Rally changes against arbitrary revisions of local track repositories,
using arbitrary versions of Elasticsearch.
The actual contents of the tests live in the track repository. The
plugin will run any tests found in the `it` subdirectory of the provided track
repository by default, including those that are auto-generated by
`pytest-rally`.
By default, tests will be run against the `master` branch of the track
repository checked out in `$RALLY_HOME/benchmarks/tracks/default` (typically
`rally-tracks`), using a build of the `main` branch of Elasticsearch.
To run tests with these defaults, run the following from the root of this
repository:
`pytest it/track_repo_compatibility`
Here is an example invocation that overrides these defaults:
```
pytest it/track_repo_compatibility \
--track-repository=/path/to/repo \
--track-revision=some-branch \
--distribution-version=8.3.2
```
| install: install-user | ||
| # Also install development dependencies | ||
| . $(VENV_ACTIVATE_FILE); $(PIP_WRAPPER) install -e .[develop] | ||
| . $(VENV_ACTIVATE_FILE); $(PIP_WRAPPER) install git+https://github.com/elastic/pytest-rally.git |
There was a problem hiding this comment.
This is a stop-gap until we cut a release of pytest-rally. I cannot manage to get pip and/or hatchling to behave if I attempt to install this from source as an optional develop dependency declared in pyproject.toml, even if tool.hatch.metadata.allow-direct-references is set to true. It's a rabbit hole for another day.
There was a problem hiding this comment.
I think it's a bug in pip. In any case, I think we should upload pytest-rally on PyPI since it's not expected to change much and the resulting experience would be nicer.
There was a problem hiding this comment.
Yeah, I looked into it a bit and it does seem to be a bug in pip. Definitely agree on uploading the plugin to PyPI once it's stable.
| pytest | ||
|
|
||
| [testenv:rally-tracks-compat] | ||
| deps = pytest-rally @ git+https://github.com/elastic/pytest-rally.git |
There was a problem hiding this comment.
| default=RALLY_TRACKS_DIR, | ||
| help=("Path to a local track repository\n" f"(default: {RALLY_TRACKS_DIR})"), | ||
| ) | ||
| group.addoption("--track-revision", action="store", default="master", help=("Track repository revision to test\n" "default: `master`")) |
There was a problem hiding this comment.
Adding that comma and running black will make this call more consistent with the others and thus easier to read:
| group.addoption("--track-revision", action="store", default="master", help=("Track repository revision to test\n" "default: `master`")) | |
| group.addoption("--track-revision", action="store", default="master", help=("Track repository revision to test\n" "default: `master`"),) |
| if repo == RALLY_TRACKS_DIR: | ||
| try: | ||
| # this will perform the initial clone of rally-tracks | ||
| subprocess.run(shlex.split("esrally list tracks"), text=True, capture_output=True, check=True) |
There was a problem hiding this comment.
Relying on a central but undocumented feature of esrally list tracks sounds off. Have you considered using track.GitTrackRepository or even esrally.utils.git.RallyRepository directly by supplying it the few mandatory params it needs? This may require moving some default values out of esrally/rally.py though, which could make this approach too burdensome in practice.
| install: install-user | ||
| # Also install development dependencies | ||
| . $(VENV_ACTIVATE_FILE); $(PIP_WRAPPER) install -e .[develop] | ||
| . $(VENV_ACTIVATE_FILE); $(PIP_WRAPPER) install git+https://github.com/elastic/pytest-rally.git |
There was a problem hiding this comment.
I think it's a bug in pip. In any case, I think we should upload pytest-rally on PyPI since it's not expected to change much and the resulting experience would be nicer.
|
|
||
| RALLY_HOME = os.getenv("RALLY_HOME", os.path.expanduser("~")) | ||
| RALLY_CONFIG_DIR = os.path.join(RALLY_HOME, ".rally") | ||
| RALLY_TRACKS_DIR = os.path.join(RALLY_CONFIG_DIR, "benchmarks", "tracks", "default") |
There was a problem hiding this comment.
Should we consider another location in order to not affect and be affected by changes and checkouts in the default tracks git repository?
There was a problem hiding this comment.
I was originally intending to have these tests mimic a clean installation of Rally. By default, if a user installs Rally, it will clone rally-tracks into ~/.rally/benchmarks/tracks/default the first time an esrally command is run that requires a track repository to be available on disk.
I think this makes sense for CI, but I can see the argument for instead explicitly cloning into a distinct, non-default location for these tests, especially locally.
This commit implements the following control flow:
- If
--track-repositoryis provided directly, we make sure that the directory actually exists on disk, raising an exception if it doesn't. - If
--track-repositoryisn't provided, we default to~/.rally/benchmarks/tracks/rally-tracks-compat, cloninghttps://github.com/elastic/rally-tracksthere if the directory doesn't yet exist.
Re: your suggestions here, shelling out to git does seem easier for cloning, so I went with that.
I considered making other things configurable (the remote repo URL, an option to perform an initial clone into a custom track repository path, etc.) but I think this approach serves CI and local development well enough in their default cases.
What do you think?
There was a problem hiding this comment.
On one hand, I don't want to conflict with people's local configurations. On the other hand, I don't want additional checkouts present on my local in order to isolate testing properly.
I propose creating a rally-compat.ini which has a comment at the header # Automatically managed - do not touch and running compatibility tests via --configuration-name compat. This would specify the canonical upstream default.url = https://github.com/elastic/rally-tracks (as well as our teams and source URLs) in a place that doesn't need to be touched by users and would better avoid users shooting themselves in the foot. Is this feasible?
There was a problem hiding this comment.
Mike and I discussed this offline. This is not a huge concern compared to the trappy nature of trying to isolate it perfectly. I agree with him that we could just move forward with this approach as-is
There was a problem hiding this comment.
Yeah I also think the rally-tracks-compat checkout is the best compromise here. Thanks!
|
Thanks for the review @pquentin! I believe I've addressed your comments, so this is ready for another look whenever you're free. |
pquentin
left a comment
There was a problem hiding this comment.
Thanks, this looks great now! Haven't tested the CI configuration though.
It's a little cumbersome (especially for jobs triggered by PRs) unless you're conversant in the local Jenkins workflow we use internally, which maybe you are already. We can sync on this if you'd like. |
This PR adds the infrastructure necessary to run
rally-tracksintegration tests (see elastic/rally-tracks#289) from within the Rally repository, both locally and--more importantly--as a PR check via CI.We modify and extend the
pytest-rallyplugin so that its behavior and defaults make sense when run from within the Rally repo as opposed to a track repo. This enables Rally developers and CI jobs to test Rally changes against arbitrary revisions of local track repositories, using arbitrary versions of Elasticsearch.In the default case, it simply ensures that changes being made to Rally do not break the
masterbranch ofrally-tracks. It does this by executing whatever tests are contained in theitsubdirectory of~/.rally/benchmarks/tracks/default. By default, it uses a build of the main branch of ES from source (i.e.--revision=currentin Rally terms) as the benchmark candidate.To run these tests, execute
pytest it/track_repo_compatibility/from the root of the Rally repo after runningmake install, or runmake rally-tracks-compatto run them within atoxenvironment.Default behavior
Here's an example of what happens by default, but we'll limit the example to just one test (for brevity of output) and add some extra logging to more clearly see what the
pytest-rallyplugin is doing:pytest it/track_repo_compatibility --log-cli-level=INFO -k metricbeatYou can see that we've applied the defaults for the
--track-repositoryand--track-revisionoptions in the first few lines of the output:rally: track-repository=/home/baamonde/.rally/benchmarks/tracks/default, track-revision=masterAnd that these make their way into the
esrallycommands thatpytest-rallygenerates:Overriding defaults
For local development, overriding these defaults could be useful. For example, if you're working on changes to both Rally and rally-tracks locally, and want to test with a released version of ES, you could do something like this:
pytest it/track_repo_compatibility/ \ --log-cli-level=INFO \ --track-repository=/home/baamonde/code/elastic/rally-tracks \ --track-revision=ci \ --distribution-version=8.3.2 \ -k metricbeatThe pytest session header will reflect these changes:
The plugin will pass along the
--distribution-versionto install the ES fixture:And race commands will provide the correct
--track-repositoryand--track-revision: