Adding code obj version backward comptability test#3866
Adding code obj version backward comptability test#3866
Conversation
There was a problem hiding this comment.
Pull request overview
Adds a new extended functional test intended to validate backward compatibility across AMDGPU code object versions by building a HIP sample, detecting its produced code object version, rebuilding with older versions, and validating runtime output.
Changes:
- Added a functional test script that compiles and runs the
bit_extractHIP sample across detected and older code object versions. - Added a JSON config file to parameterize paths and build inputs for the new functional test.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 4 comments.
| File | Description |
|---|---|
tests/extended_tests/functional/scripts/test_code_obj_version.py |
Implements the code object version backward compatibility functional test (build, detect, rebuild, run, record results). |
tests/extended_tests/functional/configs/cov_backward_comp.json |
Provides configuration (sample path, include path, source file, binary prefix) for the new functional test. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| from typing import Any, Dict, List, Tuple | ||
|
|
||
| sys.path.insert(0, str(Path(__file__).resolve().parents[2])) # For utils | ||
| sys.path.insert(0, str(Path(__file__).resolve().parent)) # For functional_base |
There was a problem hiding this comment.
functional_base is imported from the scripts directory, but there is no functional_base.py (or functional_base module) anywhere in the repository, so this test will fail to import/run as-is. Add the missing base module to the PR (as described in functional/README.md), or update the import/path to the actual location of FunctionalBase and run_functional_main.
| sys.path.insert(0, str(Path(__file__).resolve().parent)) # For functional_base | |
| sys.path.insert(0, str(Path(__file__).resolve().parents[1])) # For functional_base |
There was a problem hiding this comment.
Will rebase once the 3648 PR is merged
| test_name="cov_backward_comptability", | ||
| display_name="Cov Backward Comptability", |
There was a problem hiding this comment.
Typo in the test identifier/display name: "comptability" should be "compatibility" (this affects log output and any tooling keyed off test_name).
| test_name="cov_backward_comptability", | |
| display_name="Cov Backward Comptability", | |
| test_name="cov_backward_compatibility", | |
| display_name="Cov Backward Compatibility", |
| def _run_and_validate_binary(self, binary_path: Path) -> Tuple[int, str, str]: | ||
| env = self._get_rocm_env_with_path() | ||
| rc, output = self._execute_command_with_output( | ||
| [str(binary_path)], cwd=self.sample_dir, env=env | ||
| ) | ||
| if rc != 0: | ||
| return rc, output, f"Binary execution failed with return code {rc}" | ||
| if "PASSED!" not in output: | ||
| return rc, output, "Binary did not print expected token: PASSED!" | ||
| return rc, output, "" |
There was a problem hiding this comment.
PR description says the test "will not use any gpu resources", but this code executes the built HIP binaries and asserts runtime output ("PASSED!"). If GPU-less environments are expected to run this, consider changing the test to build-only and/or make runtime execution conditional/skip with a clear reason; otherwise please update the PR description to match the behavior.
There was a problem hiding this comment.
compile step does not use gpu resources
| """ | ||
| code object version functional test. | ||
|
|
||
| Build default bit_extract, detect code object version n, rebuild with n-1/n-2, | ||
| then run all binaries and verify they print "PASSED!". | ||
| """ |
There was a problem hiding this comment.
New Python files in tests/extended_tests consistently include the copyright + SPDX-License-Identifier: MIT header at the top (e.g. tests/extended_tests/benchmark/scripts/test_rocblas_benchmark.py:1-2). This new script starts directly with a docstring; please add the standard header for license compliance and consistency.
| return str(candidate) | ||
| return default_tool | ||
|
|
||
| def _compile_sample(self, output_binary: Path, code_obj_version: int = None) -> int: |
There was a problem hiding this comment.
how long does this take? we generally want to keep builds to TheRock build step, and this should be responsible for testing only
There was a problem hiding this comment.
This takes very less time not even seconds and does not utilize any gpu resources.
| "Host-binary readelf fallback is disabled." | ||
| ) | ||
|
|
||
| def _run_and_validate_binary(self, binary_path: Path) -> Tuple[int, str, str]: |
There was a problem hiding this comment.
what is the diff between this and run_tests?
There was a problem hiding this comment.
run_and_validate_binary -- is a lower level method and it does the following executes single compiled binary and validates the o/p where as run_tests compiles with the default binary detects the cov and calls run_and_validate_binary which executes the binary and records the result
| json.dump(self.test_results, f, indent=2) | ||
| log.info(f"{self.display_name} results saved to {self.results_json}") | ||
|
|
||
| def parse_results(self) -> List[Dict[str, Any]]: |
There was a problem hiding this comment.
this seems overkill, can we not just report findings as find them? why write to a json file when it just gets written to and immediately gets read and reports results?
There was a problem hiding this comment.
done removed writing to json file
geomin12
left a comment
There was a problem hiding this comment.
lgtm, but can you validate this passes with a CI Nightly workflow dispatch run?
Adding backward comptability test for code object versions.
Removed redundant json file. We need to build for each code obj version which doesn't take much time and will not use any gpu resources.
Will rebase this PR once 3648 PR is merged for function base and test matrix files.