Conversation
|
/gemini review |
There was a problem hiding this comment.
Code Review
This pull request adds acceptance tests for write operations. The changes look good overall, but I have a few suggestions to improve code quality and test correctness.
In the Java test file, I've suggested using Google Truth's assertThat for consistency with existing assertions.
In the Python test script, I've pointed out several unused imports that should be removed. More importantly, I've proposed a refactoring of the data verification logic to make it more efficient and to provide clearer, more accurate results in case of a data mismatch. The current logic is inefficient and can produce confusing failure messages.
...-lib/src/test/java/com/google/cloud/spark/spanner/acceptance/DataprocAcceptanceTestBase.java
Outdated
Show resolved
Hide resolved
spark-3.1-spanner-lib/src/test/resources/acceptance/write_test_table.py
Outdated
Show resolved
Hide resolved
|
/gemini review |
There was a problem hiding this comment.
Code Review
This pull request adds acceptance tests for the write functionality using a new Python Spark script. It also refactors the DataprocServerlessAcceptanceTestBase to use a one-time setup and teardown, which is a good improvement for test performance.
I've found a critical issue in the refactored setup logic where the initialization flag was not being set, causing the setup to run for every test. I've also suggested a minor improvement for assertion consistency. Please see the detailed comments.
| if (!initialized) { | ||
| try { | ||
| context = | ||
| new AcceptanceTestContext( | ||
| testId, generateClusterName(testId), testBaseGcsDir, connectorJarUri); | ||
| AcceptanceTestUtils.uploadConnectorJar( | ||
| connectorJarDirectory, connectorJarPrefix, context.connectorJarUri); | ||
| createSpannerDataset(); | ||
| } catch (Exception e) { | ||
| initializationException = e; | ||
| throw e; | ||
| } | ||
| } |
There was a problem hiding this comment.
The one-time initialization logic has a bug: the initialized flag is never set to true. This causes the expensive setup to run before every test, defeating the purpose of the refactoring. Additionally, the check is not thread-safe, which could lead to race conditions if tests are run in parallel. I suggest wrapping the initialization logic in a synchronized block and setting the flag upon successful completion.
synchronized (DataprocServerlessAcceptanceTestBase.class) {
if (!initialized) {
try {
context =
new AcceptanceTestContext(
testId, generateClusterName(testId), testBaseGcsDir, connectorJarUri);
AcceptanceTestUtils.uploadConnectorJar(
connectorJarDirectory, connectorJarPrefix, context.connectorJarUri);
createSpannerDataset();
initialized = true;
} catch (Exception e) {
initializationException = e;
throw e;
}
}
}
No description provided.