Implement overwrite save mode by MaxKsyunz · Pull Request #26 · Bit-Quill/spark-spanner-connector

MaxKsyunz · 2026-03-09T22:03:47Z

It supports overwriteMode option with two possible values:

truncate -- All rows in the table will be deleted. incoming dataframe must matching existing schema.
recreate -- table will be dropped and recreated based on the schema of the incoming dataframe.

…itioned DML. Signed-off-by: Max Ksyunz <max.ksyunz@improving.com>

Signed-off-by: Max Ksyunz <max.ksyunz@improving.com>

spotless:apply was not fixing error that spotless:check was reporting. Signed-off-by: Max Ksyunz <max.ksyunz@improving.com>

…nner/SpannerWriteBuilder.java Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>

Signed-off-by: Max Ksyunz <max.ksyunz@improving.com>

… "recreate" Signed-off-by: Max Ksyunz <max.ksyunz@improving.com>

…nner/SpannerWriteBuilder.java Temporary code. will be merged with catalog branch. Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>

…g class. Signed-off-by: Max Ksyunz <max.ksyunz@improving.com>

…nner/SpannerWriteBuilder.java Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>

Signed-off-by: Max Ksyunz <max.ksyunz@improving.com>

MaxKsyunz · 2026-03-09T22:04:55Z

/gemini review

gemini-code-assist

Code Review

This pull request implements the overwrite save mode for the Spanner Spark connector, supporting both truncate and recreate strategies, leveraging Spark's SupportsTruncate interface and including well-designed integration tests. However, a critical security issue was identified where the target table for truncation or recreation is determined by user-supplied write options rather than the SpannerTable instance properties. This could lead to operations being performed on unintended tables if conflicting options are provided, requiring careful validation to ensure destructive operations target the correct table. Additionally, there are suggestions to improve efficiency and exception handling in the implementation.

gemini-code-assist · 2026-03-09T22:14:21Z

spark-3.1-spanner-lib/src/main/java/com/google/cloud/spark/spanner/SpannerWriteBuilder.java

+    String instanceId = SpannerUtils.getRequiredOption(opts, "instanceId");
+    String databaseId = SpannerUtils.getRequiredOption(opts, "databaseId");
+    String tableName = SpannerUtils.getRequiredOption(opts, "table");


In recreateTable, the table name, instance ID, and database ID are retrieved from the user-supplied write options (opts) instead of the properties of the SpannerTable for which the WriteBuilder was created. If a user provides different values for these options in the write call, the recreate operation will be performed on the specified table instead of the intended one. This could lead to accidental or malicious data loss in a different table. It is recommended to use this.properties which contains the validated connection and table details from the SpannerTable instance.

gemini-code-assist · 2026-03-09T22:14:22Z

spark-3.1-spanner-lib/src/main/java/com/google/cloud/spark/spanner/SpannerWriteBuilder.java

+    String projectId = SpannerUtils.getRequiredOption(opts, "projectId");
+    String instanceId = SpannerUtils.getRequiredOption(opts, "instanceId");
+    String databaseId = SpannerUtils.getRequiredOption(opts, "databaseId");
+    String tableName = SpannerUtils.getRequiredOption(opts, "table");


In truncateTable, the project ID, instance ID, database ID, and table name are retrieved from the user-supplied write options (opts) instead of the properties of the SpannerTable. This allows a user to potentially truncate an arbitrary table by providing its details in the write options, bypassing the intended target of the WriteBuilder. Additionally, these values are used without the validation performed in SpannerUtils.connectionFromProperties. It is recommended to use this.properties to ensure the operation is performed on the correct and validated target.

spark-3.1-spanner-lib/src/main/java/com/google/cloud/spark/spanner/SpannerWriteBuilder.java

Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>

Signed-off-by: Max Ksyunz <max.ksyunz@improving.com>

MaxKsyunz · 2026-03-09T22:35:36Z

/gemini review

gemini-code-assist

Code Review

This pull request introduces support for overwriteMode with truncate and recreate options, enhancing the flexibility of data writing operations. The changes include adding a truncateTableDml method in SpannerInformationSchema, updating SpannerTable to support TRUNCATE capability, and implementing the SupportsTruncate interface in SpannerWriteBuilder with logic for both overwrite modes. New integration tests have been added to validate the functionality of these overwrite modes.

spark-3.1-spanner-lib/src/main/java/com/google/cloud/spark/spanner/SpannerWriteBuilder.java

…nner/SpannerWriteBuilder.java Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>

MaxKsyunz and others added 12 commits March 9, 2026 10:32

Implement overwrite mode as delete from ... where true sent as part…

996c3f0

…itioned DML. Signed-off-by: Max Ksyunz <max.ksyunz@improving.com>

Fix-up formatting

0d6344b

Signed-off-by: Max Ksyunz <max.ksyunz@improving.com>

Fix-up formatting II

07171af

Signed-off-by: Max Ksyunz <max.ksyunz@improving.com>

No comments, no problems.

78b8fe2

spotless:apply was not fixing error that spotless:check was reporting. Signed-off-by: Max Ksyunz <max.ksyunz@improving.com>

Update spark-3.1-spanner-lib/src/main/java/com/google/cloud/spark/spa…

1e2f8a1

…nner/SpannerWriteBuilder.java Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>

Replace print statements with logs.

9680724

Signed-off-by: Max Ksyunz <max.ksyunz@improving.com>

add support for overwriteMode option for Overwrite save mode, support…

f43a71a

… "recreate" Signed-off-by: Max Ksyunz <max.ksyunz@improving.com>

Update spark-3.1-spanner-lib/src/main/java/com/google/cloud/spark/spa…

4efe8d3

…nner/SpannerWriteBuilder.java Temporary code. will be merged with catalog branch. Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>

Rework recreate mode to use same DDL generation code as SpannerCatalo…

48255b2

…g class. Signed-off-by: Max Ksyunz <max.ksyunz@improving.com>

Update spark-3.1-spanner-lib/src/main/java/com/google/cloud/spark/spa…

6ff3f1b

…nner/SpannerWriteBuilder.java Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>

Refactor Overwrite save mode integration tests.

7fe7619

Signed-off-by: Max Ksyunz <max.ksyunz@improving.com>

Formatting clean-up

7d3b9f5

Signed-off-by: Max Ksyunz <max.ksyunz@improving.com>

gemini-code-assist bot reviewed Mar 9, 2026

View reviewed changes

MaxKsyunz and others added 3 commits March 9, 2026 15:15

Simplify dialect check

3d3a69f

Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>

Catch appropriate exception for DDL error

58cc7c1

Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>

Clean up property usage.

baaddcf

Signed-off-by: Max Ksyunz <max.ksyunz@improving.com>

gemini-code-assist bot reviewed Mar 9, 2026

View reviewed changes

spark-3.1-spanner-lib/src/main/java/com/google/cloud/spark/spanner/SpannerWriteBuilder.java Show resolved Hide resolved

Update spark-3.1-spanner-lib/src/main/java/com/google/cloud/spark/spa…

1eacb24

…nner/SpannerWriteBuilder.java Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>

MaxKsyunz merged commit 676dc77 into integ/overwrite_save_mode Mar 10, 2026
1 check passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement overwrite save mode #26

Implement overwrite save mode #26
MaxKsyunz merged 16 commits intointeg/overwrite_save_modefrom
dev/overwrite_mode_v3

MaxKsyunz commented Mar 9, 2026

Uh oh!

MaxKsyunz commented Mar 9, 2026

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

gemini-code-assist bot Mar 9, 2026

Uh oh!

gemini-code-assist bot Mar 9, 2026

Uh oh!

Uh oh!

Uh oh!

MaxKsyunz commented Mar 9, 2026

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

MaxKsyunz commented Mar 9, 2026

Uh oh!

MaxKsyunz commented Mar 9, 2026

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Mar 9, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Mar 9, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

MaxKsyunz commented Mar 9, 2026

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant