Skip to content

Support for Spark Catalogs and ErrorIfExists save mode. (#19)#20

Open
MaxKsyunz wants to merge 18 commits intomainfrom
integ/save_mode_simple
Open

Support for Spark Catalogs and ErrorIfExists save mode. (#19)#20
MaxKsyunz wants to merge 18 commits intomainfrom
integ/save_mode_simple

Conversation

@MaxKsyunz
Copy link
Copy Markdown

Main Changes:

  • Connector implements Catalog API.
  • ErrorIfExists and Ignore save modes are supported (via Catalog API).
  • Separate format (cloud-spanner-graph) and table provider for Spanner Graphs.

Introducing a separate format for graphs was necessary because current implementation tightly couples Spark table definition and graph query. To support Catalog API for Spanner graphs and tables with one Spark table provider requires significantly refactoring graph support.

Specifically, I ran into a case where loading a graph table and then querying resulted in type conversion errors because query code casts Id columns to String even though they are defined as INT64.


@MaxKsyunz MaxKsyunz requested a review from stevelordbq March 2, 2026 21:25
MaxKsyunz and others added 2 commits March 2, 2026 13:41
Main Changes:
- Connector implements Catalog API.
- ErrorIfExists and Ignore save modes are supported (via Catalog API).
- Separate format (`cloud-spanner-graph`) and table provider for
Spanner Graphs.

Introducing a separate format for graphs was necessary because current
implementation tightly couples Spark table definition and graph query.
To support Catalog API for Spanner graphs and tables with one Spark table provider
requires significantly refactoring graph support.

Specifically, I ran into a case where loading a graph table and then
querying resulted in type conversion errors because query code casts Id
columns to `String` even though they are defined as `INT64`.

---------

Signed-off-by: Max Ksyunz <max.ksyunz@improving.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
@MaxKsyunz MaxKsyunz force-pushed the integ/save_mode_simple branch from 8ab0e02 to d4a3399 Compare March 2, 2026 21:41
MaxKsyunz added 16 commits March 2, 2026 15:28
Signed-off-by: Max Ksyunz <max.ksyunz@improving.com>
Signed-off-by: Max Ksyunz <max.ksyunz@improving.com>
…rmat(#22)

- `SpannerCatalogTableProviderBase.extractIdentifier` encodes dataframe options  supported by graphs into returned identifier.
- These are: `graph`, `type`, `configs`, `graphQuery`, `timestamp`, `viewsEnabled.
- `SpannerCatalog.loadTable` extracts these options to correctly instantiate `SpannerGraph` class.
- Removed classes related to `cloud-spanner-graph`.
Signed-off-by: Max Ksyunz <max.ksyunz@improving.com>
Signed-off-by: Max Ksyunz <max.ksyunz@improving.com>
Signed-off-by: Max Ksyunz <max.ksyunz@improving.com>
Signed-off-by: Max Ksyunz <max.ksyunz@improving.com>
Other refactorings and clean-up

Signed-off-by: Max Ksyunz <max.ksyunz@improving.com>
# Conflicts:
#	spark-3.1-spanner-lib/src/test/java/com/google/cloud/spark/spanner/integration/WriteIntegrationTest.java
Signed-off-by: Max Ksyunz <max.ksyunz@improving.com>
Signed-off-by: Max Ksyunz <max.ksyunz@improving.com>
Signed-off-by: Max Ksyunz <max.ksyunz@improving.com>
Signed-off-by: Max Ksyunz <max.ksyunz@improving.com>
- Make sure most integration tests continue to test with dataframe API without catalog
- Add integration tests with Catalog-based write operations specifically
- Store dataframe options for writer as case-insensitive map. Spark options are case-insensitive and sometimes they arrive normalized to all lower case.

Signed-off-by: Max Ksyunz <max.ksyunz@improving.com>
Signed-off-by: Max Ksyunz <max.ksyunz@improving.com>
Signed-off-by: Max Ksyunz <max.ksyunz@improving.com>
…arity

They both perform the same write operation twice with different results.

Signed-off-by: Max Ksyunz <max.ksyunz@improving.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant