Skip to content

Add mysql_fdw for MySQL/Doris support#584

Open
wenerme wants to merge 6 commits intosupabase:mainfrom
wenerme:feat/mysql
Open

Add mysql_fdw for MySQL/Doris support#584
wenerme wants to merge 6 commits intosupabase:mainfrom
wenerme:feat/mysql

Conversation

@wenerme
Copy link

@wenerme wenerme commented Mar 4, 2026

Summary

  • Add a new native MySQL FDW (mysql_fdw) that supports querying MySQL and MySQL-protocol compatible databases (e.g., Apache Doris)
  • Uses mysql_async crate with text protocol (simple_query) instead of prepared statements for broader compatibility with databases that don't support mysql_stmt_result_metadata
  • Based on mssql_fdw template with MySQL-specific adaptations: LIMIT without ORDER BY, string-based date/time parsing, support for conn_string and conn_string_id (Vault) options

@coderabbitai
Copy link

coderabbitai bot commented Mar 4, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

This pull request introduces a MySQL Foreign Data Wrapper (FDW) to enable PostgreSQL to query MySQL tables. The changes add the mysql_async dependency and feature flag to Cargo.toml, register the mysql_fdw module conditionally, implement an error handling layer (MysqlFdwError), and provide a complete FDW implementation with lifecycle methods (new, begin_scan, iter_scan, re_scan, end_scan), SQL query generation (deparse), and row type conversion (field_to_cell) supporting multiple column types. A test module scaffold is also included.

Sequence Diagram

sequenceDiagram
    participant PG as PostgreSQL<br/>Query Engine
    participant FDW as MysqlFdw
    participant RT as Tokio<br/>Runtime
    participant MySQL as MySQL<br/>Database
    participant Conv as Type<br/>Converter

    PG->>FDW: new(server)
    FDW->>RT: Create runtime
    FDW->>FDW: Resolve connection string
    FDW-->>PG: FDW instance

    PG->>FDW: begin_scan(quals, columns, sorts, limit)
    FDW->>FDW: Build SELECT query via deparse()
    FDW->>RT: Execute async query
    RT->>MySQL: SELECT with WHERE/ORDER BY/LIMIT
    MySQL-->>RT: Return rows
    RT-->>FDW: Store Vec<Row>
    FDW-->>PG: Scan ready

    loop For each row
        PG->>FDW: iter_scan(row)
        FDW->>Conv: field_to_cell() for each column
        Conv->>Conv: Parse type (bool/int/float/numeric/text/date/timestamp)
        Conv-->>FDW: Cell values
        FDW->>FDW: Materialize target row
        FDW-->>PG: Row with data
    end

    PG->>FDW: end_scan()
    FDW->>FDW: Clear results
    FDW-->>PG: Scan complete
Loading

Tip

Try Coding Plans. Let us write the prompt for your AI agent so you can ship faster (with fewer bugs).
Share your feedback on Discord.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@wrappers/Cargo.toml`:
- Around line 146-151: The new feature group mysql_fdw is declared but not
included in the aggregate native_fdws/all_fdws features; update the feature
lists so native_fdws includes "mysql_fdw" and ensure any higher-level aggregate
(e.g., all_fdws or all) that should enable every FDW includes native_fdws (or
directly includes "mysql_fdw") so builds that rely on native_fdws/all_fdws will
pull in the mysql_fdw feature.

In `@wrappers/src/fdw/mysql_fdw/mod.rs`:
- Around line 35-38: The From<MysqlFdwError> for ErrorReport implementation
currently forwards raw error text which can leak DSN details and flattens
OptionsError into a generic FDW error; update this conversion to (1) match on
MysqlFdwError variants, (2) for mysql_async/connection-related variants produce
a sanitized message (redacting DSN/credentials) instead of the raw error string,
and (3) when encountering the OptionsError variant return an ErrorReport that
preserves its specialized SQLSTATE mapping rather than using
PgSqlErrorCode::ERRCODE_FDW_ERROR; modify the match arms in the from()
implementation to map each variant appropriately and only include non-sensitive
diagnostic details.

In `@wrappers/src/fdw/mysql_fdw/mysql_fdw.rs`:
- Around line 125-160: The SQL builder is inserting unquoted/unescaped
identifiers and deparsed qualifiers directly (variables: tgts, self.table,
columns, quals with qual.deparse(), sorts with sort.field), which risks SQL
injection and breaks on reserved/special characters; update sql assembly to
properly quote/escape identifiers (table and column names from self.table,
columns/tgts, and sort.field) using the DB-safe identifier quoting/escaping
routine or a helper (e.g., quote_identifier or escape_identifier) and ensure
quals/deparse() produces only parameter placeholders with values passed
separately or is rewritten to return a safe expression and parameters, rather
than interpolating raw text—apply this consistently where tgts, self.table,
qual.deparse(), and sort.field are used to build the WHERE and ORDER BY clauses.
- Around line 57-82: The code silently converts datetime parse failures into
NULL by using .ok()/.and_then (in the DATEOID, TIMESTAMPOID, TIMESTAMPTZOID
branches and the similar block at lines ~91-95); change these to propagate parse
errors instead of swallowing them: replace the .ok()/.and_then chains around
get_col::<String> and parse_naive_datetime with explicit Result handling (match
or ? on Result from parse_naive_datetime) and return a descriptive Err (or map
the parse error into the function's error type) so callers see conversion
failures; reference the branches for PgOid::BuiltIn(PgBuiltInOids::DATEOID),
TIMESTAMPOID, TIMESTAMPTZOID and functions get_col, parse_naive_datetime,
to_timestamp, and the Cell::Date/Cell::Timestamp/Cell::Timestamptz constructors
when implementing the change.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Central YAML (base), Organization UI (inherited)

Review profile: CHILL

Plan: Pro

Run ID: 34ec0ec6-dffe-4b26-9657-da47005e453f

📥 Commits

Reviewing files that changed from the base of the PR and between 77d4200 and 11fa34b.

⛔ Files ignored due to path filters (1)
  • Cargo.lock is excluded by !**/*.lock
📒 Files selected for processing (5)
  • wrappers/Cargo.toml
  • wrappers/src/fdw/mod.rs
  • wrappers/src/fdw/mysql_fdw/mod.rs
  • wrappers/src/fdw/mysql_fdw/mysql_fdw.rs
  • wrappers/src/fdw/mysql_fdw/tests.rs

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

♻️ Duplicate comments (1)
wrappers/src/fdw/mysql_fdw/mysql_fdw.rs (1)

139-148: ⚠️ Potential issue | 🔴 Critical

Unsafe qualifier SQL interpolation still allows predicate injection and identifier breakage.

q.deparse() output is interpolated directly into the remote SQL. In this code path, qualifier identifiers/literals are not safely MySQL-quoted/escaped, so crafted string predicates can alter SQL semantics and reserved/special identifiers in quals can fail at runtime.

Suggested fix (build WHERE clauses with MySQL-safe deparsing)
+fn quote_literal(value: &str) -> String {
+    format!("'{}'", value.replace('\\', "\\\\").replace('\'', "''"))
+}
+
+fn format_mysql_cell(cell: &Cell) -> String {
+    match cell {
+        Cell::String(v) => quote_literal(v),
+        _ => format!("{cell}"),
+    }
+}
+
+fn deparse_mysql_qual(q: &Qual) -> MysqlFdwResult<String> {
+    let field = quote_ident(&q.field);
+    let op = match q.operator.as_str() {
+        "=" | "<>" | ">" | ">=" | "<" | "<=" | "is" | "is not" => q.operator.as_str(),
+        "~~" => "like",
+        "!~~" => "not like",
+        other => {
+            return Err(MysqlFdwError::ConversionError(format!(
+                "unsupported operator for pushdown: {other}"
+            )));
+        }
+    };
+
+    if q.use_or {
+        match &q.value {
+            Value::Array(cells) => Ok(cells
+                .iter()
+                .map(|cell| format!("{field} {op} {}", format_mysql_cell(cell)))
+                .collect::<Vec<_>>()
+                .join(" or ")),
+            Value::Cell(_) => unreachable!(),
+        }
+    } else {
+        match &q.value {
+            Value::Cell(Cell::String(v)) if (op == "is" || op == "is not") && v == "null" => {
+                Ok(format!("{field} {op} null"))
+            }
+            Value::Cell(cell) => Ok(format!("{field} {op} {}", format_mysql_cell(cell))),
+            Value::Array(_) => unreachable!(),
+        }
+    }
+}
@@
-            let cond = quals
-                .iter()
-                .map(|q| q.deparse())
-                .collect::<Vec<String>>()
-                .join(" and ");
+            let cond = quals
+                .iter()
+                .map(deparse_mysql_qual)
+                .collect::<MysqlFdwResult<Vec<String>>>()?
+                .join(" and ");

Verification (read-only): confirm the current code path still interpolates raw Qual::deparse() and that Qual::deparse/string formatting are not MySQL-safe. Expected result: both commands return matches.

#!/bin/bash
set -euo pipefail

rg -n -C3 'map\(\|q\| q\.deparse\(\)\)' wrappers/src/fdw/mysql_fdw/mysql_fdw.rs
rg -n -C3 'impl Qual|deparse_with_fmt|format!\("\{\} \{\} \{\}"|Cell::String\(v\).*write!' supabase-wrappers/src/interface.rs

As per coding guidelines, “Restrict feedback to errors, security risks, or functionality-breaking problems.”

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@wrappers/src/fdw/mysql_fdw/mysql_fdw.rs` around lines 139 - 148, The current
code concatenates raw q.deparse() outputs into the SQL (in the quals -> cond ->
sql.push_str path), which allows predicate injection and identifier/literal
breakage; change the WHERE construction to build predicates using a MySQL-safe
deparsing routine or explicit escaping rather than raw string interpolation: for
each Qual in quals call a safe formatter (e.g. a deparse_with_fmt or dedicated
escape_identifier/escape_literal helpers) that returns properly quoted
identifiers (`backticks`) and escaped string/number literals, then join those
safe predicates with " and " and append to sql; alternatively switch to
parameterized queries if supported by the remote execution path. Ensure the
change targets the quals iteration and the q.deparse() call so all qualifier
output is sanitized before sql.push_str.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Duplicate comments:
In `@wrappers/src/fdw/mysql_fdw/mysql_fdw.rs`:
- Around line 139-148: The current code concatenates raw q.deparse() outputs
into the SQL (in the quals -> cond -> sql.push_str path), which allows predicate
injection and identifier/literal breakage; change the WHERE construction to
build predicates using a MySQL-safe deparsing routine or explicit escaping
rather than raw string interpolation: for each Qual in quals call a safe
formatter (e.g. a deparse_with_fmt or dedicated escape_identifier/escape_literal
helpers) that returns properly quoted identifiers (`backticks`) and escaped
string/number literals, then join those safe predicates with " and " and append
to sql; alternatively switch to parameterized queries if supported by the remote
execution path. Ensure the change targets the quals iteration and the
q.deparse() call so all qualifier output is sanitized before sql.push_str.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Central YAML (base), Organization UI (inherited)

Review profile: CHILL

Plan: Pro

Run ID: 66dfb6b6-84e7-4774-935e-1acb024ef716

📥 Commits

Reviewing files that changed from the base of the PR and between 11fa34b and 433873c.

📒 Files selected for processing (3)
  • wrappers/Cargo.toml
  • wrappers/src/fdw/mysql_fdw/mod.rs
  • wrappers/src/fdw/mysql_fdw/mysql_fdw.rs
🚧 Files skipped from review as they are similar to previous changes (1)
  • wrappers/src/fdw/mysql_fdw/mod.rs

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@wrappers/src/fdw/tencent_cls_fdw/tencent_cls_fdw.rs`:
- Around line 165-175: The current headers.insert calls use
HeaderValue::from_str(...).unwrap(), which will panic on non-ASCII
user-controlled values (authorization, self.region, timestamp) — change these
unwraps to safe, fallible handling: replace the unwraps around
HeaderValue::from_str for "Authorization", "X-TC-Region", and "X-TC-Timestamp"
with code that either (a) validates the input is ASCII (e.g.
authorization.is_ascii(), self.region.is_ascii()) and returns a proper
Error/Result to the caller if not, or (b) attempts HeaderValue::from_str and
maps Err into a returned error (propagate with context) instead of panicking;
update the function that builds the headers (the block with headers.insert(...)
for Authorization, X-TC-Region, X-TC-Timestamp and X-TC-Timestamp using
timestamp.to_string()) to return Result so header conversion failures are
propagated safely.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Central YAML (base), Organization UI (inherited)

Review profile: CHILL

Plan: Pro

Run ID: 4886062a-74dc-4aa3-989d-97f481e64560

📥 Commits

Reviewing files that changed from the base of the PR and between 433873c and a771087.

📒 Files selected for processing (7)
  • wrappers/Cargo.toml
  • wrappers/src/fdw/mod.rs
  • wrappers/src/fdw/mysql_fdw/mod.rs
  • wrappers/src/fdw/mysql_fdw/mysql_fdw.rs
  • wrappers/src/fdw/tencent_cls_fdw/mod.rs
  • wrappers/src/fdw/tencent_cls_fdw/tencent_cls_fdw.rs
  • wrappers/src/fdw/tencent_cls_fdw/tests.rs

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

♻️ Duplicate comments (1)
wrappers/src/fdw/mysql_fdw/mysql_fdw.rs (1)

153-159: ⚠️ Potential issue | 🔴 Critical

WHERE clause still interpolates raw qualifier field/operator text.

Line 157 uses q.deparse_with_fmt(&mut fmt), and Qual::deparse_with_fmt emits field/operator verbatim. This leaves identifier/operator safety weaker than the quoted SELECT/ORDER BY paths and can still produce injection/runtime-break behavior.

As per coding guidelines, “Comment only when the issue must be resolved before merge.”

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@wrappers/src/fdw/mysql_fdw/mysql_fdw.rs` around lines 153 - 159, The WHERE
construction currently uses Qual::deparse_with_fmt (called in the quals.iter()
map) which emits field and operator verbatim; replace this by building safe
conditions: for each Qual in quals, validate/quote the identifier (use or extend
MysqlCellFormatter to provide a quote_identifier method) and whitelist/validate
the operator instead of interpolating it, produce a condition with a parameter
placeholder (e.g., "quoted_field {op} ?") and collect the corresponding value
into a params vector; update the code around quals.iter() (the closure
generating cond and the MysqlCellFormatter usage) to call the new safe
quoting/validation routines and to return both the condition string and
parameter(s) rather than using deparse_with_fmt directly.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@wrappers/src/fdw/mysql_fdw/mysql_fdw.rs`:
- Around line 196-201: The current conn_string resolution swallows Vault errors
by calling get_vault_secret(conn_str_id).unwrap_or_default(), producing an empty
DSN on failure; change this to propagate the Vault retrieval error instead of
defaulting: after resolving conn_string_id via require_option, call
get_vault_secret(conn_str_id) and propagate its Err (using ? or map_err to
convert to the function's error type) so the function returns an error when
secret retrieval fails; update the match arm that handles
server.options.get("conn_string") and ensure the surrounding function (which
uses conn_str) returns/handles the propagated error.
- Around line 193-281: MysqlFdw currently only implements scan methods so DMLs
are no-ops, WHERE identifiers are unquoted, and vault failures are masked; fix
by (1) implementing the ForeignDataWrapper DML methods (begin_modify, insert,
update, delete, end_modify) in MysqlFdw to actually execute the corresponding
MySQL statements using the runtime/connection logic already used in begin_scan
and return MysqlFdwError on failure instead of default success (reference:
methods named begin_modify, insert, update, delete, end_modify), (2) change
qualifier deparsing to quote identifiers the same way SELECT deparses (update
Qual::deparse_with_fmt or the MysqlFdw::deparse path to call the existing
quote/escape helper used for columns and tables so field names in WHERE are
safely quoted and cannot inject SQL; refer to deparse and
Qual::deparse_with_fmt), and (3) remove unwrap_or_default() when fetching the
vault secret in MysqlFdw::new (replace get_vault_secret(...).unwrap_or_default()
with a fallible path that returns an error via ? or map_err so missing/failed
vault lookups surface as errors and don’t yield an empty conn_str; reference
get_vault_secret and require_option usage in MysqlFdw::new).

---

Duplicate comments:
In `@wrappers/src/fdw/mysql_fdw/mysql_fdw.rs`:
- Around line 153-159: The WHERE construction currently uses
Qual::deparse_with_fmt (called in the quals.iter() map) which emits field and
operator verbatim; replace this by building safe conditions: for each Qual in
quals, validate/quote the identifier (use or extend MysqlCellFormatter to
provide a quote_identifier method) and whitelist/validate the operator instead
of interpolating it, produce a condition with a parameter placeholder (e.g.,
"quoted_field {op} ?") and collect the corresponding value into a params vector;
update the code around quals.iter() (the closure generating cond and the
MysqlCellFormatter usage) to call the new safe quoting/validation routines and
to return both the condition string and parameter(s) rather than using
deparse_with_fmt directly.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Central YAML (base), Organization UI (inherited)

Review profile: CHILL

Plan: Pro

Run ID: 8672de56-1845-422a-96fd-2094811dd4b7

📥 Commits

Reviewing files that changed from the base of the PR and between a771087 and 9154020.

📒 Files selected for processing (1)
  • wrappers/src/fdw/mysql_fdw/mysql_fdw.rs

@burmecia
Copy link
Member

burmecia commented Mar 5, 2026

Thanks for the PR! Could you fix the issues raised in CI first?

wenerme added 4 commits March 5, 2026 14:41
Uses mysql_async crate with text protocol (simple_query) instead of
prepared statements, which makes it compatible with Doris and other
MySQL-protocol databases that don't support mysql_stmt_result_metadata.

Based on mssql_fdw template with MySQL-specific adaptations:
- MySQL LIMIT without requiring ORDER BY
- String-based date/time parsing for broader compatibility
- Support for conn_string and conn_string_id (vault) options
- Add mysql_fdw to native_fdws feature list in Cargo.toml
- Preserve SQLSTATE codes for OptionsError/CreateRuntimeError and
  sanitize mysql_async error messages to avoid leaking credentials
- Propagate datetime/date parse errors instead of silently returning NULL
- Quote SQL identifiers with backticks to prevent injection and
  reserved word conflicts
Add MysqlCellFormatter that escapes single quotes and backslashes in
string literals, and formats booleans as 0/1 for MySQL compatibility.
Use deparse_with_fmt instead of deparse to apply the safe formatter
when building WHERE predicates.
@wenerme
Copy link
Author

wenerme commented Mar 5, 2026

@burmecia ping

wenerme added 2 commits March 5, 2026 22:51
…in WHERE, and DML support

- Replace get_vault_secret().unwrap_or_default() with proper error propagation
  so missing/failed vault lookups surface as errors instead of empty conn_str
- Add custom deparse_qual() that quotes field names with backticks in WHERE
  clauses, matching the quoting used in SELECT columns and ORDER BY
- Implement DML methods (begin_modify, insert, update, delete, end_modify)
  using the same runtime/connection pattern as begin_scan
@wenerme
Copy link
Author

wenerme commented Mar 6, 2026

@burmecia CI issues have been fixed. The clippy uninlined_format_args errors and all CodeRabbit review feedback have been addressed in the latest commits. The most recent CI run shows action_required (needs approval to run).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants