Conversation
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughThis pull request introduces a MySQL Foreign Data Wrapper (FDW) to enable PostgreSQL to query MySQL tables. The changes add the mysql_async dependency and feature flag to Cargo.toml, register the mysql_fdw module conditionally, implement an error handling layer (MysqlFdwError), and provide a complete FDW implementation with lifecycle methods (new, begin_scan, iter_scan, re_scan, end_scan), SQL query generation (deparse), and row type conversion (field_to_cell) supporting multiple column types. A test module scaffold is also included. Sequence DiagramsequenceDiagram
participant PG as PostgreSQL<br/>Query Engine
participant FDW as MysqlFdw
participant RT as Tokio<br/>Runtime
participant MySQL as MySQL<br/>Database
participant Conv as Type<br/>Converter
PG->>FDW: new(server)
FDW->>RT: Create runtime
FDW->>FDW: Resolve connection string
FDW-->>PG: FDW instance
PG->>FDW: begin_scan(quals, columns, sorts, limit)
FDW->>FDW: Build SELECT query via deparse()
FDW->>RT: Execute async query
RT->>MySQL: SELECT with WHERE/ORDER BY/LIMIT
MySQL-->>RT: Return rows
RT-->>FDW: Store Vec<Row>
FDW-->>PG: Scan ready
loop For each row
PG->>FDW: iter_scan(row)
FDW->>Conv: field_to_cell() for each column
Conv->>Conv: Parse type (bool/int/float/numeric/text/date/timestamp)
Conv-->>FDW: Cell values
FDW->>FDW: Materialize target row
FDW-->>PG: Row with data
end
PG->>FDW: end_scan()
FDW->>FDW: Clear results
FDW-->>PG: Scan complete
Tip Try Coding Plans. Let us write the prompt for your AI agent so you can ship faster (with fewer bugs). Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 4
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@wrappers/Cargo.toml`:
- Around line 146-151: The new feature group mysql_fdw is declared but not
included in the aggregate native_fdws/all_fdws features; update the feature
lists so native_fdws includes "mysql_fdw" and ensure any higher-level aggregate
(e.g., all_fdws or all) that should enable every FDW includes native_fdws (or
directly includes "mysql_fdw") so builds that rely on native_fdws/all_fdws will
pull in the mysql_fdw feature.
In `@wrappers/src/fdw/mysql_fdw/mod.rs`:
- Around line 35-38: The From<MysqlFdwError> for ErrorReport implementation
currently forwards raw error text which can leak DSN details and flattens
OptionsError into a generic FDW error; update this conversion to (1) match on
MysqlFdwError variants, (2) for mysql_async/connection-related variants produce
a sanitized message (redacting DSN/credentials) instead of the raw error string,
and (3) when encountering the OptionsError variant return an ErrorReport that
preserves its specialized SQLSTATE mapping rather than using
PgSqlErrorCode::ERRCODE_FDW_ERROR; modify the match arms in the from()
implementation to map each variant appropriately and only include non-sensitive
diagnostic details.
In `@wrappers/src/fdw/mysql_fdw/mysql_fdw.rs`:
- Around line 125-160: The SQL builder is inserting unquoted/unescaped
identifiers and deparsed qualifiers directly (variables: tgts, self.table,
columns, quals with qual.deparse(), sorts with sort.field), which risks SQL
injection and breaks on reserved/special characters; update sql assembly to
properly quote/escape identifiers (table and column names from self.table,
columns/tgts, and sort.field) using the DB-safe identifier quoting/escaping
routine or a helper (e.g., quote_identifier or escape_identifier) and ensure
quals/deparse() produces only parameter placeholders with values passed
separately or is rewritten to return a safe expression and parameters, rather
than interpolating raw text—apply this consistently where tgts, self.table,
qual.deparse(), and sort.field are used to build the WHERE and ORDER BY clauses.
- Around line 57-82: The code silently converts datetime parse failures into
NULL by using .ok()/.and_then (in the DATEOID, TIMESTAMPOID, TIMESTAMPTZOID
branches and the similar block at lines ~91-95); change these to propagate parse
errors instead of swallowing them: replace the .ok()/.and_then chains around
get_col::<String> and parse_naive_datetime with explicit Result handling (match
or ? on Result from parse_naive_datetime) and return a descriptive Err (or map
the parse error into the function's error type) so callers see conversion
failures; reference the branches for PgOid::BuiltIn(PgBuiltInOids::DATEOID),
TIMESTAMPOID, TIMESTAMPTZOID and functions get_col, parse_naive_datetime,
to_timestamp, and the Cell::Date/Cell::Timestamp/Cell::Timestamptz constructors
when implementing the change.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Central YAML (base), Organization UI (inherited)
Review profile: CHILL
Plan: Pro
Run ID: 34ec0ec6-dffe-4b26-9657-da47005e453f
⛔ Files ignored due to path filters (1)
Cargo.lockis excluded by!**/*.lock
📒 Files selected for processing (5)
wrappers/Cargo.tomlwrappers/src/fdw/mod.rswrappers/src/fdw/mysql_fdw/mod.rswrappers/src/fdw/mysql_fdw/mysql_fdw.rswrappers/src/fdw/mysql_fdw/tests.rs
There was a problem hiding this comment.
♻️ Duplicate comments (1)
wrappers/src/fdw/mysql_fdw/mysql_fdw.rs (1)
139-148:⚠️ Potential issue | 🔴 CriticalUnsafe qualifier SQL interpolation still allows predicate injection and identifier breakage.
q.deparse()output is interpolated directly into the remote SQL. In this code path, qualifier identifiers/literals are not safely MySQL-quoted/escaped, so crafted string predicates can alter SQL semantics and reserved/special identifiers in quals can fail at runtime.Suggested fix (build WHERE clauses with MySQL-safe deparsing)
+fn quote_literal(value: &str) -> String { + format!("'{}'", value.replace('\\', "\\\\").replace('\'', "''")) +} + +fn format_mysql_cell(cell: &Cell) -> String { + match cell { + Cell::String(v) => quote_literal(v), + _ => format!("{cell}"), + } +} + +fn deparse_mysql_qual(q: &Qual) -> MysqlFdwResult<String> { + let field = quote_ident(&q.field); + let op = match q.operator.as_str() { + "=" | "<>" | ">" | ">=" | "<" | "<=" | "is" | "is not" => q.operator.as_str(), + "~~" => "like", + "!~~" => "not like", + other => { + return Err(MysqlFdwError::ConversionError(format!( + "unsupported operator for pushdown: {other}" + ))); + } + }; + + if q.use_or { + match &q.value { + Value::Array(cells) => Ok(cells + .iter() + .map(|cell| format!("{field} {op} {}", format_mysql_cell(cell))) + .collect::<Vec<_>>() + .join(" or ")), + Value::Cell(_) => unreachable!(), + } + } else { + match &q.value { + Value::Cell(Cell::String(v)) if (op == "is" || op == "is not") && v == "null" => { + Ok(format!("{field} {op} null")) + } + Value::Cell(cell) => Ok(format!("{field} {op} {}", format_mysql_cell(cell))), + Value::Array(_) => unreachable!(), + } + } +} @@ - let cond = quals - .iter() - .map(|q| q.deparse()) - .collect::<Vec<String>>() - .join(" and "); + let cond = quals + .iter() + .map(deparse_mysql_qual) + .collect::<MysqlFdwResult<Vec<String>>>()? + .join(" and ");Verification (read-only): confirm the current code path still interpolates raw
Qual::deparse()and thatQual::deparse/string formatting are not MySQL-safe. Expected result: both commands return matches.#!/bin/bash set -euo pipefail rg -n -C3 'map\(\|q\| q\.deparse\(\)\)' wrappers/src/fdw/mysql_fdw/mysql_fdw.rs rg -n -C3 'impl Qual|deparse_with_fmt|format!\("\{\} \{\} \{\}"|Cell::String\(v\).*write!' supabase-wrappers/src/interface.rsAs per coding guidelines, “Restrict feedback to errors, security risks, or functionality-breaking problems.”
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@wrappers/src/fdw/mysql_fdw/mysql_fdw.rs` around lines 139 - 148, The current code concatenates raw q.deparse() outputs into the SQL (in the quals -> cond -> sql.push_str path), which allows predicate injection and identifier/literal breakage; change the WHERE construction to build predicates using a MySQL-safe deparsing routine or explicit escaping rather than raw string interpolation: for each Qual in quals call a safe formatter (e.g. a deparse_with_fmt or dedicated escape_identifier/escape_literal helpers) that returns properly quoted identifiers (`backticks`) and escaped string/number literals, then join those safe predicates with " and " and append to sql; alternatively switch to parameterized queries if supported by the remote execution path. Ensure the change targets the quals iteration and the q.deparse() call so all qualifier output is sanitized before sql.push_str.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Duplicate comments:
In `@wrappers/src/fdw/mysql_fdw/mysql_fdw.rs`:
- Around line 139-148: The current code concatenates raw q.deparse() outputs
into the SQL (in the quals -> cond -> sql.push_str path), which allows predicate
injection and identifier/literal breakage; change the WHERE construction to
build predicates using a MySQL-safe deparsing routine or explicit escaping
rather than raw string interpolation: for each Qual in quals call a safe
formatter (e.g. a deparse_with_fmt or dedicated escape_identifier/escape_literal
helpers) that returns properly quoted identifiers (`backticks`) and escaped
string/number literals, then join those safe predicates with " and " and append
to sql; alternatively switch to parameterized queries if supported by the remote
execution path. Ensure the change targets the quals iteration and the
q.deparse() call so all qualifier output is sanitized before sql.push_str.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Central YAML (base), Organization UI (inherited)
Review profile: CHILL
Plan: Pro
Run ID: 66dfb6b6-84e7-4774-935e-1acb024ef716
📒 Files selected for processing (3)
wrappers/Cargo.tomlwrappers/src/fdw/mysql_fdw/mod.rswrappers/src/fdw/mysql_fdw/mysql_fdw.rs
🚧 Files skipped from review as they are similar to previous changes (1)
- wrappers/src/fdw/mysql_fdw/mod.rs
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@wrappers/src/fdw/tencent_cls_fdw/tencent_cls_fdw.rs`:
- Around line 165-175: The current headers.insert calls use
HeaderValue::from_str(...).unwrap(), which will panic on non-ASCII
user-controlled values (authorization, self.region, timestamp) — change these
unwraps to safe, fallible handling: replace the unwraps around
HeaderValue::from_str for "Authorization", "X-TC-Region", and "X-TC-Timestamp"
with code that either (a) validates the input is ASCII (e.g.
authorization.is_ascii(), self.region.is_ascii()) and returns a proper
Error/Result to the caller if not, or (b) attempts HeaderValue::from_str and
maps Err into a returned error (propagate with context) instead of panicking;
update the function that builds the headers (the block with headers.insert(...)
for Authorization, X-TC-Region, X-TC-Timestamp and X-TC-Timestamp using
timestamp.to_string()) to return Result so header conversion failures are
propagated safely.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Central YAML (base), Organization UI (inherited)
Review profile: CHILL
Plan: Pro
Run ID: 4886062a-74dc-4aa3-989d-97f481e64560
📒 Files selected for processing (7)
wrappers/Cargo.tomlwrappers/src/fdw/mod.rswrappers/src/fdw/mysql_fdw/mod.rswrappers/src/fdw/mysql_fdw/mysql_fdw.rswrappers/src/fdw/tencent_cls_fdw/mod.rswrappers/src/fdw/tencent_cls_fdw/tencent_cls_fdw.rswrappers/src/fdw/tencent_cls_fdw/tests.rs
There was a problem hiding this comment.
Actionable comments posted: 2
♻️ Duplicate comments (1)
wrappers/src/fdw/mysql_fdw/mysql_fdw.rs (1)
153-159:⚠️ Potential issue | 🔴 CriticalWHERE clause still interpolates raw qualifier field/operator text.
Line 157 uses
q.deparse_with_fmt(&mut fmt), andQual::deparse_with_fmtemitsfield/operatorverbatim. This leaves identifier/operator safety weaker than the quoted SELECT/ORDER BY paths and can still produce injection/runtime-break behavior.As per coding guidelines, “Comment only when the issue must be resolved before merge.”
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@wrappers/src/fdw/mysql_fdw/mysql_fdw.rs` around lines 153 - 159, The WHERE construction currently uses Qual::deparse_with_fmt (called in the quals.iter() map) which emits field and operator verbatim; replace this by building safe conditions: for each Qual in quals, validate/quote the identifier (use or extend MysqlCellFormatter to provide a quote_identifier method) and whitelist/validate the operator instead of interpolating it, produce a condition with a parameter placeholder (e.g., "quoted_field {op} ?") and collect the corresponding value into a params vector; update the code around quals.iter() (the closure generating cond and the MysqlCellFormatter usage) to call the new safe quoting/validation routines and to return both the condition string and parameter(s) rather than using deparse_with_fmt directly.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@wrappers/src/fdw/mysql_fdw/mysql_fdw.rs`:
- Around line 196-201: The current conn_string resolution swallows Vault errors
by calling get_vault_secret(conn_str_id).unwrap_or_default(), producing an empty
DSN on failure; change this to propagate the Vault retrieval error instead of
defaulting: after resolving conn_string_id via require_option, call
get_vault_secret(conn_str_id) and propagate its Err (using ? or map_err to
convert to the function's error type) so the function returns an error when
secret retrieval fails; update the match arm that handles
server.options.get("conn_string") and ensure the surrounding function (which
uses conn_str) returns/handles the propagated error.
- Around line 193-281: MysqlFdw currently only implements scan methods so DMLs
are no-ops, WHERE identifiers are unquoted, and vault failures are masked; fix
by (1) implementing the ForeignDataWrapper DML methods (begin_modify, insert,
update, delete, end_modify) in MysqlFdw to actually execute the corresponding
MySQL statements using the runtime/connection logic already used in begin_scan
and return MysqlFdwError on failure instead of default success (reference:
methods named begin_modify, insert, update, delete, end_modify), (2) change
qualifier deparsing to quote identifiers the same way SELECT deparses (update
Qual::deparse_with_fmt or the MysqlFdw::deparse path to call the existing
quote/escape helper used for columns and tables so field names in WHERE are
safely quoted and cannot inject SQL; refer to deparse and
Qual::deparse_with_fmt), and (3) remove unwrap_or_default() when fetching the
vault secret in MysqlFdw::new (replace get_vault_secret(...).unwrap_or_default()
with a fallible path that returns an error via ? or map_err so missing/failed
vault lookups surface as errors and don’t yield an empty conn_str; reference
get_vault_secret and require_option usage in MysqlFdw::new).
---
Duplicate comments:
In `@wrappers/src/fdw/mysql_fdw/mysql_fdw.rs`:
- Around line 153-159: The WHERE construction currently uses
Qual::deparse_with_fmt (called in the quals.iter() map) which emits field and
operator verbatim; replace this by building safe conditions: for each Qual in
quals, validate/quote the identifier (use or extend MysqlCellFormatter to
provide a quote_identifier method) and whitelist/validate the operator instead
of interpolating it, produce a condition with a parameter placeholder (e.g.,
"quoted_field {op} ?") and collect the corresponding value into a params vector;
update the code around quals.iter() (the closure generating cond and the
MysqlCellFormatter usage) to call the new safe quoting/validation routines and
to return both the condition string and parameter(s) rather than using
deparse_with_fmt directly.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Central YAML (base), Organization UI (inherited)
Review profile: CHILL
Plan: Pro
Run ID: 8672de56-1845-422a-96fd-2094811dd4b7
📒 Files selected for processing (1)
wrappers/src/fdw/mysql_fdw/mysql_fdw.rs
|
Thanks for the PR! Could you fix the issues raised in CI first? |
Uses mysql_async crate with text protocol (simple_query) instead of prepared statements, which makes it compatible with Doris and other MySQL-protocol databases that don't support mysql_stmt_result_metadata. Based on mssql_fdw template with MySQL-specific adaptations: - MySQL LIMIT without requiring ORDER BY - String-based date/time parsing for broader compatibility - Support for conn_string and conn_string_id (vault) options
- Add mysql_fdw to native_fdws feature list in Cargo.toml - Preserve SQLSTATE codes for OptionsError/CreateRuntimeError and sanitize mysql_async error messages to avoid leaking credentials - Propagate datetime/date parse errors instead of silently returning NULL - Quote SQL identifiers with backticks to prevent injection and reserved word conflicts
Add MysqlCellFormatter that escapes single quotes and backslashes in string literals, and formats booleans as 0/1 for MySQL compatibility. Use deparse_with_fmt instead of deparse to apply the safe formatter when building WHERE predicates.
|
@burmecia ping |
…in WHERE, and DML support - Replace get_vault_secret().unwrap_or_default() with proper error propagation so missing/failed vault lookups surface as errors instead of empty conn_str - Add custom deparse_qual() that quotes field names with backticks in WHERE clauses, matching the quoting used in SELECT columns and ORDER BY - Implement DML methods (begin_modify, insert, update, delete, end_modify) using the same runtime/connection pattern as begin_scan
|
@burmecia CI issues have been fixed. The clippy |
Summary
mysql_fdw) that supports querying MySQL and MySQL-protocol compatible databases (e.g., Apache Doris)mysql_asynccrate with text protocol (simple_query) instead of prepared statements for broader compatibility with databases that don't supportmysql_stmt_result_metadatamssql_fdwtemplate with MySQL-specific adaptations: LIMIT without ORDER BY, string-based date/time parsing, support forconn_stringandconn_string_id(Vault) options