Skip to content

Comments

finished adding type aware parser for time and timstamp#37965

Open
manasagar wants to merge 5 commits intoapache:masterfrom
manasagar:37437
Open

finished adding type aware parser for time and timstamp#37965
manasagar wants to merge 5 commits intoapache:masterfrom
manasagar:37437

Conversation

@manasagar
Copy link
Contributor

Fixes #37437

Before committing this PR, I'm sure that I have checked the following options:

  • My code follows the code of conduct of this project.
  • I have self-reviewed the commit code.
  • I have (or in comment I request) added corresponding labels for the pull request.
  • I have passed maven check locally : ./mvnw clean install -B -T1C -Dmaven.javadoc.skip -Dmaven.jacoco.skip -e.
  • I have made corresponding changes to the documentation.
  • I have added corresponding unit tests for my changes.
  • I have updated the Release Notes of the current development version. For more details, see Update Release Note

Changes proposed in this pull request:

In the PostgreSQL text protocol write path (PostgreSQLDataRowPacket#writeTextValue), used type-aware formatting for timestamp/timestamptz/date/time instead of generic toString().

PostgreSQL 17 Temporal Data Behavior

Data Type Input Output
TIME '10:00:00' 10:00:00
TIME '10:00:00.0' 10:00:00
TIME '10:00:00.1' 10:00:00.1
TIMETZ '10:00:00+05:30' 10:00:00+05:30
TIMETZ '10:00:00.1+05:30' 10:00:00.100000+05:30
TIMETZ '10:00:00.0' 10:00:00
TIMESTAMP '2026-02-04 10:00:00' 2026-02-04 10:00:00
TIMESTAMP '2026-02-04 10:00:00.0' 2026-02-04 10:00:00
TIMESTAMP '2026-02-04 10:00:00.1' 2026-02-04 10:00:00.1
TIMESTAMPTZ '2026-02-04 10:00:00.0+05:30' 2026-02-04 04:30:00+00
TIMESTAMPTZ '2026-02-04 10:00:00.1+05:30' 2026-02-04 04:30:00.1+00

Key Points:

  • .0 fractional seconds are truncated
  • TIMETZ pads fractional seconds to 6 digits
  • TIMESTAMPTZ converts to UTC when timezone is specified

Copy link
Member

@terrymanu terrymanu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the effort to make the text protocol type-aware and for adding initial time-related tests—this is the right direction.

Change requests:

  • Root cause still unresolved: the proxy returns java.sql.Timestamp, which still goes through each.toString() and keeps the .0 suffix. Issue #37437 remains.
  • Time zone semantics are broken: the OffsetDateTime branch forces withOffsetSameInstant(UTC) and emits +00, diverging from PostgreSQL’s session time zone output and risking visible offsets.
  • Trailing-zero handling is inconsistent: OffsetTime always pads to 6 decimals, so 10:00:00.0+05:30 becomes 10:00:00.000000+05:30; LocalTime/LocalDateTime precision and trimming still don’t align with PostgreSQL rules.
  • Type detection: relying on instanceof misses the main Timestamp path from JDBC. The formatter should pick behavior based on column type, not just runtime class.
  • Tests don’t cover the failing scenario (timestamp without fractional seconds) or timestamptz/timetz behavior, so the fix isn’t demonstrated and regressions aren’t guarded.
  • Please keep ResultSetUtils untouched for this issue; the correct place to fix is the PostgreSQL text protocol formatter, not the generic JDBC conversion utilities.

Recommended next steps:

  1. In PostgreSQLDataRowPacket.writeTextValue, format timestamp/timestamptz/time/timetz per PostgreSQL text protocol: max 6 fractional digits, trim trailing zeros, preserve session time zone.
  2. Base formatting on column type, not only runtime instanceof, so Timestamp values are handled.
  3. Add tests for the reported case and for timestamptz/timetz outputs to prove the fix.

Please address these before we merge; happy to review the next revision.

@manasagar manasagar force-pushed the 37437 branch 2 times, most recently from 274ff90 to ee5134d Compare February 10, 2026 08:48
@manasagar
Copy link
Contributor Author

manasagar commented Feb 10, 2026

Thanks for the effort to make the text protocol type-aware and for adding initial time-related tests—this is the right direction.

Change requests:

  • Root cause still unresolved: the proxy returns java.sql.Timestamp, which still goes through each.toString() and keeps the .0 suffix. Issue PostgreSQL proxy: timestamp output adds trailing ".0" compared to native PostgreSQL #37437 remains.
  • Time zone semantics are broken: the OffsetDateTime branch forces withOffsetSameInstant(UTC) and emits +00, diverging from PostgreSQL’s session time zone output and risking visible offsets.
  • Trailing-zero handling is inconsistent: OffsetTime always pads to 6 decimals, so 10:00:00.0+05:30 becomes 10:00:00.000000+05:30; LocalTime/LocalDateTime precision and trimming still don’t align with PostgreSQL rules.
  • Type detection: relying on instanceof misses the main Timestamp path from JDBC. The formatter should pick behavior based on column type, not just runtime class.
  • Tests don’t cover the failing scenario (timestamp without fractional seconds) or timestamptz/timetz behavior, so the fix isn’t demonstrated and regressions aren’t guarded.
  • Please keep ResultSetUtils untouched for this issue; the correct place to fix is the PostgreSQL text protocol formatter, not the generic JDBC conversion utilities.

Recommended next steps:

  1. In PostgreSQLDataRowPacket.writeTextValue, format timestamp/timestamptz/time/timetz per PostgreSQL text protocol: max 6 fractional digits, trim trailing zeros, preserve session time zone.
  2. Base formatting on column type, not only runtime instanceof, so Timestamp values are handled.
  3. Add tests for the reported case and for timestamptz/timetz outputs to prove the fix.

Please address these before we merge; happy to review the next revision.
@terrymanu thanks for the feedback

I have implemented the column value based type checking .
When looking into ResultSetUtil I saw that Timstamp from db can be converted into LocalDateTime or TimeStamp or String so I also added runtime formatter along with the column type to make sure correct formatting is applied correctly. I have also changed the parser for OffsetTime and OffsetDateTime solving the issues that previously occured(6 trailing zeroes and adding 00 respectively) .

Copy link
Member

@terrymanu terrymanu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cannot merge yet.

Below is a ready-to-post English review comment:

Thanks for the update. I reviewed the latest version (head 3fd4977) and compared it with the previous CHANGES_REQUESTED items.

Merge Verdict: Not Mergeable

Reviewed Scope:

  • database/protocol/dialect/postgresql/src/main/java/org/apache/shardingsphere/database/protocol/postgresql/packet/command/query/PostgreSQLDataRowPacket.java
  • database/protocol/dialect/postgresql/src/test/java/org/apache/shardingsphere/database/protocol/postgresql/packet/command/query/PostgreSQLDataRowPacketTest.java
  • proxy/frontend/dialect/postgresql/src/main/java/org/apache/shardingsphere/proxy/frontend/postgresql/command/query/simple/PostgreSQLComQueryExecutor.java
  • proxy/frontend/dialect/postgresql/src/main/java/org/apache/shardingsphere/proxy/frontend/postgresql/command/query/extended/Portal.java
  • proxy/frontend/dialect/opengauss/src/main/java/org/apache/shardingsphere/proxy/frontend/opengauss/command/query/simple/OpenGaussComQueryExecutor.java

Compliance Gate: PASS
Rule ID: RULE-COD-062
Rule Source: CODE_OF_CONDUCT.md:62
Commands:

  1. rg '\.java$' /tmp/pr37965_files_all.txt (exit 0)
  2. rg -nP '\([^)]*\bOptional<[^>]+>\s+\w+[^)]*\)' <changed_java_files> (exit 1, no matches)

Major issues:

  1. Type-domain mismatch (root cause still not reliably fixed):
  • columnTypes comes from JDBC types (QueryHeader.getColumnType()), but formatter branches compare PostgreSQL OIDs.
  • Evidence: proxy/backend/dialect/postgresql/src/main/java/org/apache/shardingsphere/proxy/backend/postgresql/response/header/query/PostgreSQLQueryHeaderBuilder.java:41
  • Evidence: database/protocol/dialect/postgresql/src/main/java/org/apache/shardingsphere/database/protocol/postgresql/packet/command/query/PostgreSQLDataRowPacket.java:119
  • This can bypass the new timestamp/time formatting paths and fall back to toString().
  1. Potential NPE in extended-query execute path:
  • columnTypes is set in describe() but used in row packet generation.
  • Evidence: proxy/frontend/dialect/postgresql/src/main/java/org/apache/shardingsphere/proxy/frontend/postgresql/command/query/extended/Portal.java:134
  • Evidence: database/protocol/dialect/postgresql/src/main/java/org/apache/shardingsphere/database/protocol/postgresql/packet/command/query/PostgreSQLDataRowPacket.java:81
  • If execute runs without describe, columnTypes can be null.
  1. TIMESTAMPTZ still forces UTC for Timestamp:
  • Evidence: database/protocol/dialect/postgresql/src/main/java/org/apache/shardingsphere/database/protocol/postgresql/packet/command/query/PostgreSQLDataRowPacket.java:138
  • This still conflicts with the prior request to preserve session time-zone semantics.

Status of previous CHANGES_REQUESTED items:

  • Root cause (Timestamp formatting path): Not fixed
  • Time-zone semantics: Partially fixed
  • Trailing-zero handling: Mostly fixed
  • Column-type-based formatting: Partially fixed (implementation issue above)
  • Tests for failing scenarios: Partially fixed (unit coverage improved, but real type path mismatch not covered)
  • Keep ResultSetUtils untouched: Fixed

Latest commit changes (3fd4977):

  • Added column-type-aware formatting in PostgreSQLDataRowPacket.
  • Added/expanded formatter tests in PostgreSQLDataRowPacketTest.
  • Passed columnTypes from PostgreSQL/openGauss query executors and portal to data row packet.

Please address the three blocking issues above, then I can re-review quickly.

@manasagar
Copy link
Contributor Author

manasagar commented Feb 16, 2026

@terrymanu sorry about the delay
Type domain mismatch — Switching to java.sql.Types constants resolves the JDBC-vs-OID mismatch. This should now correctly route timestamp/time values through the formatting logic.
Fixed in: database/protocol/dialect/postgresql/src/main/java/org/apache/shardingsphere/database/protocol/postgresql/packet/command/query/PostgreSQLDataRowPacket.java
NPE risk — Moving columnTypes initialization into execute() ensures the array is always populated before row packet generation, eliminating the null reference path.
Fixed in: proxy/frontend/dialect/postgresql/src/main/java/org/apache/shardingsphere/proxy/frontend/postgresql/command/query/extended/Portal.java
Time zone handling — Changing from ZoneOffset.UTC to ZoneId.systemDefault() is a step in the right direction.
Changed in: database/protocol/dialect/postgresql/src/main/java/org/apache/shardingsphere/database/protocol/postgresql/packet/command/query/PostgreSQLDataRowPacket.java:138 .I am unclear what you mean incorrect processing of timestamp for timestamptz .
Timestamp does not possess any information regarding timezones so I decided to use systemDefault but this seems to be causing it to fail on some testcases the alternative would be to simply add +00:00 every time their is a timestamp in timstamptz. Please confirm which method to utilize I have the alternative ready.

@manasagar
Copy link
Contributor Author

Currently failing because of the environment CI/ Cl not allowing systemdefault

Copy link
Member

@terrymanu terrymanu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the update and for addressing the JDBC type-domain mismatch in the latest head (999d519).
The direction is correct, but there are still merge blockers.

Merge Verdict: Not Mergeable

Please address the following before merge:

  1. CI is still failing on the latest head.

  2. Time zone semantics are still not stable for timestamptz/timetz formatting.

    • Current code uses system default zone and current date:
      • database/protocol/dialect/postgresql/src/main/java/org/apache/shardingsphere/database/protocol/postgresql/packet/command/query/PostgreSQLDataRowPacket.java:129
      • database/protocol/dialect/postgresql/src/main/java/org/apache/shardingsphere/database/protocol/postgresql/packet/command/query/PostgreSQLDataRowPacket.java:141
      • database/protocol/dialect/postgresql/src/main/java/org/apache/shardingsphere/database/protocol/postgresql/packet/command/query/PostgreSQLDataRowPacket.java:145
    • This can vary by host timezone/DST and may diverge from session-timezone behavior.
    • Please switch to session-timezone-aware formatting.
  3. There is a protocol consistency risk when columnTypes is shorter than data.

    • Column count is written as data.size(), but the row loop breaks early if column types are exhausted:
      • database/protocol/dialect/postgresql/src/main/java/org/apache/shardingsphere/database/protocol/postgresql/packet/command/query/PostgreSQLDataRowPacket.java:82
      • database/protocol/dialect/postgresql/src/main/java/org/apache/shardingsphere/database/protocol/postgresql/packet/command/query/PostgreSQLDataRowPacket.java:90
    • Please avoid early break and use a safe fallback type instead.
  4. Tests are still not mapped one-to-one with all new fix points.

    • The new columnTypes propagation in simple/extended/openGauss paths is not fully validated with dedicated regression assertions.
    • Please add targeted tests for:
      • execute-before-describe path in Portal
      • columnTypes/data size mismatch handling
      • session-timezone output behavior for timestamptz/timetz

Once these are fixed and CI is green, I can re-review quickly.

Copy link
Member

@terrymanu terrymanu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the update. I reviewed the latest head, and the overall direction is good, but there are still blockers before merge.

Merge Verdict: Not Mergeable

What is in the right direction

  • The fix now targets the real root-cause path for Timestamp text formatting (.0 suffix), not only fallback logic.
  • JDBC type-domain alignment and columnTypes propagation in simple/extended paths are improved.
  • The previous execute-before-describe null-path risk is addressed.
  • CI status for the latest head is green, which is a good step.

Change requests

  1. Please fix session timezone parsing logic before merge.
  • Current timezone extraction uses ZoneId.getAvailableZoneIds().contains(tzName) in:
    • proxy/frontend/dialect/postgresql/src/main/java/org/apache/shardingsphere/proxy/frontend/postgresql/command/query/simple/PostgreSQLComQueryExecutor.java
    • proxy/frontend/dialect/postgresql/src/main/java/org/apache/shardingsphere/proxy/frontend/postgresql/command/query/extended/Portal.java
    • proxy/frontend/dialect/opengauss/src/main/java/org/apache/shardingsphere/proxy/frontend/opengauss/command/query/simple/OpenGaussComQueryExecutor.java
  • This rejects valid PostgreSQL-style offset values such as +05:30, and falls back to UTC, which can break TIMESTAMPTZ/TIMETZ semantics.
  • Please parse timezone directly with safe fallback (including quoted values and offset forms), instead of filtering by available zone ID set.
  1. Please make the session-timezone source chain explicit and reliable.
  • The current flow still does not fully prove that PostgreSQL/openGauss session timezone is consistently recorded and consumed for row formatting.
  • Please either:
    • complete the replay/record path for timezone variables, or
    • switch to an existing authoritative session-timezone source in connection/session context.
  • Also please add targeted regression tests for offset timezone values (for example +05:30) to prevent fallback-to-UTC regressions.

Newly introduced issue (needs fix or rollback)

  • The new timezone extraction implementation introduced a compatibility regression risk by rejecting valid offset timezone values and defaulting to UTC.
  • Please fix this behavior or roll back this part.

Please continue refining this PR. Once the above items are addressed, I can do a focused re-review quickly.

@manasagar
Copy link
Contributor Author

manasagar commented Feb 19, 2026

@terrymanu ## Update Summary

This PR has been updated with significant enhancements in the latest commit (f8c11bf):

Key Changes:

  • Added replay provider support for PostgreSQL and OpenGauss: Implemented comprehensive replay functionality to improve database compatibility and operation handling
  • Enhanced session time zone ID management: Developed robust handling for both offset-based and location-based time zone identifiers, ensuring accurate temporal data processing across different database systems
  • Comprehensive test coverage: Added extensive test cases to validate the new time zone function

@manasagar
Copy link
Contributor Author

I have added seperate replay for both openGauss and postgres right now it only propogate timestamp due to the requirement

Copy link
Member

@terrymanu terrymanu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Decision

  • Merge Verdict: Not Mergeable
  • Reviewed Scope: database/protocol/dialect/postgresql/.../PostgreSQLDataRowPacket.java, database/protocol/dialect/postgresql/.../PostgreSQLDataRowPacketTest.java, proxy/frontend/dialect/postgresql/.../PostgreSQLComQueryExecutor.java, proxy/frontend/dialect/
    postgresql/.../Portal.java, proxy/frontend/dialect/opengauss/.../OpenGaussComQueryExecutor.java, proxy/backend/core/.../RequiredSessionVariableRecorder.java, new PostgreSQL/openGauss replay providers + their tests/resources.
  • Not Reviewed Scope: live PostgreSQL/openGauss end-to-end runtime behavior with real server/session timezone settings.
  • Need Expert Review: Yes (PostgreSQL protocol/timezone semantics).

Positive direction is clear: the root .0 formatting path for Timestamp is directly addressed, CI is green, and targeted tests pass.

Major Issues

  1. TIMESTAMPTZ/TIMETZ session-timezone semantics are still not proven/correct on the real driver path.
  • Evidence: OffsetDateTime is formatted directly without session-zone normalization in database/protocol/dialect/postgresql/src/main/java/org/apache/shardingsphere/database/protocol/postgresql/packet/command/query/PostgreSQLDataRowPacket.java:145.
  • Evidence: session timezone is sourced only from replayed session vars in proxy/frontend/dialect/postgresql/src/main/java/org/apache/shardingsphere/proxy/frontend/postgresql/command/query/simple/PostgreSQLComQueryExecutor.java:140, proxy/frontend/dialect/postgresql/
    src/main/java/org/apache/shardingsphere/proxy/frontend/postgresql/command/query/extended/Portal.java:235, proxy/frontend/dialect/opengauss/src/main/java/org/apache/shardingsphere/proxy/frontend/opengauss/command/query/simple/OpenGaussComQueryExecutor.java:140.
  • Risk: PGJDBC docs say returned OffsetDateTime is UTC; output can stay UTC instead of session timezone.
  • Recommendation: normalize tz values to session zone (or explicitly keep UTC with documented compatibility rationale) and prove with integration tests.
  1. Tz-branch routing depends on JDBC types that may not be emitted as expected by PGJDBC.
  • Evidence: formatter branches rely on Types.TIMESTAMP_WITH_TIMEZONE/Types.TIME_WITH_TIMEZONE in database/protocol/dialect/postgresql/src/main/java/org/apache/shardingsphere/database/protocol/postgresql/packet/command/query/PostgreSQLDataRowPacket.java:129 and database/
    protocol/dialect/postgresql/src/main/java/org/apache/shardingsphere/database/protocol/postgresql/packet/command/query/PostgreSQLDataRowPacket.java:141.
  • Evidence: upstream value loading falls back to generic getObject for non-covered JDBC types in database/connector/core/src/main/java/org/apache/shardingsphere/database/connector/core/resultset/ResultSetMapper.java:93.
  • Inference from source: PGJDBC may still expose TIMESTAMPTZ metadata as Types.TIMESTAMP (issue #1766), which can bypass intended tz-specific formatting.
  • Recommendation: add contract/integration tests that start from ResultSetMetaData + ResultSetMapper and assert final wire text for timestamptz/timetz.

Newly Introduced Risk

  1. Per-value timezone parsing in hot row-write path.
  • Evidence: repeated zone parsing/checks in database/protocol/dialect/postgresql/src/main/java/org/apache/shardingsphere/database/protocol/postgresql/packet/command/query/PostgreSQLDataRowPacket.java:217 and database/protocol/dialect/postgresql/src/main/java/org/apache/
    shardingsphere/database/protocol/postgresql/packet/command/query/PostgreSQLDataRowPacket.java:222.
  • Recommendation: parse/cache session ZoneId once per packet.

Multi-Round Comparison

  • Fixed: column-types propagation + execute-before-describe null-path; CI now green.
  • Partially fixed: timezone offset parsing/replay wiring.
  • Unresolved: real-driver tz semantics for timestamptz/timetz.
  • New risk: per-row zone parsing overhead.

@RaigorJiang
Copy link
Contributor

Hi @manasagar
Since version 5.5.3 is about to be released, I will update the milestone of this PR to the next version. Thank you!

@RaigorJiang RaigorJiang modified the milestones: 5.5.3, 5.5.4 Feb 23, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

PostgreSQL proxy: timestamp output adds trailing ".0" compared to native PostgreSQL

4 participants