-
Notifications
You must be signed in to change notification settings - Fork 80
Unify applied transform semantics: main table, working CSV, and GET reload (filter / sort / query / pivot) #238
Description
Problem
Some operations (filter, sort, advQueryFilter, pivotTables) return updated columns / rows from POST /projects/{id}/transform when should_save == false, so the working copy CSV may stay unchanged. Different UI flows also update the main table inconsistently (some operations push results into the grid via onTransform / context; others only show a preview or partial UI).
After a reload, GET /projects/get/{project_id} reads from disk, so users can see data that does not match what they thought they had applied—and CSV export (from the working file) can disagree with what they last saw in the app. None of this is spelled out as an intentional rule today.
Example (reproducible)
- Upload a CSV with a column you can filter on (e.g.
statuswith valuesactive/inactive). - Open Filter, apply e.g.
status=active. - Observe the UI:
- The filtered result may appear in a preview while the main table still shows all rows, or the table reflects the filter depending on the code path—either way the experience is inconsistent across transforms.
- Refresh the browser (full reload).
GET /projects/get/{id}(what the app loads after refresh) returns the on-disk working copy — typically the full, unfiltered dataset for this class of operations.- Export CSV — file matches the unfiltered working copy, not the last filtered view.
Expected (product decision, pick one and enforce):
- Persist: Filter/sort/query/pivot (or a defined subset) are written to the working copy and survive reload and export; or
- Ephemeral: They never touch disk, but the UI clearly states they are temporary and reload resets to disk state with no surprise.
Right now behavior sits in between, which is confusing and hard to combine with checkpoints/logs (#224, #166, #49).
Suspected cause
- Transform results are applied inconsistently between frontend routes (preview-only vs updating shared table state).
should_saveis not a single, user-visible contract: some ops skip persisting to the working CSV while the UI does not always say so.- No single source of truth: the “current dataset” is split between in-memory / preview state,
GETfrom disk, and export—they can diverge after the same user action.
Proposal
Agree on one explicit contract (persist vs ephemeral vs hybrid with e.g. “Commit to dataset”), document it, then align backend (should_save, logging) and frontend (always update shared state + messaging) and add tests.
Goals
- Main table behavior is consistent after Apply across filter, sort, advanced query, pivot (and matches the chosen contract).
- Reload /
GETand export match that contract and are documented. - PRs reference related checkpoint work where needed.
Non-goals
- Fixing every checkpoint edge case in one PR (see related issues).
- Redesigning null JSON handling (Backend response serialization coerces missing values to empty strings, losing null semantics #220), except not making it worse.
Suggested phases
- Audit: map each
operation_type→should_save→ which components callonTransform/updateData. - Maintainer/user-visible decision: persist vs ephemeral (short doc in README or CONTRIBUTING).
- Frontend: one consistent path from transform response → table state (+ clear copy if ephemeral).
- Backend: if persisting, adjust writes/logging; add integration coverage (Integration Tests for Transform Endpoint #91).
- Test: e.g. apply → reload → assert rows/columns per contract.
Suggested direction: Centralize transform application (one pipeline or single app-level dataset state) used by the main table, CSV export, and whatever GET reload hydrates—so they cannot silently disagree.
Acceptance criteria
- Documented contract (merged or approved in this issue).
- Main table aligned with API after Apply for filter / sort / query / pivot per contract.
- Reload + export behavior matches contract; ≥1 automated test.
- Related: [Feature] On clicking the column button sorting should apply like there is in excel #200, Repeated saves drop previously checkpointed transforms by rebuilding from only pending logs #224, Backend response serialization coerces missing values to empty strings, losing null semantics #220, [Bug] Revert to original does not clear pending unapplied logs, subsequent save re-applies stale transformations #166, [Bug] Revert doesn't clear unapplied logs — saving after revert replays reverted transformations #49, Integration Tests for Transform Endpoint #91 in follow-up PRs as needed.
Related issues
- [Feature] On clicking the column button sorting should apply like there is in excel #200 — column sort UX
- Repeated saves drop previously checkpointed transforms by rebuilding from only pending logs #224 — checkpoint / save replay
- Backend response serialization coerces missing values to empty strings, losing null semantics #220 — null serialization in API
- [Bug] Revert to original does not clear pending unapplied logs, subsequent save re-applies stale transformations #166, [Bug] Revert doesn't clear unapplied logs — saving after revert replays reverted transformations #49 — revert / pending logs
- Integration Tests for Transform Endpoint #91 — integration tests