Skip to content

Backend response serialization coerces missing values to empty strings, losing null semantics #220

@nodesagar

Description

@nodesagar

Summary

The shared backend response serializer currently collapses missing values into empty strings before returning API payloads. This happens in dataframe_to_response() in dataloom-backend/app/utils/pandas_helpers.py, where the DataFrame is normalized with fillna("").

That means API consumers cannot distinguish a true missing value (null) from a real empty string ("").

Why this matters

This affects every endpoint that uses the shared serializer, including project fetch/upload responses and transform responses. It also creates downstream ambiguity for:

  • profiling and null-count correctness
  • data-quality checks
  • formula behavior
  • export/report fidelity
  • frontend rendering and edit flows that should preserve the difference between null and ""

Current behavior

A DataFrame like this:

  • None in one cell
  • "" in another cell

is serialized so both values become "" in the API response.

Expected behavior

  • Missing values should remain null in API responses
  • Real empty strings should remain ""
  • Any display-layer blank rendering should happen in the frontend, not in backend serialization

Suggested direction

Normalize response values at the serializer boundary instead of calling fillna(""), preserving null semantics while still keeping the payload JSON-safe.

This is a small backend-only change, but it improves correctness for future profiling, quality, formula, and export work.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions