Skip to content

feat: Add DataFrame.from_dict#2885

Merged
dangotbanned merged 9 commits intomainfrom
dataframe-from-dict
Jul 26, 2025
Merged

feat: Add DataFrame.from_dict#2885
dangotbanned merged 9 commits intomainfrom
dataframe-from-dict

Conversation

@dangotbanned
Copy link
Copy Markdown
Member

@dangotbanned dangotbanned commented Jul 25, 2025

What type of PR is this? (check all applicable)

  • 💾 Refactor
  • ✨ Feature
  • 🐛 Bug Fix
  • 🔧 Optimization
  • 📝 Documentation
  • ✅ Test
  • 🐳 Other

Related issues

Checklist

  • Code follows style guide (ruff)
  • Tests added
  • Documented the changes

If you have comments or can explain your changes, please do so below

  • Resolve [var-annotated] by not using Self (695f9ac)

Possible Follow-ups

@dangotbanned dangotbanned added the enhancement New feature or request label Jul 25, 2025
```
"""

_version: ClassVar[Version] = Version.MAIN
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@MarcoGorelli

Doing this is gonna be quite handy for the rest of (#2116)

Since we're already inside the class, we can just start from the known version there

ns = cls._version.namespace.from_backend(implementation).compliant

instead of using _stablify afterwards 😄

ns = Version.MAIN.namespace.from_backend(implementation).compliant

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice!

@dangotbanned dangotbanned marked this pull request as ready for review July 25, 2025 18:42
@dangotbanned dangotbanned mentioned this pull request Jul 25, 2025
6 tasks
dangotbanned added a commit that referenced this pull request Jul 25, 2025
Copy link
Copy Markdown
Member

@MarcoGorelli MarcoGorelli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks @dangotbanned !

just left a comment

Comment on lines +511 to +521
if is_eager_allowed(implementation):
ns = cls._version.namespace.from_backend(implementation).compliant
compliant = ns._dataframe.from_dict(data, schema=schema, context=ns)
return cls(compliant, level="full")
# NOTE: (#2786) needs resolving for extensions
msg = (
f"{implementation} support in Narwhals is lazy-only, but `DataFrame.from_dict` is an eager-only function.\n\n"
"Hint: you may want to use an eager backend and then call `.lazy`, e.g.:\n\n"
f" nw.DataFrame.from_dict({{'a': [1, 2]}}, backend='pyarrow').lazy('{implementation}')"
)
raise ValueError(msg)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i'm confused as to why is this necessary - if we have a DataFrame class, aren't we already eager?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was just adapting the existing error in nw.from_dict

msg = (
f"{implementation} support in Narwhals is lazy-only, but `from_dict` is an eager-only function.\n\n"
"Hint: you may want to use an eager backend and then call `.lazy`, e.g.:\n\n"
f" nw.from_dict({{'a': [1, 2]}}, backend='pyarrow').lazy('{implementation}')"
)

Did you want this version to do something different if we wrote?

import narwhals as nw

nw.DataFrame.from_dict({'a': [1, 2]}, backend="pyspark")

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah I see sorry, this is indeed needed and correct, thanks!

my only other comment would be - do we want to repeat the docstring, or do like numpy and say "see [...] for full docstring"? probably fine to repeat?

https://numpy.org/doc/stable/reference/generated/numpy.ndarray.std.html

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"see [...] for full docstring"? probably fine to repeat?

Definitely repeat!

I'm guessing even to get that doc you'd need to not have type stubs or have a jupyter kernel running?

image

I'm always gonna feel strongly about IDE-experience-first-docs, since that's how I consume them 🙂

Sadly useful docs + static typing often mean we need to repeat ourselves.
But putting in that extra effort is something I really appreciate from polars vs pandas

Copy link
Copy Markdown
Member

@MarcoGorelli MarcoGorelli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this needs adding to dataframe.md in the docs, right?

other than that, looks good

@dangotbanned
Copy link
Copy Markdown
Member Author

this needs adding to dataframe.md in the docs, right?

oop yeah will do that now!

@dangotbanned dangotbanned marked this pull request as draft July 26, 2025 13:55
@dangotbanned
Copy link
Copy Markdown
Member Author

(#2885 (comment))
Just need to fix check_api_reference.py to handle classmethod

Preview of the doc

image

@dangotbanned dangotbanned marked this pull request as ready for review July 26, 2025 14:07
@dangotbanned dangotbanned merged commit bdcf661 into main Jul 26, 2025
33 checks passed
@dangotbanned dangotbanned deleted the dataframe-from-dict branch July 26, 2025 15:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants