fix: missing `__getitem__` type fixes by EdAbati · Pull Request #1963 · narwhals-dev/narwhals

EdAbati · 2025-02-07T17:54:17Z

What type of PR is this? (check all applicable)

Checklist

Code follows style guide (ruff)
Tests added
Documented the changes

If you have comments or can explain your changes, please do so below

I am so sorry @MarcoGorelli , I was a bit to quick to press "Ready for Review" 😭

This is a follow up of #1958

EdAbati · 2025-02-07T17:55:02Z

narwhals/dataframe.py

    def __getitem__(
        self: Self,
-        key: (
+        item: (


silly mistake (I blame copilot) 🙈

I wonder why it was not picked up though

narwhals/dataframe.py

tests/frame/get_column_test.py

MarcoGorelli · 2025-02-08T09:09:36Z

I haven't got permissions

you do now 😉 feel free to make improvements, thanks for all your help + evangelism of Narwhals

dangotbanned · 2025-02-08T11:13:28Z

I haven't got permissions

you do now 😉 feel free to make improvements, thanks for all your help + evangelism of Narwhals

Oh hello 😊
Thanks @MarcoGorelli!

Resolves narwhals-dev#1963 (comment)

dangotbanned · 2025-02-08T12:52:08Z

Maybe this is too much for the PR.
But from someone who doesn't use __getitem__ - replacing these types with the polars aliases would be much easier to read IMO.

# Annotations for `__getitem__` methods
SingleIndexSelector: TypeAlias = int
MultiIndexSelector: TypeAlias = Union[
    slice,
    range,
    Sequence[int],
    "Series",
    "np.ndarray[Any, Any]",
]
SingleNameSelector: TypeAlias = str
MultiNameSelector: TypeAlias = Union[
    slice,
    Sequence[str],
    "Series",
    "np.ndarray[Any, Any]",
]
BooleanMask: TypeAlias = Union[
    Sequence[bool],
    "Series",
    "np.ndarray[Any, Any]",
]
SingleColSelector: TypeAlias = Union[SingleIndexSelector, SingleNameSelector]
MultiColSelector: TypeAlias = Union[MultiIndexSelector, MultiNameSelector, BooleanMask]

Still ends up pretty complex in the @overload(s), but much easier to differentiate the parts:

    # `str` overlaps with `Sequence[str]`
    # We can ignore this but we must keep this overload ordering
    @overload
    def __getitem__(
        self, key: tuple[SingleIndexSelector, SingleColSelector]
    ) -> Any: ...

    @overload
    def __getitem__(  # type: ignore[overload-overlap]
        self, key: str | tuple[MultiIndexSelector, SingleColSelector]
    ) -> Series: ...

    @overload
    def __getitem__(
        self,
        key: (
            SingleIndexSelector
            | MultiIndexSelector
            | MultiColSelector
            | tuple[SingleIndexSelector, MultiColSelector]
            | tuple[MultiIndexSelector, MultiColSelector]
        ),
    ) -> DataFrame: ...

    def __getitem__(
        self,
        key: (
            SingleIndexSelector
            | SingleColSelector
            | MultiColSelector
            | MultiIndexSelector
            | tuple[SingleIndexSelector, SingleColSelector]
            | tuple[SingleIndexSelector, MultiColSelector]
            | tuple[MultiIndexSelector, SingleColSelector]
            | tuple[MultiIndexSelector, MultiColSelector]
        ),
    ) -> DataFrame | Series | Any:

@MarcoGorelli do you see these aliases as a desirable thing to steal (re-implement) from polars?

EdAbati · 2025-02-08T13:03:04Z

narwhals/dataframe.py

-        key: (
-            slice
+        item: (
+            int


Hey @dangotbanned thanks for fixing! super quick I didn't have the chance to see the comments :D

Anyway I think that int is a bit of a special case.
The narwhals rule at the moment is that int in __getitem__ should not be fully supported (https://narwhals-dev.github.io/narwhals/pandas_like_concepts/column_names/) or encouraged .

I think we shouldn't add it in typing here, but ignore the int slicing in the tests if a typing error occurs.

Having said that, it is weird that after my changes mypy wants me to remove the # ignore in the tests

@EdAbati interesting 🤔

So should this line in the doc be rewritten?

narwhals/narwhals/dataframe.py

Line 928 in 733ab52

- Integers are always interpreted as positions

I do appreciate that what you're saying is documented, but writing always reads to me as something that belongs in the annotation?

Mmm oh yeah actually good catch, you are right! then it makes a lot of sense

@EdAbati marking as unresolved just to show another option if a reviewer wanted it:

diff --git a/narwhals/dataframe.py b/narwhals/dataframe.py index 6b673a80..207ef7b2 100644 --- a/narwhals/dataframe.py +++ b/narwhals/dataframe.py @@ -859,6 +859,8 @@ class DataFrame(BaseFrame[DataFrameT]): """ return self._compliant_frame.estimated_size(unit=unit) # type: ignore[no-any-return] + @overload + def __getitem__(self: DataFrame[pd.DataFrame], item: int) -> Any: ... @overload def __getitem__( # type: ignore[overload-overlap] self: Self, diff --git a/narwhals/stable/v1/__init__.py b/narwhals/stable/v1/__init__.py index 5aefefe2..af8aed47 100644 --- a/narwhals/stable/v1/__init__.py +++ b/narwhals/stable/v1/__init__.py @@ -84,6 +84,7 @@ if TYPE_CHECKING: from types import ModuleType import numpy as np + import pandas as pd from typing_extensions import Self from narwhals.dtypes import DType @@ -131,6 +132,8 @@ class DataFrame(NwDataFrame[IntoDataFrameT]): def _lazyframe(self: Self) -> type[LazyFrame[Any]]: return LazyFrame + @overload + def __getitem__(self: DataFrame[pd.DataFrame], item: int) -> Any: ... @overload def __getitem__( # type: ignore[overload-overlap] self: Self,

I think this describes the pandas edge case?

EdAbati · 2025-02-08T13:07:20Z

Still ends up pretty complex in the @overload(s), but much easier to differentiate the parts:

I agree, my plan was actually to add these in this or follow-up PR. I find them easier to read and follow

MarcoGorelli · 2025-02-08T14:49:42Z

tests/frame/getitem_test.py


    with pytest.raises(TypeError, match="Expected str or slice, got:"):
-        nw.from_native(constructor_eager(data), eager_only=True)[Foo()]  # type: ignore[call-overload]
+        nw.from_native(constructor_eager(data), eager_only=True)[Foo()]  # type: ignore[call-overload, unused-ignore]


🤔 ignoring unused-ignore seems a bit strange?

@MarcoGorelli see #1963 (comment)

It is to ignore pre-commit not understanding the call is invalid

MarcoGorelli

thanks both!

Discussed in: #1963 (comment), #1963 (comment)

EdAbati added 2 commits February 7, 2025 18:13

fix silly mistake

6272946

missing another np.array

89983fe

EdAbati commented Feb 7, 2025

View reviewed changes

narwhals/dataframe.py Show resolved Hide resolved

EdAbati commented Feb 7, 2025

View reviewed changes

tests/frame/get_column_test.py Show resolved Hide resolved

This comment was marked as resolved.

Sign in to view

Merge branch 'main' into more-fixes-getitem

2517334

dangotbanned mentioned this pull request Feb 8, 2025

feat: add Schema.to_(arrow|pandas|polars) #1924

Merged

4 tasks

dangotbanned added 3 commits February 8, 2025 12:22

chore(typing): ignore type on fail case

88a0ca5

fix(typing): add int cases to overloads

932a2f9

Resolves narwhals-dev#1963 (comment)

chore(typing): ignore pre-commit false-positive

733ab52

EdAbati commented Feb 8, 2025

View reviewed changes

EdAbati marked this pull request as ready for review February 8, 2025 13:25

MarcoGorelli reviewed Feb 8, 2025

View reviewed changes

MarcoGorelli added internal typing labels Feb 8, 2025

MarcoGorelli approved these changes Feb 8, 2025

View reviewed changes

MarcoGorelli merged commit 02bcc1d into narwhals-dev:main Feb 8, 2025
23 checks passed

EdAbati deleted the more-fixes-getitem branch February 8, 2025 14:53

dangotbanned added a commit that referenced this pull request Feb 15, 2025

refactor(typing): Add __getitem__ selector aliases

8d8895b

Discussed in: #1963 (comment), #1963 (comment)

dangotbanned mentioned this pull request Feb 15, 2025

refactor(typing): Add __getitem__ selector aliases #2020

Closed

10 tasks

dangotbanned mentioned this pull request Mar 12, 2025

chore(typing): Use SQLFrame instead of PySpark for _spark_like internally #2190

Merged

10 tasks

Conversation

EdAbati commented Feb 7, 2025

What type of PR is this? (check all applicable)

Checklist

If you have comments or can explain your changes, please do so below

Uh oh!

EdAbati Feb 7, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

This comment was marked as resolved.

MarcoGorelli commented Feb 8, 2025

Uh oh!

dangotbanned commented Feb 8, 2025

Uh oh!

dangotbanned commented Feb 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

EdAbati Feb 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

dangotbanned Feb 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

EdAbati Feb 8, 2025

Choose a reason for hiding this comment

Uh oh!

dangotbanned Feb 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

EdAbati commented Feb 8, 2025

Uh oh!

MarcoGorelli Feb 8, 2025

Choose a reason for hiding this comment

Uh oh!

dangotbanned Feb 8, 2025

Choose a reason for hiding this comment

Uh oh!

MarcoGorelli Feb 8, 2025

Choose a reason for hiding this comment

Uh oh!

MarcoGorelli left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

dangotbanned commented Feb 8, 2025 •

edited

Loading

EdAbati Feb 8, 2025 •

edited

Loading

dangotbanned Feb 8, 2025 •

edited

Loading

dangotbanned Feb 8, 2025 •

edited

Loading