fix: missing __getitem__ type fixes#1963
Conversation
| def __getitem__( | ||
| self: Self, | ||
| key: ( | ||
| item: ( |
There was a problem hiding this comment.
silly mistake (I blame copilot) 🙈
I wonder why it was not picked up though
This comment was marked as resolved.
This comment was marked as resolved.
you do now 😉 feel free to make improvements, thanks for all your help + evangelism of Narwhals |
Oh hello 😊 |
|
Maybe this is too much for the PR. # Annotations for `__getitem__` methods
SingleIndexSelector: TypeAlias = int
MultiIndexSelector: TypeAlias = Union[
slice,
range,
Sequence[int],
"Series",
"np.ndarray[Any, Any]",
]
SingleNameSelector: TypeAlias = str
MultiNameSelector: TypeAlias = Union[
slice,
Sequence[str],
"Series",
"np.ndarray[Any, Any]",
]
BooleanMask: TypeAlias = Union[
Sequence[bool],
"Series",
"np.ndarray[Any, Any]",
]
SingleColSelector: TypeAlias = Union[SingleIndexSelector, SingleNameSelector]
MultiColSelector: TypeAlias = Union[MultiIndexSelector, MultiNameSelector, BooleanMask]Still ends up pretty complex in the # `str` overlaps with `Sequence[str]`
# We can ignore this but we must keep this overload ordering
@overload
def __getitem__(
self, key: tuple[SingleIndexSelector, SingleColSelector]
) -> Any: ...
@overload
def __getitem__( # type: ignore[overload-overlap]
self, key: str | tuple[MultiIndexSelector, SingleColSelector]
) -> Series: ...
@overload
def __getitem__(
self,
key: (
SingleIndexSelector
| MultiIndexSelector
| MultiColSelector
| tuple[SingleIndexSelector, MultiColSelector]
| tuple[MultiIndexSelector, MultiColSelector]
),
) -> DataFrame: ...
def __getitem__(
self,
key: (
SingleIndexSelector
| SingleColSelector
| MultiColSelector
| MultiIndexSelector
| tuple[SingleIndexSelector, SingleColSelector]
| tuple[SingleIndexSelector, MultiColSelector]
| tuple[MultiIndexSelector, SingleColSelector]
| tuple[MultiIndexSelector, MultiColSelector]
),
) -> DataFrame | Series | Any:@MarcoGorelli do you see these aliases as a desirable thing to steal (re-implement) from |
| key: ( | ||
| slice | ||
| item: ( | ||
| int |
There was a problem hiding this comment.
Hey @dangotbanned thanks for fixing! super quick I didn't have the chance to see the comments :D
Anyway I think that int is a bit of a special case.
The narwhals rule at the moment is that int in __getitem__ should not be fully supported (https://narwhals-dev.github.io/narwhals/pandas_like_concepts/column_names/) or encouraged .
I think we shouldn't add it in typing here, but ignore the int slicing in the tests if a typing error occurs.
Having said that, it is weird that after my changes mypy wants me to remove the # ignore in the tests
There was a problem hiding this comment.
@EdAbati interesting 🤔
So should this line in the doc be rewritten?
narwhals/narwhals/dataframe.py
Line 928 in 733ab52
I do appreciate that what you're saying is documented, but writing always reads to me as something that belongs in the annotation?
There was a problem hiding this comment.
Mmm oh yeah actually good catch, you are right! then it makes a lot of sense
There was a problem hiding this comment.
@EdAbati marking as unresolved just to show another option if a reviewer wanted it:
diff --git a/narwhals/dataframe.py b/narwhals/dataframe.py
index 6b673a80..207ef7b2 100644
--- a/narwhals/dataframe.py
+++ b/narwhals/dataframe.py
@@ -859,6 +859,8 @@ class DataFrame(BaseFrame[DataFrameT]):
"""
return self._compliant_frame.estimated_size(unit=unit) # type: ignore[no-any-return]
+ @overload
+ def __getitem__(self: DataFrame[pd.DataFrame], item: int) -> Any: ...
@overload
def __getitem__( # type: ignore[overload-overlap]
self: Self,
diff --git a/narwhals/stable/v1/__init__.py b/narwhals/stable/v1/__init__.py
index 5aefefe2..af8aed47 100644
--- a/narwhals/stable/v1/__init__.py
+++ b/narwhals/stable/v1/__init__.py
@@ -84,6 +84,7 @@ if TYPE_CHECKING:
from types import ModuleType
import numpy as np
+ import pandas as pd
from typing_extensions import Self
from narwhals.dtypes import DType
@@ -131,6 +132,8 @@ class DataFrame(NwDataFrame[IntoDataFrameT]):
def _lazyframe(self: Self) -> type[LazyFrame[Any]]:
return LazyFrame
+ @overload
+ def __getitem__(self: DataFrame[pd.DataFrame], item: int) -> Any: ...
@overload
def __getitem__( # type: ignore[overload-overlap]
self: Self,
I think this describes the pandas edge case?
I agree, my plan was actually to add these in this or follow-up PR. I find them easier to read and follow |
|
|
||
| with pytest.raises(TypeError, match="Expected str or slice, got:"): | ||
| nw.from_native(constructor_eager(data), eager_only=True)[Foo()] # type: ignore[call-overload] | ||
| nw.from_native(constructor_eager(data), eager_only=True)[Foo()] # type: ignore[call-overload, unused-ignore] |
There was a problem hiding this comment.
🤔 ignoring unused-ignore seems a bit strange?
There was a problem hiding this comment.
@MarcoGorelli see #1963 (comment)
It is to ignore pre-commit not understanding the call is invalid


What type of PR is this? (check all applicable)
Checklist
If you have comments or can explain your changes, please do so below
I am so sorry @MarcoGorelli , I was a bit to quick to press "Ready for Review" 😭
This is a follow up of #1958