chore: refactor EVERYTHING to do with broadcasting#2005
chore: refactor EVERYTHING to do with broadcasting#2005MarcoGorelli merged 140 commits intonarwhals-dev:mainfrom
Conversation
yup! that's a fair point, thanks, might as well align with Polars' internal name |
No worries StrAsLit: TypeAlias = bool
"""<Explain me>"""
def func(str_as_lit: StrAsLit):...Personal preference I suppose, but I like being able to get more info - but not having it always there if I'm familiar. These things seem small but do add up over time |
This comment was marked as resolved.
This comment was marked as resolved.
|
great suggestions, thanks - fancy adding a commit with them? |
Sure thing |
Suggested edit:The import was blocking me using the normal suggestion. diff --git a/narwhals/_expression_parsing.py b/narwhals/_expression_parsing.py
index 38300ad8..09419a5d 100644
--- a/narwhals/_expression_parsing.py
+++ b/narwhals/_expression_parsing.py
@@ -8,6 +8,7 @@ from enum import auto
from typing import TYPE_CHECKING
from typing import Any
from typing import Callable
+from typing import Iterable
from typing import Sequence
from typing import TypedDict
from typing import TypeVar
@@ -445,25 +446,27 @@ def infer_kind(obj: IntoExpr | _1DArray | object, *, str_as_lit: bool) -> ExprKi
return ExprKind.LITERAL
+def _iter_n_ary_operation(
+ compliant_exprs: Iterable[CompliantExpr[Any] | object], kinds: Sequence[ExprKind]
+) -> Iterable[CompliantExpr[Any] | object]:
+ broadcast = any(kind is ExprKind.TRANSFORM for kind in kinds)
+ broadcast_kind = {ExprKind.AGGREGATION, ExprKind.LITERAL}
+ if not broadcast:
+ yield from compliant_exprs
+ else:
+ for compliant_expr, kind in zip(compliant_exprs, kinds):
+ if kind in broadcast_kind and is_compliant_expr(compliant_expr):
+ yield compliant_expr.broadcast(kind)
+ else:
+ yield compliant_expr
+
+
def apply_n_ary_operation(
- plx: CompliantNamespace,
- function: Any,
- *comparands: IntoExpr,
- str_as_lit: bool,
+ plx: CompliantNamespace, function: Any, *comparands: IntoExpr, str_as_lit: bool
) -> CompliantExpr[Any]:
compliant_exprs = (
extract_compliant(plx, comparand, str_as_lit=str_as_lit)
for comparand in comparands
)
kinds = [infer_kind(comparand, str_as_lit=str_as_lit) for comparand in comparands]
-
- broadcast = any(kind is ExprKind.TRANSFORM for kind in kinds)
- compliant_exprs = (
- compliant_expr.broadcast(kind)
- if broadcast
- and kind in (ExprKind.AGGREGATION, ExprKind.LITERAL)
- and is_compliant_expr(compliant_expr)
- else compliant_expr
- for compliant_expr, kind in zip(compliant_exprs, kinds)
- )
- return function(*compliant_exprs)
+ return function(*_iter_n_ary_operation(compliant_exprs, kinds))
|
|
@MarcoGorelli ah I forgot about how singular aliases inherit their docs (unlike unions): I'm not as sold on my own idea now 😅 Also I was thinking this did the same thing as attribute docs, but it isn't visible everywhere like: |
| for x in kwargs.values() | ||
| ) | ||
| exprs = chain(args, kwargs.values()) | ||
| agg_or_lit = {ExprKind.AGGREGATION, ExprKind.LITERAL} |
There was a problem hiding this comment.
Maybe you'd want some of these sets defined on ExprKind like you've got for Implementation?
Lines 238 to 256 in 390d6e0
I think you'd be able to make a lot of little ergonomic changes - if there's more cases that are repeated a lot
| if ( | ||
| is_narwhals_series(obj) | ||
| or is_numpy_array(obj) | ||
| or (isinstance(obj, str) and not str_as_lit) | ||
| ): |
There was a problem hiding this comment.
Could split out into is_transform(obj: Any, str_as_lit: bool)
| compliant_exprs = [ | ||
| compliant_expr.broadcast(kind) | ||
| if kind in (ExprKind.AGGREGATION, ExprKind.LITERAL) | ||
| else compliant_expr | ||
| for compliant_expr, kind in zip(compliant_exprs, kinds) | ||
| ] |
There was a problem hiding this comment.
This is identicial to https://github.com/MarcoGorelli/narwhals/blob/5ee5ce276d54d676371a518521b7b180504dd83b/narwhals/dataframe.py#L139-L144
Maybe move into a function they both use?
| and x._metadata["kind"] in (ExprKind.AGGREGATION, ExprKind.LITERAL) | ||
| for x in kwargs.values() | ||
|
|
||
| def infer_kind(obj: IntoExpr | _1DArray | object, *, str_as_lit: bool) -> ExprKind: |
There was a problem hiding this comment.
Have you thought about making this a @classmethod for ExprKind?
Maybe called from_expr or I guess technically from_into_expr (but thats a mouthful)
| def maybe_evaluate_expr( | ||
| df: CompliantDataFrame, expr: CompliantExpr[CompliantSeriesT_co] | ||
| ) -> Sequence[CompliantSeriesT_co]: ... | ||
| ) -> CompliantSeriesT_co: ... |
There was a problem hiding this comment.
Note for a follow up:
This reminds me that we should fix the typing for CompliantSeriesT_co
for Spark we use Column, in duckdb Expression. They are not compliant series :/
|
Thanks all for comments, much appreciated 🙏 If people are OK with the logic of the changes, OK if we get this in and then get further refactors in as follow-ups? |
dangotbanned
left a comment
There was a problem hiding this comment.
Great work @MarcoGorelli
|
Thanks for your review and for taking such a close look, legend mate! Merging then so we can proceed with further improvements (e.g. other typing PRs, order-dependent lazy ops, and more) before too many merge conflicts crop up |



This...took some effort
What type of PR is this? (check all applicable)
Related issues
Checklist
If you have comments or can explain your changes, please do so below