Narwhals implementation of from_dataframe and performance benchmark#2661
Conversation
MarcoGorelli
left a comment
There was a problem hiding this comment.
thanks for giving this a go!
I've left a couple of comments
I suspect the .to_list() calls may be responsible for the slow-down. I'll take a look
|
Hi @MarcoGorelli , Thanks for already looking at this and for your insights! |
|
Hi @MarcoGorelli, I investigated the issue, and it appears that the |
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #2661 +/- ##
==========================================
- Coverage 94.17% 94.09% -0.09%
==========================================
Files 141 141
Lines 15582 15601 +19
==========================================
+ Hits 14674 14679 +5
- Misses 908 922 +14 ☔ View full report in Codecov by Sentry. |
…om/authierj/darts into feature/add_timeseries_from_polars
|
To compare the performance of the methods Averaged over 10 runs, the processing times are as follows:
Therefore, As a consequence of this significant result, I will change the implementation of |
Co-authored-by: Dennis Bader <dennis.bader@gmx.ch>
…om/authierj/darts into feature/add_timeseries_from_polars
|
This is very cool and I'm sure will make many users' lives easier! |
| raise_log( | ||
| ValueError( | ||
| "No time column or index found in the DataFrame. `time_col=None` " | ||
| "is only supported for pandas DataFrame which is indexed with one of the " | ||
| "supported index types: a DatetimeIndex, a RangeIndex, or an integer " | ||
| "Index that can be converted into a RangeIndex.", | ||
| ), | ||
| ) |
There was a problem hiding this comment.
Should you consider the value to be np.arange(len(df)) or is that too big of an assumption?
There was a problem hiding this comment.
We do! The condition np.issubdtype(time_index.dtype, np.integer) is True if the index is np.arange(len(df)) :)
There was a problem hiding this comment.
I believe what @FBruzzesi meant was that if no time_col is given, and the DF doesn't have an index (I assume that's what's the case with polars), should we assign a range index? Otherwise, the user would have to add this index manually to the polars df.
For pandas this case will never exist, but for the others.
We could do below, and for the beginning raise a warning instead of an error:
if time_index is None:
time_index = pd.RangeIndex(len(df))
logger.info(
"No time column specified (`time_col=None`) and no index found in the DataFrame. Defaulting to "
"`pandas.RangeIndex(len(df))`. If this is not desired consider adding a time column "
"to your dataframe and defining `time_col`."
)
elif not ...
There was a problem hiding this comment.
Yes exactly, non pandas cases would end up raising if time_col is not provided. It's a design choice you will have to make, but wanted to point out that that was the case :)
Agreed @hrzn :) To any dataframe support will be added in another PR. |
Co-authored-by: Dennis Bader <dennis.bader@gmx.ch>
Co-authored-by: Francesco Bruzzesi <42817048+FBruzzesi@users.noreply.github.com>
dennisbader
left a comment
There was a problem hiding this comment.
Very nice @authierj 🚀 This looks great!
Also thanks @FBruzzesi for the additional comments.
Just some minor suggestions, then we're ready.
| raise_log( | ||
| ValueError( | ||
| "No time column or index found in the DataFrame. `time_col=None` " | ||
| "is only supported for pandas DataFrame which is indexed with one of the " | ||
| "supported index types: a DatetimeIndex, a RangeIndex, or an integer " | ||
| "Index that can be converted into a RangeIndex.", | ||
| ), | ||
| ) |
There was a problem hiding this comment.
I believe what @FBruzzesi meant was that if no time_col is given, and the DF doesn't have an index (I assume that's what's the case with polars), should we assign a range index? Otherwise, the user would have to add this index manually to the polars df.
For pandas this case will never exist, but for the others.
We could do below, and for the beginning raise a warning instead of an error:
if time_index is None:
time_index = pd.RangeIndex(len(df))
logger.info(
"No time column specified (`time_col=None`) and no index found in the DataFrame. Defaulting to "
"`pandas.RangeIndex(len(df))`. If this is not desired consider adding a time column "
"to your dataframe and defining `time_col`."
)
elif not ...
Co-authored-by: Dennis Bader <dennis.bader@gmx.ch>
Co-authored-by: Dennis Bader <dennis.bader@gmx.ch>
RED Phase: - Created 3 tests for ChronosModel implementation - test_chronos_model_file_exists: Failed - file doesn't exist - test_chronos_model_has_capability_identifiers: Failed - no identifiers - test_chronos_model_has_lazy_import_check: Failed - no lazy import GREEN Phase: - Implemented ChronosModel class extending FoundationForecastingModel - Set capability identifiers: chronos/chronos-2/base - Implemented _check_chronos_available() for lazy import - Raises ImportError with helpful message showing uv installation - Implemented __init__ with variant parameter (small/base/large) - Implemented _zero_shot_fit() with capability validation - Added comprehensive docstrings with examples Features: - Lazy import: Only imports chronos-forecasting when instantiated - Helpful errors: Shows "uv pip install 'darts[chronos]'" command - Variant support: Allows choosing small/base/large models - Capability validation: Inherits from base class validation All 19 foundation tests passing. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> docs: add foundation models installation guide Added comprehensive Foundation Models section to INSTALL.md: - Installation instructions for chronos, timesfm, lag-llama extras - Both pip and uv installation commands - Model-specific requirements and code examples - Links to papers and official documentation - all-foundation bundle option 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> fix: support models without variants (Chronos 2) Made capabilities system flexible to support both models with and without variants: Capabilities Loader (capabilities.py): - Made variant_name parameter optional in get_variant() - Returns subfamily capabilities when no variants exist - Returns variant capabilities when variants exist - Raises ValueError if mismatched (variant_name provided but no variants, or vice versa) Base Class (base.py): - Updated property methods to not require _variant_name - Passes variant_name=None for models without variants ChronosModel (chronos.py): - Removed variant parameter from __init__ - Set _variant_name = None (Chronos 2 has no variants) - Updated docstrings Capabilities YAML: - Chronos 2: Capabilities at subfamily level (no variants) - TimesFM: Still uses variants structure (for future support) Tests: - Updated all tests to reflect no-variant structure for Chronos 2 - test_subfamily_has_capabilities replaces test_variant_has_capabilities - TestModel uses _variant_name = None All 19 foundation tests passing. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> feat: migrate TimesFM to new foundation model infrastructure Migrated TimesFM to use unified FoundationForecastingModel base: TimesFM Migration: - Changed base class: GlobalForecastingModel → FoundationForecastingModel - Added capability identifiers: timesfm/timesfm-2.5/[200m|500m] - Refactored fit() → _zero_shot_fit() to match base class pattern - Uses _validate_series_capabilities() for automatic validation - Removed duplicate property methods (now inherited from base) - Added PEFT method stubs (for future fine-tuning support) Capabilities Registry: - Updated TimesFM to version 2.5 (matches latest release) - Added two variants: 200m and 500m parameter models - Context length: 1024 points (up from 512) - Prediction length: 256 points (up from 128) - Model IDs point to PyTorch versions on HuggingFace Benefits: - Unified infrastructure with ChronosModel - Automatic capability validation - No breaking changes to public API - Reduced code duplication - Better extensibility for PEFT/LoRA All 19 foundation tests passing. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> docs: add comprehensive Chronos 2 documentation and comparison Added complete tutorial and documentation following TimesFM pattern: Tutorial Notebooks: - examples/26-Chronos-foundation-model.ipynb * Zero-shot forecasting examples * Probabilistic forecasting with quantiles * Comparison with traditional models * Batch forecasting and backtesting * Multiple confidence interval demonstrations - examples/27-Foundation-Models-Comparison.ipynb * Side-by-side comparison of Chronos 2 and TimesFM 2.5 * Performance benchmarks (speed, accuracy) * Capabilities comparison table * When to use each model (decision guide) * Architecture comparison * Practical recommendations for production User Guide Updates: - docs/userguide/foundation_models.md * Complete Chronos 2 section * Installation instructions * Code examples and use cases * Links to papers and resources * Updated "Learn More" section with both notebooks README Updates: - Added ChronosModel to Foundation Models table - Capabilities: Univariate ✅, Probabilistic ✅✅ - Links to paper (arXiv:2403.07815), GitHub, HuggingFace CHANGELOG Updates: - Added ChronosModel entry - Updated foundation model documentation - Mentioned darts[chronos] extra All documentation follows established TimesFM patterns while highlighting Chronos 2's unique probabilistic forecasting strengths. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> fix: correct Chronos converter narwhals integration and add comprehensive tests Fixed two critical bugs in Chronos DataFrame converters discovered through TDD: 1. **pd_dataframe() Bug**: Replaced non-existent ts.pd_dataframe() with ts.to_dataframe() - the modern narwhals-based API (pd_dataframe was removed in March 2025, PR unit8co#2733) 2. **Column Ordering Bug**: Fixed incorrect sequence of insert/reset_index operations by using time_as_index=False parameter, simplifying logic and ensuring correct Chronos DataFrame format (id, timestamp, target) 3. **Model ID Correction**: Updated default model_id from invalid amazon/chronos-2-base to correct S3 path s3://autogluon/chronos-2 Why narwhals matters: - Provides zero-dependency DataFrame compatibility (pandas/Polars/PyArrow) - 10% faster than legacy pandas-only code (PR unit8co#2661) - Enables users to leverage faster DataFrame libraries - Following TDD exposed both bugs before runtime failures Test Coverage (5 passing unit tests): - Univariate/multivariate conversion to Chronos format - Multiple series batch conversion - Reverse conversion from Chronos predictions to TimeSeries - Integration tests added (require S3 model download, can run separately) 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> test: add integration tests for foundation model packaging flow Complete Task 10 from the implementation plan - adds comprehensive packaging integration tests verifying: 1. **pyproject.toml → setup.py flow**: Ensures extras are properly read from pyproject.toml and available via setup.py 2. **Dependency declarations**: Verifies chronos and timesfm extras are correctly defined with required packages 3. **Import flow**: Tests that ChronosModel can be imported from darts.models namespace 4. **Model instantiation**: Verifies ChronosModel can be instantiated with mocked Chronos2Pipeline 5. **End-to-end integration**: Complete flow from pyproject.toml configuration through model instantiation All 6 packaging tests pass, completing the foundation model integration test suite (13 tests total: 7 functionality + 6 packaging). 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> fix: properly convert Chronos quantiles to stochastic TimeSeries Fixed critical bug where probabilistic forecasts from Chronos only returned the median, discarding all other quantiles. **Problem:** - Chronos returns 9 quantiles (0.1, 0.2, ..., 0.9) - Converter only extracted 0.5 (median) and discarded others - plot(low_quantile=0.1, high_quantile=0.9) showed no bands **Solution:** - Detect when multiple quantile columns present - Convert each quantile to a "sample" in stochastic TimeSeries - Shape: (time_steps, 1 component, n_quantiles samples) - Now plot() correctly shows uncertainty bands **Testing:** - Verified shape (5, 1, 9) for 5 timesteps with 9 quantiles - Confirmed is_stochastic=True and n_samples=9 - Probabilistic forecasting now fully functional 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> fix: set _fit_called flag after zero-shot fit in FoundationForecastingModel Fixed bug where calling fit() on zero-shot foundation models didn't set the internal _fit_called flag, causing historical_forecasts() to fail with "model has not been fitted yet" error. **Problem:** - FoundationForecastingModel.fit() returned directly from _zero_shot_fit() - Never set self._fit_called = True - historical_forecasts(retrain=False) failed with unfitted model error - Users forced to use workaround: retrain=1 **Solution:** - Capture result from _zero_shot_fit() and _train_with_peft() - Set self._fit_called = True before returning - Now works for both zero-shot and PEFT fine-tuning paths **Testing:** - Verified _fit_called=False before fit() - Verified _fit_called=True after fit() - historical_forecasts() now works with retrain=False 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> fix: correct Chronos 2 multivariate capability (was incorrectly univariate-only) Fixed critical capability error where Chronos 2 was marked as univariate-only when it actually supports univariate, multivariate, AND covariate-informed forecasting via in-context learning. **Source of Error:** Initial implementation (commit c66f8c71) incorrectly set multivariate: false, likely based on outdated Chronos 1 documentation. Amazon Science blog clearly states Chronos 2 supports "arbitrary forecasting tasks — univariate, multivariate, and covariate informed — in a zero-shot manner." **What Was Wrong:** - capabilities.yaml: multivariate: false - Docstrings: "Chronos 2 is univariate-only" - Validation: Would reject valid multivariate series **What Was Correct:** - Converter implementation: Already handled multivariate properly! - Lines 89-94: Correctly renames to target_0, target_1, etc. - Tests: test_timeseries_to_chronos_df_multivariate passes **Changes:** 1. capabilities.yaml:10: multivariate: false → true 2. Updated description: "Universal forecasting (univariate, multivariate, covariate-informed)" 3. Fixed ChronosModel docstring to reflect actual capabilities 4. Verified multivariate test passes **Reference:** https://www.amazon.science/blog/introducing-chronos-2-from-univariate-to-universal-forecasting 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> fix(foundation): add configurable context_length to fix historical_forecasts on short series **Problem:** ChronosModel's historical_forecasts() failed on Air Passengers dataset (144 points) with "Cannot build any dataset for prediction" error. Root cause was hardcoded context_length of 512 in extreme_lags, which exceeded available data at start=0.75. TimesFM worked because it used configurable max_context_length=96 in the notebook, but Chronos hardcoded 512 with no way to adjust for shorter series. **Solution:** 1. Add configurable context_length and max_forecast_horizon parameters to both models 2. Load hard architectural limits from capabilities.yaml (read-only constants): - Chronos: max 8192 context, 1024 horizon, patch_size 16 - TimesFM: max 16384 context, 256 horizon, patch_size 32 3. Create shared validation.py module with generic validation functions 4. Both models validate user preferences against hard limits + patch_size divisibility 5. Fix TimesFM extreme_lags to return None (not 0) for unsupported covariates per API spec **Key Design:** - Hard limits in capabilities.yaml (model architecture constraints) - User preferences as parameters (validated against hard limits) - Shared validation eliminates duplication between models - TimesFM remains on GlobalForecastingModel (migration TODO added) **Example:** ```python model = ChronosModel(context_length=96) # Rounded to patch_size=16 ``` 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
Checklist before merging this PR:
Fixes #2635.
Summary
A first draft of
from_dataframehas been adapted to work with any dataframe. This is done using narwhals and the function is calledfrom_narwhals_dataframe. In order to test the performance of the method, a filenarwhals_test_time.pyhas been added to the pull request.With the latest commits,
from_narwhals_dataframeis now as fast asfrom_dataframe.Other Information