Skip to content

Narwhals implementation of from_dataframe and performance benchmark#2661

Merged
dennisbader merged 42 commits intounit8co:masterfrom
authierj:feature/add_timeseries_from_polars
Feb 28, 2025
Merged

Narwhals implementation of from_dataframe and performance benchmark#2661
dennisbader merged 42 commits intounit8co:masterfrom
authierj:feature/add_timeseries_from_polars

Conversation

@authierj
Copy link
Copy Markdown
Contributor

@authierj authierj commented Jan 31, 2025

Checklist before merging this PR:

  • Mentioned all issues that this PR fixes or addresses.
  • Summarized the updates of this PR under Summary.
  • Added an entry under Unreleased in the Changelog.

Fixes #2635.

Summary

A first draft of from_dataframe has been adapted to work with any dataframe. This is done using narwhals and the function is called from_narwhals_dataframe. In order to test the performance of the method, a file narwhals_test_time.py has been added to the pull request.
With the latest commits, from_narwhals_dataframe is now as fast as from_dataframe.

Other Information

Copy link
Copy Markdown

@MarcoGorelli MarcoGorelli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for giving this a go!

I've left a couple of comments

I suspect the .to_list() calls may be responsible for the slow-down. I'll take a look

@authierj
Copy link
Copy Markdown
Contributor Author

Hi @MarcoGorelli ,

Thanks for already looking at this and for your insights!

@authierj
Copy link
Copy Markdown
Contributor Author

authierj commented Feb 3, 2025

Hi @MarcoGorelli,

I investigated the issue, and it appears that the .to_list() call is not responsible for the slowdown. However, the call series_df.to_numpy()[:, :, np.newaxis] on line 906 is very slow. The investigation is going on!

Copy link
Copy Markdown
Contributor

@FBruzzesi FBruzzesi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @authierj for the effort on this! We really appreciate it! I left very non-relevant comments 😂

@codecov
Copy link
Copy Markdown

codecov bot commented Feb 4, 2025

Codecov Report

Attention: Patch coverage is 91.83673% with 4 lines in your changes missing coverage. Please review.

Project coverage is 94.09%. Comparing base (e086582) to head (3fa924f).
Report is 1 commits behind head on master.

Files with missing lines Patch % Lines
darts/timeseries.py 91.83% 4 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master    #2661      +/-   ##
==========================================
- Coverage   94.17%   94.09%   -0.09%     
==========================================
  Files         141      141              
  Lines       15582    15601      +19     
==========================================
+ Hits        14674    14679       +5     
- Misses        908      922      +14     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@authierj
Copy link
Copy Markdown
Contributor Author

To compare the performance of the methods from_dataframe() (which only accepts pandas DataFrames as input) and from_narwhals_dataframe() (which accepts all kinds of DataFrames), I implemented the script narwhals_test_time.py, see here. This script calls the two functions for a large number of different pandas DataFrames, with shuffled or unshuffled data, varying sizes, indices, and datetime formats.

Averaged over 10 runs, the processing times are as follows:

method average processing time [s]
from_dataframe() 10.9718
from_narwhals_dataframe() 9.8564

Therefore, from_narwhals_dataframe() is 1.1154 seconds faster than from_dataframe(), representing a 10.17% decrease in processing time on average.

As a consequence of this significant result, I will change the implementation of from_dataframe() and also modify from_series() to use the narwhals approach.

@authierj authierj marked this pull request as ready for review February 14, 2025 12:22
@hrzn
Copy link
Copy Markdown
Contributor

hrzn commented Feb 22, 2025

This is very cool and I'm sure will make many users' lives easier!
I think it might be worth updating the docs / quickstart to maybe showcase an example for creating/exporting from/to polars?

Copy link
Copy Markdown
Contributor

@FBruzzesi FBruzzesi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @authierj , I left a few suggestions and considerations in the from_* functions! Hope they help :)

Comment on lines +722 to +729
raise_log(
ValueError(
"No time column or index found in the DataFrame. `time_col=None` "
"is only supported for pandas DataFrame which is indexed with one of the "
"supported index types: a DatetimeIndex, a RangeIndex, or an integer "
"Index that can be converted into a RangeIndex.",
),
)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should you consider the value to be np.arange(len(df)) or is that too big of an assumption?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We do! The condition np.issubdtype(time_index.dtype, np.integer) is True if the index is np.arange(len(df)) :)

Copy link
Copy Markdown
Collaborator

@dennisbader dennisbader Feb 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe what @FBruzzesi meant was that if no time_col is given, and the DF doesn't have an index (I assume that's what's the case with polars), should we assign a range index? Otherwise, the user would have to add this index manually to the polars df.

For pandas this case will never exist, but for the others.

We could do below, and for the beginning raise a warning instead of an error:

if time_index is None:
    time_index = pd.RangeIndex(len(df))
    logger.info(
        "No time column specified (`time_col=None`) and no index found in the DataFrame. Defaulting to "
        "`pandas.RangeIndex(len(df))`. If this is not desired consider adding a time column "
        "to your dataframe and defining `time_col`."
    )
elif not ...

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes exactly, non pandas cases would end up raising if time_col is not provided. It's a design choice you will have to make, but wanted to point out that that was the case :)

@dennisbader
Copy link
Copy Markdown
Collaborator

This is very cool and I'm sure will make many users' lives easier! I think it might be worth updating the docs / quickstart to maybe showcase an example for creating/exporting from/to polars?

Agreed @hrzn :) To any dataframe support will be added in another PR.

Copy link
Copy Markdown
Collaborator

@dennisbader dennisbader left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very nice @authierj 🚀 This looks great!

Also thanks @FBruzzesi for the additional comments.

Just some minor suggestions, then we're ready.

Comment on lines +722 to +729
raise_log(
ValueError(
"No time column or index found in the DataFrame. `time_col=None` "
"is only supported for pandas DataFrame which is indexed with one of the "
"supported index types: a DatetimeIndex, a RangeIndex, or an integer "
"Index that can be converted into a RangeIndex.",
),
)
Copy link
Copy Markdown
Collaborator

@dennisbader dennisbader Feb 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe what @FBruzzesi meant was that if no time_col is given, and the DF doesn't have an index (I assume that's what's the case with polars), should we assign a range index? Otherwise, the user would have to add this index manually to the polars df.

For pandas this case will never exist, but for the others.

We could do below, and for the beginning raise a warning instead of an error:

if time_index is None:
    time_index = pd.RangeIndex(len(df))
    logger.info(
        "No time column specified (`time_col=None`) and no index found in the DataFrame. Defaulting to "
        "`pandas.RangeIndex(len(df))`. If this is not desired consider adding a time column "
        "to your dataframe and defining `time_col`."
    )
elif not ...

authierj and others added 3 commits February 28, 2025 16:28
Co-authored-by: Dennis Bader <dennis.bader@gmx.ch>
Co-authored-by: Dennis Bader <dennis.bader@gmx.ch>
@dennisbader dennisbader merged commit 24cec52 into unit8co:master Feb 28, 2025
9 checks passed
@github-project-automation github-project-automation bot moved this from In review to Done in darts Feb 28, 2025
@dennisbader dennisbader moved this from Done to Released in darts Mar 10, 2025
@cnhwl cnhwl mentioned this pull request Apr 9, 2025
3 tasks
nehalecky added a commit to nehalecky/darts that referenced this pull request Oct 31, 2025
RED Phase:
- Created 3 tests for ChronosModel implementation
- test_chronos_model_file_exists: Failed - file doesn't exist
- test_chronos_model_has_capability_identifiers: Failed - no identifiers
- test_chronos_model_has_lazy_import_check: Failed - no lazy import

GREEN Phase:
- Implemented ChronosModel class extending FoundationForecastingModel
- Set capability identifiers: chronos/chronos-2/base
- Implemented _check_chronos_available() for lazy import
- Raises ImportError with helpful message showing uv installation
- Implemented __init__ with variant parameter (small/base/large)
- Implemented _zero_shot_fit() with capability validation
- Added comprehensive docstrings with examples

Features:
- Lazy import: Only imports chronos-forecasting when instantiated
- Helpful errors: Shows "uv pip install 'darts[chronos]'" command
- Variant support: Allows choosing small/base/large models
- Capability validation: Inherits from base class validation

All 19 foundation tests passing.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

docs: add foundation models installation guide

Added comprehensive Foundation Models section to INSTALL.md:
- Installation instructions for chronos, timesfm, lag-llama extras
- Both pip and uv installation commands
- Model-specific requirements and code examples
- Links to papers and official documentation
- all-foundation bundle option

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

fix: support models without variants (Chronos 2)

Made capabilities system flexible to support both models with and without variants:

Capabilities Loader (capabilities.py):
- Made variant_name parameter optional in get_variant()
- Returns subfamily capabilities when no variants exist
- Returns variant capabilities when variants exist
- Raises ValueError if mismatched (variant_name provided but no variants, or vice versa)

Base Class (base.py):
- Updated property methods to not require _variant_name
- Passes variant_name=None for models without variants

ChronosModel (chronos.py):
- Removed variant parameter from __init__
- Set _variant_name = None (Chronos 2 has no variants)
- Updated docstrings

Capabilities YAML:
- Chronos 2: Capabilities at subfamily level (no variants)
- TimesFM: Still uses variants structure (for future support)

Tests:
- Updated all tests to reflect no-variant structure for Chronos 2
- test_subfamily_has_capabilities replaces test_variant_has_capabilities
- TestModel uses _variant_name = None

All 19 foundation tests passing.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

feat: migrate TimesFM to new foundation model infrastructure

Migrated TimesFM to use unified FoundationForecastingModel base:

TimesFM Migration:
- Changed base class: GlobalForecastingModel → FoundationForecastingModel
- Added capability identifiers: timesfm/timesfm-2.5/[200m|500m]
- Refactored fit() → _zero_shot_fit() to match base class pattern
- Uses _validate_series_capabilities() for automatic validation
- Removed duplicate property methods (now inherited from base)
- Added PEFT method stubs (for future fine-tuning support)

Capabilities Registry:
- Updated TimesFM to version 2.5 (matches latest release)
- Added two variants: 200m and 500m parameter models
- Context length: 1024 points (up from 512)
- Prediction length: 256 points (up from 128)
- Model IDs point to PyTorch versions on HuggingFace

Benefits:
- Unified infrastructure with ChronosModel
- Automatic capability validation
- No breaking changes to public API
- Reduced code duplication
- Better extensibility for PEFT/LoRA

All 19 foundation tests passing.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

docs: add comprehensive Chronos 2 documentation and comparison

Added complete tutorial and documentation following TimesFM pattern:

Tutorial Notebooks:
- examples/26-Chronos-foundation-model.ipynb
  * Zero-shot forecasting examples
  * Probabilistic forecasting with quantiles
  * Comparison with traditional models
  * Batch forecasting and backtesting
  * Multiple confidence interval demonstrations

- examples/27-Foundation-Models-Comparison.ipynb
  * Side-by-side comparison of Chronos 2 and TimesFM 2.5
  * Performance benchmarks (speed, accuracy)
  * Capabilities comparison table
  * When to use each model (decision guide)
  * Architecture comparison
  * Practical recommendations for production

User Guide Updates:
- docs/userguide/foundation_models.md
  * Complete Chronos 2 section
  * Installation instructions
  * Code examples and use cases
  * Links to papers and resources
  * Updated "Learn More" section with both notebooks

README Updates:
- Added ChronosModel to Foundation Models table
- Capabilities: Univariate ✅, Probabilistic ✅✅
- Links to paper (arXiv:2403.07815), GitHub, HuggingFace

CHANGELOG Updates:
- Added ChronosModel entry
- Updated foundation model documentation
- Mentioned darts[chronos] extra

All documentation follows established TimesFM patterns while highlighting
Chronos 2's unique probabilistic forecasting strengths.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

fix: correct Chronos converter narwhals integration and add comprehensive tests

Fixed two critical bugs in Chronos DataFrame converters discovered through TDD:

1. **pd_dataframe() Bug**: Replaced non-existent ts.pd_dataframe() with
   ts.to_dataframe() - the modern narwhals-based API (pd_dataframe was
   removed in March 2025, PR unit8co#2733)

2. **Column Ordering Bug**: Fixed incorrect sequence of insert/reset_index
   operations by using time_as_index=False parameter, simplifying logic
   and ensuring correct Chronos DataFrame format (id, timestamp, target)

3. **Model ID Correction**: Updated default model_id from invalid
   amazon/chronos-2-base to correct S3 path s3://autogluon/chronos-2

Why narwhals matters:
- Provides zero-dependency DataFrame compatibility (pandas/Polars/PyArrow)
- 10% faster than legacy pandas-only code (PR unit8co#2661)
- Enables users to leverage faster DataFrame libraries
- Following TDD exposed both bugs before runtime failures

Test Coverage (5 passing unit tests):
- Univariate/multivariate conversion to Chronos format
- Multiple series batch conversion
- Reverse conversion from Chronos predictions to TimeSeries
- Integration tests added (require S3 model download, can run separately)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

test: add integration tests for foundation model packaging flow

Complete Task 10 from the implementation plan - adds comprehensive
packaging integration tests verifying:

1. **pyproject.toml → setup.py flow**: Ensures extras are properly
   read from pyproject.toml and available via setup.py

2. **Dependency declarations**: Verifies chronos and timesfm extras
   are correctly defined with required packages

3. **Import flow**: Tests that ChronosModel can be imported from
   darts.models namespace

4. **Model instantiation**: Verifies ChronosModel can be instantiated
   with mocked Chronos2Pipeline

5. **End-to-end integration**: Complete flow from pyproject.toml
   configuration through model instantiation

All 6 packaging tests pass, completing the foundation model
integration test suite (13 tests total: 7 functionality + 6 packaging).

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

fix: properly convert Chronos quantiles to stochastic TimeSeries

Fixed critical bug where probabilistic forecasts from Chronos only
returned the median, discarding all other quantiles.

**Problem:**
- Chronos returns 9 quantiles (0.1, 0.2, ..., 0.9)
- Converter only extracted 0.5 (median) and discarded others
- plot(low_quantile=0.1, high_quantile=0.9) showed no bands

**Solution:**
- Detect when multiple quantile columns present
- Convert each quantile to a "sample" in stochastic TimeSeries
- Shape: (time_steps, 1 component, n_quantiles samples)
- Now plot() correctly shows uncertainty bands

**Testing:**
- Verified shape (5, 1, 9) for 5 timesteps with 9 quantiles
- Confirmed is_stochastic=True and n_samples=9
- Probabilistic forecasting now fully functional

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

fix: set _fit_called flag after zero-shot fit in FoundationForecastingModel

Fixed bug where calling fit() on zero-shot foundation models didn't
set the internal _fit_called flag, causing historical_forecasts() to
fail with "model has not been fitted yet" error.

**Problem:**
- FoundationForecastingModel.fit() returned directly from _zero_shot_fit()
- Never set self._fit_called = True
- historical_forecasts(retrain=False) failed with unfitted model error
- Users forced to use workaround: retrain=1

**Solution:**
- Capture result from _zero_shot_fit() and _train_with_peft()
- Set self._fit_called = True before returning
- Now works for both zero-shot and PEFT fine-tuning paths

**Testing:**
- Verified _fit_called=False before fit()
- Verified _fit_called=True after fit()
- historical_forecasts() now works with retrain=False

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

fix: correct Chronos 2 multivariate capability (was incorrectly univariate-only)

Fixed critical capability error where Chronos 2 was marked as univariate-only
when it actually supports univariate, multivariate, AND covariate-informed
forecasting via in-context learning.

**Source of Error:**
Initial implementation (commit c66f8c71) incorrectly set multivariate: false,
likely based on outdated Chronos 1 documentation. Amazon Science blog clearly
states Chronos 2 supports "arbitrary forecasting tasks — univariate,
multivariate, and covariate informed — in a zero-shot manner."

**What Was Wrong:**
- capabilities.yaml: multivariate: false
- Docstrings: "Chronos 2 is univariate-only"
- Validation: Would reject valid multivariate series

**What Was Correct:**
- Converter implementation: Already handled multivariate properly!
- Lines 89-94: Correctly renames to target_0, target_1, etc.
- Tests: test_timeseries_to_chronos_df_multivariate passes

**Changes:**
1. capabilities.yaml:10: multivariate: false → true
2. Updated description: "Universal forecasting (univariate, multivariate, covariate-informed)"
3. Fixed ChronosModel docstring to reflect actual capabilities
4. Verified multivariate test passes

**Reference:**
https://www.amazon.science/blog/introducing-chronos-2-from-univariate-to-universal-forecasting

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

fix(foundation): add configurable context_length to fix historical_forecasts on short series

**Problem:**
ChronosModel's historical_forecasts() failed on Air Passengers dataset (144 points)
with "Cannot build any dataset for prediction" error. Root cause was hardcoded
context_length of 512 in extreme_lags, which exceeded available data at start=0.75.

TimesFM worked because it used configurable max_context_length=96 in the notebook,
but Chronos hardcoded 512 with no way to adjust for shorter series.

**Solution:**
1. Add configurable context_length and max_forecast_horizon parameters to both models
2. Load hard architectural limits from capabilities.yaml (read-only constants):
   - Chronos: max 8192 context, 1024 horizon, patch_size 16
   - TimesFM: max 16384 context, 256 horizon, patch_size 32
3. Create shared validation.py module with generic validation functions
4. Both models validate user preferences against hard limits + patch_size divisibility
5. Fix TimesFM extreme_lags to return None (not 0) for unsupported covariates per API spec

**Key Design:**
- Hard limits in capabilities.yaml (model architecture constraints)
- User preferences as parameters (validated against hard limits)
- Shared validation eliminates duplication between models
- TimesFM remains on GlobalForecastingModel (migration TODO added)

**Example:**
```python
model = ChronosModel(context_length=96)  # Rounded to patch_size=16
```

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

feature request Use this label to request a new feature improvement New feature or improvement

Projects

Status: Released

Development

Successfully merging this pull request may close these issues.

Add TimeSeries.from_polars

5 participants