Skip to content

Add multiseries time series regression pipeline#4256

Merged
eccabay merged 25 commits intomainfrom
4254_msts_pipeline
Aug 8, 2023
Merged

Add multiseries time series regression pipeline#4256
eccabay merged 25 commits intomainfrom
4254_msts_pipeline

Conversation

@eccabay
Copy link
Copy Markdown
Contributor

@eccabay eccabay commented Aug 2, 2023

Closes #4254

@codecov
Copy link
Copy Markdown

codecov bot commented Aug 2, 2023

Codecov Report

Merging #4256 (7ed66bd) into main (045686d) will increase coverage by 0.1%.
The diff coverage is 100.0%.

@@           Coverage Diff           @@
##            main   #4256     +/-   ##
=======================================
+ Coverage   99.7%   99.7%   +0.1%     
=======================================
  Files        351     353      +2     
  Lines      38512   38643    +131     
=======================================
+ Hits       38391   38522    +131     
  Misses       121     121             
Files Changed Coverage Δ
evalml/preprocessing/__init__.py 100.0% <ø> (ø)
...onent_tests/test_multiseries_baseline_regressor.py 100.0% <ø> (ø)
evalml/pipelines/__init__.py 100.0% <100.0%> (ø)
...sors/multiseries_time_series_baseline_regressor.py 100.0% <100.0%> (ø)
...valml/pipelines/multiseries_regression_pipeline.py 100.0% <100.0%> (ø)
evalml/pipelines/time_series_pipeline_base.py 100.0% <100.0%> (ø)
evalml/pipelines/utils.py 99.5% <100.0%> (+0.1%) ⬆️
evalml/preprocessing/utils.py 100.0% <100.0%> (ø)
evalml/tests/conftest.py 98.3% <100.0%> (ø)
...line_tests/test_multiseries_regression_pipeline.py 100.0% <100.0%> (ø)
... and 2 more

@eccabay eccabay marked this pull request as ready for review August 2, 2023 20:58
Copy link
Copy Markdown
Collaborator

@jeremyliweishih jeremyliweishih left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall LGTM but requesting one small test case and had some clarifying questions!

Copy link
Copy Markdown
Contributor

@christopherbunn christopherbunn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One question but otherwise LGTM

return stacked_series


def stack_X(X, series_id_name, time_index, starting_index=None):
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to double check, this is able to handle exogenous features too right? (e.g. stacking columns feature_a, feature_b, and feature_c columns into one feature column). If so, is there a test for this?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct. stack_data will stack a dataframe of a single unstacked column, and stack_X is essentially just a wrapper around stack_data which calls it for every original stacked column. The test is test_stack_X in the pipeline_utils tests

Copy link
Copy Markdown
Contributor

@chukarsten chukarsten left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, just a few nitty things!

return self

def _fit(self, X, y):
from evalml.pipelines.utils import unstack_multiseries
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's maybe move this to the top with the other imports?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Circular imports if moved to the top - I'm open to any other suggestions on how to mitigate!

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we file a ticket to deal with that? That makes me...uncomfortable.

@eccabay eccabay requested a review from chukarsten August 7, 2023 20:38
Copy link
Copy Markdown
Contributor

@chukarsten chukarsten left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you!

return self

def _fit(self, X, y):
from evalml.pipelines.utils import unstack_multiseries
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we file a ticket to deal with that? That makes me...uncomfortable.

@eccabay eccabay merged commit 4982dd1 into main Aug 8, 2023
@eccabay eccabay deleted the 4254_msts_pipeline branch August 8, 2023 18:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Multiseries Time Series Pipeline (EvalML)

4 participants