Add recommendation score by eccabay · Pull Request #4156 · alteryx/evalml

eccabay · 2023-04-21T13:47:21Z

codecov · 2023-04-21T13:56:26Z

Codecov Report

Merging #4156 (9b67d47) into main (b530abd) will increase coverage by 0.1%.
The diff coverage is 100.0%.

@@           Coverage Diff           @@
##            main   #4156     +/-   ##
=======================================
+ Coverage   99.7%   99.7%   +0.1%     
=======================================
  Files        349     349             
  Lines      37809   38094    +285     
=======================================
+ Hits       37692   37977    +285     
  Misses       117     117

Impacted Files	Coverage Δ
evalml/objectives/__init__.py	`100.0% <ø> (ø)`
evalml/automl/automl_search.py	`99.6% <100.0%> (+0.1%)`	⬆️
evalml/objectives/standard_metrics.py	`100.0% <100.0%> (ø)`
evalml/objectives/utils.py	`100.0% <100.0%> (ø)`
evalml/tests/automl_tests/test_automl.py	`99.6% <100.0%> (+0.1%)`	⬆️
evalml/tests/conftest.py	`98.3% <100.0%> (ø)`
evalml/tests/objective_tests/test_objectives.py	`100.0% <100.0%> (ø)`

eccabay · 2023-04-24T18:11:10Z

evalml/automl/automl_search.py

        self.data_splitter = data_splitter
        self.optimize_thresholds = optimize_thresholds
        self.ensembling = ensembling
-        if objective == "auto":


This section was not deleted, simply moved down lower to be part of the other objective handling. It needed to be lower rather than moving the new handling up here because the new recommendation_objective logic requires several other validation checks to have already happened and self.X_train and self.y_train to be set.

jeremyliweishih

Overall LGTM - great work! Just have some nits and you may need to take a look at the docs for the new rankings!

docs/source/user_guide/automl.ipynb

evalml/objectives/utils.py

jeremyliweishih · 2023-04-25T14:58:24Z

evalml/objectives/utils.py

+    ]
+
+    if problem_type == ProblemTypes.MULTICLASS and imbalanced:
+        objective_list.remove(objectives.AUCMicro.name)


do you think it would be more clear to define lists for every problem type instead of encoding it in logic? I'm not too sure about this but just a thought.

Yeah, I went back and forth on this a lot. I figured this was more concise, but it may also be more unclear. I'm happy to change this around if need be!

jeremyliweishih · 2023-04-25T15:11:45Z

docs/source/user_guide/automl.ipynb

+    ")\n",
+    "automl_recommendation.search(interactive_plot=False)\n",
+    "\n",
+    "automl_recommendation.rankings"


Spacing is off here for some reason

This is a weird one. It doesn't happen when running the jupyter notebooks directly, and it looks like the same issue comes up with other rankings dataframes as well - this one looks the worst since it has the most columns. I think I'll drop some unnecessary columns so that they aren't so squished, but I'm also open to other suggestions, as Google was unhelpful here.

evalml/tests/automl_tests/test_automl.py

evalml/objectives/utils.py

chukarsten · 2023-04-26T17:31:00Z

evalml/objectives/utils.py

+    if prioritized_objective is not None:
+        if prioritized_objective not in objectives:


prioritized_objective allows us to give more (or I guess less) weight to a single objective. Is it worth designing this function such that it accepts instead an objectives_weighting (dict[str,float]): Mapping of objectives to their relative weighting to compose the recommendation score ?

I was kind of thinking that there would existing presets that would function as data scientist selected "blends" of metrics based on what our user was looking for. This would require weighting of multiple metrics at the same time.

That was my original design - amended after discussion here

chukarsten · 2023-04-26T17:33:41Z

evalml/tests/automl_tests/test_automl.py

+    assert "already one of the default objectives" in caplog.text
+
+
+def test_recommendation_include_non_optimization(X_y_binary):


What's this test about?

We get the additional (not main objective) scores through the additional_objectives mechanism. By default, this calculates the objectives for all of the optimization metrics, excluding any that are classified as ranking only (and we have a check against it). However, we want to make sure users have full control over which objectives are included in the recommendation score - including ones that are ranking only (for example, recall). This test ensures that the logic permitting this works

chukarsten · 2023-04-26T17:36:33Z

evalml/tests/automl_tests/test_automl.py

+
+
+@pytest.mark.parametrize("imbalanced_data", [True, False])
+def test_use_recommendation_score_imbalanced(


I reviewed the code...why are we treating imbalanced data differently?

This was discussed in the design doc - it's exclusively used in the multiclass classification case, when selecting whether to use AUC Micro or AUC weighted, because micro averages are better for imbalanced datasets.

evalml/tests/automl_tests/test_automl.py

chukarsten · 2023-04-26T17:49:54Z

evalml/automl/automl_search.py

+        ranking_column = "ranking_score"
+        if self.use_recommendation:


Is there any benefit to modifying the API of full_rankings() to have the ranking column be selectable via the function call rather than the AutoML init setting? Probably not, I guess. Wouldn't want the user to be able to select "recommendation_score" with self.use_recommendation not having been set...

evalml/objectives/utils.py

chukarsten · 2023-05-01T18:52:13Z

evalml/objectives/utils.py

+        prioritized_objective (str): An optional name of a priority objective that should be given heavier weight
+            than the other objectives contributing to the score. Defaults to None, where all objectives are
+            weighted equally.
+        prioritized_weight (float): The weight (maximum of 1) to attribute to the prioritized objective, if it exists.
+            Should be between 0 and 1. Defaults to 0.5.
+        custom_weights (dict[str,float]): A dictionary mapping objective names to corresponding weights between 0 and 1.
+            If all objectives are listed, should add up to 1. If a subset of objectives are listed, should add up to less
+            than 1, and remaining weight will be evenly distributed between the remaining objectives. Should not be used
+            at the same time as prioritized_objective.


It's my fault for doing this piecemeal - I blame being distracted - but perhaps we should consider an API where these two are put together and that the input is a dict of str:float. If there is one value, then perhaps that should be interpreted as the prioritized weight/obj.

As we've discussed previously, I'm hesitant to completely remove the prioritized objective/weight API. I think the benefit of the simplicity outweighs the slightly larger API. That being said, after thinking about it more I would be open to removing the prioritized_weight argument, and just keeping prioritized_objective. That allows users to have an easy way to say "I care about this one more", and if they want to be more specific about the weights, there is still custom_weights to open up that opportunity. Thoughts?

eccabay added 13 commits April 14, 2023 14:32

Start adding helper functions

3f20d38

Merge branch 'main' into 4149_recommendation_score

da95c77

Finish adding helper functions

96c0502

Add working tests for helper functions

1ec1074

Add get_recommendation_score to AutoMLSearch

97af39e

Merge branch 'main' into 4149_recommendation_score

33747d5

Add ranking score to results generation

eb636e7

Only normalize metrics if not bounded like percentage

8571937

Add check and test for imbalanced data (not working yet)

f050fdd

Add argument verification and tests to automl init

ee9b58d

Add ability to include ranking-only metrics in recommendation score

81ae2ea

Test fixes

356c8df

A few more test fixes

248154b

Codecov fixes and release notes

3eaa48c

eccabay marked this pull request as ready for review April 24, 2023 16:36

auto-assign bot assigned eccabay Apr 24, 2023

eccabay requested review from christopherbunn, chukarsten, jeremyliweishih and tamargrey April 24, 2023 16:36

eccabay added 2 commits April 24, 2023 12:06

Augment get_recommendation_scores to be more user-friendly

c730e52

Add recommendation score to docs

1f2b1ad

eccabay commented Apr 24, 2023

View reviewed changes

jeremyliweishih approved these changes Apr 25, 2023

View reviewed changes

eccabay added 2 commits April 26, 2023 09:59

PR comments

c8f594b

Lint

afc172a

chukarsten reviewed Apr 26, 2023

View reviewed changes

eccabay added 2 commits April 28, 2023 11:52

PR comments

f553979

Merge branch 'main' into 4149_recommendation_score

b643082

Add custom_objectives dict

641e3c0

eccabay requested a review from chukarsten May 1, 2023 12:37

eccabay added 4 commits May 1, 2023 14:00

Merge branch 'main' into 4149_recommendation_score

701b897

Refactor prioritized_objective to be a custom weight

862b7bb

Merge branch 'main' into 4149_recommendation_score

87950bc

Merge branch 'main' into 4149_recommendation_score

619c664

chukarsten reviewed May 1, 2023

View reviewed changes

evalml/objectives/utils.py Outdated Show resolved Hide resolved

chukarsten reviewed May 1, 2023

View reviewed changes

eccabay added 5 commits May 3, 2023 09:51

Just a bit of cleanup

1ac0bac

Merge branch 'main' into 4149_recommendation_score

df911a2

Remove prioritized_weight argument

7e5f5e9

Test fixes after removing priority weight argument

a9b09e6

Merge branch 'main' into 4149_recommendation_score

4dc0bb2

chukarsten approved these changes May 8, 2023

View reviewed changes

chukarsten and others added 4 commits May 8, 2023 13:52

Merge branch 'main' into 4149_recommendation_score

020c577

Merge branch 'main' into 4149_recommendation_score

5c0d93d

MAE should NOT be bounded_like_percentage YIKES

8fd9c04

Merge branch 'main' into 4149_recommendation_score

9b67d47

chukarsten merged commit d777c6c into main May 9, 2023

chukarsten deleted the 4149_recommendation_score branch May 9, 2023 17:48

chukarsten mentioned this pull request May 10, 2023

Release v0.76.0. #4182

Merged

		if prioritized_objective is not None:
		if prioritized_objective not in objectives:

		assert "already one of the default objectives" in caplog.text


		def test_recommendation_include_non_optimization(X_y_binary):



		@pytest.mark.parametrize("imbalanced_data", [True, False])
		def test_use_recommendation_score_imbalanced(

Conversation

eccabay commented Apr 21, 2023

Uh oh!

codecov bot commented Apr 21, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jeremyliweishih left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

eccabay Apr 26, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

codecov bot commented Apr 21, 2023 •

edited

Loading

eccabay Apr 26, 2023 •

edited

Loading