Skip to content
Snippets Groups Projects
  • Raghav RV's avatar
    a08555a2
    [MRG + 2] ENH Allow `cross_val_score`, `GridSearchCV` et al. to evaluate on... · a08555a2
    Raghav RV authored
    [MRG + 2] ENH Allow `cross_val_score`, `GridSearchCV` et al. to evaluate on multiple metrics (#7388)
    
    * ENH cross_val_score now supports multiple metrics
    
    * DOCFIX permutation_test_score
    
    * ENH validate multiple metric scorers
    
    * ENH Move validation of multimetric scoring param out
    
    * ENH GridSearchCV and RandomizedSearchCV now support multiple metrics
    
    * EXA Add an example demonstrating the multiple metric in GridSearchCV
    
    * ENH Let check_multimetric_scoring tell if its multimetric or not
    
    * FIX For single metric name of scorer should remain 'score'
    
    * ENH validation_curve and learning_curve now support multiple metrics
    
    * MNT move _aggregate_score_dicts helper into _validation.py
    
    * TST More testing/ Fixing scores to the correct values
    
    * EXA Add cross_val_score to multimetric example
    
    * Rename to multiple_metric_evaluation.py
    
    * MNT Remove scaffolding
    
    * FIX doctest imports
    
    * FIX wrap the scorer and unwrap the score when using _score() in rfe
    
    * TST Cleanup the tests. Test for is_multimetric too
    
    * TST Make sure it registers as single metric when scoring is of that type
    
    * PEP8
    
    * Don't use dict comprehension to make it work in python2.6
    
    * ENH/FIX/TST grid_scores_ should not be available for multimetric evaluation
    
    * FIX+TST delegated methods NA when multimetric is enabled...
    
    TST Add general tests to GridSearchCV and RandomizedSearchCV
    
    * ENH add option to disable delegation on multimetric scoring
    
    * Remove old function from __all__
    
    * flake8
    
    * FIX revert disable_on_multimetric
    
    * stash
    
    * Fix incorrect rebase
    
    * [ci skip]
    
    * Make sure refit works as expected and remove irrelevant tests
    
    * Allow passing standard scorers by name in multimetric scorers
    
    * Fix example
    
    * flake8
    
    * Address reviews
    
    * Fix indentation
    
    * Ensure {'acc': 'accuracy'} and ['precision'] are valid inputs
    
    * Test that for single metric, 'score' is a key
    
    * Typos
    
    * Fix incorrect rebase
    
    * Compare multimetric grid search with multiple single metric searches
    
    * Test X, y list and pandas input; Test multimetric for unsupervised grid search
    
    * Fix tests; Unsupervised multimetric gs will not pass until #8117 is merged
    
    * Make a plot of Precision vs ROC AUC for RandomForest varying the n_estimators
    
    * Add example to grid_search.rst
    
    * Use the classic tuning of C param in SVM instead of estimators in RF
    
    * FIX Remove scoring arg in deafult scorer test
    
    * flake8
    
    * Search for min_samples_split in DTC; Also show f-score
    
    * REVIEW Make check_multimetric_scoring private
    
    * FIX Add more samples to see if 3% mismatch on 32 bit systems gets fixed
    
    * REVIEW Plot best score; Shorten legends
    
    * REVIEW/COSMIT multimetric --> multi-metric
    
    * REVIEW Mark the best scores of P/R scores too
    
    * Revert "FIX Add more samples to see if 3% mismatch on 32 bit systems gets fixed"
    
    This reverts commit ba766d98353380a186fbc3dade211670ee72726d.
    
    * ENH Use looping for iid testing
    
    * FIX use param grid as scipy's stats dist in 0.12 do not accept seed
    
    * ENH more looping less code; Use small non-noisy dataset
    
    * FIX Use named arg after expanded args
    
    * TST More testing of the refit parameter
    
    * Test that in multimetric search refit to single metric, the delegated methods
      work as expected.
    * Test that setting probability=False works with multimetric too
    * Test refit=False gives sensible error
    
    * COSMIT multimetric --> multi-metric
    
    * REV Correct example doc
    
    * COSMIT
    
    * REVIEW Make tests stronger; Fix bugs in _check_multimetric_scorer
    
    * REVIEW refit param: Raise for empty strings
    
    * TST Invalid refit params
    
    * REVIEW Use <scorer_name> alone; recall --> Recall
    
    * REV specify when we expect scorers to not be None
    
    * FLAKE8
    
    * REVERT multimetrics in learning_curve and validation_curve
    
    * REVIEW Simpler coding style
    
    * COSMIT
    
    * COSMIT
    
    * REV Compress example a bit. Move comment to top
    
    * FIX fit_grid_point's previous API must be preserved
    
    * Flake8
    
    * TST Use loop; Compare with single-metric
    
    * REVIEW Use dict-comprehension instead of helper
    
    * REVIEW Remove redundant test
    
    * Fix tests incorrect braces
    
    * COSMIT
    
    * REVIEW Use regexp
    
    * REV Simplify aggregation of score dicts
    
    * FIX precision and accuracy test
    
    * FIX doctest and flake8
    
    * TST the best_* attributes multimetric with single metric
    
    * Address @jnothman's review
    
    * Address more comments \o/
    
    * DOCFIXES
    
    * Fix use the validated fit_param from fit's arguments
    
    * Revert alpha to a lower value as before
    
    * Using def instead of lambda
    
    * Address @jnothman's review batch 1: Fix tests / Doc fixes
    
    * Remove superfluous tests
    
    * Remove more superfluous testing
    
    * TST/FIX loop over refit and check found n_clusters
    
    * Cosmetic touches
    
    * Use zip instead of manually listing the keys
    
    * Fix inverse_transform
    
    * FIX bug in fit_grid_point; Allow only single score
    
    TST if fit_grid_point works as intended
    
    * ENH Use only ROC-AUC and F1-score
    
    * Fix typos and flake8; Address Andy's reviews
    
    MNT Add a comment on why we do such a transpose + some fixes
    
    * ENH Better error messages for incorrect multimetric scoring values +...
    
    ENH Avoid exception traceback while using incorrect scoring string
    
    * Dict keys must be of string type only
    
    * 1. Better error message for invalid scoring 2...
    Internal functions return single score for single metric scoring
    
    * Fix test failures and shuffle tests
    
    * Avoid wrapping scorer as dict in learning_curve
    
    * Remove doc example as asked for
    
    * Some leftover ones
    
    * Don't wrap scorer in validation_curve either
    
    * Add a doc example and skip it as dict order fails doctest
    
    * Import zip from six for python2.7 compat
    
    * Make cross_val_score return a cv_results-like dict
    
    * Add relevant sections to userguide
    
    * Flake8 fixes
    
    * Add whatsnew and fix broken links
    
    * Use AUC and accuracy instead of f1
    
    * Fix failing doctests cross_validation.rst
    
    * DOC add the wrapper example for metrics that return multiple return values
    
    * Address andy's comments
    
    * Be less weird
    
    * Address more of andy's comments
    
    * Make a separate cross_validate function to return dict and a cross_val_score
    
    * Update the docs to reflect the new cross_validate function
    
    * Add cross_validate to toc-tree
    
    * Add more tests on type of cross_validate return and time limits
    
    * FIX failing doctests
    
    * FIX ensure keys are not plural
    
    * DOC fix
    
    * Address some pending comments
    
    * Remove the comment as it is irrelevant now
    
    * Remove excess blank line
    
    * Fix flake8 inconsistencies
    
    * Allow fit_times to be 0 to conform with windows precision
    
    * DOC specify how refit param is to be set in multiple metric case
    
    * TST ensure cross_validate works for string single metrics + address @jnothman's reviews
    
    * Doc fixes
    
    * Remove the shape and transform parameter of _aggregate_score_dicts
    
    * Address Joel's doc comments
    
    * Fix broken doctest
    
    * Fix the spurious file
    
    * Address Andy's comments
    
    * MNT Remove erroneous entry
    
    * Address Andy's comments
    
    * FIX broken links
    
    * Update whats_new.rst
    
    missing newline
    a08555a2
    History
    [MRG + 2] ENH Allow `cross_val_score`, `GridSearchCV` et al. to evaluate on...
    Raghav RV authored
    [MRG + 2] ENH Allow `cross_val_score`, `GridSearchCV` et al. to evaluate on multiple metrics (#7388)
    
    * ENH cross_val_score now supports multiple metrics
    
    * DOCFIX permutation_test_score
    
    * ENH validate multiple metric scorers
    
    * ENH Move validation of multimetric scoring param out
    
    * ENH GridSearchCV and RandomizedSearchCV now support multiple metrics
    
    * EXA Add an example demonstrating the multiple metric in GridSearchCV
    
    * ENH Let check_multimetric_scoring tell if its multimetric or not
    
    * FIX For single metric name of scorer should remain 'score'
    
    * ENH validation_curve and learning_curve now support multiple metrics
    
    * MNT move _aggregate_score_dicts helper into _validation.py
    
    * TST More testing/ Fixing scores to the correct values
    
    * EXA Add cross_val_score to multimetric example
    
    * Rename to multiple_metric_evaluation.py
    
    * MNT Remove scaffolding
    
    * FIX doctest imports
    
    * FIX wrap the scorer and unwrap the score when using _score() in rfe
    
    * TST Cleanup the tests. Test for is_multimetric too
    
    * TST Make sure it registers as single metric when scoring is of that type
    
    * PEP8
    
    * Don't use dict comprehension to make it work in python2.6
    
    * ENH/FIX/TST grid_scores_ should not be available for multimetric evaluation
    
    * FIX+TST delegated methods NA when multimetric is enabled...
    
    TST Add general tests to GridSearchCV and RandomizedSearchCV
    
    * ENH add option to disable delegation on multimetric scoring
    
    * Remove old function from __all__
    
    * flake8
    
    * FIX revert disable_on_multimetric
    
    * stash
    
    * Fix incorrect rebase
    
    * [ci skip]
    
    * Make sure refit works as expected and remove irrelevant tests
    
    * Allow passing standard scorers by name in multimetric scorers
    
    * Fix example
    
    * flake8
    
    * Address reviews
    
    * Fix indentation
    
    * Ensure {'acc': 'accuracy'} and ['precision'] are valid inputs
    
    * Test that for single metric, 'score' is a key
    
    * Typos
    
    * Fix incorrect rebase
    
    * Compare multimetric grid search with multiple single metric searches
    
    * Test X, y list and pandas input; Test multimetric for unsupervised grid search
    
    * Fix tests; Unsupervised multimetric gs will not pass until #8117 is merged
    
    * Make a plot of Precision vs ROC AUC for RandomForest varying the n_estimators
    
    * Add example to grid_search.rst
    
    * Use the classic tuning of C param in SVM instead of estimators in RF
    
    * FIX Remove scoring arg in deafult scorer test
    
    * flake8
    
    * Search for min_samples_split in DTC; Also show f-score
    
    * REVIEW Make check_multimetric_scoring private
    
    * FIX Add more samples to see if 3% mismatch on 32 bit systems gets fixed
    
    * REVIEW Plot best score; Shorten legends
    
    * REVIEW/COSMIT multimetric --> multi-metric
    
    * REVIEW Mark the best scores of P/R scores too
    
    * Revert "FIX Add more samples to see if 3% mismatch on 32 bit systems gets fixed"
    
    This reverts commit ba766d98353380a186fbc3dade211670ee72726d.
    
    * ENH Use looping for iid testing
    
    * FIX use param grid as scipy's stats dist in 0.12 do not accept seed
    
    * ENH more looping less code; Use small non-noisy dataset
    
    * FIX Use named arg after expanded args
    
    * TST More testing of the refit parameter
    
    * Test that in multimetric search refit to single metric, the delegated methods
      work as expected.
    * Test that setting probability=False works with multimetric too
    * Test refit=False gives sensible error
    
    * COSMIT multimetric --> multi-metric
    
    * REV Correct example doc
    
    * COSMIT
    
    * REVIEW Make tests stronger; Fix bugs in _check_multimetric_scorer
    
    * REVIEW refit param: Raise for empty strings
    
    * TST Invalid refit params
    
    * REVIEW Use <scorer_name> alone; recall --> Recall
    
    * REV specify when we expect scorers to not be None
    
    * FLAKE8
    
    * REVERT multimetrics in learning_curve and validation_curve
    
    * REVIEW Simpler coding style
    
    * COSMIT
    
    * COSMIT
    
    * REV Compress example a bit. Move comment to top
    
    * FIX fit_grid_point's previous API must be preserved
    
    * Flake8
    
    * TST Use loop; Compare with single-metric
    
    * REVIEW Use dict-comprehension instead of helper
    
    * REVIEW Remove redundant test
    
    * Fix tests incorrect braces
    
    * COSMIT
    
    * REVIEW Use regexp
    
    * REV Simplify aggregation of score dicts
    
    * FIX precision and accuracy test
    
    * FIX doctest and flake8
    
    * TST the best_* attributes multimetric with single metric
    
    * Address @jnothman's review
    
    * Address more comments \o/
    
    * DOCFIXES
    
    * Fix use the validated fit_param from fit's arguments
    
    * Revert alpha to a lower value as before
    
    * Using def instead of lambda
    
    * Address @jnothman's review batch 1: Fix tests / Doc fixes
    
    * Remove superfluous tests
    
    * Remove more superfluous testing
    
    * TST/FIX loop over refit and check found n_clusters
    
    * Cosmetic touches
    
    * Use zip instead of manually listing the keys
    
    * Fix inverse_transform
    
    * FIX bug in fit_grid_point; Allow only single score
    
    TST if fit_grid_point works as intended
    
    * ENH Use only ROC-AUC and F1-score
    
    * Fix typos and flake8; Address Andy's reviews
    
    MNT Add a comment on why we do such a transpose + some fixes
    
    * ENH Better error messages for incorrect multimetric scoring values +...
    
    ENH Avoid exception traceback while using incorrect scoring string
    
    * Dict keys must be of string type only
    
    * 1. Better error message for invalid scoring 2...
    Internal functions return single score for single metric scoring
    
    * Fix test failures and shuffle tests
    
    * Avoid wrapping scorer as dict in learning_curve
    
    * Remove doc example as asked for
    
    * Some leftover ones
    
    * Don't wrap scorer in validation_curve either
    
    * Add a doc example and skip it as dict order fails doctest
    
    * Import zip from six for python2.7 compat
    
    * Make cross_val_score return a cv_results-like dict
    
    * Add relevant sections to userguide
    
    * Flake8 fixes
    
    * Add whatsnew and fix broken links
    
    * Use AUC and accuracy instead of f1
    
    * Fix failing doctests cross_validation.rst
    
    * DOC add the wrapper example for metrics that return multiple return values
    
    * Address andy's comments
    
    * Be less weird
    
    * Address more of andy's comments
    
    * Make a separate cross_validate function to return dict and a cross_val_score
    
    * Update the docs to reflect the new cross_validate function
    
    * Add cross_validate to toc-tree
    
    * Add more tests on type of cross_validate return and time limits
    
    * FIX failing doctests
    
    * FIX ensure keys are not plural
    
    * DOC fix
    
    * Address some pending comments
    
    * Remove the comment as it is irrelevant now
    
    * Remove excess blank line
    
    * Fix flake8 inconsistencies
    
    * Allow fit_times to be 0 to conform with windows precision
    
    * DOC specify how refit param is to be set in multiple metric case
    
    * TST ensure cross_validate works for string single metrics + address @jnothman's reviews
    
    * Doc fixes
    
    * Remove the shape and transform parameter of _aggregate_score_dicts
    
    * Address Joel's doc comments
    
    * Fix broken doctest
    
    * Fix the spurious file
    
    * Address Andy's comments
    
    * MNT Remove erroneous entry
    
    * Address Andy's comments
    
    * FIX broken links
    
    * Update whats_new.rst
    
    missing newline