Skip to content
Snippets Groups Projects
Select Git revision
0 results

test_split.py

Blame
    • Raghav RV's avatar
      3f8743f4
      Main Commits - Major · 3f8743f4
      Raghav RV authored
      --------------------
      
      * ENH Reogranize classes/fn from grid_search into search.py
      * ENH Reogranize classes/fn from cross_validation into split.py
      * ENH Reogranize cls/fn from cross_validation/learning_curve into validate.py
      
      * MAINT Merge _check_cv into check_cv inside the model_selection module
      * MAINT Update all the imports to point to the model_selection module
      * FIX use iter_cv to iterate throught the new style/old style cv objs
      * TST Add tests for the new model_selection members
      * ENH Wrap the old-style cv obj/iterables instead of using iter_cv
      
      * ENH Use scipy's binomial coefficient function comb for calucation of nCk
      * ENH Few enhancements to the split module
      * ENH Improve check_cv input validation and docstring
      * MAINT _get_test_folds(X, y, labels) --> _get_test_folds(labels)
      * TST if 1d arrays for X introduce any errors
      * ENH use 1d X arrays for all tests;
      * ENH X_10 --> X (global var)
      
      Minor
      -----
      
      * ENH _PartitionIterator --> _BaseCrossValidator;
      * ENH CVIterator --> CVIterableWrapper
      * TST Import the old SKF locally
      * FIX/TST Clean up the split module's tests.
      * DOC Improve documentation of the cv parameter
      * COSMIT consistently hyphenate cross-validation/cross-validator
      * TST Calculate n_samples from X
      * COSMIT Use separate lines for each import.
      * COSMIT cross_validation_generator --> cross_validator
      
      Commits merged manually
      -----------------------
      
      * FIX Document the random_state attribute in RandomSearchCV
      * MAINT Use check_cv instead of _check_cv
      * ENH refactor OVO decision function, use it in SVC for sklearn-like
        decision_function shape
      * FIX avoid memory cost when sampling from large parameter grids
      
      ENH Major to Minor incremental enhancements to the model_selection
      
      Squashed commit messages - (For reference)
      
      Major
      -----
      
      * ENH p --> n_labels
      * FIX *ShuffleSplit: all float/invalid type errors at init and int error at split
      * FIX make PredefinedSplit accept test_folds in constructor; Cleanup docstrings
      * ENH+TST KFold: make rng to be generated at every split call for reproducibility
      * FIX/MAINT KFold: make shuffle a public attr
      * FIX Make CVIterableWrapper private.
      * FIX reuse len_cv instead of recalculating it
      * FIX Prevent adding *SearchCV estimators from the old grid_search module
      * re-FIX In all_estimators: the sorting to use only the 1st item (name)
          To avoid collision between the old and the new GridSearch classes.
      * FIX test_validate.py: Use 2D X (1D X is being detected as a single sample)
      * MAINT validate.py --> validation.py
      * MAINT make the submodules private
      * MAINT Support old cv/gs/lc until 0.19
      * FIX/MAINT n_splits --> get_n_splits
      * FIX/TST test_logistic.py/test_ovr_multinomial_iris:
          pass predefined folds as an iterable
      * MAINT expose BaseCrossValidator
      * Update the model_selection module with changes from master
        - From #5161
        -  - MAINT remove redundant p variable
        -  - Add check for sparse prediction in cross_val_predict
        - From #5201 - DOC improve random_state param doc
        - From #5190 - LabelKFold and test
        - From #4583 - LabelShuffleSplit and tests
        - From #5300 - shuffle the `labels` not the `indxs` in LabelKFold + tests
        - From #5378 - Make the GridSearchCV docs more accurate.
        - From #5458 - Remove shuffle from LabelKFold
        - From #5466(#4270) - Gaussian Process by Jan Metzen
        - From #4826 - Move custom error / warnings into sklearn.exception
      
      Minor
      -----
      
      * ENH Make the KFold shuffling test stronger
      * FIX/DOC Use the higher level model_selection module as ref
      * DOC in check_cv "y : array-like, optional"
      * DOC a supervised learning problem --> supervised learning problems
      * DOC cross-validators --> cross-validation strategies
      * DOC Correct Olivier Grisel's name ;)
      * MINOR/FIX cv_indices --> kfold
      * FIX/DOC Align the 'See also' section of the new KFold, LeaveOneOut
      * TST/FIX imports on separate lines
      * FIX use __class__ instead of classmethod
      * TST/FIX import directly from model_selection
      * COSMIT Relocate the random_state documentation
      * COSMIT remove pass
      * MAINT Remove deprecation warnings from old tests
      * FIX correct import at test_split
      * FIX/MAINT Move P_sparse, X, y defns to top; rm unused W_sparse, X_sparse
      * FIX random state to avoid doctest failure
      * TST n_splits and split wrapping of _CVIterableWrapper
      * FIX/MAINT Use multilabel indicator matrix directly
      * TST/DOC clarify why we conflate classes 0 and 1
      * DOC add comment that this was taken from BaseEstimator
      * FIX use of labels is not needed in stratified k fold
      * Fix cross_validation reference
      * Fix the labels param doc
      
      FIX/DOC/MAINT Addressing the review comments by Arnaud and Andy
      
      COSMIT Sort the members alphabetically
      COSMIT len_cv --> n_splits
      COSMIT Merge 2 if; FIX Use kwargs
      DOC Add my name to the authors :D
      DOC make labels parameter consistent
      FIX Remove hack for boolean indices; + COSMIT idx --> indices; DOC Add Returns
      COSMIT preds --> predictions
      DOC Add Returns and neatly arrange X, y, labels
      FIX idx(s)/ind(s)--> indice(s)
      COSMIT Merge if and else to elif
      COSMIT n --> n_samples
      COSMIT Use bincount only once
      COSMIT cls --> class_i / class_i (ith class indices) -->
      perm_indices_class_i
      
      FIX/ENH/TST Addressing the final reviews
      
      COSMIT c --> count
      FIX/TST make check_cv raise ValueError for string cv value
      TST nested cv (gs inside cross_val_score) works for diff cvs
      FIX/ENH Raise ValueError when labels is None for label based cvs;
      TST if labels is being passed correctly to the cv and that the
      ValueError is being propagated to the cross_val_score/predict and grid
      search
      FIX pass labels to cross_val_score
      FIX use make_classification
      DOC Add Returns; COSMIT Remove scaffolding
      TST add a test to check the _build_repr helper
      REVERT the old GS/RS should also be tested by the common tests.
      ENH Add a tuple of all/label based CVS
      FIX raise VE even at get_n_splits if labels is None
      FIX Fabian's comments
      PEP8
      3f8743f4
      History
      Main Commits - Major
      Raghav RV authored
      --------------------
      
      * ENH Reogranize classes/fn from grid_search into search.py
      * ENH Reogranize classes/fn from cross_validation into split.py
      * ENH Reogranize cls/fn from cross_validation/learning_curve into validate.py
      
      * MAINT Merge _check_cv into check_cv inside the model_selection module
      * MAINT Update all the imports to point to the model_selection module
      * FIX use iter_cv to iterate throught the new style/old style cv objs
      * TST Add tests for the new model_selection members
      * ENH Wrap the old-style cv obj/iterables instead of using iter_cv
      
      * ENH Use scipy's binomial coefficient function comb for calucation of nCk
      * ENH Few enhancements to the split module
      * ENH Improve check_cv input validation and docstring
      * MAINT _get_test_folds(X, y, labels) --> _get_test_folds(labels)
      * TST if 1d arrays for X introduce any errors
      * ENH use 1d X arrays for all tests;
      * ENH X_10 --> X (global var)
      
      Minor
      -----
      
      * ENH _PartitionIterator --> _BaseCrossValidator;
      * ENH CVIterator --> CVIterableWrapper
      * TST Import the old SKF locally
      * FIX/TST Clean up the split module's tests.
      * DOC Improve documentation of the cv parameter
      * COSMIT consistently hyphenate cross-validation/cross-validator
      * TST Calculate n_samples from X
      * COSMIT Use separate lines for each import.
      * COSMIT cross_validation_generator --> cross_validator
      
      Commits merged manually
      -----------------------
      
      * FIX Document the random_state attribute in RandomSearchCV
      * MAINT Use check_cv instead of _check_cv
      * ENH refactor OVO decision function, use it in SVC for sklearn-like
        decision_function shape
      * FIX avoid memory cost when sampling from large parameter grids
      
      ENH Major to Minor incremental enhancements to the model_selection
      
      Squashed commit messages - (For reference)
      
      Major
      -----
      
      * ENH p --> n_labels
      * FIX *ShuffleSplit: all float/invalid type errors at init and int error at split
      * FIX make PredefinedSplit accept test_folds in constructor; Cleanup docstrings
      * ENH+TST KFold: make rng to be generated at every split call for reproducibility
      * FIX/MAINT KFold: make shuffle a public attr
      * FIX Make CVIterableWrapper private.
      * FIX reuse len_cv instead of recalculating it
      * FIX Prevent adding *SearchCV estimators from the old grid_search module
      * re-FIX In all_estimators: the sorting to use only the 1st item (name)
          To avoid collision between the old and the new GridSearch classes.
      * FIX test_validate.py: Use 2D X (1D X is being detected as a single sample)
      * MAINT validate.py --> validation.py
      * MAINT make the submodules private
      * MAINT Support old cv/gs/lc until 0.19
      * FIX/MAINT n_splits --> get_n_splits
      * FIX/TST test_logistic.py/test_ovr_multinomial_iris:
          pass predefined folds as an iterable
      * MAINT expose BaseCrossValidator
      * Update the model_selection module with changes from master
        - From #5161
        -  - MAINT remove redundant p variable
        -  - Add check for sparse prediction in cross_val_predict
        - From #5201 - DOC improve random_state param doc
        - From #5190 - LabelKFold and test
        - From #4583 - LabelShuffleSplit and tests
        - From #5300 - shuffle the `labels` not the `indxs` in LabelKFold + tests
        - From #5378 - Make the GridSearchCV docs more accurate.
        - From #5458 - Remove shuffle from LabelKFold
        - From #5466(#4270) - Gaussian Process by Jan Metzen
        - From #4826 - Move custom error / warnings into sklearn.exception
      
      Minor
      -----
      
      * ENH Make the KFold shuffling test stronger
      * FIX/DOC Use the higher level model_selection module as ref
      * DOC in check_cv "y : array-like, optional"
      * DOC a supervised learning problem --> supervised learning problems
      * DOC cross-validators --> cross-validation strategies
      * DOC Correct Olivier Grisel's name ;)
      * MINOR/FIX cv_indices --> kfold
      * FIX/DOC Align the 'See also' section of the new KFold, LeaveOneOut
      * TST/FIX imports on separate lines
      * FIX use __class__ instead of classmethod
      * TST/FIX import directly from model_selection
      * COSMIT Relocate the random_state documentation
      * COSMIT remove pass
      * MAINT Remove deprecation warnings from old tests
      * FIX correct import at test_split
      * FIX/MAINT Move P_sparse, X, y defns to top; rm unused W_sparse, X_sparse
      * FIX random state to avoid doctest failure
      * TST n_splits and split wrapping of _CVIterableWrapper
      * FIX/MAINT Use multilabel indicator matrix directly
      * TST/DOC clarify why we conflate classes 0 and 1
      * DOC add comment that this was taken from BaseEstimator
      * FIX use of labels is not needed in stratified k fold
      * Fix cross_validation reference
      * Fix the labels param doc
      
      FIX/DOC/MAINT Addressing the review comments by Arnaud and Andy
      
      COSMIT Sort the members alphabetically
      COSMIT len_cv --> n_splits
      COSMIT Merge 2 if; FIX Use kwargs
      DOC Add my name to the authors :D
      DOC make labels parameter consistent
      FIX Remove hack for boolean indices; + COSMIT idx --> indices; DOC Add Returns
      COSMIT preds --> predictions
      DOC Add Returns and neatly arrange X, y, labels
      FIX idx(s)/ind(s)--> indice(s)
      COSMIT Merge if and else to elif
      COSMIT n --> n_samples
      COSMIT Use bincount only once
      COSMIT cls --> class_i / class_i (ith class indices) -->
      perm_indices_class_i
      
      FIX/ENH/TST Addressing the final reviews
      
      COSMIT c --> count
      FIX/TST make check_cv raise ValueError for string cv value
      TST nested cv (gs inside cross_val_score) works for diff cvs
      FIX/ENH Raise ValueError when labels is None for label based cvs;
      TST if labels is being passed correctly to the cv and that the
      ValueError is being propagated to the cross_val_score/predict and grid
      search
      FIX pass labels to cross_val_score
      FIX use make_classification
      DOC Add Returns; COSMIT Remove scaffolding
      TST add a test to check the _build_repr helper
      REVERT the old GS/RS should also be tested by the common tests.
      ENH Add a tuple of all/label based CVS
      FIX raise VE even at get_n_splits if labels is None
      FIX Fabian's comments
      PEP8