diff --git a/doc/datasets/index.rst b/doc/datasets/index.rst index cc258422a421177d257e00378e77b1b43c09b043..7bff294e52048dcae1657a576ea4670b25aebf86 100644 --- a/doc/datasets/index.rst +++ b/doc/datasets/index.rst @@ -81,7 +81,7 @@ and pipeline on 2D data. load_sample_images load_sample_image -.. image:: ../auto_examples/cluster/images/plot_color_quantization_1.png +.. image:: ../auto_examples/cluster/images/plot_color_quantization_001.png :target: ../auto_examples/cluster/plot_color_quantization.html :scale: 30 :align: right @@ -108,7 +108,7 @@ Sample generators In addition, scikit-learn includes various random sample generators that can be used to build artificial datasets of controlled size and complexity. -.. image:: ../auto_examples/datasets/images/plot_random_dataset_1.png +.. image:: ../auto_examples/datasets/images/plot_random_dataset_001.png :target: ../auto_examples/datasets/plot_random_dataset.html :scale: 50 :align: center diff --git a/doc/modules/biclustering.rst b/doc/modules/biclustering.rst index 1ad5eae4862f8daff2a2ba41a7910c6f0e240a4b..e4583b451e17f8b799d4ef0b6d9ba3e49978b959 100644 --- a/doc/modules/biclustering.rst +++ b/doc/modules/biclustering.rst @@ -44,8 +44,8 @@ biclusters on the diagonal. Here is an example of this structure where biclusters have higher average values than the other rows and columns: -.. figure:: ../auto_examples/bicluster/images/plot_spectral_coclustering_3.png - :target: ../auto_examples/bicluster/images/plot_spectral_coclustering_3.png +.. figure:: ../auto_examples/bicluster/images/plot_spectral_coclustering_003.png + :target: ../auto_examples/bicluster/images/plot_spectral_coclustering_003.png :align: center :scale: 50 @@ -56,8 +56,8 @@ each column belongs to all row clusters. Here is an example of this structure where the variance of the values within each bicluster is small: -.. figure:: ../auto_examples/bicluster/images/plot_spectral_biclustering_3.png - :target: ../auto_examples/bicluster/images/plot_spectral_biclustering_3.png +.. figure:: ../auto_examples/bicluster/images/plot_spectral_biclustering_003.png + :target: ../auto_examples/bicluster/images/plot_spectral_biclustering_003.png :align: center :scale: 50 diff --git a/doc/modules/clustering.rst b/doc/modules/clustering.rst index 6ed6ac742f90e4ed5bb8b733aebc078946740d7b..4cb91923aafeff52799036fab32fca80b8eff978 100644 --- a/doc/modules/clustering.rst +++ b/doc/modules/clustering.rst @@ -33,7 +33,7 @@ data can be found in the ``labels_`` attribute. Overview of clustering methods =============================== -.. figure:: ../auto_examples/cluster/images/plot_cluster_comparison_1.png +.. figure:: ../auto_examples/cluster/images/plot_cluster_comparison_001.png :target: ../auto_examples/cluster/plot_cluster_comparison.html :align: center :scale: 50 @@ -161,7 +161,7 @@ and the new centroids are computed and the algorithm repeats these last two steps until this value is less than a threshold. In other words, it repeats until the centroids do not move significantly. -.. image:: ../auto_examples/cluster/images/plot_kmeans_digits_1.png +.. image:: ../auto_examples/cluster/images/plot_kmeans_digits_001.png :target: ../auto_examples/cluster/plot_kmeans_digits.html :align: right :scale: 35 @@ -245,7 +245,7 @@ convergence or a predetermined number of iterations is reached. of the results is reduced. In practice this difference in quality can be quite small, as shown in the example and cited reference. -.. figure:: ../auto_examples/cluster/images/plot_mini_batch_kmeans_1.png +.. figure:: ../auto_examples/cluster/images/plot_mini_batch_kmeans_001.png :target: ../auto_examples/cluster/plot_mini_batch_kmeans.html :align: center :scale: 100 @@ -283,7 +283,7 @@ values from other pairs. This updating happens iteratively until convergence, at which point the final exemplars are chosen, and hence the final clustering is given. -.. figure:: ../auto_examples/cluster/images/plot_affinity_propagation_1.png +.. figure:: ../auto_examples/cluster/images/plot_affinity_propagation_001.png :target: ../auto_examples/cluster/plot_affinity_propagation.html :align: center :scale: 50 @@ -384,7 +384,7 @@ Labelling a new sample is performed by finding the nearest centroid for a given sample. -.. figure:: ../auto_examples/cluster/images/plot_mean_shift_1.png +.. figure:: ../auto_examples/cluster/images/plot_mean_shift_001.png :target: ../auto_examples/cluster/plot_mean_shift.html :align: center :scale: 50 @@ -424,11 +424,11 @@ graph vertices are pixels, and edges of the similarity graph are a function of the gradient of the image. -.. |noisy_img| image:: ../auto_examples/cluster/images/plot_segmentation_toy_1.png +.. |noisy_img| image:: ../auto_examples/cluster/images/plot_segmentation_toy_001.png :target: ../auto_examples/cluster/plot_segmentation_toy.html :scale: 50 -.. |segmented_img| image:: ../auto_examples/cluster/images/plot_segmentation_toy_2.png +.. |segmented_img| image:: ../auto_examples/cluster/images/plot_segmentation_toy_002.png :target: ../auto_examples/cluster/plot_segmentation_toy.html :scale: 50 @@ -455,11 +455,11 @@ function of the gradient of the image. * :ref:`example_cluster_plot_lena_segmentation.py`: Spectral clustering to split the image of lena in regions. -.. |lena_kmeans| image:: ../auto_examples/cluster/images/plot_lena_segmentation_1.png +.. |lena_kmeans| image:: ../auto_examples/cluster/images/plot_lena_segmentation_001.png :target: ../auto_examples/cluster/plot_lena_segmentation.html :scale: 65 -.. |lena_discretize| image:: ../auto_examples/cluster/images/plot_lena_segmentation_2.png +.. |lena_discretize| image:: ../auto_examples/cluster/images/plot_lena_segmentation_002.png :target: ../auto_examples/cluster/plot_lena_segmentation.html :scale: 65 @@ -545,15 +545,15 @@ Different linkage type: Ward, complete and average linkage :class:`AgglomerativeClustering` supports Ward, average, and complete linkage strategies. -.. image:: ../auto_examples/cluster/images/plot_digits_linkage_1.png +.. image:: ../auto_examples/cluster/images/plot_digits_linkage_001.png :target: ../auto_examples/cluster/plot_digits_linkage.html :scale: 43 -.. image:: ../auto_examples/cluster/images/plot_digits_linkage_2.png +.. image:: ../auto_examples/cluster/images/plot_digits_linkage_002.png :target: ../auto_examples/cluster/plot_digits_linkage.html :scale: 43 -.. image:: ../auto_examples/cluster/images/plot_digits_linkage_3.png +.. image:: ../auto_examples/cluster/images/plot_digits_linkage_003.png :target: ../auto_examples/cluster/plot_digits_linkage.html :scale: 43 @@ -582,11 +582,11 @@ constraints forbid the merging of points that are not adjacent on the swiss roll, and thus avoid forming clusters that extend across overlapping folds of the roll. -.. |unstructured| image:: ../auto_examples/cluster/images/plot_ward_structured_vs_unstructured_1.png +.. |unstructured| image:: ../auto_examples/cluster/images/plot_ward_structured_vs_unstructured_001.png :target: ../auto_examples/cluster/plot_ward_structured_vs_unstructured.html :scale: 49 -.. |structured| image:: ../auto_examples/cluster/images/plot_ward_structured_vs_unstructured_2.png +.. |structured| image:: ../auto_examples/cluster/images/plot_ward_structured_vs_unstructured_002.png :target: ../auto_examples/cluster/plot_ward_structured_vs_unstructured.html :scale: 49 @@ -634,19 +634,19 @@ enable only merging of neighboring pixels on an image, as in the clusters and almost empty ones. (see the discussion in :ref:`example_cluster_plot_agglomerative_clustering.py`). -.. image:: ../auto_examples/cluster/images/plot_agglomerative_clustering_1.png +.. image:: ../auto_examples/cluster/images/plot_agglomerative_clustering_001.png :target: ../auto_examples/cluster/plot_agglomerative_clustering.html :scale: 38 -.. image:: ../auto_examples/cluster/images/plot_agglomerative_clustering_2.png +.. image:: ../auto_examples/cluster/images/plot_agglomerative_clustering_002.png :target: ../auto_examples/cluster/plot_agglomerative_clustering.html :scale: 38 -.. image:: ../auto_examples/cluster/images/plot_agglomerative_clustering_3.png +.. image:: ../auto_examples/cluster/images/plot_agglomerative_clustering_003.png :target: ../auto_examples/cluster/plot_agglomerative_clustering.html :scale: 38 -.. image:: ../auto_examples/cluster/images/plot_agglomerative_clustering_4.png +.. image:: ../auto_examples/cluster/images/plot_agglomerative_clustering_004.png :target: ../auto_examples/cluster/plot_agglomerative_clustering.html :scale: 38 @@ -670,15 +670,15 @@ The guidelines for choosing a metric is to use one that maximizes the distance between samples in different classes, and minimizes that within each class. -.. image:: ../auto_examples/cluster/images/plot_agglomerative_clustering_metrics_5.png +.. image:: ../auto_examples/cluster/images/plot_agglomerative_clustering_metrics_005.png :target: ../auto_examples/cluster/plot_agglomerative_clustering_metrics.html :scale: 32 -.. image:: ../auto_examples/cluster/images/plot_agglomerative_clustering_metrics_6.png +.. image:: ../auto_examples/cluster/images/plot_agglomerative_clustering_metrics_006.png :target: ../auto_examples/cluster/plot_agglomerative_clustering_metrics.html :scale: 32 -.. image:: ../auto_examples/cluster/images/plot_agglomerative_clustering_metrics_7.png +.. image:: ../auto_examples/cluster/images/plot_agglomerative_clustering_metrics_007.png :target: ../auto_examples/cluster/plot_agglomerative_clustering_metrics.html :scale: 32 @@ -728,7 +728,7 @@ indicating core samples found by the algorithm. Smaller circles are non-core samples that are still part of a cluster. Moreover, the outliers are indicated by black points below. -.. |dbscan_results| image:: ../auto_examples/cluster/images/plot_dbscan_1.png +.. |dbscan_results| image:: ../auto_examples/cluster/images/plot_dbscan_001.png :target: ../auto_examples/cluster/plot_dbscan.html :scale: 50 @@ -1169,7 +1169,7 @@ Drawbacks smaller sample sizes or larger number of clusters it is safer to use an adjusted index such as the Adjusted Rand Index (ARI)**. -.. figure:: ../auto_examples/cluster/images/plot_adjusted_for_chance_measures_1.png +.. figure:: ../auto_examples/cluster/images/plot_adjusted_for_chance_measures_001.png :target: ../auto_examples/cluster/plot_adjusted_for_chance_measures.html :align: center :scale: 100 diff --git a/doc/modules/computational_performance.rst b/doc/modules/computational_performance.rst index f421c40fac684e6a8727eea55c9f1806acf3e512..cc5a792a47d572cc3a69f4d78eb25d79570a478d 100644 --- a/doc/modules/computational_performance.rst +++ b/doc/modules/computational_performance.rst @@ -51,13 +51,13 @@ linear algebra libraries optimizations etc.). Here we see on a setting with few features that independently of estimator choice the bulk mode is always faster, and for some of them by 1 to 2 orders of magnitude: -.. |atomic_prediction_latency| image:: ../auto_examples/applications/images/plot_prediction_latency_1.png +.. |atomic_prediction_latency| image:: ../auto_examples/applications/images/plot_prediction_latency_001.png :target: ../auto_examples/applications/plot_prediction_latency.html :scale: 80 .. centered:: |atomic_prediction_latency| -.. |bulk_prediction_latency| image:: ../auto_examples/applications/images/plot_prediction_latency_2.png +.. |bulk_prediction_latency| image:: ../auto_examples/applications/images/plot_prediction_latency_002.png :target: ../auto_examples/applications/plot_prediction_latency.html :scale: 80 @@ -79,7 +79,7 @@ From a computing perspective it also means that the number of basic operations too. Here is a graph of the evolution of the prediction latency with the number of features: -.. |influence_of_n_features_on_latency| image:: ../auto_examples/applications/images/plot_prediction_latency_3.png +.. |influence_of_n_features_on_latency| image:: ../auto_examples/applications/images/plot_prediction_latency_003.png :target: ../auto_examples/applications/plot_prediction_latency.html :scale: 80 @@ -148,7 +148,7 @@ describe it fully. Of course sparsity influences in turn the prediction time as the sparse dot-product takes time roughly proportional to the number of non-zero coefficients. -.. |en_model_complexity| image:: ../auto_examples/applications/images/plot_model_complexity_influence_1.png +.. |en_model_complexity| image:: ../auto_examples/applications/images/plot_model_complexity_influence_001.png :target: ../auto_examples/applications/plot_model_complexity_influence.html :scale: 80 @@ -163,7 +163,7 @@ support vector. In the following graph the ``nu`` parameter of :class:`sklearn.svm.classes.NuSVR` was used to influence the number of support vectors. -.. |nusvr_model_complexity| image:: ../auto_examples/applications/images/plot_model_complexity_influence_2.png +.. |nusvr_model_complexity| image:: ../auto_examples/applications/images/plot_model_complexity_influence_002.png :target: ../auto_examples/applications/plot_model_complexity_influence.html :scale: 80 @@ -175,7 +175,7 @@ important role. Latency and throughput should scale linearly with the number of trees. In this case we used directly the ``n_estimators`` parameter of :class:`sklearn.ensemble.gradient_boosting.GradientBoostingRegressor`. -.. |gbt_model_complexity| image:: ../auto_examples/applications/images/plot_model_complexity_influence_3.png +.. |gbt_model_complexity| image:: ../auto_examples/applications/images/plot_model_complexity_influence_003.png :target: ../auto_examples/applications/plot_model_complexity_influence.html :scale: 80 @@ -199,7 +199,7 @@ files, tokenizing the text and hashing it into a common vector space) is taking 100 to 500 times more time than the actual prediction code, depending on the chosen model. - .. |prediction_time| image:: ../auto_examples/applications/images/plot_out_of_core_classification_4.png + .. |prediction_time| image:: ../auto_examples/applications/images/plot_out_of_core_classification_004.png :target: ../auto_examples/applications/plot_out_of_core_classification.html :scale: 80 @@ -218,7 +218,7 @@ time. Here is a benchmark from the :ref:`example_applications_plot_prediction_latency.py` example that measures this quantity for a number of estimators on synthetic data: -.. |throughput_benchmark| image:: ../auto_examples/applications/images/plot_prediction_latency_4.png +.. |throughput_benchmark| image:: ../auto_examples/applications/images/plot_prediction_latency_004.png :target: ../auto_examples/applications/plot_prediction_latency.html :scale: 80 diff --git a/doc/modules/covariance.rst b/doc/modules/covariance.rst index 9ccb8c964e0b0a48f61874ab76d01e11d5ee7423..9da5be0a4083be27145bbf37ddb2f5bf180aa053 100644 --- a/doc/modules/covariance.rst +++ b/doc/modules/covariance.rst @@ -133,7 +133,7 @@ with the :meth:`oas` function of the `sklearn.covariance` package, or it can be otherwise obtained by fitting an :class:`OAS` object to the same sample. -.. figure:: ../auto_examples/covariance/images/plot_covariance_estimation_1.png +.. figure:: ../auto_examples/covariance/images/plot_covariance_estimation_001.png :target: ../auto_examples/covariance/plot_covariance_estimation.html :align: center :scale: 65% @@ -155,7 +155,7 @@ object to the same sample. an :class:`OAS` estimator of the covariance. -.. figure:: ../auto_examples/covariance/images/plot_lw_vs_oas_1.png +.. figure:: ../auto_examples/covariance/images/plot_lw_vs_oas_001.png :target: ../auto_examples/covariance/plot_lw_vs_oas.html :align: center :scale: 75% @@ -187,7 +187,7 @@ the precision matrix: the higher its ``alpha`` parameter, the more sparse the precision matrix. The corresponding :class:`GraphLassoCV` object uses cross-validation to automatically set the ``alpha`` parameter. -.. figure:: ../auto_examples/covariance/images/plot_sparse_cov_1.png +.. figure:: ../auto_examples/covariance/images/plot_sparse_cov_001.png :target: ../auto_examples/covariance/plot_sparse_cov.html :align: center :scale: 60% @@ -308,11 +308,11 @@ attributes of a :class:`MinCovDet` robust covariance estimator object. :class:`MinCovDet` covariance estimators in terms of Mahalanobis distance (so we get a better estimate of the precision matrix too). -.. |robust_vs_emp| image:: ../auto_examples/covariance/images/plot_robust_vs_empirical_covariance_1.png +.. |robust_vs_emp| image:: ../auto_examples/covariance/images/plot_robust_vs_empirical_covariance_001.png :target: ../auto_examples/covariance/plot_robust_vs_empirical_covariance.html :scale: 49% -.. |mahalanobis| image:: ../auto_examples/covariance/images/plot_mahalanobis_distances_1.png +.. |mahalanobis| image:: ../auto_examples/covariance/images/plot_mahalanobis_distances_001.png :target: ../auto_examples/covariance/plot_mahalanobis_distances.html :scale: 49% diff --git a/doc/modules/cross_decomposition.rst b/doc/modules/cross_decomposition.rst index caa3bccdfed4815bb1d4eb4960d54d60c9366ed8..c55a2168458a06108e66940df2551bf64d160f24 100644 --- a/doc/modules/cross_decomposition.rst +++ b/doc/modules/cross_decomposition.rst @@ -13,7 +13,7 @@ These families of algorithms are useful to find linear relations between two multivariate datasets: the ``X`` and ``Y`` arguments of the ``fit`` method are 2D arrays. -.. figure:: ../auto_examples/cross_decomposition/images/plot_compare_cross_decomposition_1.png +.. figure:: ../auto_examples/cross_decomposition/images/plot_compare_cross_decomposition_001.png :target: ../auto_examples/cross_decomposition/plot_compare_cross_decomposition.html :scale: 75% :align: center diff --git a/doc/modules/decomposition.rst b/doc/modules/decomposition.rst index 33732aaa0630dbdbcab19d2803e5bf66892fc1be..74c47b4a2c70862a39f75b5dacbdc8f25b77342a 100644 --- a/doc/modules/decomposition.rst +++ b/doc/modules/decomposition.rst @@ -34,7 +34,7 @@ longer exact since some information is lost while forward transforming. Below is an example of the iris dataset, which is comprised of 4 features, projected on the 2 dimensions that explain most variance: -.. figure:: ../auto_examples/decomposition/images/plot_pca_vs_lda_1.png +.. figure:: ../auto_examples/decomposition/images/plot_pca_vs_lda_001.png :target: ../auto_examples/decomposition/plot_pca_vs_lda.html :align: center :scale: 75% @@ -45,7 +45,7 @@ probabilistic interpretation of the PCA that can give a likelihood of data based on the amount of variance it explains. As such it implements a `score` method that can be used in cross-validation: -.. figure:: ../auto_examples/decomposition/images/plot_pca_vs_fa_model_selection_1.png +.. figure:: ../auto_examples/decomposition/images/plot_pca_vs_fa_model_selection_001.png :target: ../auto_examples/decomposition/plot_pca_vs_fa_model_selection.html :align: center :scale: 75% @@ -89,11 +89,11 @@ singular vectors reshaped as portraits. Since we only require the top and :math:`n_{features} = 64 \times 64 = 4096`, the computation time it less than 1s: -.. |orig_img| image:: ../auto_examples/decomposition/images/plot_faces_decomposition_1.png +.. |orig_img| image:: ../auto_examples/decomposition/images/plot_faces_decomposition_001.png :target: ../auto_examples/decomposition/plot_faces_decomposition.html :scale: 60% -.. |pca_img| image:: ../auto_examples/decomposition/images/plot_faces_decomposition_2.png +.. |pca_img| image:: ../auto_examples/decomposition/images/plot_faces_decomposition_002.png :target: ../auto_examples/decomposition/plot_faces_decomposition.html :scale: 60% @@ -147,7 +147,7 @@ applications including denoising, compression and structured prediction (kernel dependency estimation). :class:`KernelPCA` supports both ``transform`` and ``inverse_transform``. -.. figure:: ../auto_examples/decomposition/images/plot_kernel_pca_1.png +.. figure:: ../auto_examples/decomposition/images/plot_kernel_pca_001.png :target: ../auto_examples/decomposition/plot_kernel_pca.html :align: center :scale: 75% @@ -197,7 +197,7 @@ norms that take into account adjacency and different kinds of structure; see For more details on how to use Sparse PCA, see the Examples section, below. -.. |spca_img| image:: ../auto_examples/decomposition/images/plot_faces_decomposition_5.png +.. |spca_img| image:: ../auto_examples/decomposition/images/plot_faces_decomposition_005.png :target: ../auto_examples/decomposition/plot_faces_decomposition.html :scale: 60% @@ -401,11 +401,11 @@ dictionary fixed, and then updating the dictionary to best fit the sparse code. 0 \leq k < n_{atoms} -.. |pca_img2| image:: ../auto_examples/decomposition/images/plot_faces_decomposition_2.png +.. |pca_img2| image:: ../auto_examples/decomposition/images/plot_faces_decomposition_002.png :target: ../auto_examples/decomposition/plot_faces_decomposition.html :scale: 60% -.. |dict_img2| image:: ../auto_examples/decomposition/images/plot_faces_decomposition_6.png +.. |dict_img2| image:: ../auto_examples/decomposition/images/plot_faces_decomposition_006.png :target: ../auto_examples/decomposition/plot_faces_decomposition.html :scale: 60% @@ -420,7 +420,7 @@ The following image shows how a dictionary learned from 4x4 pixel image patches extracted from part of the image of Lena looks like. -.. figure:: ../auto_examples/decomposition/images/plot_image_denoising_1.png +.. figure:: ../auto_examples/decomposition/images/plot_image_denoising_001.png :target: ../auto_examples/decomposition/plot_image_denoising.html :align: center :scale: 50% @@ -458,7 +458,7 @@ does not fit into the memory. .. currentmodule:: sklearn.cluster -.. image:: ../auto_examples/cluster/images/plot_dict_face_patches_1.png +.. image:: ../auto_examples/cluster/images/plot_dict_face_patches_001.png :target: ../auto_examples/cluster/plot_dict_face_patches.html :scale: 50% :align: right @@ -533,11 +533,11 @@ Factor analysis *can* produce similar components (the columns of its loading matrix) to :class:`PCA`. However, one can not make any general statements about these components (e.g. whether they are orthogonal): -.. |pca_img3| image:: ../auto_examples/decomposition/images/plot_faces_decomposition_2.png +.. |pca_img3| image:: ../auto_examples/decomposition/images/plot_faces_decomposition_002.png :target: ../auto_examples/decomposition/plot_faces_decomposition.html :scale: 60% -.. |fa_img3| image:: ../auto_examples/decomposition/images/plot_faces_decomposition_9.png +.. |fa_img3| image:: ../auto_examples/decomposition/images/plot_faces_decomposition_009.png :target: ../auto_examples/decomposition/plot_faces_decomposition.html :scale: 60% @@ -547,7 +547,7 @@ The main advantage for Factor Analysis (over :class:`PCA` is that it can model the variance in every direction of the input space independently (heteroscedastic noise): -.. figure:: ../auto_examples/decomposition/images/plot_faces_decomposition_8.png +.. figure:: ../auto_examples/decomposition/images/plot_faces_decomposition_008.png :target: ../auto_examples/decomposition/plot_faces_decomposition.html :align: center :scale: 75% @@ -555,7 +555,7 @@ it can model the variance in every direction of the input space independently This allows better model selection than probabilistic PCA in the presence of heteroscedastic noise: -.. figure:: ../auto_examples/decomposition/images/plot_pca_vs_fa_model_selection_2.png +.. figure:: ../auto_examples/decomposition/images/plot_pca_vs_fa_model_selection_002.png :target: ../auto_examples/decomposition/plot_pca_vs_fa_model_selection.html :align: center :scale: 75% @@ -582,7 +582,7 @@ of the PCA variants. It is classically used to separate mixed signals (a problem known as *blind source separation*), as in the example below: -.. figure:: ../auto_examples/decomposition/images/plot_ica_blind_source_separation_1.png +.. figure:: ../auto_examples/decomposition/images/plot_ica_blind_source_separation_001.png :target: ../auto_examples/decomposition/plot_ica_blind_source_separation.html :align: center :scale: 60% @@ -591,11 +591,11 @@ It is classically used to separate mixed signals (a problem known as ICA can also be used as yet another non linear decomposition that finds components with some sparsity: -.. |pca_img4| image:: ../auto_examples/decomposition/images/plot_faces_decomposition_2.png +.. |pca_img4| image:: ../auto_examples/decomposition/images/plot_faces_decomposition_002.png :target: ../auto_examples/decomposition/plot_faces_decomposition.html :scale: 60% -.. |ica_img4| image:: ../auto_examples/decomposition/images/plot_faces_decomposition_4.png +.. |ica_img4| image:: ../auto_examples/decomposition/images/plot_faces_decomposition_004.png :target: ../auto_examples/decomposition/plot_faces_decomposition.html :scale: 60% @@ -639,11 +639,11 @@ resulting in interpretable models. The following example displays 16 sparse components found by :class:`NMF` from the images in the Olivetti faces dataset, in comparison with the PCA eigenfaces. -.. |pca_img5| image:: ../auto_examples/decomposition/images/plot_faces_decomposition_2.png +.. |pca_img5| image:: ../auto_examples/decomposition/images/plot_faces_decomposition_002.png :target: ../auto_examples/decomposition/plot_faces_decomposition.html :scale: 60% -.. |nmf_img5| image:: ../auto_examples/decomposition/images/plot_faces_decomposition_3.png +.. |nmf_img5| image:: ../auto_examples/decomposition/images/plot_faces_decomposition_003.png :target: ../auto_examples/decomposition/plot_faces_decomposition.html :scale: 60% diff --git a/doc/modules/density.rst b/doc/modules/density.rst index ff91abad858b534069af345fd72314ba8ab27ffe..c9f5c271f7f15ec2190181671130f2c3779903e9 100644 --- a/doc/modules/density.rst +++ b/doc/modules/density.rst @@ -24,7 +24,7 @@ A histogram is a simple visualization of data where bins are defined, and the number of data points within each bin is tallied. An example of a histogram can be seen in the upper-left panel of the following figure: -.. |hist_to_kde| image:: ../auto_examples/neighbors/images/plot_kde_1d_1.png +.. |hist_to_kde| image:: ../auto_examples/neighbors/images/plot_kde_1d_001.png :target: ../auto_examples/neighbors/plot_kde_1d.html :scale: 80 @@ -68,7 +68,7 @@ dimensionality causes its performance to degrade in high dimensions. In the following figure, 100 points are drawn from a bimodal distribution, and the kernel density estimates are shown for three choices of kernels: -.. |kde_1d_distribution| image:: ../auto_examples/neighbors/images/plot_kde_1d_3.png +.. |kde_1d_distribution| image:: ../auto_examples/neighbors/images/plot_kde_1d_003.png :target: ../auto_examples/neighbors/plot_kde_1d.html :scale: 80 @@ -103,7 +103,7 @@ to an unsmooth (i.e. high-variance) density distribution. :class:`sklearn.neighbors.KernelDensity` implements several common kernel forms, which are shown in the following figure: -.. |kde_kernels| image:: ../auto_examples/neighbors/images/plot_kde_1d_2.png +.. |kde_kernels| image:: ../auto_examples/neighbors/images/plot_kde_1d_002.png :target: ../auto_examples/neighbors/plot_kde_1d.html :scale: 80 @@ -145,7 +145,7 @@ is an example of using a kernel density estimate for a visualization of geospatial data, in this case the distribution of observations of two different species on the South American continent: -.. |species_kde| image:: ../auto_examples/neighbors/images/plot_species_kde_1.png +.. |species_kde| image:: ../auto_examples/neighbors/images/plot_species_kde_001.png :target: ../auto_examples/neighbors/plot_species_kde.html :scale: 80 @@ -158,7 +158,7 @@ Here is an example of using this process to create a new set of hand-written digits, using a Gaussian kernel learned on a PCA projection of the data: -.. |digits_kde| image:: ../auto_examples/neighbors/images/plot_digits_kde_sampling_1.png +.. |digits_kde| image:: ../auto_examples/neighbors/images/plot_digits_kde_sampling_001.png :target: ../auto_examples/neighbors/plot_digits_kde_sampling.html :scale: 80 diff --git a/doc/modules/ensemble.rst b/doc/modules/ensemble.rst index 9296d0c376b013b3a245b96a3b0ea91af6b92bed..963a1837cc37312466b89eb556d68bb49cfed241 100644 --- a/doc/modules/ensemble.rst +++ b/doc/modules/ensemble.rst @@ -181,7 +181,7 @@ in bias:: >>> scores.mean() > 0.999 True -.. figure:: ../auto_examples/ensemble/images/plot_forest_iris_1.png +.. figure:: ../auto_examples/ensemble/images/plot_forest_iris_001.png :target: ../auto_examples/ensemble/plot_forest_iris.html :align: center :scale: 75% @@ -257,7 +257,7 @@ The following example shows a color-coded representation of the relative importances of each individual pixel for a face recognition task using a :class:`ExtraTreesClassifier` model. -.. figure:: ../auto_examples/ensemble/images/plot_forest_importances_faces_1.png +.. figure:: ../auto_examples/ensemble/images/plot_forest_importances_faces_001.png :target: ../auto_examples/ensemble/plot_forest_importances_faces.html :align: center :scale: 75 @@ -333,7 +333,7 @@ ever-increasing influence. Each subsequent weak learner is thereby forced to concentrate on the examples that are missed by the previous ones in the sequence [HTF]_. -.. figure:: ../auto_examples/ensemble/images/plot_adaboost_hastie_10_2_1.png +.. figure:: ../auto_examples/ensemble/images/plot_adaboost_hastie_10_2_001.png :target: ../auto_examples/ensemble/plot_adaboost_hastie_10_2.html :align: center :scale: 75 @@ -497,7 +497,7 @@ to determine the optimal number of trees (i.e. ``n_estimators``) by early stoppi The plot on the right shows the feature importances which can be obtained via the ``feature_importances_`` property. -.. figure:: ../auto_examples/ensemble/images/plot_gradient_boosting_regression_1.png +.. figure:: ../auto_examples/ensemble/images/plot_gradient_boosting_regression_001.png :target: ../auto_examples/ensemble/plot_gradient_boosting_regression.html :align: center :scale: 75 @@ -694,7 +694,7 @@ outperforms no-shrinkage. Subsampling with shrinkage can further increase the accuracy of the model. Subsampling without shrinkage, on the other hand, does poorly. -.. figure:: ../auto_examples/ensemble/images/plot_gradient_boosting_regularization_1.png +.. figure:: ../auto_examples/ensemble/images/plot_gradient_boosting_regularization_001.png :target: ../auto_examples/ensemble/plot_gradient_boosting_regularization.html :align: center :scale: 75 @@ -785,7 +785,7 @@ usually chosen among the most important features. The Figure below shows four one-way and one two-way partial dependence plots for the California housing dataset: -.. figure:: ../auto_examples/ensemble/images/plot_partial_dependence_1.png +.. figure:: ../auto_examples/ensemble/images/plot_partial_dependence_001.png :target: ../auto_examples/ensemble/plot_partial_dependence.html :align: center :scale: 70 diff --git a/doc/modules/feature_extraction.rst b/doc/modules/feature_extraction.rst index 4043338990356b27d1a30906bfd7b811cf9c9449..b757883fb015e18153259e55beb0195405f2aad3 100644 --- a/doc/modules/feature_extraction.rst +++ b/doc/modules/feature_extraction.rst @@ -890,7 +890,7 @@ features or samples. For instance Ward clustering (:ref:`hierarchical_clustering`) can cluster together only neighboring pixels of an image, thus forming contiguous patches: -.. figure:: ../auto_examples/cluster/images/plot_lena_ward_segmentation_1.png +.. figure:: ../auto_examples/cluster/images/plot_lena_ward_segmentation_001.png :target: ../auto_examples/cluster/plot_lena_ward_segmentation.html :align: center :scale: 40 diff --git a/doc/modules/feature_selection.rst b/doc/modules/feature_selection.rst index fb6aedcc6b44f72a43f181d8ff81aed0ae019ddd..0fc8a5abf0532f15948250b7964c30fb535e1b8b 100644 --- a/doc/modules/feature_selection.rst +++ b/doc/modules/feature_selection.rst @@ -210,7 +210,7 @@ settings, using the Lasso, while :class:`RandomizedLogisticRegression` uses the logistic regression and is suitable for classification tasks. To get a full path of stability scores you can use :func:`lasso_stability_path`. -.. figure:: ../auto_examples/linear_model/images/plot_sparse_recovery_2.png +.. figure:: ../auto_examples/linear_model/images/plot_sparse_recovery_002.png :target: ../auto_examples/linear_model/plot_sparse_recovery.html :align: center :scale: 60 diff --git a/doc/modules/gaussian_process.rst b/doc/modules/gaussian_process.rst index aad5aad45699117ebf6f95350a7d48ab99fe6c65..a272dd177fa1eca27e6b51e42bc7a74fb8fe3760 100644 --- a/doc/modules/gaussian_process.rst +++ b/doc/modules/gaussian_process.rst @@ -59,7 +59,7 @@ data. Depending on the number of parameters provided at instantiation, the fitting procedure may recourse to maximum likelihood estimation for the parameters or alternatively it uses the given parameters. -.. figure:: ../auto_examples/gaussian_process/images/plot_gp_regression_1.png +.. figure:: ../auto_examples/gaussian_process/images/plot_gp_regression_001.png :target: ../auto_examples/gaussian_process/plot_gp_regression.html :align: center @@ -100,7 +100,7 @@ equivalent to specifying a fractional variance in the input. That is With ``nugget`` and ``corr`` properly set, Gaussian Processes can be used to robustly recover an underlying function from noisy data: -.. figure:: ../auto_examples/gaussian_process/images/plot_gp_regression_2.png +.. figure:: ../auto_examples/gaussian_process/images/plot_gp_regression_002.png :target: ../auto_examples/gaussian_process/plot_gp_regression.html :align: center diff --git a/doc/modules/isotonic.rst b/doc/modules/isotonic.rst index c781beaa186dde7bcb0d56f1e73889ce0ddbb2f8..9da18e4f069a20b1b46aa38342de2462e0a3db14 100644 --- a/doc/modules/isotonic.rst +++ b/doc/modules/isotonic.rst @@ -18,6 +18,6 @@ arbitrary real number. It yields the vector which is composed of non-decreasing elements the closest in terms of mean squared error. In practice this list of elements forms a function that is piecewise linear. -.. figure:: ../auto_examples/images/plot_isotonic_regression_1.png +.. figure:: ../auto_examples/images/plot_isotonic_regression_001.png :target: ../auto_examples/images/plot_isotonic_regression.html :align: center diff --git a/doc/modules/kernel_approximation.rst b/doc/modules/kernel_approximation.rst index a6ce6f44ab6026f3a9969c0f51b8cede98e45dcc..148d09df58894f93e97354bd8ddc4a3086e6c77f 100644 --- a/doc/modules/kernel_approximation.rst +++ b/doc/modules/kernel_approximation.rst @@ -83,7 +83,7 @@ For a given value of ``n_components`` :class:`RBFSampler` is often less accurate as :class:`Nystroem`. :class:`RBFSampler` is cheaper to compute, though, making use of larger feature spaces more efficient. -.. figure:: ../auto_examples/images/plot_kernel_approximation_2.png +.. figure:: ../auto_examples/images/plot_kernel_approximation_002.png :target: ../auto_examples/plot_kernel_approximation.html :scale: 50% :align: center diff --git a/doc/modules/label_propagation.rst b/doc/modules/label_propagation.rst index 38f94c61f6d9f83572a204865757f10c2d1a51ec..80f865f01c4d4c5928d261287dd6ce69bf788c09 100644 --- a/doc/modules/label_propagation.rst +++ b/doc/modules/label_propagation.rst @@ -37,7 +37,7 @@ A few features available in this model: :class:`LabelPropagation` and :class:`LabelSpreading`. Both work by constructing a similarity graph over all items in the input dataset. -.. figure:: ../auto_examples/semi_supervised/images/plot_label_propagation_structure_1.png +.. figure:: ../auto_examples/semi_supervised/images/plot_label_propagation_structure_001.png :target: ../auto_examples/semi_supervised/plot_label_propagation_structure.html :align: center :scale: 60% diff --git a/doc/modules/lda_qda.rst b/doc/modules/lda_qda.rst index 89b29c206bc240fc64ca28ae08acfc5adf39227b..2706cbb405b3b31077003d4d9b36e222ea97e663 100644 --- a/doc/modules/lda_qda.rst +++ b/doc/modules/lda_qda.rst @@ -16,7 +16,7 @@ can be easily computed, are inherently multiclass, and have proven to work well in practice. Also there are no parameters to tune for these algorithms. -.. |ldaqda| image:: ../auto_examples/images/plot_lda_qda_1.png +.. |ldaqda| image:: ../auto_examples/images/plot_lda_qda_001.png :target: ../auto_examples/plot_lda_qda.html :scale: 80 diff --git a/doc/modules/learning_curve.rst b/doc/modules/learning_curve.rst index 176813ec373ca601a132ff491517c1ee66b1cef4..19b97945468f05b908613af1545f2e190f79e183 100644 --- a/doc/modules/learning_curve.rst +++ b/doc/modules/learning_curve.rst @@ -21,7 +21,7 @@ the second estimator approximates it almost perfectly and the last estimator approximates the training data perfectly but does not fit the true function very well, i.e. it is very sensitive to varying training data (high variance). -.. figure:: ../auto_examples/images/plot_underfitting_overfitting_1.png +.. figure:: ../auto_examples/images/plot_underfitting_overfitting_001.png :target: ../auto_examples/plot_underfitting_overfitting.html :align: center :scale: 50% @@ -98,7 +98,7 @@ training score and a high validation score is usually not possible. All three cases can be found in the plot below where we vary the parameter :math:`\gamma` of an SVM on the digits dataset. -.. figure:: ../auto_examples/images/plot_validation_curve_1.png +.. figure:: ../auto_examples/images/plot_validation_curve_001.png :target: ../auto_examples/plot_validation_curve.html :align: center :scale: 50% @@ -118,7 +118,7 @@ size of the training set, we will not benefit much from more training data. In the following plot you can see an example: naive Bayes roughly converges to a low score. -.. figure:: ../auto_examples/images/plot_learning_curve_1.png +.. figure:: ../auto_examples/images/plot_learning_curve_001.png :target: ../auto_examples/plot_learning_curve.html :align: center :scale: 50% @@ -130,7 +130,7 @@ the maximum number of training samples, adding more training samples will most likely increase generalization. In the following plot you can see that the SVM could benefit from more training examples. -.. figure:: ../auto_examples/images/plot_learning_curve_2.png +.. figure:: ../auto_examples/images/plot_learning_curve_002.png :target: ../auto_examples/plot_learning_curve.html :align: center :scale: 50% diff --git a/doc/modules/linear_model.rst b/doc/modules/linear_model.rst index c6dc3584f7e35e046d32c88c7f21fbe28814acaa..0ad9dc78cc3a2f5b395ddba6d28ba54972967cc5 100644 --- a/doc/modules/linear_model.rst +++ b/doc/modules/linear_model.rst @@ -33,7 +33,7 @@ solves a problem of the form: .. math:: \underset{w}{min\,} {|| X w - y||_2}^2 -.. figure:: ../auto_examples/linear_model/images/plot_ols_1.png +.. figure:: ../auto_examples/linear_model/images/plot_ols_001.png :target: ../auto_examples/linear_model/plot_ols.html :align: center :scale: 50% @@ -90,7 +90,7 @@ Here, :math:`\alpha \geq 0` is a complexity parameter that controls the amount of shrinkage: the larger the value of :math:`\alpha`, the greater the amount of shrinkage and thus the coefficients become more robust to collinearity. -.. figure:: ../auto_examples/linear_model/images/plot_ridge_path_1.png +.. figure:: ../auto_examples/linear_model/images/plot_ridge_path_001.png :target: ../auto_examples/linear_model/plot_ridge_path.html :align: center :scale: 50% @@ -230,11 +230,11 @@ the advantage of exploring more relevant values of `alpha` parameter, and if the number of samples is very small compared to the number of observations, it is often faster than :class:`LassoCV`. -.. |lasso_cv_1| image:: ../auto_examples/linear_model/images/plot_lasso_model_selection_2.png +.. |lasso_cv_1| image:: ../auto_examples/linear_model/images/plot_lasso_model_selection_002.png :target: ../auto_examples/linear_model/plot_lasso_model_selection.html :scale: 48% -.. |lasso_cv_2| image:: ../auto_examples/linear_model/images/plot_lasso_model_selection_3.png +.. |lasso_cv_2| image:: ../auto_examples/linear_model/images/plot_lasso_model_selection_003.png :target: ../auto_examples/linear_model/plot_lasso_model_selection.html :scale: 48% @@ -255,7 +255,7 @@ is correct, i.e. that the data are actually generated by this model. They also tend to break when the problem is badly conditioned (more features than samples). -.. figure:: ../auto_examples/linear_model/images/plot_lasso_model_selection_1.png +.. figure:: ../auto_examples/linear_model/images/plot_lasso_model_selection_001.png :target: ../auto_examples/linear_model/plot_lasso_model_selection.html :align: center :scale: 50% @@ -289,7 +289,7 @@ The objective function to minimize is in this case \frac{\alpha(1-\rho)}{2} ||w||_2 ^ 2} -.. figure:: ../auto_examples/linear_model/images/plot_lasso_coordinate_descent_path_1.png +.. figure:: ../auto_examples/linear_model/images/plot_lasso_coordinate_descent_path_001.png :target: ../auto_examples/linear_model/plot_lasso_coordinate_descent_path.html :align: center :scale: 50% @@ -318,11 +318,11 @@ with a simple Lasso or a MultiTaskLasso. The Lasso estimates yields scattered non-zeros while the non-zeros of the MultiTaskLasso are full columns. -.. |multi_task_lasso_1| image:: ../auto_examples/linear_model/images/plot_multi_task_lasso_support_1.png +.. |multi_task_lasso_1| image:: ../auto_examples/linear_model/images/plot_multi_task_lasso_support_001.png :target: ../auto_examples/linear_model/plot_multi_task_lasso_support.html :scale: 48% -.. |multi_task_lasso_2| image:: ../auto_examples/linear_model/images/plot_multi_task_lasso_support_2.png +.. |multi_task_lasso_2| image:: ../auto_examples/linear_model/images/plot_multi_task_lasso_support_002.png :target: ../auto_examples/linear_model/plot_multi_task_lasso_support.html :scale: 48% @@ -399,7 +399,7 @@ algorithm, and unlike the implementation based on coordinate_descent, this yields the exact solution, which is piecewise linear as a function of the norm of its coefficients. -.. figure:: ../auto_examples/linear_model/images/plot_lasso_lars_1.png +.. figure:: ../auto_examples/linear_model/images/plot_lasso_lars_001.png :target: ../auto_examples/linear_model/plot_lasso_lars.html :align: center :scale: 50% @@ -556,7 +556,7 @@ log likelihood*. By default :math:`\alpha_1 = \alpha_2 = \lambda_1 = \lambda_2 = 1.e^{-6}`. -.. figure:: ../auto_examples/linear_model/images/plot_bayesian_ridge_1.png +.. figure:: ../auto_examples/linear_model/images/plot_bayesian_ridge_001.png :target: ../auto_examples/linear_model/plot_bayesian_ridge.html :align: center :scale: 50% @@ -623,7 +623,7 @@ has its own standard deviation :math:`\lambda_i`. The prior over all :math:`\lambda_i` is chosen to be the same gamma distribution given by hyperparameters :math:`\lambda_1` and :math:`\lambda_2`. -.. figure:: ../auto_examples/linear_model/images/plot_ard_1.png +.. figure:: ../auto_examples/linear_model/images/plot_ard_001.png :target: ../auto_examples/linear_model/plot_ard.html :align: center :scale: 50% @@ -752,7 +752,7 @@ which may be subject to noise, and outliers, which are e.g. caused by erroneous measurements or invalid hypotheses about the data. The resulting model is then estimated only from the determined inliers. -.. figure:: ../auto_examples/linear_model/images/plot_ransac_1.png +.. figure:: ../auto_examples/linear_model/images/plot_ransac_001.png :target: ../auto_examples/linear_model/plot_ransac.html :align: center :scale: 50% @@ -841,7 +841,7 @@ flexibility to fit a much broader range of data. Here is an example of applying this idea to one-dimensional data, using polynomial features of varying degrees: -.. figure:: ../auto_examples/linear_model/images/plot_polynomial_interpolation_1.png +.. figure:: ../auto_examples/linear_model/images/plot_polynomial_interpolation_001.png :target: ../auto_examples/linear_model/plot_polynomial_interpolation.html :align: center :scale: 50% diff --git a/doc/modules/manifold.rst b/doc/modules/manifold.rst index 32921f0d0e408b13c96967eb2c004c2edc15ed0e..20c4328baede6cab6f80ef5e93b8cd31ed3a5c67 100644 --- a/doc/modules/manifold.rst +++ b/doc/modules/manifold.rst @@ -20,7 +20,7 @@ Manifold learning -.. figure:: ../auto_examples/manifold/images/plot_compare_methods_1.png +.. figure:: ../auto_examples/manifold/images/plot_compare_methods_001.png :target: ../auto_examples/manifold/plot_compare_methods.html :align: center :scale: 60 @@ -46,11 +46,11 @@ to be desired. In a random projection, it is likely that the more interesting structure within the data will be lost. -.. |digits_img| image:: ../auto_examples/manifold/images/plot_lle_digits_1.png +.. |digits_img| image:: ../auto_examples/manifold/images/plot_lle_digits_001.png :target: ../auto_examples/manifold/plot_lle_digits.html :scale: 50 -.. |projected_img| image:: ../auto_examples/manifold/images/plot_lle_digits_2.png +.. |projected_img| image:: ../auto_examples/manifold/images/plot_lle_digits_002.png :target: ../auto_examples/manifold/plot_lle_digits.html :scale: 50 @@ -66,11 +66,11 @@ These methods can be powerful, but often miss important non-linear structure in the data. -.. |PCA_img| image:: ../auto_examples/manifold/images/plot_lle_digits_3.png +.. |PCA_img| image:: ../auto_examples/manifold/images/plot_lle_digits_003.png :target: ../auto_examples/manifold/plot_lle_digits.html :scale: 50 -.. |LDA_img| image:: ../auto_examples/manifold/images/plot_lle_digits_4.png +.. |LDA_img| image:: ../auto_examples/manifold/images/plot_lle_digits_004.png :target: ../auto_examples/manifold/plot_lle_digits.html :scale: 50 @@ -106,7 +106,7 @@ Isomap seeks a lower-dimensional embedding which maintains geodesic distances between all points. Isomap can be performed with the object :class:`Isomap`. -.. figure:: ../auto_examples/manifold/images/plot_lle_digits_5.png +.. figure:: ../auto_examples/manifold/images/plot_lle_digits_005.png :target: ../auto_examples/manifold/plot_lle_digits.html :align: center :scale: 50 @@ -162,7 +162,7 @@ Locally linear embedding can be performed with function :func:`locally_linear_embedding` or its object-oriented counterpart :class:`LocallyLinearEmbedding`. -.. figure:: ../auto_examples/manifold/images/plot_lle_digits_6.png +.. figure:: ../auto_examples/manifold/images/plot_lle_digits_006.png :target: ../auto_examples/manifold/plot_lle_digits.html :align: center :scale: 50 @@ -216,7 +216,7 @@ linear embedding* (MLLE). MLLE can be performed with function :class:`LocallyLinearEmbedding`, with the keyword ``method = 'modified'``. It requires ``n_neighbors > n_components``. -.. figure:: ../auto_examples/manifold/images/plot_lle_digits_7.png +.. figure:: ../auto_examples/manifold/images/plot_lle_digits_007.png :target: ../auto_examples/manifold/plot_lle_digits.html :align: center :scale: 50 @@ -266,7 +266,7 @@ for small output dimension. HLLE can be performed with function :class:`LocallyLinearEmbedding`, with the keyword ``method = 'hessian'``. It requires ``n_neighbors > n_components * (n_components + 3) / 2``. -.. figure:: ../auto_examples/manifold/images/plot_lle_digits_8.png +.. figure:: ../auto_examples/manifold/images/plot_lle_digits_008.png :target: ../auto_examples/manifold/plot_lle_digits.html :align: center :scale: 50 @@ -358,7 +358,7 @@ tangent spaces to learn the embedding. LTSA can be performed with function :func:`locally_linear_embedding` or its object-oriented counterpart :class:`LocallyLinearEmbedding`, with the keyword ``method = 'ltsa'``. -.. figure:: ../auto_examples/manifold/images/plot_lle_digits_9.png +.. figure:: ../auto_examples/manifold/images/plot_lle_digits_009.png :target: ../auto_examples/manifold/plot_lle_digits.html :align: center :scale: 50 @@ -416,7 +416,7 @@ vision, the algorithms will try to preserve the order of the distances, and hence seek for a monotonic relationship between the distances in the embedded space and the similarities/dissimilarities. -.. figure:: ../auto_examples/manifold/images/plot_lle_digits_10.png +.. figure:: ../auto_examples/manifold/images/plot_lle_digits_010.png :target: ../auto_examples/manifold/plot_lle_digits.html :align: center :scale: 50 @@ -451,7 +451,7 @@ A trivial solution to this problem is to set all the points on the origin. In order to avoid that, the disparities :math:`\hat{d}_{ij}` are normalized. -.. figure:: ../auto_examples/manifold/images/plot_mds_1.png +.. figure:: ../auto_examples/manifold/images/plot_mds_001.png :target: ../auto_examples/manifold/plot_mds.html :align: center :scale: 60 @@ -487,7 +487,7 @@ of the KL divergence. Hence, it is sometimes useful to try different seeds and select the embedding with the lowest KL divergence. -.. figure:: ../auto_examples/manifold/images/plot_lle_digits_13.png +.. figure:: ../auto_examples/manifold/images/plot_lle_digits_013.png :target: ../auto_examples/manifold/plot_lle_digits.html :align: center :scale: 50 diff --git a/doc/modules/mixture.rst b/doc/modules/mixture.rst index bb6de877f164b4cbdba702a6bbc68761f06a2311..14ed5a63a56b38515df4b3f5ef5fda2c5c4922ad 100644 --- a/doc/modules/mixture.rst +++ b/doc/modules/mixture.rst @@ -14,7 +14,7 @@ matrices supported), sample them, and estimate them from data. Facilities to help determine the appropriate number of components are also provided. - .. figure:: ../auto_examples/mixture/images/plot_gmm_pdf_1.png + .. figure:: ../auto_examples/mixture/images/plot_gmm_pdf_001.png :target: ../auto_examples/mixture/plot_gmm_pdf.html :align: center :scale: 50% @@ -55,7 +55,7 @@ The :class:`GMM` comes with different options to constrain the covariance of the difference classes estimated: spherical, diagonal, tied or full covariance. -.. figure:: ../auto_examples/mixture/images/plot_gmm_classifier_1.png +.. figure:: ../auto_examples/mixture/images/plot_gmm_classifier_001.png :target: ../auto_examples/mixture/plot_gmm_classifier.html :align: center :scale: 75% @@ -102,7 +102,7 @@ only in the asymptotic regime (i.e. if much data is available). Note that using a :ref:`DPGMM <dpgmm>` avoids the specification of the number of components for a Gaussian mixture model. -.. figure:: ../auto_examples/mixture/images/plot_gmm_selection_1.png +.. figure:: ../auto_examples/mixture/images/plot_gmm_selection_001.png :target: ../auto_examples/mixture/plot_gmm_selection.html :align: center :scale: 50% @@ -210,11 +210,11 @@ components, and at the expense of extra computational time the user only needs to specify a loose upper bound on this number and a concentration parameter. -.. |plot_gmm| image:: ../auto_examples/mixture/images/plot_gmm_1.png +.. |plot_gmm| image:: ../auto_examples/mixture/images/plot_gmm_001.png :target: ../auto_examples/mixture/plot_gmm.html :scale: 48% -.. |plot_gmm_sin| image:: ../auto_examples/mixture/images/plot_gmm_sin_1.png +.. |plot_gmm_sin| image:: ../auto_examples/mixture/images/plot_gmm_sin_001.png :target: ../auto_examples/mixture/plot_gmm_sin.html :scale: 48% diff --git a/doc/modules/model_evaluation.rst b/doc/modules/model_evaluation.rst index f3fd09b65ae0551c8f2e27602c5d0d252e7bafe6..97d7682b57f277941a2d40c0d7edc29a60dedc06 100644 --- a/doc/modules/model_evaluation.rst +++ b/doc/modules/model_evaluation.rst @@ -292,7 +292,7 @@ predicted to be in group :math:`j`. Here an example of such confusion matrix:: Here a visual representation of such confusion matrix (this figure comes from the :ref:`example_plot_confusion_matrix.py` example): -.. image:: ../auto_examples/images/plot_confusion_matrix_1.png +.. image:: ../auto_examples/images/plot_confusion_matrix_001.png :target: ../auto_examples/plot_confusion_matrix.html :scale: 75 :align: center @@ -794,7 +794,7 @@ Here a small example of how to use the :func:`roc_curve` function:: The following figure shows an example of such ROC curve. -.. image:: ../auto_examples/images/plot_roc_1.png +.. image:: ../auto_examples/images/plot_roc_001.png :target: ../auto_examples/plot_roc.html :scale: 75 :align: center @@ -835,7 +835,7 @@ F1 score, ROC AUC doesn't require to optimize a threshold for each label. The if predicted outputs have been binarized. -.. image:: ../auto_examples/images/plot_roc_2.png +.. image:: ../auto_examples/images/plot_roc_002.png :target: ../auto_examples/plot_roc.html :scale: 75 :align: center diff --git a/doc/modules/multiclass.rst b/doc/modules/multiclass.rst index a28652879cba9c99403061d77ea091e13b730b2b..2852dd6b763d7a4da79375fb1443bb02c7e0f0e7 100644 --- a/doc/modules/multiclass.rst +++ b/doc/modules/multiclass.rst @@ -142,7 +142,7 @@ To use this feature, feed the classifier an indicator matrix, in which cell [i, j] indicates the presence of label j in sample i. -.. figure:: ../auto_examples/images/plot_multilabel_1.png +.. figure:: ../auto_examples/images/plot_multilabel_001.png :target: ../auto_examples/plot_multilabel.html :align: center :scale: 75% diff --git a/doc/modules/neighbors.rst b/doc/modules/neighbors.rst index f98711f40936c418b1191342dfd22d167bfeb425..5b8d4ba2e2c037e9daf2dec0a2366c102ed7d528 100644 --- a/doc/modules/neighbors.rst +++ b/doc/modules/neighbors.rst @@ -184,11 +184,11 @@ distance can be supplied which is used to compute the weights. -.. |classification_1| image:: ../auto_examples/neighbors/images/plot_classification_1.png +.. |classification_1| image:: ../auto_examples/neighbors/images/plot_classification_001.png :target: ../auto_examples/neighbors/plot_classification.html :scale: 50 -.. |classification_2| image:: ../auto_examples/neighbors/images/plot_classification_2.png +.. |classification_2| image:: ../auto_examples/neighbors/images/plot_classification_002.png :target: ../auto_examples/neighbors/plot_classification.html :scale: 50 @@ -227,7 +227,7 @@ weights proportional to the inverse of the distance from the query point. Alternatively, a user-defined function of the distance can be supplied, which will be used to compute the weights. -.. figure:: ../auto_examples/neighbors/images/plot_regression_1.png +.. figure:: ../auto_examples/neighbors/images/plot_regression_001.png :target: ../auto_examples/neighbors/plot_regression.html :align: center :scale: 75 @@ -237,7 +237,7 @@ The use of multi-output nearest neighbors for regression is demonstrated in X are the pixels of the upper half of faces and the outputs Y are the pixels of the lower half of those faces. -.. figure:: ../auto_examples/images/plot_multioutput_face_completion_1.png +.. figure:: ../auto_examples/images/plot_multioutput_face_completion_001.png :target: ../auto_examples/plot_multioutput_face_completion.html :scale: 75 :align: center @@ -496,11 +496,11 @@ This is useful, for example, for removing noisy features. In the example below, using a small shrink threshold increases the accuracy of the model from 0.81 to 0.82. -.. |nearest_centroid_1| image:: ../auto_examples/neighbors/images/plot_nearest_centroid_1.png +.. |nearest_centroid_1| image:: ../auto_examples/neighbors/images/plot_nearest_centroid_001.png :target: ../auto_examples/neighbors/plot_classification.html :scale: 50 -.. |nearest_centroid_2| image:: ../auto_examples/neighbors/images/plot_nearest_centroid_2.png +.. |nearest_centroid_2| image:: ../auto_examples/neighbors/images/plot_nearest_centroid_002.png :target: ../auto_examples/neighbors/plot_classification.html :scale: 50 diff --git a/doc/modules/neural_networks.rst b/doc/modules/neural_networks.rst index 7c1cc36bd29be689c20f3c3a0c6c1121415a97b5..7519ba01a15dd9458770756cde382995e6906387 100644 --- a/doc/modules/neural_networks.rst +++ b/doc/modules/neural_networks.rst @@ -32,7 +32,7 @@ density estimation. The method gained popularity for initializing deep neural networks with the weights of independent RBMs. This method is known as unsupervised pre-training. -.. figure:: ../auto_examples/images/plot_rbm_logistic_classification_1.png +.. figure:: ../auto_examples/images/plot_rbm_logistic_classification_001.png :target: ../auto_examples/plot_rbm_logistic_classification.html :align: center :scale: 100% diff --git a/doc/modules/outlier_detection.rst b/doc/modules/outlier_detection.rst index 08b1616f819db2d98eafc86c76ce1b935ca9653f..ee7c483c73a7edc28797f9d9ffc6cacc6d73f907 100644 --- a/doc/modules/outlier_detection.rst +++ b/doc/modules/outlier_detection.rst @@ -69,7 +69,7 @@ but regular, observation outside the frontier. frontier learned around some data by a :class:`svm.OneClassSVM` object. -.. figure:: ../auto_examples/svm/images/plot_oneclass_1.png +.. figure:: ../auto_examples/svm/images/plot_oneclass_001.png :target: ../auto_examples/svm/plot_oneclasse.html :align: center :scale: 75% @@ -105,7 +105,7 @@ whithout being influenced by outliers). The Mahalanobis distances obtained from this estimate is used to derive a measure of outlyingness. This strategy is illustrated below. -.. figure:: ../auto_examples/covariance/images/plot_mahalanobis_distances_1.png +.. figure:: ../auto_examples/covariance/images/plot_mahalanobis_distances_001.png :target: ../auto_examples/covariance/plot_mahalanobis_distances.html :align: center :scale: 75% @@ -138,15 +138,15 @@ The examples below illustrate how the performance of the less unimodal. :class:`svm.OneClassSVM` works better on data with multiple modes. -.. |outlier1| image:: ../auto_examples/covariance/images/plot_outlier_detection_1.png +.. |outlier1| image:: ../auto_examples/covariance/images/plot_outlier_detection_001.png :target: ../auto_examples/covariance/plot_outlier_detection.html :scale: 50% -.. |outlier2| image:: ../auto_examples/covariance/images/plot_outlier_detection_2.png +.. |outlier2| image:: ../auto_examples/covariance/images/plot_outlier_detection_002.png :target: ../auto_examples/covariance/plot_outlier_detection.html :scale: 50% -.. |outlier3| image:: ../auto_examples/covariance/images/plot_outlier_detection_3.png +.. |outlier3| image:: ../auto_examples/covariance/images/plot_outlier_detection_003.png :target: ../auto_examples/covariance/plot_outlier_detection.html :scale: 50% diff --git a/doc/modules/random_projection.rst b/doc/modules/random_projection.rst index 51d874650ff2fa17d4a23968eb6210cfd2ad4fac..e6ef3cb63e02a035886f048fe8bb7537bf3d633f 100644 --- a/doc/modules/random_projection.rst +++ b/doc/modules/random_projection.rst @@ -64,12 +64,12 @@ bounded distortion introduced by the random projection:: >>> johnson_lindenstrauss_min_dim(n_samples=[1e4, 1e5, 1e6], eps=0.1) array([ 7894, 9868, 11841]) -.. figure:: ../auto_examples/images/plot_johnson_lindenstrauss_bound_1.png +.. figure:: ../auto_examples/images/plot_johnson_lindenstrauss_bound_001.png :target: ../auto_examples/plot_johnson_lindenstrauss_bound.html :scale: 75 :align: center -.. figure:: ../auto_examples/images/plot_johnson_lindenstrauss_bound_2.png +.. figure:: ../auto_examples/images/plot_johnson_lindenstrauss_bound_002.png :target: ../auto_examples/plot_johnson_lindenstrauss_bound.html :scale: 75 :align: center diff --git a/doc/modules/scaling_strategies.rst b/doc/modules/scaling_strategies.rst index 0131650f47e3037632041e5a370db93cc13f1314..e4c5e9953ca756cb14d49fd53da5965594aecb98 100644 --- a/doc/modules/scaling_strategies.rst +++ b/doc/modules/scaling_strategies.rst @@ -95,7 +95,7 @@ systems and demonstrates most of the notions discussed above. Furthermore, it also shows the evolution of the performance of different algorithms with the number of processed examples. -.. |accuracy_over_time| image:: ../auto_examples/applications/images/plot_out_of_core_classification_1.png +.. |accuracy_over_time| image:: ../auto_examples/applications/images/plot_out_of_core_classification_001.png :target: ../auto_examples/applications/plot_out_of_core_classification.html :scale: 80 @@ -107,7 +107,7 @@ algorithms, ``MultinomialNB`` is the most expensive, but its overhead can be mitigated by increasing the size of the mini-batches (exercise: change ``minibatch_size`` to 100 and 10000 in the program and compare). -.. |computation_time| image:: ../auto_examples/applications/images/plot_out_of_core_classification_3.png +.. |computation_time| image:: ../auto_examples/applications/images/plot_out_of_core_classification_003.png :target: ../auto_examples/applications/plot_out_of_core_classification.html :scale: 80 diff --git a/doc/modules/sgd.rst b/doc/modules/sgd.rst index 4a09081399ac85859fe24a76b779dcf13e56500a..7f8fb758cc7e01fccc23ffe68b50d80f3da5a2dc 100644 --- a/doc/modules/sgd.rst +++ b/doc/modules/sgd.rst @@ -46,7 +46,7 @@ The class :class:`SGDClassifier` implements a plain stochastic gradient descent learning routine which supports different loss functions and penalties for classification. -.. figure:: ../auto_examples/linear_model/images/plot_sgd_separating_hyperplane_1.png +.. figure:: ../auto_examples/linear_model/images/plot_sgd_separating_hyperplane_001.png :target: ../auto_examples/linear_model/plot_sgd_separating_hyperplane.html :align: center :scale: 75 @@ -136,7 +136,7 @@ below illustrates the OVA approach on the iris dataset. The dashed lines represent the three OVA classifiers; the background colors show the decision surface induced by the three classifiers. -.. figure:: ../auto_examples/linear_model/images/plot_sgd_iris_1.png +.. figure:: ../auto_examples/linear_model/images/plot_sgd_iris_001.png :target: ../auto_examples/linear_model/plot_sgd_iris.html :align: center :scale: 75 @@ -283,7 +283,7 @@ Different choices for :math:`L` entail different classifiers such as All of the above loss functions can be regarded as an upper bound on the misclassification error (Zero-one loss) as shown in the Figure below. -.. figure:: ../auto_examples/linear_model/images/plot_sgd_loss_functions_1.png +.. figure:: ../auto_examples/linear_model/images/plot_sgd_loss_functions_001.png :align: center :scale: 75 @@ -297,7 +297,7 @@ Popular choices for the regularization term :math:`R` include: The Figure below shows the contours of the different regularization terms in the parameter space when :math:`R(w) = 1`. -.. figure:: ../auto_examples/linear_model/images/plot_sgd_penalties_1.png +.. figure:: ../auto_examples/linear_model/images/plot_sgd_penalties_001.png :align: center :scale: 75 diff --git a/doc/modules/svm.rst b/doc/modules/svm.rst index afb3b8e41ee51bade315c3ff868d6bb2d09e8b0c..dc8d3bbaf757a84e5c9033b7bd475c447f99b0fd 100644 --- a/doc/modules/svm.rst +++ b/doc/modules/svm.rst @@ -51,7 +51,7 @@ Classification capable of performing multi-class classification on a dataset. -.. figure:: ../auto_examples/svm/images/plot_iris_1.png +.. figure:: ../auto_examples/svm/images/plot_iris_001.png :target: ../auto_examples/svm/plot_iris.html :align: center @@ -243,7 +243,7 @@ classes or certain individual samples keywords ``class_weight`` and ``{class_label : value}``, where value is a floating point number > 0 that sets the parameter ``C`` of class ``class_label`` to ``C * value``. -.. figure:: ../auto_examples/svm/images/plot_separating_hyperplane_unbalanced_1.png +.. figure:: ../auto_examples/svm/images/plot_separating_hyperplane_unbalanced_001.png :target: ../auto_examples/svm/plot_separating_hyperplane_unbalanced.html :align: center :scale: 75 @@ -255,7 +255,7 @@ that sets the parameter ``C`` of class ``class_label`` to ``C * value``. set the parameter ``C`` for the i-th example to ``C * sample_weight[i]``. -.. figure:: ../auto_examples/svm/images/plot_weighted_samples_1.png +.. figure:: ../auto_examples/svm/images/plot_weighted_samples_001.png :target: ../auto_examples/svm/plot_weighted_samples.html :align: center :scale: 75 @@ -325,7 +325,7 @@ will only take as input an array X, as there are no class labels. See, section :ref:`outlier_detection` for more details on this usage. -.. figure:: ../auto_examples/svm/images/plot_oneclass_1.png +.. figure:: ../auto_examples/svm/images/plot_oneclass_001.png :target: ../auto_examples/svm/plot_oneclass.html :align: center :scale: 75 @@ -537,7 +537,7 @@ margin), since in general the larger the margin the lower the generalization error of the classifier. -.. figure:: ../auto_examples/svm/images/plot_separating_hyperplane_1.png +.. figure:: ../auto_examples/svm/images/plot_separating_hyperplane_001.png :align: center :scale: 75 diff --git a/doc/modules/tree.rst b/doc/modules/tree.rst index 3e4703351fea8a22a638e48688ac6b5007fb31be..e4c55bc88129a57f72d56ebcb922a8be649f6cf2 100644 --- a/doc/modules/tree.rst +++ b/doc/modules/tree.rst @@ -16,7 +16,7 @@ For instance, in the example below, decision trees learn from data to approximate a sine curve with a set of if-then-else decision rules. The deeper the tree, the more complex the decision rules and the fitter the model. -.. figure:: ../auto_examples/tree/images/plot_tree_regression_1.png +.. figure:: ../auto_examples/tree/images/plot_tree_regression_001.png :target: ../auto_examples/tree/plot_tree_regression.html :scale: 75 :align: center @@ -160,7 +160,7 @@ After being fitted, the model can then be used to predict new values:: >>> clf.predict(iris.data[0, :]) array([0]) -.. figure:: ../auto_examples/tree/images/plot_iris_1.png +.. figure:: ../auto_examples/tree/images/plot_iris_001.png :target: ../auto_examples/tree/plot_iris.html :align: center :scale: 75 @@ -175,7 +175,7 @@ After being fitted, the model can then be used to predict new values:: Regression ========== -.. figure:: ../auto_examples/tree/images/plot_tree_regression_1.png +.. figure:: ../auto_examples/tree/images/plot_tree_regression_001.png :target: ../auto_examples/tree/plot_tree_regression.html :scale: 75 :align: center @@ -240,7 +240,7 @@ The use of multi-output trees for regression is demonstrated in :ref:`example_tree_plot_tree_regression_multioutput.py`. In this example, the input X is a single real value and the outputs Y are the sine and cosine of X. -.. figure:: ../auto_examples/tree/images/plot_tree_regression_multioutput_1.png +.. figure:: ../auto_examples/tree/images/plot_tree_regression_multioutput_001.png :target: ../auto_examples/tree/plot_tree_regression_multioutput.html :scale: 75 :align: center @@ -250,7 +250,7 @@ The use of multi-output trees for classification is demonstrated in X are the pixels of the upper half of faces and the outputs Y are the pixels of the lower half of those faces. -.. figure:: ../auto_examples/images/plot_multioutput_face_completion_1.png +.. figure:: ../auto_examples/images/plot_multioutput_face_completion_001.png :target: ../auto_examples/plot_multioutput_face_completion.html :scale: 75 :align: center diff --git a/doc/sphinxext/gen_rst.py b/doc/sphinxext/gen_rst.py index dd1f766acb97aee83291ddd0215c0f40b490f7e4..a7ac4dd2861ca53a80ecae660a5ac0f95995e900 100644 --- a/doc/sphinxext/gen_rst.py +++ b/doc/sphinxext/gen_rst.py @@ -415,11 +415,11 @@ SINGLE_IMAGE = """ # thumbnails for the front page of the scikit-learn home page. # key: first image in set # values: (number of plot in set, height of thumbnail) -carousel_thumbs = {'plot_classifier_comparison_1.png': (1, 600), - 'plot_outlier_detection_1.png': (3, 372), - 'plot_gp_regression_1.png': (2, 250), - 'plot_adaboost_twoclass_1.png': (1, 372), - 'plot_compare_methods_1.png': (1, 349)} +carousel_thumbs = {'plot_classifier_comparison_001.png': (1, 600), + 'plot_outlier_detection_001.png': (3, 372), + 'plot_gp_regression_001.png': (2, 250), + 'plot_adaboost_twoclass_001.png': (1, 372), + 'plot_compare_methods_001.png': (1, 349)} def extract_docstring(filename, ignore_heading=False): @@ -883,7 +883,7 @@ def generate_file_rst(fname, target_dir, src_dir, root_dir, plot_gallery): """ Generate the rst file for a given example. """ base_image_name = os.path.splitext(fname)[0] - image_fname = '%s_%%s.png' % base_image_name + image_fname = '%s_%%03d.png' % base_image_name this_template = rst_template last_dir = os.path.split(src_dir)[-1] @@ -988,12 +988,8 @@ def generate_file_rst(fname, target_dir, src_dir, root_dir, plot_gallery): print(" - time elapsed : %.2g sec" % time_elapsed) else: figure_list = [f[len(image_dir):] - for f in glob.glob(image_path % '[1-9]')] - #for f in glob.glob(image_path % '*')] - # Catter for the fact that there can be more than 10 images - if len(figure_list) >= 9: - figure_list.extend([f[len(image_dir):] - for f in glob.glob(image_path % '1[0-9]')]) + for f in glob.glob(image_path.replace("%03d", '*'))] + figure_list.sort() # generate thumb file this_template = plot_rst_template diff --git a/doc/tutorial/basic/tutorial.rst b/doc/tutorial/basic/tutorial.rst index 5451fc7916325a18114bd3ba5c78f0626666d17b..bb6c1fd943f5595e4d7ebbf95dd29a167101e607 100644 --- a/doc/tutorial/basic/tutorial.rst +++ b/doc/tutorial/basic/tutorial.rst @@ -189,7 +189,7 @@ which we have not used to train the classifier:: The corresponding image is the following: -.. image:: ../../auto_examples/datasets/images/plot_digits_last_image_1.png +.. image:: ../../auto_examples/datasets/images/plot_digits_last_image_001.png :target: ../../auto_examples/datasets/plot_digits_last_image.html :align: center :scale: 50 diff --git a/doc/tutorial/statistical_inference/model_selection.rst b/doc/tutorial/statistical_inference/model_selection.rst index 828143225012a930139ed7ffcacddd779ac06619..76ca0bbfebc4d008e47fa96abcf2af9d48c18898 100644 --- a/doc/tutorial/statistical_inference/model_selection.rst +++ b/doc/tutorial/statistical_inference/model_selection.rst @@ -110,7 +110,7 @@ of the computer. .. topic:: **Exercise** :class: green - .. image:: ../../auto_examples/exercises/images/plot_cv_digits_1.png + .. image:: ../../auto_examples/exercises/images/plot_cv_digits_001.png :target: ../../auto_examples/exercises/plot_cv_digits.html :align: right :scale: 90 diff --git a/doc/tutorial/statistical_inference/putting_together.rst b/doc/tutorial/statistical_inference/putting_together.rst index 4a1260b1cdabe1ee5eb4efb68bcc8755f97c13f0..eec42d0e7f3c6167fba5bb9488b98b492c9dea28 100644 --- a/doc/tutorial/statistical_inference/putting_together.rst +++ b/doc/tutorial/statistical_inference/putting_together.rst @@ -11,7 +11,7 @@ Pipelining We have seen that some estimators can transform data and that some estimators can predict variables. We can also create combined estimators: -.. image:: ../../auto_examples/images/plot_digits_pipe_1.png +.. image:: ../../auto_examples/images/plot_digits_pipe_001.png :target: ../../auto_examples/plot_digits_pipe.html :scale: 65 :align: right diff --git a/doc/tutorial/statistical_inference/settings.rst b/doc/tutorial/statistical_inference/settings.rst index 8865212f6149b7f4625f559c567db7ff1a205993..fead00cf952fb1abb409a2700567c191523f994e 100644 --- a/doc/tutorial/statistical_inference/settings.rst +++ b/doc/tutorial/statistical_inference/settings.rst @@ -31,7 +31,7 @@ needs to be preprocessed in order to be used by scikit-learn. .. topic:: An example of reshaping data would be the digits dataset - .. image:: ../../auto_examples/datasets/images/plot_digits_last_image_1.png + .. image:: ../../auto_examples/datasets/images/plot_digits_last_image_001.png :target: ../../auto_examples/datasets/plot_digits_last_image.html :align: right :scale: 60 diff --git a/doc/tutorial/statistical_inference/supervised_learning.rst b/doc/tutorial/statistical_inference/supervised_learning.rst index a4b382d1868764ba99b905952e9e3ba1165b7c5b..7f54a1e92e905fe0f6f270f4318177e05824422f 100644 --- a/doc/tutorial/statistical_inference/supervised_learning.rst +++ b/doc/tutorial/statistical_inference/supervised_learning.rst @@ -38,7 +38,7 @@ Nearest neighbor and the curse of dimensionality .. topic:: Classifying irises: - .. image:: ../../auto_examples/datasets/images/plot_iris_dataset_1.png + .. image:: ../../auto_examples/datasets/images/plot_iris_dataset_001.png :target: ../../auto_examples/datasets/plot_iris_dataset.html :align: right :scale: 65 @@ -75,7 +75,7 @@ Scikit-learn documentation for more information about this type of classifier.) **KNN (k nearest neighbors) classification example**: -.. image:: ../../auto_examples/neighbors/images/plot_classification_1.png +.. image:: ../../auto_examples/neighbors/images/plot_classification_001.png :target: ../../auto_examples/neighbors/plot_classification.html :align: center :scale: 70 @@ -158,7 +158,7 @@ in it's simplest form, fits a linear model to the data set by adjusting a set of parameters in order to make the sum of the squared residuals of the model as small as possible. -.. image:: ../../auto_examples/linear_model/images/plot_ols_1.png +.. image:: ../../auto_examples/linear_model/images/plot_ols_001.png :target: ../../auto_examples/linear_model/plot_ols.html :scale: 40 :align: right @@ -199,7 +199,7 @@ Shrinkage If there are few data points per dimension, noise in the observations induces high variance: -.. image:: ../../auto_examples/linear_model/images/plot_ols_ridge_variance_1.png +.. image:: ../../auto_examples/linear_model/images/plot_ols_ridge_variance_001.png :target: ../../auto_examples/linear_model/plot_ols_ridge_variance.html :scale: 70 :align: right @@ -228,7 +228,7 @@ regression coefficients to zero: any two randomly chosen set of observations are likely to be uncorrelated. This is called :class:`Ridge` regression: -.. image:: ../../auto_examples/linear_model/images/plot_ols_ridge_variance_2.png +.. image:: ../../auto_examples/linear_model/images/plot_ols_ridge_variance_002.png :target: ../../auto_examples/linear_model/plot_ols_ridge_variance.html :scale: 70 :align: right @@ -274,15 +274,15 @@ Sparsity ---------- -.. |diabetes_ols_1| image:: ../../auto_examples/linear_model/images/plot_ols_3d_1.png +.. |diabetes_ols_1| image:: ../../auto_examples/linear_model/images/plot_ols_3d_001.png :target: ../../auto_examples/linear_model/plot_ols_3d.html :scale: 65 -.. |diabetes_ols_3| image:: ../../auto_examples/linear_model/images/plot_ols_3d_3.png +.. |diabetes_ols_3| image:: ../../auto_examples/linear_model/images/plot_ols_3d_003.png :target: ../../auto_examples/linear_model/plot_ols_3d.html :scale: 65 -.. |diabetes_ols_2| image:: ../../auto_examples/linear_model/images/plot_ols_3d_2.png +.. |diabetes_ols_2| image:: ../../auto_examples/linear_model/images/plot_ols_3d_002.png :target: ../../auto_examples/linear_model/plot_ols_3d.html :scale: 65 @@ -349,7 +349,7 @@ application of Occam's razor: *prefer simpler models*. Classification --------------- -.. image:: ../../auto_examples/linear_model/images/plot_logistic_1.png +.. image:: ../../auto_examples/linear_model/images/plot_logistic_001.png :target: ../../auto_examples/linear_model/plot_logistic.html :scale: 65 :align: right @@ -375,7 +375,7 @@ function or **logistic** function: This is known as :class:`LogisticRegression`. -.. image:: ../../auto_examples/linear_model/images/plot_iris_logistic_1.png +.. image:: ../../auto_examples/linear_model/images/plot_iris_logistic_001.png :target: ../../auto_examples/linear_model/plot_iris_logistic.html :scale: 83 @@ -423,11 +423,11 @@ the separating line (less regularization). .. currentmodule :: sklearn.svm -.. |svm_margin_unreg| image:: ../../auto_examples/svm/images/plot_svm_margin_1.png +.. |svm_margin_unreg| image:: ../../auto_examples/svm/images/plot_svm_margin_001.png :target: ../../auto_examples/svm/plot_svm_margin.html :scale: 70 -.. |svm_margin_reg| image:: ../../auto_examples/svm/images/plot_svm_margin_2.png +.. |svm_margin_reg| image:: ../../auto_examples/svm/images/plot_svm_margin_002.png :target: ../../auto_examples/svm/plot_svm_margin.html :scale: 70 @@ -473,11 +473,11 @@ build a decision function that is not linear but may be polynomial instead. This is done using the *kernel trick* that can be seen as creating a decision energy by positioning *kernels* on observations: -.. |svm_kernel_linear| image:: ../../auto_examples/svm/images/plot_svm_kernels_1.png +.. |svm_kernel_linear| image:: ../../auto_examples/svm/images/plot_svm_kernels_001.png :target: ../../auto_examples/svm/plot_svm_kernels.html :scale: 65 -.. |svm_kernel_poly| image:: ../../auto_examples/svm/images/plot_svm_kernels_2.png +.. |svm_kernel_poly| image:: ../../auto_examples/svm/images/plot_svm_kernels_002.png :target: ../../auto_examples/svm/plot_svm_kernels.html :scale: 65 @@ -515,7 +515,7 @@ creating a decision energy by positioning *kernels* on observations: -.. |svm_kernel_rbf| image:: ../../auto_examples/svm/images/plot_svm_kernels_3.png +.. |svm_kernel_rbf| image:: ../../auto_examples/svm/images/plot_svm_kernels_003.png :target: ../../auto_examples/svm/plot_svm_kernels.html :scale: 65 @@ -548,7 +548,7 @@ creating a decision energy by positioning *kernels* on observations: ``svm_gui.py``; add data points of both classes with right and left button, fit the model and change parameters and data. -.. image:: ../../auto_examples/datasets/images/plot_iris_dataset_1.png +.. image:: ../../auto_examples/datasets/images/plot_iris_dataset_001.png :target: ../../auto_examples/datasets/plot_iris_dataset.html :align: right :scale: 70 diff --git a/doc/tutorial/statistical_inference/unsupervised_learning.rst b/doc/tutorial/statistical_inference/unsupervised_learning.rst index 1d281e9a7d26b72a72d45f96ace854febb44eda4..d62c7e50d61bb2c4382c6a3400530b3326285bf4 100644 --- a/doc/tutorial/statistical_inference/unsupervised_learning.rst +++ b/doc/tutorial/statistical_inference/unsupervised_learning.rst @@ -24,7 +24,7 @@ Note that there exist a lot of different clustering criteria and associated algorithms. The simplest clustering algorithm is :ref:`k_means`. -.. image:: ../../auto_examples/cluster/images/plot_cluster_iris_2.png +.. image:: ../../auto_examples/cluster/images/plot_cluster_iris_002.png :target: ../../auto_examples/cluster/plot_cluster_iris.html :scale: 70 :align: right @@ -45,15 +45,15 @@ algorithms. The simplest clustering algorithm is >>> print(y_iris[::10]) [0 0 0 0 0 1 1 1 1 1 2 2 2 2 2] -.. |k_means_iris_bad_init| image:: ../../auto_examples/cluster/images/plot_cluster_iris_3.png +.. |k_means_iris_bad_init| image:: ../../auto_examples/cluster/images/plot_cluster_iris_003.png :target: ../../auto_examples/cluster/plot_cluster_iris.html :scale: 63 -.. |k_means_iris_8| image:: ../../auto_examples/cluster/images/plot_cluster_iris_1.png +.. |k_means_iris_8| image:: ../../auto_examples/cluster/images/plot_cluster_iris_001.png :target: ../../auto_examples/cluster/plot_cluster_iris.html :scale: 63 -.. |cluster_iris_truth| image:: ../../auto_examples/cluster/images/plot_cluster_iris_4.png +.. |cluster_iris_truth| image:: ../../auto_examples/cluster/images/plot_cluster_iris_004.png :target: ../../auto_examples/cluster/plot_cluster_iris.html :scale: 63 @@ -85,19 +85,19 @@ algorithms. The simplest clustering algorithm is **Don't over-interpret clustering results** -.. |lena| image:: ../../auto_examples/cluster/images/plot_lena_compress_1.png +.. |lena| image:: ../../auto_examples/cluster/images/plot_lena_compress_001.png :target: ../../auto_examples/cluster/plot_lena_compress.html :scale: 60 -.. |lena_regular| image:: ../../auto_examples/cluster/images/plot_lena_compress_2.png +.. |lena_regular| image:: ../../auto_examples/cluster/images/plot_lena_compress_002.png :target: ../../auto_examples/cluster/plot_lena_compress.html :scale: 60 -.. |lena_compressed| image:: ../../auto_examples/cluster/images/plot_lena_compress_3.png +.. |lena_compressed| image:: ../../auto_examples/cluster/images/plot_lena_compress_003.png :target: ../../auto_examples/cluster/plot_lena_compress.html :scale: 60 -.. |lena_histogram| image:: ../../auto_examples/cluster/images/plot_lena_compress_4.png +.. |lena_histogram| image:: ../../auto_examples/cluster/images/plot_lena_compress_004.png :target: ../../auto_examples/cluster/plot_lena_compress.html :scale: 60 @@ -177,7 +177,7 @@ This can be useful, for instance, to retrieve connected regions (sometimes also referred to as connected components) when clustering an image: -.. image:: ../../auto_examples/cluster/images/plot_lena_ward_segmentation_1.png +.. image:: ../../auto_examples/cluster/images/plot_lena_ward_segmentation_001.png :target: ../../auto_examples/cluster/plot_lena_ward_segmentation.html :scale: 40 :align: right @@ -200,7 +200,7 @@ features: **feature agglomeration**. This approach can be implemented by clustering in the feature direction, in other words clustering the transposed data. -.. image:: ../../auto_examples/cluster/images/plot_digits_agglomeration_1.png +.. image:: ../../auto_examples/cluster/images/plot_digits_agglomeration_001.png :target: ../../auto_examples/cluster/plot_digits_agglomeration.html :align: right :scale: 57 @@ -242,11 +242,11 @@ Principal component analysis: PCA :ref:`PCA` selects the successive components that explain the maximum variance in the signal. -.. |pca_3d_axis| image:: ../../auto_examples/decomposition/images/plot_pca_3d_1.png +.. |pca_3d_axis| image:: ../../auto_examples/decomposition/images/plot_pca_3d_001.png :target: ../../auto_examples/decomposition/plot_pca_3d.html :scale: 70 -.. |pca_3d_aligned| image:: ../../auto_examples/decomposition/images/plot_pca_3d_2.png +.. |pca_3d_aligned| image:: ../../auto_examples/decomposition/images/plot_pca_3d_002.png :target: ../../auto_examples/decomposition/plot_pca_3d.html :scale: 70 @@ -294,7 +294,7 @@ Independent Component Analysis: ICA a maximum amount of independent information. It is able to recover **non-Gaussian** independent signals: -.. image:: ../../auto_examples/decomposition/images/plot_ica_blind_source_separation_1.png +.. image:: ../../auto_examples/decomposition/images/plot_ica_blind_source_separation_001.png :target: ../../auto_examples/decomposition/plot_ica_blind_source_separation.html :scale: 70 :align: center