diff --git a/doc/datasets/index.rst b/doc/datasets/index.rst
index 2e36c85b659da5dae9ff11c5b5607386782e21b7..09abd44bcebcacdc5c4f2694a9586f16d590226b 100644
--- a/doc/datasets/index.rst
+++ b/doc/datasets/index.rst
@@ -27,15 +27,16 @@ The simplest one is the interface for sample images, which is described
 below in the :ref:`sample_images` section.
 
 The dataset generation functions and the svmlight loader share a simplistic
-interface, returning a tuple ``(X, y)`` consisting of a n_samples x n_features
-numpy array X and an array of length n_samples containing the targets y.
+interface, returning a tuple ``(X, y)`` consisting of a ``n_samples`` *
+``n_features`` numpy array ``X`` and an array of length ``n_samples``
+ containing the targets ``y``.
 
 The toy datasets as well as the 'real world' datasets and the datasets
 fetched from mldata.org have more sophisticated structure.
 These functions return a dictionary-like object holding at least two items:
-an array of shape ``n_samples`` * `` n_features`` with key ``data``
+an array of shape ``n_samples`` * ``n_features`` with key ``data``
 (except for 20newsgroups)
-and a NumPy array of length ``n_samples``, containing the target values,
+and a numpy array of length ``n_samples``, containing the target values,
 with key ``target``.
 
 The datasets also contain a description in ``DESCR`` and some contain
diff --git a/doc/datasets/labeled_faces.rst b/doc/datasets/labeled_faces.rst
index 8786335e6fec02f79f10b955fb6e13b74421eaf2..85a2b41a6475bbbaa742e757a745e7cf216d21d6 100644
--- a/doc/datasets/labeled_faces.rst
+++ b/doc/datasets/labeled_faces.rst
@@ -92,15 +92,16 @@ is a pair of two picture belonging or not to the same person::
   >>> lfw_pairs_train.target.shape
   (2200,)
 
-Both for the ``fetch_lfw_people`` and ``fetch_lfw_pairs`` function it is
+Both for the :func:`sklearn.datasets.fetch_lfw_people` and
+:func:`sklearn.datasets.fetch_lfw_pairs` function it is
 possible to get an additional dimension with the RGB color channels by
 passing ``color=True``, in that case the shape will be
 ``(2200, 2, 62, 47, 3)``.
 
-The ``fetch_lfw_pairs`` datasets is subdivided into 3 subsets: the development
-``train`` set, the development ``test`` set and an evaluation ``10_folds``
-set meant to compute performance metrics using a 10-folds cross
-validation scheme.
+The :func:`sklearn.datasets.fetch_lfw_pairs` datasets is subdivided into
+3 subsets: the development ``train`` set, the development ``test`` set and
+an evaluation ``10_folds`` set meant to compute performance metrics using a
+10-folds cross validation scheme.
 
 .. topic:: References:
 
diff --git a/doc/datasets/mldata.rst b/doc/datasets/mldata.rst
index 5620a43df3ee5ceb287729566edea8e15fb655e7..5083317cffc5300a459bd6df8b0fd56a62255b4f 100644
--- a/doc/datasets/mldata.rst
+++ b/doc/datasets/mldata.rst
@@ -13,7 +13,8 @@ Downloading datasets from the mldata.org repository
 data, supported by the `PASCAL network <http://www.pascal-network.org>`_ .
 
 The ``sklearn.datasets`` package is able to directly download data
-sets from the repository using the function ``fetch_mldata(dataname)``.
+sets from the repository using the function
+:func:`sklearn.datasets.fetch_mldata`.
 
 For example, to download the MNIST digit recognition database::
 
@@ -38,14 +39,15 @@ specified by the ``data_home`` keyword argument, which defaults to
   ['mnist-original.mat']
 
 Data sets in `mldata.org <http://mldata.org>`_ do not adhere to a strict
-naming or formatting convention. ``fetch_mldata`` is able to make sense
-of the most common cases, but allows to tailor the defaults to individual
-datasets:
+naming or formatting convention. :func:`sklearn.datasets.fetch_mldata` is
+able to make sense of the most common cases, but allows to tailor the
+defaults to individual datasets:
 
 * The data arrays in `mldata.org <http://mldata.org>`_ are most often
   shaped as ``(n_features, n_samples)``. This is the opposite of the
-  ``scikit-learn`` convention, so ``fetch_mldata`` transposes the matrix
-  by default. The ``transpose_data`` keyword controls this behavior::
+  ``scikit-learn`` convention, so :func:`sklearn.datasets.fetch_mldata`
+  transposes the matrix by default. The ``transpose_data`` keyword controls
+  this behavior::
 
     >>> iris = fetch_mldata('iris', data_home=custom_data_home)
     >>> iris.data.shape
@@ -55,12 +57,12 @@ datasets:
     >>> iris.data.shape
     (4, 150)
 
-* For datasets with multiple columns, ``fetch_mldata`` tries to identify
-  the target and data columns and rename them to ``target`` and ``data``.
-  This is done by looking for arrays named ``label`` and ``data`` in the
-  dataset, and failing that by choosing the first array to be ``target``
-  and the second to be ``data``. This behavior can be changed with the
-  ``target_name`` and ``data_name`` keywords, setting them to a specific
+* For datasets with multiple columns, :func:`sklearn.datasets.fetch_mldata`
+  tries to identify the target and data columns and rename them to ``target``
+  and ``data``. This is done by looking for arrays named ``label`` and
+  ``data`` in the dataset, and failing that by choosing the first array to be
+  ``target`` and the second to be ``data``. This behavior can be changed with
+  the ``target_name`` and ``data_name`` keywords, setting them to a specific
   name or index number (the name and order of the columns in the datasets
   can be found at its `mldata.org <http://mldata.org>`_ under the tab "Data"::
 
diff --git a/doc/datasets/twenty_newsgroups.rst b/doc/datasets/twenty_newsgroups.rst
index 003366efa4606e532587e7bb129fb21973cb7e1d..593f35978017d5c31160d0b76d26e8dc5486a080 100644
--- a/doc/datasets/twenty_newsgroups.rst
+++ b/doc/datasets/twenty_newsgroups.rst
@@ -10,22 +10,22 @@ between the train and test set is based upon a messages posted before
 and after a specific date.
 
 This module contains two loaders. The first one,
-``sklearn.datasets.fetch_20newsgroups``,
+:func:`sklearn.datasets.fetch_20newsgroups`,
 returns a list of the raw texts that can be fed to text feature
 extractors such as :class:`sklearn.feature_extraction.text.Vectorizer`
 with custom parameters so as to extract feature vectors.
-The second one, ``sklearn.datasets.fetch_20newsgroups_vectorized``,
+The second one, :func:`sklearn.datasets.fetch_20newsgroups_vectorized`,
 returns ready-to-use features, i.e., it is not necessary to use a feature
 extractor.
 
 Usage
 -----
 
-The ``sklearn.datasets.fetch_20newsgroups`` function is a data
+The :func:`sklearn.datasets.fetch_20newsgroups` function is a data
 fetching / caching functions that downloads the data archive from
 the original `20 newsgroups website`_, extracts the archive contents
 in the ``~/scikit_learn_data/20news_home`` folder and calls the
-``sklearn.datasets.load_file`` on either the training or
+:func:`sklearn.datasets.load_files` on either the training or
 testing set folder, or both of them::
 
   >>> from sklearn.datasets import fetch_20newsgroups
@@ -65,7 +65,8 @@ attribute is the integer index of the category::
   array([12,  6,  9,  8,  6,  7,  9,  2, 13, 19])
 
 It is possible to load only a sub-selection of the categories by passing the
-list of the categories to load to the ``fetch_20newsgroups`` function::
+list of the categories to load to the
+:func:`sklearn.datasets.fetch_20newsgroups` function::
 
   >>> cats = ['alt.atheism', 'sci.space']
   >>> newsgroups_train = fetch_20newsgroups(subset='train', categories=cats)
@@ -106,7 +107,7 @@ components by sample in a more than 30000-dimensional space
   >>> vectors.nnz / float(vectors.shape[0])
   159.01327433628319
 
-``sklearn.datasets.fetch_20newsgroups_vectorized`` is a function which returns
+:func:`sklearn.datasets.fetch_20newsgroups_vectorized` is a function which returns
 ready-to-use tfidf features instead of file names.
 
 .. _`20 newsgroups website`: http://people.csail.mit.edu/jrennie/20Newsgroups/
@@ -147,7 +148,7 @@ Let's take a look at what the most informative features are:
   ...     for i, category in enumerate(categories):
   ...         top10 = np.argsort(classifier.coef_[i])[-10:]
   ...         print("%s: %s" % (category, " ".join(feature_names[top10])))
-  ...  
+  ...
   >>> show_top10(clf, vectorizer, newsgroups_train.target_names)
   alt.atheism: sgi livesey atheists writes people caltech com god keith edu
   comp.graphics: organization thanks files subject com image lines university edu graphics
@@ -176,7 +177,7 @@ of each file. **remove** should be a tuple containing any subset of
 ``('headers', 'footers', 'quotes')``, telling it to remove headers, signature
 blocks, and quotation blocks respectively.
 
-  >>> newsgroups_test = fetch_20newsgroups(subset='test', 
+  >>> newsgroups_test = fetch_20newsgroups(subset='test',
   ...                                      remove=('headers', 'footers', 'quotes'),
   ...                                      categories=categories)
   >>> vectors_test = vectorizer.transform(newsgroups_test.data)