diff --git a/doc/images/last_digit.png b/doc/images/last_digit.png
new file mode 100644
index 0000000000000000000000000000000000000000..f6c715a54e216999839fa861dd55d590a5aa585b
Binary files /dev/null and b/doc/images/last_digit.png differ
diff --git a/doc/tutorial.rst b/doc/tutorial.rst
index 3a37eecb4d7db639e7608fe9e5db0aba0c8749a2..fb8c89dbe8937a046a929556caa8fba3875c3de6 100644
--- a/doc/tutorial.rst
+++ b/doc/tutorial.rst
@@ -1,6 +1,13 @@
 Getting started: an introduction to learning with the scikit
 =============================================================
 
+.. topic:: Section contents
+
+    In this section, we introduce the machine learning vocabulary that we
+    use through-out the `scikit.learn` and give a simple example of
+    solving a learning problem.
+
+
 Machine learning: the problem setting
 ---------------------------------------
 
@@ -27,7 +34,17 @@ We can separate learning problems in a few large categories:
  * **unsupervised learning**, in which we are trying to learning a
    synthetic representation of the data.
 
-Loading a sample dataset
+.. topic:: Training set and testing set
+
+    Machine learning is about learning some properties of a data set and
+    applying them to new data. This is why a common practice in machine 
+    learning to evaluate an algorithm is to split the data at hand in two
+    sets, one that we call a *training set* on which we learn data
+    properties, and one that we call a *testing set*, on which we test
+    these properties.
+
+
+Loading an example dataset
 --------------------------
 
 The `scikit.learn` comes with a few standard datasets, for instance the
@@ -39,61 +56,97 @@ the `digits dataset
     >>> iris = datasets.load_iris()
     >>> digits = datasets.load_digits()
 
-A dataset is a dictionary-like object that holds all the samples and
-some metadata about the samples. You can access the underlying data
-with members `.data` and `.target`.
-
-For instance, in the case of the iris dataset, `iris.data` gives access
-to the features that can be used to classify the iris samples::
-
-    >>> iris.data
-    array([[ 5.1,  3.5,  1.4,  0.2],
-	   [ 4.9,  3. ,  1.4,  0.2],
-	   [ 4.7,  3.2,  1.3,  0.2],
-	   ...
-	   [ 6.5,  3. ,  5.2,  2. ],
-	   [ 6.2,  3.4,  5.4,  2.3],
-	   [ 5.9,  3. ,  5.1,  1.8]])
-
-and `iris.target` gives the ground thruth for the iris dataset, that is
-the labels describing the different classes of irises that we are trying
-to learn:
+A dataset is a dictionary-like object that holds all the data and some
+metadata about the data. This data is stored in the `.data` member, which
+is a `n_samples, n_features` array. In the case of supervised problem,
+explanatory variables are stored in the `.target` member.
 
->>> iris.target
-array([ 0.,  0.,  0.,  0., ... 2.,  2.,  2.,  2.])
+For instance, in the case of the digits dataset, `digits.data` gives
+access to the features that can be used to classify the digits samples::
 
+    >>> digits.data
+    array([[  0.,   0.,   5., ...,   0.,   0.,   0.],
+           [  0.,   0.,   0., ...,  10.,   0.,   0.],
+           [  0.,   0.,   0., ...,  16.,   9.,   0.],
+           ..., 
+           [  0.,   0.,   1., ...,   6.,   0.,   0.],
+           [  0.,   0.,   2., ...,  12.,   0.,   0.],
+           [  0.,   0.,  10., ...,  12.,   1.,   0.]])
 
-Prediction
-----------
+and `digits.target` gives the ground truth for the digit dataset, that
+is the number corresponding to each digit image that we are trying to
+learn:
 
-Suppose some given data points each belong to one of two classes, and
-the goal is to decide which class a new data point will be in. In
-``scikits.learn`` this is done with an *estimator*. An *estimator* is
-just a plain Python class that implements the methods fit(X, Y) and
-predict(T).
+>>> digits.target
+array([0, 1, 2, ..., 8, 9, 8])
 
-An example of predictor is the class
-``scikits.learn.neighbors.Neighbors``(XXX ref). The constructor of a predictor
-takes as arguments the parameters of the model. In this case, our only
-parameter is k, the number of neighbors to consider.
-
->>> from scikits.learn import neighbors
->>> clf = neighbors.Neighbors(k=3)
-
-The predictor now must be fitted to the model, that is, it must
-`learn` from the model. This is done by passing our training set to
-the ``fit`` method.
-
->>> clf.fit(iris.data, iris.target) #doctest: +ELLIPSIS
-<scikits.learn.neighbors.Neighbors instance at 0x...>
-
-Now you can predict new values
-
->>> print clf.predict([[0, 0, 0, 0]])
-[[ 0.]]
-
-
-Regression
-----------
-In the regression problem, classes take continous values.
-Linear Regression. TODO
+.. topic:: Shape of the data arrays
+   
+    The data is always are 2D array, `n_samples, n_features`, although
+    the original data may have had a different shape. In the case of the
+    digits, each original sample is an image of shape `8, 8` and can be
+    accessed using:
+
+    >>> digits.images[0]
+    array([[  0.,   0.,   5.,  13.,   9.,   1.,   0.,   0.],
+           [  0.,   0.,  13.,  15.,  10.,  15.,   5.,   0.],
+           [  0.,   3.,  15.,   2.,   0.,  11.,   8.,   0.],
+           [  0.,   4.,  12.,   0.,   0.,   8.,   8.,   0.],
+           [  0.,   5.,   8.,   0.,   0.,   9.,   8.,   0.],
+           [  0.,   4.,  11.,   0.,   1.,  12.,   7.,   0.],
+           [  0.,   2.,  14.,   5.,  10.,  12.,   0.,   0.],
+           [  0.,   0.,   6.,  13.,  10.,   0.,   0.,   0.]])
+
+    The :ref:`simple example on this dataset <example_plot_digits_classification.py>`
+    illustrates how starting from the original problem one can shape the 
+    data for consumption in the `scikit.learn`.
+
+
+Learning and Predicting
+------------------------
+
+In the case of the digits dataset, the task is to predict the value of a
+hand-written digit from an image. We are given samples of each of the 10
+possible classes on which we *fit* an `estimator` to be able to *predict*
+the labels corresponding to new data.
+
+In `scikit.learn`, an *estimator* is just a plain Python class that
+implements the methods `fit(X, Y)` and `predict(T)`.
+
+An example of estimator is the class ``scikits.learn.neighbors.SVC`` that
+implements `Support Vector Classification
+<http://en.wikipedia.org/wiki/Support_vector_machine>`_. The
+constructor of an estimator takes as arguments the parameters of the
+model, but for the time being, we will consider the estimator as a black
+box and not worry about these:
+
+>>> from scikits.learn import svm
+>>> clf = svm.SVC()
+
+We call our estimator instance `clf` as it is a classifier. It now must
+be fitted to the model, that is, it must `learn` from the model. This is
+done by passing our training set to the ``fit`` method. As a training
+set, let us use the all the images of our dataset appart from the last
+one:
+
+>>> clf.fit(digits.data[:-1], digits.target[:-1]) #doctest: +ELLIPSIS
+<scikits.learn.svm.SVC object at 0x...>
+
+Now you can predict new values, in particular, we can ask to the
+classifier what is the digit of our last image in the `digits` dataset,
+which we have not used to train the classifier:
+
+>>> print clf.predict(digits.data[-1])
+array([ 8.])
+
+The corresponding image is the following:
+
+.. image:: images/last_digit.png
+    :align: center
+
+As you can see, it is a challenging task: the images are of poor
+resolution. Do you agree with the classifier?
+
+A complete example of this classification problem is available as an
+example that you can run and study:
+:ref:`example_plot_digits_classification.py`.