diff --git a/scikits/learn/datasets/README.txt b/scikits/learn/datasets/README.txt
new file mode 100644
index 0000000000000000000000000000000000000000..4e8bdbafff4ecb6764fede8bbd41673b1a09c2de
--- /dev/null
+++ b/scikits/learn/datasets/README.txt
@@ -0,0 +1,28 @@
+Last Change: Tue Jul 17 04:00 PM 2007 J
+
+This packages datasets defines a set of packages which contain datasets useful
+for demo, examples, etc... This can be seen as an equivalent of the R dataset
+package, but for python.
+
+Each subdir is a python package, and should define the function load, returning
+the corresponding data. For example, to access datasets data1, you should be able to do:
+
+>> from datasets.data1 import load
+>> d = load() # -> d contains the data of the datasets data1
+
+load can do whatever it wants: fetching data from a file (python script, csv
+file, etc...), from the internet, etc... Some special variables must be defined
+for each package, containing a python string:
+    - COPYRIGHT: copyright informations
+    - SOURCE: where the data are coming from
+    - DESCHOSRT: short description
+    - DESCLONG: long description
+    - NOTE: some notes on the datasets.
+
+For the datasets to be useful in the learn scikits, which is the project which initiated this datasets package, the data returned by load has to be a dict with the following conventions:
+    - 'data': this value should be a record array containing the actual data.
+    - 'label': this value should be a rank 1 array of integers, contains the
+      label index for each sample, that is label[i] should be the label index
+      of data[i].
+    - 'class': a record array such as class[i] is the class name. In other
+      words, this makes the correspondance label index <> label name.