Skip to content
Snippets Groups Projects
Commit 9784dfd6 authored by Mathieu Blondel's avatar Mathieu Blondel
Browse files

Address @ogrisel and @amueller's comments.

parent 0b0d8d14
Branches
No related tags found
No related merge requests found
......@@ -99,6 +99,9 @@ zero features)::
>>> vectors.nnz / vectors.shape[0]
118
``sklearn.datasets.fetch_20newsgroups_tfidf`` is a function which returns
ready-to-use tfidf features instead of file names.
.. _`20 newsgroups website`: http://people.csail.mit.edu/jrennie/20Newsgroups/
.. _`TF-IDF`: http://en.wikipedia.org/wiki/Tf-idf
......
......@@ -97,7 +97,7 @@ def download_20newsgroups(target_dir, cache_path):
def fetch_20newsgroups(data_home=None, subset='train', categories=None,
shuffle=True, random_state=42, download_if_missing=True):
"""Load the filenames of the 20 newsgroups dataset
"""Load the filenames of the 20 newsgroups dataset.
Parameters
----------
......@@ -225,6 +225,7 @@ def fetch_20newsgroups_tfidf(subset="train", data_home=None):
data_home = get_data_home(data_home=data_home)
mem = Memory(cachedir=data_home, verbose=False)
# we shuffle but use a fixed seed for the memoization
data_train = fetch_20newsgroups(data_home=data_home,
subset='train',
categories=None,
......
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Please register or to comment