-
- Downloads
[MRG + 1] Added some documentation for loading external datasets (Issue 3808) (#7516)
* Update tutorial.rst * Update tutorial.rst * Update tutorial.rst * Update tutorial.rst * Update index.rst * Update index.rst * Update tutorial.rst * Update tutorial.rst * Update tutorial.rst * Update faq.rst * Update faq.rst * Divided in two cases (standard columnar and misc data) I also added a preprocessing note at the end * Update tutorial.rst * Update faq.rst * Update index.rst Also added some references that were in the original FAQ and pointed the FAQ to here * Update index.rst Added the information from the removed part of the FAQ because I felt that the FAQ version was better than the explanation I gave. * Update index.rst reference to skimage and also has sklearn.preprocessing.OneHotEncoder instead of OneHotEncoder * Update index.rst * Update index.rst Changed with @jnothman's feedback https://github.com/scikit-learn/scikit-learn/pull/7516/files/d16ac523ed404188fc1f2529ac89050d4a974e3f * Update faq.rst * optimized file formats added to datasets/index.rst Note: if you manage your own numerical data it is recommended to use an optimized file format such as HDF5 to reduce data load times. Various libraries such as H5Py, PyTables and pandas provides a Python interface for reading and writing data in that format. - From the FAQ * faq.rst: Moved the comment in bunch section to datasets index This comment has been moved to the datasets index in the external_datasets section: Note: if you manage your own numerical data it is recommended to use an optimized file format such as HDF5 to reduce data load times. Various libraries such as H5Py, PyTables and pandas provides a Python interface for reading and writing data in that format. * Update index.rst Included all changes mentioned by @amueller and @jnothman * Update faq.rst * Update faq.rst
Please register or sign in to comment