Skip to content
Snippets Groups Projects
Commit df27e261 authored by Raul Garreta's avatar Raul Garreta Committed by Ignacio Rossi
Browse files

model persistence doc, added improvements from ogrisel comments

parent 8489c330
No related branches found
No related tags found
No related merge requests found
...@@ -36,12 +36,25 @@ persistence model, namely `pickle <http://docs.python.org/library/pickle.html>`_ ...@@ -36,12 +36,25 @@ persistence model, namely `pickle <http://docs.python.org/library/pickle.html>`_
In the specific case of the scikit, it may be more interesting to use In the specific case of the scikit, it may be more interesting to use
joblib's replacement of pickle (``joblib.dump`` & ``joblib.load``), joblib's replacement of pickle (``joblib.dump`` & ``joblib.load``),
which is more efficient on big data, but can only pickle to the disk which is more efficient on objects that carry large numpy arrays internally as
and not to a string:: is often the case for fitted scikit-learn estimators, but can only pickle to the
disk and not to a string::
>>> from sklearn.externals import joblib >>> from sklearn.externals import joblib
>>> joblib.dump(clf, 'filename.pkl') # doctest: +SKIP >>> joblib.dump(clf, 'filename.pkl') # doctest: +SKIP
Later you can load back the pickled model (possibly in another Python process)
with::
>>> clf = joblib.load('filename.pkl') # doctest:+SKIP
.. note::
joblib.dump returns a list of filenames. Each individual numpy array
contained in the `clf` object is serialized as a separate file on the
filesystem. All files are required in the same folder when reloading the
model with joblib.load.
Security & maintainability limitations Security & maintainability limitations
-------------------------------------- --------------------------------------
......
...@@ -234,7 +234,19 @@ and not to a string:: ...@@ -234,7 +234,19 @@ and not to a string::
>>> from sklearn.externals import joblib >>> from sklearn.externals import joblib
>>> joblib.dump(clf, 'filename.pkl') # doctest: +SKIP >>> joblib.dump(clf, 'filename.pkl') # doctest: +SKIP
It's important for you to know that pickle has some security and maintainability Later you can load back the pickled model (possibly in another Python process)
issues. Please refer to section :ref:`model_persistence` for more detailed with::
information about model persistence with scikit-learn.
>>> clf = joblib.load('filename.pkl') # doctest:+SKIP
.. note::
joblib.dump returns a list of filenames. Each individual numpy array
contained in the `clf` object is serialized as a separate file on the
filesystem. All files are required in the same folder when reloading the
model with joblib.load.
Note that pickle has some security and maintainability issues. Please refer to
section :ref:`model_persistence` for more detailed information about model
persistence with scikit-learn.
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Please register or to comment