Skip to content
Snippets Groups Projects
Commit 0b818d3b authored by Nelson Liu's avatar Nelson Liu Committed by Andreas Mueller
Browse files

[MRG+1] edited the FAQ to fix a typo and include info regarding global random states (#5927)

* edited the FAQ to fix a typo and include info regarding global random states.

* more explicitly delineated scikit-learn's use of numpy global random state and added comments about thread safety.

* amended info about multiprocessing and instead discussed how global random states are prone to unintended change. Also used sphinx formatting on numpy.random.seed.

* fixed line breaks

* link to function

* emphasize random_state param for estimators and cv splitters

* edit line length and add backticks to param name
parent c726085e
No related branches found
No related tags found
No related merge requests found
......@@ -8,7 +8,8 @@ Here we try to give some answers to questions that regularly pop up on the maili
What is the project name (a lot of people get it wrong)?
--------------------------------------------------------
scikit-learn, but not scikit or SciKit nor sci-kit learn. Also not scikits.learn or scikits-learn, which where previously used.
scikit-learn, but not scikit or SciKit nor sci-kit learn.
Also not scikits.learn or scikits-learn, which were previously used.
How do you pronounce the project name?
------------------------------------------
......@@ -201,7 +202,6 @@ DBSCAN with Levenshtein distances::
Similar tricks can be used, with some care, for tree kernels, graph kernels,
etc.
Why do I sometime get a crash/freeze with n_jobs > 1 under OSX or Linux?
------------------------------------------------------------------------
......@@ -249,9 +249,9 @@ program: Insert the following instructions in your main script::
You can find more default on the new start methods in the `multiprocessing
documentation <https://docs.python.org/3/library/multiprocessing.html#contexts-and-start-methods>`_.
Why is there no support for deep or reinforcement learning / Will there be support for deep or reinforcement learning in scikit-learn?
--------------------------------------------------------------------------------------------------------------------------------------
Deep learning and reinforcement learning both require a rich vocabulary to
define an architecture, with deep learning additionally requiring
GPUs for efficient computing. However, neither of these fit within
......@@ -259,9 +259,9 @@ the design constraints of scikit-learn; as a result, deep learning
and reinforcement learning are currently out of scope for what
scikit-learn seeks to achieve.
Why is my pull request not getting any attention?
-------------------------------------------------
The scikit-learn review process takes a significant amount of time, and
contributors should not be discouraged by a lack of activity or review on
their pull request. We care a lot about getting things right
......@@ -276,3 +276,23 @@ If a review of your pull request comes slowly, it is likely because the
reviewers are busy. We ask for your understanding and request that you
not close your pull request or discontinue your work solely because of
this reason.
How do I set a ``random_state`` for an entire execution?
----------------------------------------------------
For testing and replicability, it is often important to have the entire execution
controlled by a single seed for the pseudo-random number generator used in
algorithms that have a randomized component. Scikit-learn does not use its own
global random state; whenever a RandomState instance or an integer random seed
is not provided as an argument, it relies on the numpy global random state,
which can be set using :func:`numpy.random.seed`.
For example, to set an execution's numpy global random state to 42, one could
execute the following in his or her script::
import numpy as np
np.random.seed(42)
However, a global random state is prone to modification by other code during
execution. Thus, the only way to ensure replicability is to pass ``RandomState``
instances everywhere and ensure that both estimators and cross-validation
splitters have their ``random_state`` parameter set.
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Please register or to comment