diff --git a/doc/modules/clustering.rst b/doc/modules/clustering.rst index b990159820f92b4c23b975650738c06c49d7bb41..2b6ec645b6bd8b6fe1ae2128e7fd7125bb8dce7e 100644 --- a/doc/modules/clustering.rst +++ b/doc/modules/clustering.rst @@ -565,13 +565,15 @@ Mathematical formulation ~~~~~~~~~~~~~~~~~~~~~~~~ Assume two label assignments (of the same data), :math:`U` with :math:`R` classes and :math:`V` with :math:`C` classes. The entropy of either is the - amount of uncertaintly for an array, and can be calculated as: +amount of uncertaintly for an array, and can be calculated as: .. math:: H(U) = \sum_{i=1}^{|R|}P(i)log(P(i)) Where P(i) is the number of instances in U that are in class :math:`R_i`. Likewise, for :math:`V`: + .. math:: H(V) = \sum_{j=1}^{|C|}P'(j)log(P'(j)) + Where P'(j) is the number of instances in V that are in class :math:`C_j`. The (non-adjusted) mutual information between :math:`U` and :math:`V` is diff --git a/doc/modules/mixture.rst b/doc/modules/mixture.rst index 5b32ccecd0fa97d57bbfb6be42868b3eb8d189b9..c1a97bf5c22714855140522ac2f609ac594e9603 100644 --- a/doc/modules/mixture.rst +++ b/doc/modules/mixture.rst @@ -47,7 +47,7 @@ only needs to specify a loose upper bound on this number and a concentration parameter. Expectation-maximization ------------------------ +------------------------ The main difficulty in learning gaussian mixture models from unlabeled data is that it is one usually doesn't know which points came from