Divergence and Distance
DiscreteEntropy.cross_entropy
— Function cross_entropy(P::CountVector, Q::CountVector, ::Type{T}) where {T<:AbstractEstimator}
\[H(P,Q) = - \sum_x(P(x) \log(Q(x)))\]
Compute the cross entropy of $P$ and $Q$, given an estimator of type $T$. $P$ and $Q$ must be the same length. Both vectors are normalised. The cross entropy of a probability distribution $P$ with itself, is equal to its entropy, ie $H(P, P) = H(P)$.
Example
julia> P = cvector([1,2,3,4,3,2])
julia> Q = cvector([2,5,5,4,3,4])
julia> ce = cross_entropy(P, Q, MaximumLikelihood)
1.778564897565542
Note: not every estimator is currently supported.
DiscreteEntropy.kl_divergence
— Functionkl_divergence(P::CountVector, Q::CountVector, estimator::Type{T}; truncate::Union{Nothing, Int} = nothing) where {T<:AbstractEstimator}
\[D_{KL}(P ‖ Q) = \sum_{x \in X} P(x) \log \left( \frac{P(x)}{Q(x)} \right)\]
Compute the Kullback-Lebler Divergence between two discrete distributions. $P$ and $Q$ must be the same length. If the distributions are not normalised, they will be.
If the distributions are not over the same space or the cross entropy is negative, then it returns Inf
.
If truncate is set to some integer value, x
, return kl_divergence rounded to x
decimal places.
DiscreteEntropy.jensen_shannon_divergence
— Functionjensen_shannon_divergence(countsP::CountVector, countsQ::CountVector)
jensen_shannon_divergence(countsP::CountVector, countsQ::CountVector, estimator::Type{T}) where {T<:NonParamterisedEstimator}
jensen_shannon_divergence(countsP::CountVector, countsQ::CountVector, estimator::Type{Bayes}, α)
Compute the Jensen Shannon Divergence between discrete distributions $P$ and $Q$, as represented by their histograms. If no estimator is specified, it defaults to MaximumLikelihood.
\[\widehat{JS}(P, Q) = \hat{H}\left(\frac{P + Q}{2} \right) - \left( \frac{H(P) + H(Q)}{2} \right) \]
DiscreteEntropy.jensen_shannon_distance
— Functionjensen_shannon_distance(P::CountVector, Q::CountVector, estimator::Type{T}) where {T<:AbstractEstimator}
Compute the Jensen Shannon Distance
DiscreteEntropy.jeffreys_divergence
— Functionjeffreys_divergence(P::CountVector, Q::CountVector)
jeffreys_divergence(P::CountVector, Q::CountVector, estimator::Type{T}) where T<:AbstractEstimator
\[J(p, q) = D_{KL}(p \Vert q) + D_{KL}(q \Vert p)\]
If no estimator is specified, then we calculate using maximum likelihood
External Links
DiscreteEntropy.uncertainty_coefficient
— Function uncertainty_coefficient(joint::Matrix{I}, estimator::Type{T}; symmetric=false) where {T<:AbstractEstimator, I<:Real}
Compute Thiel's uncertainty coefficient on 2 dimensional matrix joint
, with estimator
, where joint
is the histogram of the joint distribution of two random variables $(X;Y)$, and $I(X;Y)$ is the (estimated) mutual information.
\[U(X \mid Y) = \frac{I(X;Y)}{H(X)}\]
If symmetric
is true
then compute the weighted average between X
and Y
\[U(X, Y) = 2 \left[ \frac{H(X) + H(Y) - H(X, Y)} {H(X) + H(Y)} \right]\]