genieclust.inequity
Inequity (inequality) measures
- genieclust.inequity.bonferroni_index(x, is_sorted=False)
Computes the normalised Bonferroni index
- Parameters
- xndarray
A vector with non-negative elements.
- is_sortedbool
Indicates if x is already sorted increasingly.
- Returns
- double
The value of the inequity index, a number in [0,1].
See also
genieclust.inequity.devergottini_index
The normalised De Vergottini index
genieclust.inequity.gini_index
The normalised Gini index
Notes
The normalised Bonferroni [1] index is given by:
\[B(x_1,\dots,x_n) = \frac{ \sum_{i=1}^{n} \left( n-\sum_{j=1}^i \frac{n}{n-j+1} \right) x_{\sigma(n-i+1)} }{ (n-1) \sum_{i=1}^n x_i },\]where \(\sigma\) is an ordering permutation of \((x_1,\dots,x_n)\).
Time complexity: \(O(n)\) for sorted data.
References
- 1
Bonferroni C., Elementi di Statistica Generale, Libreria Seber, Firenze, 1930.
Examples
No inequality (perfect equality):
>>> round(genieclust.inequity.bonferroni_index(np.r_[2, 2, 2, 2, 2]), 2) 0.0
One has it all (total inequity):
>>> round(genieclust.inequity.bonferroni_index(np.r_[0, 0, 10, 0, 0]), 2) 1.0
Give to the poor, take away from the rich:
>>> round(genieclust.inequity.bonferroni_index(np.r_[7, 0, 3, 0, 0]), 2) 0.91
Robinhood even more:
>>> round(genieclust.inequity.bonferroni_index(np.r_[6, 0, 3, 1, 0]), 2) 0.83
- genieclust.inequity.devergottini_index(x, is_sorted=False)
Computes the normalised De Vergottini index
- Parameters
- xndarray
A vector with non-negative elements.
- is_sortedbool
Indicates if x is already sorted increasingly.
- Returns
- double
The value of the inequity index, a number in [0,1].
See also
genieclust.inequity.bonferroni_index
The normalised Bonferroni index
genieclust.inequity.gini_index
The normalised Gini index
Notes
The normalised De Vergottini index is given by:
\[\frac{1}{\sum_{i=2}^n \frac{1}{i}} \left( \frac{ \sum_{i=1}^n \left( \sum_{j=i}^{n} \frac{1}{j} \right) x_{\sigma(n-i+1)} }{\sum_{i=1}^{n} x_i} - 1 \right)\]where \(\sigma\) is an ordering permutation of \((x_1,\dots,x_n)\).
Time complexity is \(O(n)\) for sorted data.
Examples
No inequality (perfect equality):
>>> round(genieclust.inequity.devergottini_index(np.r_[2, 2, 2, 2, 2]), 2) 0.0
One has it all (total inequity):
>>> round(genieclust.inequity.devergottini_index(np.r_[0, 0, 10, 0, 0]), 2) 1.0
Give to the poor, take away from the rich:
>>> round(genieclust.inequity.devergottini_index(np.r_[7, 0, 3, 0, 0]), 2) 0.77
Robinhood even more:
>>> round(genieclust.inequity.devergottini_index(np.r_[6, 0, 3, 1, 0]), 2) 0.65
- genieclust.inequity.gini_index(x, is_sorted=False)
Computes the normalised Gini index
- Parameters
- xndarray
A vector with non-negative elements.
- is_sortedbool
Indicates if x is already sorted increasingly.
- Returns
- double
The value of the inequity index, a number in [0,1].
See also
genieclust.inequity.bonferroni_index
The normalised Bonferroni index
genieclust.inequity.devergottini_index
The normalised De Vergottini index
Notes
The normalised Gini [1] index is given by:
\[G(x_1,\dots,x_n) = \frac{ \sum_{i=1}^{n-1} \sum_{j=i+1}^n |x_i-x_j| }{ (n-1) \sum_{i=1}^n x_i }.\]Time complexity is \(O(n)\) for sorted data; it holds:
\[G(x_1,\dots,x_n) = \frac{ \sum_{i=1}^{n} (n-2i+1) x_{\sigma(n-i+1)} }{ (n-1) \sum_{i=1}^n x_i },\]where \(\sigma\) is an ordering permutation of \((x_1,\dots,x_n)\).
The Gini, Bonferroni, and De Vergottini indices can be used to quantify the “inequity” of a numeric sample. They can be perceived as measures of data dispersion. For constant vectors (perfect equity), the indices yield values of 0. Vectors with all elements but one equal to 0 (perfect inequity), are assigned scores of 1. They follow the Pigou-Dalton principle (are Schur-convex): setting \(x_i = x_i - h\) and \(x_j = x_j + h\) with \(h > 0\) and \(x_i - h \geq x_j + h\) (taking from the “rich” and giving away to the “poor”) decreases the inequity.
These indices have applications in economics, amongst others. The Genie clustering algorithm uses the Gini index as a measure of the inequality of cluster sizes.
References
- 1
Gini C., Variabilita e Mutabilita, Tipografia di Paolo Cuppini, Bologna, 1912.
Examples
No inequality (perfect equality):
>>> round(genieclust.inequity.gini_index(np.r_[2, 2, 2, 2, 2]), 2) 0.0
One has it all (total inequity):
>>> round(genieclust.inequity.gini_index(np.r_[0, 0, 10, 0, 0]), 2) 1.0
Give to the poor, take away from the rich:
>>> round(genieclust.inequity.gini_index(np.r_[7, 0, 3, 0, 0]), 2) 0.85
Robinhood even more:
>>> round(genieclust.inequity.gini_index(np.r_[6, 0, 3, 1, 0]), 2) 0.75