Spearman's Rank Coefficient - estebanz01/ruby-statistics GitHub Wiki

Spearman's Rank Coefficient

This is an implementation of the Spearman's Rank Coefficient and ranking method for the Spearman's Rank Correlation test.

Class methods

Rank

It expects two keywords: data: and return_ranks_only: where the latter has a default value of true.

If the second keyword is true, it returns an array of floats, where each element represents a ranking. It follows the order that data: has.
If the second keyword is false, it returns a hash containing all the unique elements of data: with some information about ranking and tie_ranking.

The function calculates and solve ties, according to the theory.

# without ties
[14] pry(main)> RubyStatistics::SpearmanRankCoefficient.rank(data: [10, 30, 34, 340, 35])
=> [5, 4, 3, 1, 2]
[15] pry(main)> RubyStatistics::SpearmanRankCoefficient.rank(data: [10, 30, 34, 340, 35], return_ranks_only: false)
=> {10=>{:counter=>1, :rank=>5, :tie_rank=>5},
 30=>{:counter=>1, :rank=>4, :tie_rank=>4},
 34=>{:counter=>1, :rank=>3, :tie_rank=>3},
 340=>{:counter=>1, :rank=>1, :tie_rank=>1},
 35=>{:counter=>1, :rank=>2, :tie_rank=>2}

# with ties
[18] pry(main)> RubyStatistics::SpearmanRankCoefficient.rank(data: [10, 30, 34, 340, 35, 35, 10])
=> [6.5, 5, 4, 1, 2.5, 2.5, 6.5]
[19] pry(main)> RubyStatistics::SpearmanRankCoefficient.rank(data: [10, 30, 34, 340, 35, 35, 10], return_ranks_only: false)
=> {10=>{:counter=>2, :rank=>13, :tie_rank=>6.5},
 30=>{:counter=>1, :rank=>5, :tie_rank=>5},
 34=>{:counter=>1, :rank=>4, :tie_rank=>4},
 340=>{:counter=>1, :rank=>1, :tie_rank=>1},
 35=>{:counter=>2, :rank=>5, :tie_rank=>2.5}}

Coefficient

It calculates the Spearman's Rank Coefficient based on two sets of rankings. It tries detect ties. Both sets of rankings must have the same number of elements in order to calculate the coefficient.

[20] pry(main)> set_one = RubyStatistics::SpearmanRankCoefficient.rank(data: [10, 30, 34, 340, 35, 35, 10])
=> [6.5, 5, 4, 1, 2.5, 2.5, 6.5]
[21] pry(main)> set_two = RubyStatistics::SpearmanRankCoefficient.rank(data: [50, 30, 12, 33, 12, 44, 70])
=> [2, 5, 6.5, 4, 6.5, 3, 1]
[22] pry(main)> RubyStatistics::SpearmanRankCoefficient.coefficient(set_one, set_two)
=> -0.5046083923495819

It throws an error if the two sets of rankings differ in size:

[23] pry(main)> set_one = [1, 2, 3]
=> [1, 2, 3]
[24] pry(main)> set_two = [1, 2, 3, 4, 5]
=> [1, 2, 3, 4, 5]
[25] pry(main)> RubyStatistics::SpearmanRankCoefficient.coefficient(set_one, set_two)
RuntimeError: Both group sets must have the same number of cases.