PHP Number Tips - Part 9: If you want to calculate statistical measures such as variance or skewness for a number set, you can use PEAR's Math_Stats class such as:
<?php
//include Math_Stats class
include "Math/Stats.php";
//initialize object
$stats = new Math_Stats();
//define number series
$series = array(76, 7348, 56, 2.6, 189, 67.59, 17594, 2648, 1929.79,↵
54, 329, 820);
//connect object to series
$stats->setData($series);
//calculate complete statistics
$data = $stats->calcFull();
print_r($data);
?>
PEAR's Math_Stats class is designed specifically to calculate statistical measures for a set of numbers. This number set must be expressed as an array, and passed to the class' setData( ) method then the calcFull( ) method can be used to generate a basic or expanded set of statistics about the number set.
Here is the following output of the calcFull( ) method:
Array
(
[min] => 2.6
[max] => 17594
[sum] => 31113.98
[sum2] => 375110698.612
[count] => 12
[mean] => 2592.83166667
[median] => 259
[mode] => Array
(
[0] => 1929.79
[1] => 820
[2] => 2648
[3] => 7348
[4] => 17594
[5] => 329
[6] => 189
[7] => 54
[8] => 56
[9] => 67.59
[10] => 76
[11] => 2.6
)
[midrange] => 8798.3
[geometric_mean] => 324.444468821
[harmonic_mean] => 26.1106363977
[stdev] => 5173.68679862
[absdev] => 3301.9175
[variance] => 26767035.0902
[range] => 17591.4
[std_error_of_mean] => 1493.51473294
[skewness] => 2.02781206173
[kurtosis] => 2.98190358339
[coeff_of_variation] => 1.99538090541
[sample_central_moments] => Array
(
[1] => 0
[2] => 24536448.8327
[3] => 280820044848
[4] => 4.2858793901E+015
[5] => 6.34511539688E+019
)
[sample_raw_moments] => Array
(
[1] => 2592.83166667
[2] => 31259224.8844
[3] => 489107716046
[4] => 8.23326983124E+015
[5] => 1.42287015523E+020
)
[frequency] => Array
(
[2.6] => 1
[54] => 1
[56] => 1
[67.59] => 1
[76] => 1
[189] => 1
[329] => 1
[820] => 1
[1929.79] => 1
[2648] => 1
[7348] => 1
[17594] => 1
)
[quartiles] => Array
(
[25] => 61.795
[50] => 259
[75] => 2288.895
)
[interquartile_range] => 2227.1
[interquartile_mean] => 568.563333333
[quartile_deviation] => 1113.55
[quartile_variation_coefficient] => 94.7423947862
[quartile_skewness_coefficient] => 0.822904225226
)
In the listing above, calcFull( ) generates a complete set of statistics data such as including its mean, median, mode, and range; its variance and standard deviation; its skewness, kurtosis, and moments; and its quartiles, inter-quartible range, and quartile deviation. Beside that, it is also possible to generate a histogram and plot the frequency distribution of a data set with PEAR's Math_Histogram package as the following:
<?php
//include Math_Histogram class
include "Math/Histogram.php";
//define number series
$series = array(10,73,27,11,92,97,49,86,92,4,32,61,2,13,48,81,94,17,8);
//initialize an object
$hist = new Math_Histogram();
//connect class to data series
$hist->setData($series);
//define number of bins and upper/lower limits
$hist->setBinOptions(10,0,100);
//calculate frequencies
$hist->calculate();
//print as ASCII bar chart
echo $hist->printHistogram();
?>
In here, a number series is expressed as an array and passed to the setData( ) and calculate( ) methods for processing. The number and size of the histogram bins can be controlled with the setBinOptions( ) method. The printHistogram( ) method displays an ASCII representation of the histogram as follow:
Histogram
Number of bins: 10
Plot range: [0, 100]
Data range: [2, 97]
Original data range: [2, 97]
BIN (FREQUENCY) ASCII_BAR (%)
10.000 (4 ) |**** (21.1%)
20.000 (3 ) |*** (15.8%)
30.000 (1 ) |* (5.3%)
40.000 (1 ) |* (5.3%)
50.000 (2 ) |** (10.5%)
60.000 (0 ) | (0.0%)
70.000 (1 ) |* (5.3%)
80.000 (1 ) |* (5.3%)
90.000 (2 ) |** (10.5%)
100.000 (4 ) |**** (21.1%)
Note that the Math_Histogram package supports both simple and cumulative historgrams as well as histograms in three and four dimensions.