The statistics
module in Python provides functions for calculating mathematical statistics of numeric data. It includes functions for calculating measures of central tendency, measures of spread, and other statistical properties. This module is part of the Python standard library, so no additional installation is required.
1. Importing the statistics
Module
To use the functions provided by the statistics
module, you need to import it first:
import statistics
2. Measures of Central Tendency
Measures of central tendency describe the center of a data set. The statistics
module provides functions to calculate the mean, median, and mode.
2.1. Mean (Average)
statistics.mean(data)
returns the arithmetic mean (average) of the data.
import statistics
data = [1, 2, 3, 4, 5, 6, 7, 8, 9]
mean_value = statistics.mean(data)
print("Mean:", mean_value)
2.2. Median
statistics.median(data)
returns the median (middle value) of the data.
import statistics
data = [1, 2, 3, 4, 5, 6, 7, 8, 9]
median_value = statistics.median(data)
print("Median:", median_value)
2.3. Mode
statistics.mode(data)
returns the mode (most common value) of the data.
import statistics
data = [1, 2, 2, 3, 4, 4, 4, 5, 6]
mode_value = statistics.mode(data)
print("Mode:", mode_value)
3. Measures of Spread
Measures of spread describe how much the data varies. The statistics
module provides functions to calculate variance and standard deviation.
3.1. Variance
statistics.variance(data)
returns the variance of the data, which is a measure of how much the data varies from the mean.
import statistics
data = [1, 2, 3, 4, 5, 6, 7, 8, 9]
variance_value = statistics.variance(data)
print("Variance:", variance_value)
3.2. Standard Deviation
statistics.stdev(data)
returns the standard deviation of the data, which is the square root of the variance and provides a measure of the spread of the data around the mean.
import statistics
data = [1, 2, 3, 4, 5, 6, 7, 8, 9]
stdev_value = statistics.stdev(data)
print("Standard Deviation:", stdev_value)
4. Other Statistical Functions
The statistics
module also includes several other useful functions for statistical analysis.
4.1. Median Low and Median High
statistics.median_low(data)
: Returns the low median (the smaller of the two middle values) when the data has an even number of elements.statistics.median_high(data)
: Returns the high median (the larger of the two middle values) when the data has an even number of elements.
import statistics
data = [1, 2, 3, 4, 5, 6, 7, 8]
median_low_value = statistics.median_low(data)
median_high_value = statistics.median_high(data)
print("Median Low:", median_low_value)
print("Median High:", median_high_value)
4.2. Median Grouped
statistics.median_grouped(data, interval=1)
returns the median of grouped continuous data, calculated as the 50th percentile.
import statistics
data = [1, 2, 2, 2, 3, 4, 4, 5, 6]
median_grouped_value = statistics.median_grouped(data)
print("Median Grouped:", median_grouped_value)
4.3. Harmonic Mean
statistics.harmonic_mean(data)
returns the harmonic mean of the data, which is the reciprocal of the arithmetic mean of the reciprocals of the data values.
import statistics
data = [40, 60, 80]
harmonic_mean_value = statistics.harmonic_mean(data)
print("Harmonic Mean:", harmonic_mean_value)
4.4. Geometric Mean
statistics.geometric_mean(data)
returns the geometric mean of the data, which is the nth root of the product of n numbers. This is particularly useful for data that grows exponentially.
import statistics
data = [1, 2, 3, 4, 5]
geometric_mean_value = statistics.geometric_mean(data)
print("Geometric Mean:", geometric_mean_value)
5. Handling Data with Multiple Modes
If your data set has multiple modes, you can use statistics.multimode(data)
to return a list of all the modes:
import statistics
data = [1, 2, 2, 3, 3, 4, 4]
modes = statistics.multimode(data)
print("Modes:", modes)