NumPy statistical functions with code examples

NumPy provides a wide range of statistical functions for analyzing data.

These functions operate on arrays and can compute statistics like mean, median, variance, standard deviation, minimum, maximum, and more.

They’re essential for data analysis and numerical computations.

Importing NumPy and Creating Arrays

First, let’s import NumPy and create a sample array to work with:

import numpy as np

# Example array
data = np.array([1, 2, 3, 4, 5, 6, 7, 8, 9, 10])

1. np.mean – Mean (Average)

The mean of an array is the sum of all elements divided by the number of elements.

Example

mean_value = np.mean(data)
print("Mean:", mean_value)

Output

Mean: 5.5

2. np.median – Median

The median is the middle value in a sorted array. If the array length is even, the median is the average of the two middle numbers.

Example

median_value = np.median(data)
print("Median:", median_value)

Output

Median: 5.5

3. np.std – Standard Deviation

The standard deviation measures the amount of variation or dispersion in a set of values. A low standard deviation indicates that values are close to the mean, while a high standard deviation indicates a wide range of values.

Example

std_dev = np.std(data)
print("Standard Deviation:", std_dev)

Output

Standard Deviation: 2.8722813232690143

4. np.var – Variance

Variance is the average of the squared differences from the mean. It’s a measure of how spread out the values are.

Example

variance = np.var(data)
print("Variance:", variance)

Output

Variance: 8.25

5. np.min and np.max – Minimum and Maximum

These functions return the minimum and maximum values in an array.

Example

min_value = np.min(data)
max_value = np.max(data)
print("Minimum:", min_value)
print("Maximum:", max_value)

Output

Minimum: 1
Maximum: 10

6. np.percentile – Percentile

The percentile function finds the value below which a given percentage of observations in a group of observations falls. For example, the 25th percentile is the value below which 25% of the observations fall.

Example

percentile_25 = np.percentile(data, 25)
percentile_50 = np.percentile(data, 50)  # Equivalent to median
percentile_75 = np.percentile(data, 75)

print("25th Percentile:", percentile_25)
print("50th Percentile (Median):", percentile_50)
print("75th Percentile:", percentile_75)

Output

25th Percentile: 3.25
50th Percentile (Median): 5.5
75th Percentile: 7.75

7. np.quantile – Quantile

Quantiles are similar to percentiles. While percentiles are expressed as percentages, quantiles are expressed as fractions (0.25, 0.5, 0.75, etc.).

Example

quantile_25 = np.quantile(data, 0.25)
quantile_50 = np.quantile(data, 0.5)   # Equivalent to median
quantile_75 = np.quantile(data, 0.75)

print("25th Quantile:", quantile_25)
print("50th Quantile (Median):", quantile_50)
print("75th Quantile:", quantile_75)

Output

25th Quantile: 3.25
50th Quantile (Median): 5.5
75th Quantile: 7.75

8. np.sum – Sum

The sum function calculates the sum of all elements in the array.

Example

total_sum = np.sum(data)
print("Sum:", total_sum)

Output

Sum: 55

9. np.prod – Product

The product function calculates the product of all elements in the array.

Example

total_product = np.prod(data)
print("Product:", total_product)

Output

Product: 3628800

10. np.cumsum – Cumulative Sum

The cumulative sum function returns an array where each element is the sum of all previous elements in the input array up to that position.

Example

cumulative_sum = np.cumsum(data)
print("Cumulative Sum:", cumulative_sum)

Output

Cumulative Sum: [ 1  3  6 10 15 21 28 36 45 55]

11. np.cumprod – Cumulative Product

The cumulative product function returns an array where each element is the product of all previous elements up to that position.

Example

cumulative_product = np.cumprod(data)
print("Cumulative Product:", cumulative_product)

Output

Cumulative Product: [      1       2       6      24     120     720    5040   40320  362880 3628800]

12. np.ptp – Peak-to-Peak (Range)

The peak-to-peak function calculates the range of values (maximum – minimum) in the array.

Example

range_value = np.ptp(data)
print("Range (Peak-to-Peak):", range_value)

Output

Range (Peak-to-Peak): 9

13. np.mean, np.median, np.var on Multidimensional Arrays

These statistical functions can also be applied to multidimensional arrays. By specifying the axis parameter, you can calculate statistics along specific dimensions.

Example

# Create a 2D array
data_2d = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])

# Mean across columns (axis=0)
mean_cols = np.mean(data_2d, axis=0)
print("Mean across columns:", mean_cols)

# Mean across rows (axis=1)
mean_rows = np.mean(data_2d, axis=1)
print("Mean across rows:", mean_rows)

# Variance across columns
variance_cols = np.var(data_2d, axis=0)
print("Variance across columns:", variance_cols)

# Median across rows
median_rows = np.median(data_2d, axis=1)
print("Median across rows:", median_rows)

Output

Mean across columns: [4. 5. 6.]
Mean across rows: [2. 5. 8.]
Variance across columns: [6. 6. 6.]
Median across rows: [2. 5. 8.]

14. np.corrcoef – Correlation Coefficient

The correlation coefficient function calculates the correlation matrix, measuring the linear relationship between variables in a 2D array.

Example

# Create two sample arrays
data_x = np.array([1, 2, 3, 4, 5])
data_y = np.array([5, 4, 3, 2, 1])

# Calculate correlation coefficient matrix
correlation = np.corrcoef(data_x, data_y)
print("Correlation Coefficient Matrix:\n", correlation)

Output

Correlation Coefficient Matrix:
[[ 1. -1.]
 [-1.  1.]]

Explanation

A value of 1 or -1 indicates a perfect linear relationship, with -1 showing an inverse correlation.

15. np.histogram – Histogram

The histogram function computes the histogram of the input data, providing bin counts and edges, which is useful for data distribution analysis.

Example

# Generate random data
data_random = np.random.randint(0, 100, size=50)

# Compute histogram
counts, bin_edges = np.histogram(data_random, bins=5)
print("Bin counts:", counts)
print("Bin edges:", bin_edges)

Explanation

np.histogram splits the data into bins and counts the number of elements in each bin.
bins=5 specifies that the data should be divided into 5 intervals.

Summary of Common NumPy Statistical Functions

Function	Description
np.mean	Calculates the mean (average) of elements
np.median	Finds the median of elements
np.std	Calculates the standard deviation
np.var	Calculates the variance
np.min, np.max	Finds the minimum and maximum
np.percentile	Calculates specified percentiles
np.quantile	Calculates specified quantiles
np.sum	Calculates the sum of elements
np.prod	Calculates the product of elements
np.cumsum	Computes the cumulative sum
np.cumprod	Computes the cumulative product
np.ptp	Calculates the peak-to-peak range
np.corrcoef	Calculates the correlation coefficient
np.histogram	Computes the histogram

NumPy’s statistical functions provide efficient ways to analyze data, making it a powerful tool for data science, analytics, and scientific computing.

These functions operate on arrays and can be applied across different axes, making them flexible for multidimensional data analysis.

numpy statistical functions

NumPy statistical functions with code examples

Importing NumPy and Creating Arrays

1. np.mean – Mean (Average)

Example

Output

2. np.median – Median

Example

Output

3. np.std – Standard Deviation

Example

Output

4. np.var – Variance

Example

Output

5. np.min and np.max – Minimum and Maximum

Example

Output

6. np.percentile – Percentile

Example

Output

7. np.quantile – Quantile

Example

Output

8. np.sum – Sum

Example

Output

9. np.prod – Product

Example

Output

10. np.cumsum – Cumulative Sum

Example

Output

11. np.cumprod – Cumulative Product

Example

Output

12. np.ptp – Peak-to-Peak (Range)

Example

Output

13. np.mean, np.median, np.var on Multidimensional Arrays

Example

Output

14. np.corrcoef – Correlation Coefficient

Example

Output

Explanation

15. np.histogram – Histogram

Example

Explanation

Summary of Common NumPy Statistical Functions

NumPy String Functions tutorial in python

NumPy Matrix Library Tutorial

You may also like