To use the pandas library in Python to compute statistical measures such as mean, median, and standard deviation, and to visualize the data using a histogram.
Basic knowledge of Python
Write a new Python script file as PandasTest.py and start by importing the requisite libraries.
import pandas as pd
import matplotlib.pyplot as plt
The following are weight values (in pounds) for 20 people:
164, 158, 172, 153, 144, 156, 189, 163, 134, 159, 143, 176, 177, 162, 141, 151, 182, 185, 171, 152
# Create a pandas Series with given data
weights = pd.Series([164, 158, 172, 153, 144, 156, 189, 163, 134, 159,
143, 176, 177, 162, 141, 151, 182, 185, 171, 152])
# Calculate mean
mean_weight = weights.mean()
print(f"Mean: {mean_weight}")
# Calculate median
median_weight = weights.median()
print(f"Median: {median_weight}")
# Calculate standard deviation
std_dev_weight = weights.std()
print(f"Standard Deviation: {std_dev_weight}")
Step 3: Plot a Histogram
# Plot histogram
plt.hist(weights, bins=5, color='blue', edgecolor='black')
plt.xlabel('Weight (lbs)')
plt.ylabel('Frequency')
plt.title('Histogram of Weights')
plt.show()
The computed values of mean, median, and standard deviation.
A histogram displaying the distribution of the given weights.
This lab demonstrates how pandas can be used to perform statistical analysis efficiently. The histogram visualization helps understand the data distribution.