Sample SD, population SD, complete descriptive statistics, per-value z-scores, and side-by-side dataset comparison — with full deviation tables and step-by-step solutions.
The most important distinction in standard deviation is whether your data represents a sample (a subset drawn from a larger group) or a population (every single member of the group you care about). The formulas differ in one critical place: what you divide by.
The sample formula divides by n−1 (called Bessel's correction). Why? When you compute the mean from sample data, you've already used one “degree of freedom” — the deviations from the sample mean are forced to sum to zero, so they carry only n−1 independent pieces of information. Dividing by n−1 instead of n corrects for this bias and makes the sample variance an unbiased estimator of the true population variance. As n grows large, the difference becomes negligible.
When to use which: If your data is the complete population (every student in a single class, all products in one batch, all members of a specific group) use population SD. If your data is a random sample and you want to infer about the broader population (1,000 survey respondents representing all voters, 50 test items representing all possible items) use sample SD.
The deviation table is the most transparent way to see exactly how standard deviation is computed. Each row shows one data point and its contribution to the final result. The columns are: the original value x, the deviation from mean (x − x̅), the squared deviation (x − x̅)², and optionally the z-score. The final row sums the squared deviations — that sum divided by n or n−1, then square-rooted, gives the standard deviation.
Values with large squared deviations have outsized influence on SD. One extreme outlier can dramatically inflate the standard deviation, which is why it's sensitive to outliers in a way that median-based measures (like IQR) are not.
| Statistic | Formula | What it measures |
|---|---|---|
| Mean (μ or x̅) | Σx / n | Average; center of mass of data |
| Median | Middle value when sorted | Center value; robust to outliers |
| Mode | Most frequent value(s) | Most common observation |
| Range | Max − Min | Total spread; sensitive to outliers |
| Variance (s²) | Σ(x − x̅)² / (n−1) | Average squared deviation; in units² |
| Standard Deviation | √Variance | Typical distance from mean; in original units |
| Q1 (25th pct) | Median of lower half | Value below which 25% of data falls |
| Q3 (75th pct) | Median of upper half | Value below which 75% of data falls |
| IQR | Q3 − Q1 | Middle 50% spread; very robust to outliers |
| CV (%) | (s / x̅) × 100 | Relative variability; unitless |
| SEM | s / √n | Precision of mean estimate |
| Skewness | Standardized 3rd moment | >0: right tail; <0: left tail |
| Kurtosis | Excess 4th moment | >0: heavy tails; <0: light tails vs normal |
For a normally distributed dataset, approximately 68% of values fall within 1 standard deviation of the mean, 95% within 2 standard deviations, and 99.7% within 3 standard deviations. This “empirical rule” is the foundation of quality control (Six Sigma tolerances), grading on a curve, outlier detection, and confidence interval intuition.
A value more than 2 SD from the mean is unusual (<5% expected). A value more than 3 SD away is very rare (<0.3%). This rule only applies strictly to normal distributions; skewed data may have very different percentages. This calculator shows this distribution visually for every dataset so you can immediately see whether your data has unusual spread or outliers.
A z-score measures how many standard deviations a value is from the mean: z = (x − x̅) / s. A z-score of +2.0 means the value is 2 standard deviations above average; −1.5 means 1.5 SDs below. Z-scores let you compare values from different datasets on the same scale — is a score of 85 on one test better or worse than 70 on a harder test?
Z-scores are also used for outlier detection: values with |z| > 2 are mild outliers, |z| > 3 are extreme outliers. The calculator flags these automatically in the z-score table.
Standard deviation is measured in the same units as your data, making it impossible to compare variability across datasets with different units or scales. The CV solves this: CV = (s / x̅) × 100%. A height dataset with SD = 10 cm and mean = 170 cm has CV = 5.9% (low variability). A price dataset with SD = $500 and mean = $1000 has CV = 50% (high variability). CV works best when data is positive and the mean is meaningfully non-zero.
Skewness measures asymmetry. Positive skewness means a long right tail (many low values, a few very high ones) — typical of income distributions. Negative skewness means a long left tail. A perfectly symmetric distribution has skewness = 0. Values beyond ±1 are generally considered notably skewed; beyond ±2, substantially skewed.
Excess kurtosis (what this calculator reports, also called Fisher kurtosis) measures tail heaviness relative to a normal distribution. Kurtosis = 0 means normal tails (mesokurtic). Positive kurtosis = heavier tails, more extreme values than expected (leptokurtic). Negative = lighter tails, fewer extremes (platykurtic). Financial return data typically shows high positive kurtosis — rare but very large gains and losses occur more often than a normal distribution would predict.