Outlier Calculator
Detect outliers in your data using IQR, Z-score, or Modified Z-score (MAD) methods. Identify which values are statistical outliers and get a cleaned dataset. See also our Interquartile Range Calculator, Z-Score Calculator, and Standard Deviation Calculator.
How to Use the Outlier Calculator
Outliers are data points that differ significantly from other observations. They can result from measurement errors, data entry mistakes, or genuine extreme values. This calculator offers three methods to detect outliers, each with different strengths: the IQR method is robust to skewed data, the Z-score method assumes normality, and the Modified Z-score uses the median absolute deviation for maximum robustness.
Enter your data as comma-separated numbers, choose a detection method, and adjust the threshold if needed. The IQR method (default multiplier 1.5) flags values below Q1 - 1.5×IQR or above Q3 + 1.5×IQR. The Z-score method (default threshold 2) flags values more than 2 standard deviations from the mean. The Modified Z-score (default 3.5) uses the median and MAD, making it resistant to the outliers themselves affecting detection.
After detection, the calculator shows which values are outliers and provides a cleaned dataset with outliers removed. Always investigate outliers before removing them — they may contain important information about your data or process. Legitimate extreme values should be kept; only errors or irrelevant observations should be removed.
Formula
IQR Method (Tukey's Fences):
Lower fence = Q1 - k × IQR
Upper fence = Q3 + k × IQR
where IQR = Q3 - Q1, k = 1.5 (mild) or 3 (extreme)
Z-Score Method:
z = (x - mean) / std_dev
Outlier if |z| > threshold (typically 2 or 3)
Modified Z-Score (MAD):
MAD = median(|xᵢ - median(x)|)
M_i = 0.6745 × (xᵢ - median) / MAD
Outlier if |M_i| > 3.5
Example Calculation
Data: 2, 4, 5, 6, 7, 8, 9, 10, 11, 12, 14, 15, 50
Sorted: 2, 4, 5, 6, 7, 8, 9, 10, 11, 12, 14, 15, 50
IQR Method:
Q1 = 5.5, Q3 = 13, IQR = 7.5
Lower fence = 5.5 - 1.5×7.5 = -5.75
Upper fence = 13 + 1.5×7.5 = 24.25
Outlier: 50 (exceeds upper fence of 24.25)
Z-Score Method (threshold=2):
Mean = 11.77, Std Dev = 11.87
z(50) = (50-11.77)/11.87 = 3.22 → OUTLIER
z(2) = (2-11.77)/11.87 = -0.82 → not outlier
Reference Table
| Method | Type | Rule | Robust? |
|---|---|---|---|
| IQR (1.5×) | Mild Outlier | x < Q1-1.5×IQR or x > Q3+1.5×IQR | Yes |
| IQR (3×) | Extreme Outlier | x < Q1-3×IQR or x > Q3+3×IQR | Yes |
| Z-Score (|z|>2) | Moderate | |x - mean| > 2×std | No |
| Z-Score (|z|>3) | Extreme | |x - mean| > 3×std | No |
| Modified Z (MAD) | Robust | |0.6745(x-median)/MAD| > 3.5 | Yes |
| Grubbs Test | Statistical | Max |x-mean|/s vs critical value | No |
| Dixon Q Test | Small samples | Gap/range ratio vs critical | Moderate |
| DBSCAN | Multivariate | Density-based clustering | Yes |
Frequently Asked Questions
What is an outlier?
An outlier is a data point that differs significantly from other observations in a dataset. It lies an abnormal distance from other values. Outliers can be caused by measurement errors, data entry mistakes, sampling problems, or genuine extreme values in the population. They can significantly affect statistical measures like the mean and standard deviation, potentially leading to misleading conclusions if not properly handled.
Which outlier detection method should I use?
Use the IQR method for general-purpose detection — it's robust and doesn't assume normality. Use Z-scores when data is approximately normal and you want a probabilistic interpretation. Use Modified Z-scores (MAD) when you suspect multiple outliers that might mask each other (masking effect) or when the data is heavily skewed. For small samples (n < 25), the IQR method or Grubbs' test is preferred over Z-scores.
Should I always remove outliers?
No. Outliers should be investigated, not automatically removed. If an outlier is due to a measurement error or data entry mistake, remove it. If it's a legitimate extreme value, keep it — it may contain important information. In some fields (fraud detection, quality control), outliers are the primary interest. Report analyses both with and without outliers to show their impact. Document your decision and reasoning for transparency.
What is the masking effect?
Masking occurs when multiple outliers inflate the mean and standard deviation so much that they no longer appear extreme by Z-score criteria. For example, if you have values 1-10 plus 100 and 200, the mean shifts to ~25 and std dev to ~55, making 100 and 200 appear less extreme. The IQR and Modified Z-score methods are resistant to masking because they use the median and quartiles, which are not affected by extreme values.
What is the difference between IQR multiplier 1.5 and 3?
A multiplier of 1.5 identifies "mild" outliers — values that are unusual but not extreme. For normally distributed data, about 0.7% of values fall outside 1.5×IQR fences. A multiplier of 3 identifies "extreme" outliers — values that are very far from the bulk of data. For normal data, only about 0.0002% fall outside 3×IQR fences. Use 1.5 for general screening and 3 for identifying only the most extreme values.
How do outliers affect statistical analysis?
Outliers can dramatically affect: (1) Mean — pulled toward the outlier; (2) Standard deviation — inflated; (3) Correlation — a single outlier can create or destroy apparent correlation; (4) Regression — can tilt the regression line significantly; (5) t-tests and ANOVA — inflated variance reduces power. Robust statistics (median, IQR, trimmed mean) are less affected. Always check for outliers before running parametric tests that assume normality.