Loading data, please wait...
Loading data, please wait...
The Secret Sauce for detecting outliers
Or: How to mathematically prove that your favorite takeoff is actually worth the drive
You know that friend who always brags about their "epic" sites, but you're pretty sure they're just full of hot air? Z-scores are the mathematical BS detector you've been waiting for.
The Short Version: A z-score tells you how unusual a data point is compared to the rest of your data. It's basically asking: "Is this site actually special, or am I just remembering that one good day?"
The Formula (don't run away):
Z-Score = (Your Value - Average) / Standard Deviation
Translation for Pilots: How many "standard deviations" away from average is this thing?
Imagine you've flown from 10 different takeoff sites over the season. You want to know which ones are your actual go-to spots vs. which ones you just happened to visit once on an epic day.
Here's your data (average flight duration from each site):
| Site | Avg Duration (hours) |
|---|---|
| Annecy | 2.5 |
| Chamonix | 3.8 |
| Verbier | 2.2 |
| Bassano | 5.0 |
| Kössen | 2.4 |
| Interlaken | 2.6 |
| Pokhara | 1.8 |
| Saint-Hilaire | 2.7 |
| Bir | 2.3 |
| Oludeniz | 2.4 |
Average across all sites: 2.77 hours Standard Deviation: 0.91 hours
Let's calculate the z-score for Bassano (because 5.0 hours is suspiciously good):
Z-Score = (5.0 - 2.77) / 0.91
Z-Score = 2.23 / 0.91
Z-Score = 2.45
What does 2.45 mean?
Now let's check Pokhara (1.8 hours, feels short):
Z-Score = (1.8 - 2.77) / 0.91
Z-Score = -0.97 / 0.91
Z-Score = -1.07
What does -1.07 mean?
Here's how to interpret your z-scores:
| Z-Score Range | What It Means | Pilot Translation |
|---|---|---|
| z > +2.0 | Extreme outlier (positive) | 🌟 This site absolutely RIPS - worth the 3-hour drive |
| +1.0 to +2.0 | Above average | 👍 Solid site, you'll usually score here |
| -1.0 to +1.0 | Pretty average | 🤷 Meh. Conditions matter more than the site |
| -2.0 to -1.0 | Below average | 👎 You're probably getting skunked here often |
| z < -2.0 | Extreme outlier (negative) | 💀 Avoid unless you enjoy landing in the LZ |
You: "Dude, I LOVE flying from [Random Site]. It's amazing!" Your Brain: Remembering that ONE epic 6-hour XC from 3 years ago while conveniently forgetting the 15 sled rides
Z-Score Solution: If the site's z-score is negative or close to zero, your brain is lying to you. The data doesn't lie - you're mostly bombing out there.
You've flown Bassano once (5 hours, epic), and you've flown your local hill 47 times (average 2.1 hours).
Which is better? Hard to say without context, right?
Z-Score Solution: The z-score accounts for the whole dataset, so you can compare apples to oranges. It asks: "Compared to ALL my sites, how does this one stack up?"
You want to justify that 4-hour drive to your partner/boss/"I'm working from home today" excuse.
Z-Score Solution: Print out your cockpit stats showing a z-score of +2.8 for that site. "See honey, it's not just fun, it's STATISTICALLY SIGNIFICANT fun. This is science."
When you look at your By Takeoff stats, you'll see a Performance Score for each site. This is the z-score, and it's calculated from your average flight duration per site (the ratio of total hours to number of flights).
// From src/lib/utils.ts
export function computeZScore(dataSerie: chartData[]) {
// For each takeoff site
for (let indexx = 0; indexx < dataSerie.length; indexx++) {
// Calculate the average ratio (avg hours per flight) across all sites
let avg_ratio = 0;
for (let index = 0; index < dataSerie.length; index++) {
avg_ratio += dataSerie[index].ratio;
}
avg_ratio = avg_ratio / dataSerie.length;
// Calculate the standard deviation
let sum_deviations = 0;
for (let index = 0; index < dataSerie.length; index++) {
sum_deviations += Math.pow(dataSerie[index].ratio - avg_ratio, 2);
}
let deviation = Math.sqrt(sum_deviations / dataSerie.length);
// Calculate z-score: (this site's ratio - average) / std deviation
dataSerie[indexx].z = deviation === 0
? 0
: (dataSerie[indexx].ratio - avg_ratio) / deviation;
}
}
The code comments say: "Values with |z| > 2 are considered statistical outliers."
This means:
Let's say you check "By Takeoff" and see:
| Takeoff Site | Total Hours | Flights | Avg/Flight | Performance Score |
|---|---|---|---|---|
| Mont Blanc | 45.5h | 8 | 5.69h | +2.84 🌟 |
| Local Hill | 67.2h | 42 | 1.60h | -0.52 😐 |
| That One Time | 3.2h | 1 | 3.20h | +0.41 🤷 |
What this tells you:
Mont Blanc (z = +2.84): Holy crap, this site is a statistical unicorn. Every time you fly here, you're almost guaranteed a long flight. Book the Airbnb NOW.
Local Hill (z = -0.52): Slightly below average, but not terrible. It's convenient, and you're building hours. Keep using it for practice, but don't expect epics.
That One Time (z = +0.41): Mildly above average, but you've only flown it once. Could be luck. Need more data before you know if it's actually good.
1. Small Sample Sizes
2. All Values Are The Same
deviation === 0 ? 0 : (value - avg) / deviation3. Seasonal Bias
4. Different Flight Styles
Which wing gives you the longest flights?
When should you book your vacation?
Which flying destination is worth the plane ticket?
Your buddy wants to drive 5 hours to a new site. You check your cockpit:
Decision: The nearby site has a higher z-score AND more data. It's the safer bet for a good day. Save the 5-hour drive for when you have more data on the new site.
You're deciding between two wings based on your logged flights:
Decision: Don't buy the friend's wing based on 2 flights. The z-score looks good, but sample size is too small. Could be luck, could be the friend is a better pilot than you.
Your partner: "You drive 2 hours to that site every weekend. Is it even worth it?"
You: "Check this out. Z-score of +2.3. I average 4.2 hours there vs. 2.1 hours at the local hill. That's statistically significant. I'm flying TWICE as long for a 2-hour drive. The math checks out."
Result: Your partner still thinks you're crazy, but at least now you have DATA.
What is a z-score? A measure of how unusual a value is compared to your average. It's normalized by standard deviation so you can compare different things.
Why do we use it? To identify which takeoff sites (or gliders, or months, or countries) are actually exceptional vs. just lucky one-offs.
How do I read it?
Where do I see it? In your Cockpit Dashboard, under "Performance Score" for takeoffs, gliders, months, etc.
What should I do with it? Use it to make smarter decisions about where to fly, what gear to use, and when to plan trips. The z-score is your data-driven wingman.
Dataset: Average flight hours per site
Sites: [2.5, 3.8, 2.2, 5.0, 2.4, 2.6, 1.8, 2.7, 2.3, 2.4]
Step 1: Calculate Mean
Mean = (2.5 + 3.8 + 2.2 + 5.0 + 2.4 + 2.6 + 1.8 + 2.7 + 2.3 + 2.4) / 10
Mean = 27.7 / 10 = 2.77 hours
Step 2: Calculate Variance
Variance = Σ(x - mean)² / N
For each value:
(2.5 - 2.77)² = 0.073
(3.8 - 2.77)² = 1.061
(2.2 - 2.77)² = 0.325
(5.0 - 2.77)² = 4.973 ← Bassano
(2.4 - 2.77)² = 0.137
(2.6 - 2.77)² = 0.029
(1.8 - 2.77)² = 0.941
(2.7 - 2.77)² = 0.005
(2.3 - 2.77)² = 0.221
(2.4 - 2.77)² = 0.137
Sum = 7.902
Variance = 7.902 / 10 = 0.790
Step 3: Calculate Standard Deviation
Std Dev = √Variance = √0.790 = 0.889 hours
Step 4: Calculate Z-Scores
Z(Bassano) = (5.0 - 2.77) / 0.889 = +2.51 🌟
Z(Pokhara) = (1.8 - 2.77) / 0.889 = -1.09 👎
Z(Annecy) = (2.5 - 2.77) / 0.889 = -0.30 🤷
A z-score tells you what is unusual, not why it's unusual.
Always look at:
Z-scores are a tool, not the truth. But they're a damn good tool for cutting through the BS and finding the signal in the noise.
Now get out there and fly! And when your friends ask why you're driving 3 hours to that one site, you can confidently say: "Because the z-score is +2.6, bro. SCIENCE."
"In statistics we trust, but in thermals we must." - Every data nerd pilot