2026 Fantasy Football Lab is Open for Business
This study applies Principal Component Analysis (PCA) combined with K-Means clustering to modern NFL quarterback performance data, incorporating the advanced metric Completion Percentage Over Expected (CPOE). Using 2025 season data and historical trends, we identified three robust archetypes: Dual-Threat, Balanced, and Pocket. Key findings show that Dual-Threat quarterbacks exhibit the highest fantasy points per game (mean ~22.8 FPG) with greater variance, while Pocket passers demonstrate the most consistent and predictable production (mean ~18.4 FPG). CPOE significantly improved archetype separation by highlighting precision differences beyond raw completion percentage. These data-driven archetypes provide actionable insights for fantasy football, team building, and player evaluation.
The NFL quarterback position has evolved dramatically with the rise of mobile athletes and advanced analytics. Traditional classifications (pocket passer vs. dual-threat) often rely on subjective observation. Recent work using rushing contribution percentages and machine learning has shown clear statistical groupings. However, few studies have incorporated CPOE — an advanced Next Gen Stats metric that adjusts completion percentage for depth of target, pressure, and situational difficulty.This study aims to:
Apply PCA + clustering with CPOE to define objective archetypes.
Compare fantasy production across groups.
Provide visual and statistical validation of the three-archetype model (Dual-Threat, Balanced, Pocket).
Data Sources
2025 regular season statistics for starting and significant backup quarterbacks were compiled, including:
Passing: Yards per game, Completion %, Touchdowns per game, Interceptions per game
Rushing: Yards per game
Advanced: CPOE (Completion % Over Expected)
Contextual: Age
Seven normalized features were used: Pass Yds/G, Comp%, Pass TD/G, INT/G, Rush Yds/G, Age, and CPOE.Analytical Pipeline
Data standardization (z-score)
Principal Component Analysis (PCA) for dimensionality reduction
K-Means clustering (k=3) on standardized features
Archetype labeling based on cluster centroids (primarily driven by Rush Yds/G and CPOE)
Fantasy Points Per Game (FPG) calculated using standard PPR/Superflex scoring
PCA scatter plots, boxplots of FPG and CPOE by archetype, and summary statistics were generated.
The inclusion of CPOE meaningfully refined archetype boundaries compared to earlier rushing-only models. High-CPOE Pocket passers (e.g., Burrow, Purdy, Maye) emerge as highly efficient despite limited mobility, while Dual-Threat players trade some accuracy for explosive rushing production.Fantasy Implications
Dual-Threat offers the highest ceiling and weekly volatility — ideal for high-variance strategies.
Pocket provides reliable floor production, especially valuable in superflex or best-ball formats.
Balanced represents the “safe” high-floor/high-ceiling middle ground.
Limitations
Sample size and single-season focus (2025 data CPOE)
CPOE values approximated where exact per-player data was limited
Does not account for team context, offensive scheme, or injury history
Future work could incorporate EPA/play, pressure metrics, and multi-year longitudinal data to track archetype evolution over careers.
PCA + K-Means clustering incorporating CPOE successfully validates three distinct quarterback archetypes with clear statistical and fantasy-relevant differences. These findings provide a modern, data-driven framework for evaluating and rostering quarterbacks in the 2020s
Next Gen Stats (CPOE methodology)
FantasyPros/PFR/FFToday/TeamRankings.com - historical QB data
Prior archetype studies (DLF, Sharp Football Analysis, Secret Stuff)
PCA Clustering Box Plots, and Other Stats (JASP.org)