Club América, the most decorated club in Mexican football, has built its legacy on history, titles, and a passionate fan base that spans generations. Yet in an increasingly competitive football landscape, every detail matters. Optimizing how corners are executed and defended can mean the difference between sustaining dominance and falling behind.
This challenge is not just about analyzing data, it’s about turning corners into goals and delivering insights that truly impact the pitch.
Challenge ¶
Turn eight seasons of Liga MX corner kicks into simple, evidence-based insights a coach or player can use.
Strategy overview ¶
Corner effectiveness: calculate concrete KPIs (shot probability, xG per corner, goal probability, chance of conceding a counter within 10s).
Tactical choices: “crowd 6-yard box” vs “spread” using player-density metrics and estimate marginal effects.
Models to predict: (a) shot happening, (b) shot xG, (c) opponent counter risk.
Simulations: replacing 20% of outswingers with short corners and estimate changes in shots, xG, and counter risk.
Understanding corners effectiveness¶
Descriptive analysis and Quantification¶
These are actionable insights:
- Shots per 100 corners, segmented by corner type and crowd density (crowd_index quartiles).
- xG per corner, analyzed by delivery type and crowd_index.
- Goal rate, grouped by the number of attackers inside the 6-yard box (0, 1–2, 3+).
- Counter-attack rate (opponent shot within 10 seconds), compared between short vs. long corners and outswingers vs. inswingers.
Descriptive analysis and Quantification¶
Statistical validation 📈¶
To ensure the observed differences are meaningful and not due to chance, we apply a bootstrap statistical test. Compute bootstrapped 95% confidence intervals (CIs) for each key rate. Compare tactical strategies (e.g., crowded vs. spread setups) using bootstrap-derived p-values, providing a robust, non-parametric measure of significance.
Predictive modeling¶
- Shot occurrence model — binary classification (shot_happened)
- Inputs: spatial features, delivery type, taker profile, opposition density, context.
- Model: gradient boosting (XGBoost / LightGBM) or logistic for interpretability.
- Output: P(shot | corner).
- Shot quality model — regression for shot_xG | shot.
- Inputs: shot location, header/foot, pressure, delivery height.
- Model: XGBoost regression or a simple calibrated xG logistic.
- Conceding risk model — binary
opp_shot_within_10sandopp_goal_within_30s.- Inputs: short corner, numbers left back, transition metrics (distance of nearest defender to half-line), keeper position.
- Model: classification tree or logistic regression.
- On-Ball Value / OBV — compute marginal expected value of each corner event vs baseline.
Counterfactual simulation: “What if we changed 20% of our corners?”¶
We would like to test a simple tactical question:
What would happen if we replaced 20% of our outswinging corners with short corners?To do this, we used a counterfactual simulation — basically a “what-if” experiment using data.
Counterfactual simulation: “What if we changed 20% of our corners?”¶
- Find the current outswingers
- Pick 20% of them
- Create their “what-if” version
- Predict new outcomes
- Compare both versions
- Repeat to be sure
Causal-style estimation: does crowding 6-yard help?¶
Propensity score matching (PSM)
- Propensity model for
n_attackers_0_6 >=3(treatment = crowded). Usage of pre-corner covariates (score_state, minute, home/away, opponent strength). - To match crowded corners to similar non-crowded corners.
- Estimate Average Treatment Effect on the Treated (ATT) for outcomes: shot probability, xG, goal rate.
- Propensity model for
Model-based marginal effect (conditional)
- Using the shot occurrence model (from predictive models), marginal effect is computed by setting
n_attackers_0_6 = k+1while holding other features constant (or sampling realistic covariates), and average predicted change. - This gives an Average Marginal Effect (AME).
- Using the shot occurrence model (from predictive models), marginal effect is computed by setting
Causal-style estimation: does crowding 6-yard help?¶
Conclusions – Turning Corners into Goals ⚽¶
1️⃣ Every corner counts. Small tactical changes can shape shot creation and defensive balance.
2️⃣ Backed by evidence. Bootstrap validation confirms which strategies truly make a difference.
3️⃣ Predictive power. Models highlight when, where, and how corners are most effective.
4️⃣ Tactical clarity. Causal analysis isolates the real impact of crowding the 6-yard box.
5️⃣ What-if simulation. Testing 20% short corners shows potential gains before applying on the pitch.
Data-driven insights to help Club América turn corners into goals.
Next steps¶
- Analyze the other features of other teams.
- What happens with corners that lead to penalties, free kicks, yellow and red cards for the opponent team?