Built an end-to-end churn prediction system on 1,701 student records — from raw behavioral data to a deployed XGBoost model with 90.3% accuracy and AUC of 0.978 — identifying at-risk students before they drop off.
An edtech platform was losing students with no early warning system. By the time disengagement was noticed, the student had already mentally left. The platform needed to know who would churn and why — before it happened.
Before modeling, EDA was run to understand which behavioral signals separate churned students from active ones. Four Python charts — click any to expand.
Raw behavioral logs don't predict churn directly. The key was engineering composite features that capture engagement quality, not just activity volume.
# Composite engagement score (top correlated feature at 0.69)
df['engagement_score'] = (
df['logins_per_week'] * 0.3 +
df['assignment_completion_pct'] * 0.4 +
df['forum_posts_monthly'] * 0.2 +
df['support_tickets'] * -0.1
)
# Early warning flags
df['critical_low_engagement'] = (df['engagement_score'] < 2.0).astype(int)
df['failing_assignments'] = (df['assignment_completion_pct'] < 40).astype(int)
df['socially_isolated'] = (df['forum_posts_monthly'] == 0).astype(int)
# Engagement trend (week 1 vs week 4)
df['engagement_trend'] = df['engagement_week_1'] - df['engagement_week_4']
Highest correlated feature (0.69). Weighted composite of logins, assignments, forum activity, and support tickets.
Top XGBoost importance feature (0.202). Students below 40% completion show dramatically higher churn.
Binary flag — failing first assignment is a 3× churn risk multiplier. Engineered from raw completion data.
Binary early warning: engagement score below threshold within first week flags immediate at-risk status.
Four algorithms were trained and evaluated on an 80/20 split with 5-fold cross-validation. XGBoost won on every metric that matters for a production churn system.
| Algorithm | Accuracy | AUC | Recall | Why chosen / rejected |
|---|---|---|---|---|
| Logistic Regression | 82.1% | 0.891 | 78% | Too simple — misses non-linear patterns |
| Random Forest | 88.4% | 0.951 | 86% | Good, but slower and less interpretable |
| Neural Network | 87.9% | 0.944 | 84% | Black box — can't explain WHY to retention team |
| XGBoost | 90.3% | 0.978 | 92% | ✓ Best accuracy + interpretable feature importance |
from xgboost import XGBClassifier
from sklearn.model_selection import cross_val_score
model = XGBClassifier(
n_estimators=200,
max_depth=6,
learning_rate=0.1,
random_state=42,
eval_metric='logloss'
)
model.fit(X_train, y_train)
# 5-fold CV
cv_scores = cross_val_score(model, X, y, cv=5, scoring='accuracy')
print(f"CV Accuracy: {cv_scores.mean():.3f} ± {cv_scores.std():.3f}")
# Output: CV Accuracy: 0.901 ± 0.012
# Export for deployment
import joblib
joblib.dump(model, 'models/churn_prediction_model.pkl')
The confusion matrix (237 TN, 85 TP, 8 FN, 11 FP) on a 341-sample test set (20% hold-out, ~28% churn rate) and feature importance chart reveal exactly which behaviors predict churn — and when to intervene.
A model that sits in a notebook helps nobody. Here's how the exported pipeline would be deployed as an operational churn prevention system.
I build ML models that identify at-risk users early — with clear explanations your team can act on.
Let's Talk →