Executive Summary
Headline attrition metrics and top risk signals
Overall attrition stands at 16.1% across 1470 employees (logistic AUC = 0.848). The single strongest logistic predictor is Job Role: Sales Representative (OR = 6.23). XGBoost ranks Monthly Income as the highest-gain feature.
Attrition Rate by Segment
Attrition headcount and rate across department, overtime, travel, and job level
| Segment | Segment Value | Attrited | Headcount | Attrition Rate PCT |
|---|---|---|---|---|
| Business Travel | Travel_Frequently | 69 | 277 | 24.91 |
| Business Travel | Travel_Rarely | 156 | 1043 | 14.96 |
| Business Travel | Non-Travel | 12 | 150 | 8 |
| Department | Sales | 92 | 446 | 20.63 |
| Department | Human Resources | 12 | 63 | 19.05 |
| Department | Research & Development | 133 | 961 | 13.84 |
| Job Level | Level 1 | 143 | 543 | 26.34 |
| Job Level | Level 3 | 32 | 218 | 14.68 |
| Job Level | Level 2 | 52 | 534 | 9.74 |
| Job Level | Level 5 | 5 | 69 | 7.25 |
| Job Level | Level 4 | 5 | 106 | 4.72 |
| Overtime | Yes | 127 | 416 | 30.53 |
| Overtime | No | 110 | 1054 | 10.44 |
Attrition rates are broken down across department, overtime status, business travel frequency, and job level. The highest-attrition segment is Overtime — Yes — at 30.5% (127 attritions / 416 employees). Groups with fewer than 5 employees are excluded.
Attrition Rate by Job Role
Percentage of employees who left, broken down by job role
Sales Representative has the highest attrition rate at 39.8%, while Research Director has the lowest at 2.5%. Across all 9 roles shown, the median attrition rate is 16.1%. Roles with fewer than 5 employees are excluded.
Logistic Regression: Odds Ratios
Top 10 predictors by effect size with 95% confidence intervals
Showing the top 10 predictors by absolute log-odds magnitude. Job Role: Sales Representative has the highest odds ratio (OR = 6.23, 95% CI: 0.66–69.07), meaning employees with this attribute are 6.2x more likely to attrite. 7 of 10 shown predictors are associated with increased attrition (OR > 1). Predictors with extreme or infinite ORs (separation) are excluded.
Logistic Regression: Full Coefficient Table
Every predictor with odds ratio, 95% CI, and p-value
| Predictor | Odds Ratio | CI Lower | CI Upper | P Value |
|---|---|---|---|---|
| Age | 0.9626 | 0.938 | 0.9869 | 0.0033 |
| Department: Research & Development | 209340.8217 | 1.997e+86 | 6.578e+74 | 0.9744 |
| Department: Sales | 332185.6684 | 4.253e+60 | 4.978e+73 | 0.9734 |
| Job Role: Human Resources | 875381.3986 | 0 | — | 0.9714 |
| Job Role: Laboratory Technician | 5.2511 | 2.129 | 14.0957 | 5.416e-04 |
| Job Role: Manager | 0.8666 | 0.1386 | 4.5177 | 0.8697 |
| Job Role: Manufacturing Director | 1.2691 | 0.4522 | 3.627 | 0.6501 |
| Job Role: Research Director | 0.2867 | 0.0343 | 1.656 | 0.1913 |
| Job Role: Research Scientist | 2.1032 | 0.8343 | 5.7155 | 0.1276 |
| Job Role: Sales Executive | 2.2519 | 0.2626 | 22.9255 | 0.4605 |
| Job Role: Sales Representative | 6.2317 | 0.6588 | 69.0678 | 0.112 |
| Job Level | 0.9769 | 0.5398 | 1.7632 | 0.9383 |
| Monthly Income | 1 | 0.9999 | 1.0002 | 0.7274 |
| Years at Company | 1.0906 | 1.0132 | 1.1725 | 0.0195 |
| Years in Current Role | 0.874 | 0.802 | 0.9516 | 0.002 |
| Overtime: Yes | 5.9684 | 4.2041 | 8.5536 | 5.564e-23 |
| Business Travel: Travel Frequently | 5.4407 | 2.6411 | 12.0076 | 1.041e-05 |
| Business Travel: Travel Rarely | 2.4155 | 1.244 | 5.0693 | 0.0133 |
| Job Satisfaction | 0.6825 | 0.5854 | 0.7937 | 8.423e-07 |
| Environment Satisfaction | 0.6656 | 0.5696 | 0.7756 | 2.305e-07 |
| Work Life Balance | 0.7372 | 0.585 | 0.9285 | 0.0096 |
| Distance from Home | 1.0406 | 1.0198 | 1.0619 | 1.137e-04 |
| Total Working Years | 0.9443 | 0.8924 | 0.9971 | 0.0424 |
| Number of Companies Worked | 1.1829 | 1.1016 | 1.2699 | 3.541e-06 |
| Years Since Last Promotion | 1.1785 | 1.0897 | 1.2769 | 4.790e-05 |
| Stock Option Level | 0.5627 | 0.4482 | 0.6989 | 3.755e-07 |
| Training Times Last Year | 0.8432 | 0.734 | 0.9653 | 0.0146 |
| Years with Curr Manager | 0.8701 | 0.7972 | 0.9503 | 0.0019 |
Full logistic regression results for all 28 predictors (after dummy coding). 17 predictors are statistically significant at p < 0.05. Odds ratios and CIs are exponentiated from log-odds; NA/Inf values indicate near-complete separation and should be interpreted with caution.
XGBoost SHAP Feature Importance
Gain-based feature importance ranking from XGBoost (SHAP proxy)
XGBoost gain-based feature importance (proxy for mean absolute SHAP values) across the top 10 drivers. Monthly Income contributes the most gain to the model's attrition predictions, suggesting it provides the largest non-linear discrimination between employees who leave and those who stay. Unlike logistic odds ratios, these ranks capture interaction effects.
Overtime × Department Attrition Heatmap
Attrition rate for each combination of overtime status and department
The overtime × department interaction reveals where workload and organisational context combine to elevate attrition risk. The highest-risk cell is Yes employees in Sales at 37.5% attrition. Cells with fewer than 5 employees are excluded from this view.
Employee Retention Survival Curve
Kaplan-Meier probability of remaining at the company over years of tenure
The Kaplan-Meier survival curve shows employee retention probability over tenure. After 33 years, an estimated 51.0% of employees remain. By year 5, 87.3% of employees are still with the company. Steep early drops indicate the first few years carry the highest attrition risk.
Model Discrimination: ROC Curve
Logistic regression ROC curve (AUC = 0.848)
The ROC curve summarises the logistic regression model's ability to distinguish employees who will attrite from those who will not. AUC = 0.848 indicates good discriminative performance. A diagonal line would represent random guessing (AUC = 0.5); the further the curve bows toward the top-left, the better the model.