Multiple Linear Regression: Predicting Student Loan Debt Based on Socioeconomic Factors
Background
The rising cost of education has led to an increasing reliance on student loans in the U.S. Some students accumulate significantly higher debt than others, leading to disparities in financial burden post-graduation. This study will analyze factors influencing student loan debt levels using multiple linear regression.
Possible Research Questions
- What socioeconomic factors (e.g., parental income, tuition costs, major choice, school type) significantly impact the total student loan debt upon graduation?
- Does a student’s employment status during college reduce overall debt?
- Do in-state and out-of-state students differ significantly in loan amounts?
Possible Data Sources
Key Variables
Dependent Variable: Total student loan debt at graduation (continuous)
Independent Variables: Family income, tuition costs, financial aid received, major category, institution type, employment status, etc.
Methods
- Perform multiple linear regression to predict total student loan debt.
- Check for multicollinearity, heteroscedasticity, and normality of residuals.
- Compare models with and without interaction terms.