The Analysis Plan is your roadmap for the Final Project. Its purpose is to:
The Analysis Plan accounts for 30% of your final grade and sets the foundation for your entire project.
A literature review is a standard section in academic research papers. It is a summary of the research that has been done on the same topic, which we expect to find in peer-reviewed journals. Whenever you're doing research, your aim is to understand how the data and the questions of interest fit into the world's existing body of communal knowledge.
Discuss the problem's origin
How it has progressed since its origin to present
The impact of the problem and the good that could come from solving an aspect of it
Discuss 2-3 previous analytical attempts to understand this problem (NOT public policy measures). Focus on research studies that used data analysis, statistical modeling, or machine learning to investigate the problem. With each analytical approach, describe its merits + shortcomings
You must have at least 20 peer-reviewed journal articles, each cited at least once in the Literature Review section. Only peer-reviewed journal articles count toward the 20-source requirement - books, white papers, websites, and other sources may be cited if useful but do NOT count toward the 20. Datasets should also be cited but do NOT count toward the 20 source requirement - clearly demarcate dataset citations (e.g., in a separate 'Data Sources' section of your references) so they can be easily distinguished from literature sources. Your literature review should be quite densely cited: if you're not familiar with literature reviews, find a few examples and emulate them.
In this section, you are essentially saying: we have decided to use these N modeling approaches, and here's why.
Each student must perform their own individual analytical approach. Each approach needs to perform all of the steps we would expect given what you've learned in the class (e.g, you perform X checks before and after your models, as shown in the class). Team GitHub repositories will be created after Analysis Plan submission, as some groups may change due to students dropping the course.
Name each approach with a subheader (e.g., Approach #1: Poisson Regression). In 2-3 paragraphs following the subheader, detail:
Describe why you think your approaches might succeed where previous attempts failed.
Describe the challenges you anticipate may be encountered by your approaches.
Describe at least 2 other approaches you considered and why you chose not to undertake those.
You are essentially a team of individual data analytical superheroes coming together to do something awesome.
This appendix must report two types of contributions: (a) each individual's contributions to the shared work of editing, assembling, writing the joint report, and reviewing each other's analyses, and (b) attribution of each individual analysis approach to its respective group member. Groups must note any individual who does not contribute at least 80% of what is expected to the shared work. This appendix is required for all Analysis Plans.