ISYE 6414

Code Deliverable

Overview

You're meant to turn in all of your work in a replicatable form. We need to be able to run your code and get your exact results.

If you used downloaded files and they're reasonably sized, upload them with your submission. If they're huge, upload a link to them along with clear instructions for downloading.

Important: Code accounts for 20% of your final grade and is assigned individually

Your code submission is a critical component of your project and will be graded individually for each student. We need to be able to understand and reproduce your analysis.

Deliverables

1. Data Loading and Cleaning Code

Code to load, clean, and process your datasets. We should be able to run these steps and get your exact cleaned data. Include all data preprocessing, missing value handling, and feature engineering.

Team vs. Individual Work:

Initial data cleaning and processing can be shared team work (placed in repository root). If you work with additional datasets for your individual analysis, place that cleaning code in your personal folder.

2. Data Integration Code

Since you're combining multiple data sources, provide the code used to join/merge your datasets. This is crucial for understanding your data integration approach.

Team vs. Individual Work:

Basic data integration can be shared team work (placed in repository root). Individual analyses may require additional data merging specific to each approach.

3. Analysis Code with Seeds

All analytical code including model fitting, statistical tests, and validation. Use appropriate seeds so we can literally get your results! Include:

Exploratory Data Analysis (EDA) code
Outlier screening and handling procedures
Variable selection procedures
Goodness of Fit testing and model assumption checks
Model training and evaluation code
Statistical analysis and hypothesis testing
Cross-validation and model selection procedures

Individual Work Required:

Each student must perform their own complete end-to-end data analysis with their chosen modeling approach. This code must be in your personal folder.

4. Graphics Generation Code

Code used to generate all graphics, tables, and visualizations in your report. Your Final Report must meet all graphics requirements: exactly 5 graphics (combination of tables and data visualizations including charts and graphs), with at least 1-2 being impressive "show-stoppers" with excellent visual design, coloring, and labeling. Use seeds where appropriate to ensure reproducible results.

Team vs. Individual Work:

Graphics specific to individual analyses go in personal folders. Overarching graphics that don't stem from a single analysis (e.g., dataset overview charts) can be shared team work in the repository root.

5. Team Contributions Documentation

Include documentation noting any team member who does not contribute at least 80% of what is expected for the code deliverable. This can be a simple text file or comment block in your main notebook.

Code Deliverable

Overview

Deliverables

1. Data Loading and Cleaning Code

2. Data Integration Code

3. Analysis Code with Seeds

4. Graphics Generation Code

5. Team Contributions Documentation

Format