GWU

CSCI 4364/6364

Introduction to Machine Learning, Spring 2022

GWU Computer Science


Homework 2 (due 1/26 at 11:59pm)

As homework, you will complete and submit the in-class exercises from lecture. For this homework, you will be modifying your Homework 1 submission.

Please copy each line of the grading rubric (including number) into a markdown element in your jupyter notebook that matches the cell that completes it, so we don't miss anything during grading :-)

GRADING RUBRIC for Homework 2:

0. Correctly changes all features of the DataFrame to be numeric, using apply() or other methodology in your notebook (write python code instead of doing it manually or dropping columns like in the first homework)5 points
1. Correctly split dataset into train, validate, and holdout Dataframes5 points
2. Dataframe is correctly split into train and holdout5 points
3. A RandomForest model is correctly trained on training data (no validation), and evaluated on the holdout5 points
4. Accuracy of the baseline RandomForest model is correctly calculated and reported on the holdout5 points
5. The RandomForest model is re-trained five times, and tested on the validation set (reporting best-performing accuracy)5 points
6. The best performing RandomForest model is correctly tested and scored on the holdout dataset5 points
7. Discussion explaining your results and whether your best-trained model generalized5 points