Introduction to Machine Learning, Spring 2022
GWU Computer Science
As homework, you will complete and submit the in-class exercises from lecture. For this homework, you will be modifying your Homework 1 submission.
Please copy each line of the grading rubric (including number) into a markdown element in your
jupyter notebook that matches the cell that completes it, so we don't miss anything during grading :-)
GRADING RUBRIC for Homework 2:
0. Correctly changes all features of the DataFrame to be numeric, using apply() or
other methodology in your notebook (write python code instead of doing it manually or dropping columns like in the
first homework) | 5 points |
1. Correctly split dataset into train, validate, and holdout Dataframes | 5 points |
2. | |
3. A RandomForest model is correctly trained on training data (no validation), and evaluated on the holdout | 5 points |
4. Accuracy of the baseline RandomForest model is correctly calculated and reported on the holdout | 5 points |
5. The RandomForest model is re-trained five times, and tested on the validation set (reporting best-performing accuracy) | 5 points |
6. The best performing RandomForest model is correctly tested and scored on the holdout dataset | 5 points |
7. Discussion explaining your results and whether your best-trained model generalized | 5 points |