Introduction to Machine Learning, Fall 2021
GWU Computer Science
In this project, you will work on a single semester-long project on a topic and dataset of your choice. This option is meant for students in CS6364 and those with previous machine learning experience. The goal is to work on a project that could form the foundation of an academic article or similar. Students selecting this option will need to submit the proposal below by 01/19 to be approved. The projects selected need to be complex. This assignment must be completed in teams of two or three -- project complexity must scale with the number of team members specified.
Your project must make use of at least one deep learning model.
Note: please be mindful of where you host and source your data. We will not allow students to work on projects that contain objectionable and/or illegal material that is also prohibited by the school's computing policy (which includes, but is not limited to pornographic material). You and your team are solely responsible for ensuing that the data you are working with abides by GW's computing policy as well as the law.
Please fill out the sections below giving details of your chosen project. Then, review the grading rubric here and explain how your project proposal either will be able to meet that existing requirement, or suggest an equal, alternative-in-spirit item for professor approval.
Please fill out the sections below.
Please copy each line of the grading rubric (including number) into a markdown element in your
jupyter notebook that matches the cell that completes it, so we don't miss anything during grading :-)
1. | Load data correctly and show contents in a cell | 2 points |
2. | Holdout dataset split as specified | 2 points |
3. | Correct explanation generalization from such a holdout split | 5 points |
4. | Printout of dataset distribution, including missing data. For imagery datasets, provide the
"average image" for each class. For tabular data, use value_counts() and describe() .
For textual data, show the distribution of your labels/targets. | 2 points |
5. | Discussion of how the dataset distribution can/will affect your modeling. | 5 points |
6. | Handle any missing data. For imagery/text datasets, discuss what records/items you might drop and why. | 2 points |
7. | The holdout dataset also contains missing data/bad images/text. Discuss how you handled this in your holdout, or why it was not a problem for you. | 5 points |
8. | Discuss (and implement if applicable) whether or not you need to scale/normalize your features, and which ones, if any, for tabular data or imagery. For textual data, display the outputs of the word embeddings and discuss why they look the way they do. | 5 points |
9. | If your dataset has categorical features: discuss and implement if you will encode them as ordinal numbers, or one-hot encode them, and why you chose to do so for each such feature. If you are using images/text, discuss whether you are performing classification or regression on your dataset and why (instead of the other one). | 5 points |
10. | Give an example of an ordinal feature that you've seen used by others, when it should have been treated as a categorical. | 2 points |
11. | For tabular data: Use a heatmap to show the correlation between all feature pairs. Discuss, if any, which features you would recommend dropping from your model. Also discuss why you would want to drop them (what is the expected benefit?). For imagery/text: Show a histogram of the distribution of pixels or word embeddings across your dataset. | 5 points |
12. | Discuss what feature you would engineer (and implement) if using tabular data, what customized dataset augmentation you would use (not required to implement) if images, or what non-standard pre-processing might help, if text | 5 points |
13. | Separate your training data into features and labels. | 2 points |
14. | Discuss and implement how you will handle any dataset imbalance. | 5 points |
15. | Instantiate a model of your choosing. | 2 points |
16. | Define a grid to tune at least three different hyperparameters with at least two different values each. Discuss why you think these parameter values might be useful for this dataset. | 5 points |
17. | Set up a gridsearchCV with 5-fold cross validation (scikit-learn) or equivalent in
PyTorch. Discuss what accuracy metric you chose and why. | 5 points |
18. | Train your model using grid search (or equivalent), and report the best performing hyperparameters. | 2 points |
19. | Calculate accuracy, precision and recall on the holdout dataset. Discuss which metric you think is most meaningful for this dataset, and why | 5 points |
20. | Discuss how the model performance on holdout compares to the model performance during training. Do you think your model will generalize well? Why or why not? | 5 points |
21. | Generate a confusion matrix and discuss your results. | 5 points |
22. | Train and tune another type of model on your training dataset. Using the best performing hyperparameters, test this model on your holdout. How did it perform, compared to your earlier model? Do you think your results will generalize? | 5 points |
23. | Next, repeat training and tuning on the same data with a third model, dissimilar from the other two. Do you need to do any additional feature cleaning or scaling here? Why or why not? | 5 points |
24. | For images, define a list of image transformations to be used during training, passing them to transforms.Compose() . For text and tabular data, discuss what pre-processing you used.
Discuss why you think these transformations might help. | 5 points |
25. | Repeat the step above for test and validation transformations. | 2 points |
26. | Correctly set up DataLoader s for the three folders (train, validation, holdout). Discuss
what options you chose for these loaders, and why (including batch size, shuffling, and dropping last). | 5 points |
27. | Instantiate any pre-trained model. Discuss why you chose it amongst the others. | 5 points |
28. | Write code to freeze/unfreeze the pretrained model layers. | 2 points |
29. | Replace the head of the model with sequential layer(s) to predict however many classes you need. | 2 points |
30. | What activation function did you use in the step above? Why? | 5 points |
31. | Did you use dropout in the step above? Why or why not? | 5 points |
32. | Did you use batch normalization in the step above? Why or why not? | 5 points |
33. | Choose and instantiate an optimizer. Discuss your choice. | 5 points |
34. | Choose and instantiate a loss function. Discuss your choice. | 5 points |
35. | Write code that places the model on the GPU, if it exists, otherwise using the CPU. | 2 points |
36. | Correctly set up your model to train over 20 epochs. | 2 points |
37. | Correctly set up your model to use your batches for training. | 2 points |
38. | Correctly make predictions with your model (the predictions can be wrong). | 2 points |
39. | Correctly choose a loss function and back-propagate its results. | 2 points |
40. | Use the optimizer correctly to update weights/gradients. | 2 points |
41. | Correctly record training losses for each epoch. | 2 points |
42. | Correctly set up validation at each epoch. | 2 points |
43. | Correctly record validation losses for each epoch. | 2 points |
44. | Correctly record training and validation accuracies for each epoch | 2 points |
45. | Graph training versus validation loss using matplotlib.pyplot (or other). Was your model
overfitting, underfitting, or neither? | 5 points |
46. | Make a list of reasons why your model may have under-performed. | 5 points |
47. | Make a list of ways you could improve your model performance (you don't have to implement these unless you wan to). | 5 points |
48. | Graph training versus validation accuracy using matplotlib.pyplot (or other). Score your
model on its predictions on the holdout. Discuss why you
think your results will or will not generalize. | 5 points |
49. | Generate a dataset of just three items, one for each class, and show your model correctly labels them. (display each item in your notebook, pass it to your model, and then print the prediction). | 5 points |
50. | Generate three datasets of our inputs, where each has only two of the classes. What do you predict the performance should be for three binary classifiers trained on these three datasets? Re-train your model on these three datasets, and discuss your results. | 5 points |
51. | Generate a dataset from your original dataset where 20% of the classes in one class are mis-labelled as the remaining two classes. How do you think your model performance will be impacted? Re-train your model on this test dataset, and discuss your results. | 5 points |
52. | Take a look at each of the items in all classes individually. What aspects of the item (such as backgrounds) might be influencing the decision-making of the model, besides the salient parts themselves?. | 5 points |
53. | Is the data biased in any way that could impact your results? Why or why not? | 5 points |
54. | If you noted some potential biases in the modeling/dataset above, discuss how you could help mitigate these biases (you don't need to implement, just discuss). If you didn't note any biases in this dataset, discuss what biases there could have been, and how the dataset designers might have helped mitigate them. | 5 points |
55. | Correctly train your model without pre-training (and discussion how this affects performance) | 5 points |
56. | Correctly implement saliency maps for all images. If doing text or tabular data, discuss feature importances or other metric. | 5 points |
57. | Discussion of saliency mapping or other metric from above. | 5 points |
Presentations are expected to be about 15 minutes, with all group members present and speaking.
1. | Discussion of motivation for work, including explanation of related work. | 5 points |
2. | Discussion of dataset aquisition and preparation. | 5 points |
3. | Discussion of model selection and hyperparameter options chosen/hypothesized. | 5 points |
4. | Discussion of results. | 5 points |
5. | Good use of powerpoint, and presentation meets length requirements. | 5 points |
Provide a list of milestones and due dates tailored to your project. You may use the grading rubric above, or the milestones can be more holistic. Include at least three milestones with dates.
Extra credit:Choose up to four related papers to your work, and summarize each one in a paragraph.
Discuss paper 1 in one paragraph. | 5 points |
Discuss paper 2 in one paragraph. | 5 points |