GWU

CS 4364/6364

Introduction to Machine Learning, Spring 2022

GWU Computer Science

Syllabus

Welcome to CS4364/6364 -- Introduction to Machine Learning!

This course will cover theory and applications of core machine learning concepts and techniques, including deep learning. We will be developing projects in Python with scikit-learn and doing deep learning with PyTorch. No previous machine learning experience is required, but if you have some, you will be able to select advanced assignments to work on in this course.




Professor: Dr. Kinga Dobolyi

Research interests: Software testing, computer vision and natural language processing for biomedical applications, computer science education

Contact: Use Ed discussion board first please! Otherwise, reach out on office hours if possible (see below). For non-Ed questions, you can also email me (kinga@gwu.edu).

Office Hours: Office hours are Wednesdays from 1-2:30pm on Zoom, or email me for an appointment (between the hours of 9am through 4pm Monday through Thursday, schedule permitting, fifteen minute appointment blocks). Although we'd love to see you, please see if your question can (or already has been) answered on Ed before -- this will help you get faster answers, and help your classmates who probably also have the same questions!


Class Schedule: Tuesdays and Thursdays 12:45pm - 2:00pm, MON 250

Objectives -

  • Students should be able to apply machine learning techniques to practical problems.
  • Students should also develop an understanding of which techniques are applicable to which problems.
  • Advanced students will also gain tools for the design and analysis of machine learning algorithms.

Course Description and Prerequisites:

  • CSCI 4364/6364: Overview of core machine learning techniques/algorithms: nearest-neighbor, regression, classification, perceptron, kernel methods, support vector machine (SVM), logistic regression, ensemble methods, hidden Markov models (HMM), non- parametrics, online learning, active learning, clustering, feature selection, parameter tuning, and cross-validation.
  • Students cannot receive credit for both CSCI 4364 taken while an undergraduate and CSCI 6364 taken while a graduate student.
  • Students in the combined BS/MS program cannot receive credit for both CSCI 4364 and CSCI 6364.
  • Prerequisites: Coursework in analysis of algorithms (CSCI 3212 or 6212), probability theory or statistics (e.g. CSCI 3362/6362), and linear algebra (MATH 2184), or equivalent. Students with non-traditional preparation should email the instructor.

  • Webpage for the course (you are here)
  • Blackboard: we will use Blackboard to upload programming assignments and for online office hours (and lectures/labs/assessments if we must move the course online due to COVID19 this semester).
  • Ed discussion board: Ed will be our discussion forum, the place where instructors will post all announcements, and the quickest and preferred way students communicate with the course staff and each other. Note that Ed allows both public posts (that the whole class can see) and private posts (that only instructors can see). We will make every effort to answer questions within 24 business hours, but usually sooner (often less than two hours during weekdays).
  • Textbook: None required, but Python Machine Learning by Sebastian Raschka is a solid place to start.

Grade Calculation:

  • Homeworks: 10%
  • Projects: 40%
  • Exams: 30%
  • Final Exam: 20%


Final course grades are calculated as follows for undergraduates:
A (>= 92.0%) A– (>= 90.0%)
B+ (>= 88.0%) B (>= 82.0%) B– (>= 80.0%)
C+ (>= 78.0%) C (>= 72.0%) C– (>= 70.0%)
D (>= 60.0%)
F (< 60.0%)

For graduate students, any grade below a C– (70.0%) will result in an F in the course.

Late Submissions:

Late work will not be accepted in the course for projects, with the exceptions stated in the COVID19 policy section below.
If you are unable to attend an exam due to an unforseen event like an illness, a doctor's note (or documentation for another university-approved excuse) will be required.


Homeworks:

Homeworks will typically be assigned and started in lecture, to be finished at home. You will be assigned to a group of five students; homework is to be completed together, with one student submitting for the group. Please place all the group member names at the top of the code submission (typically a jupyter notebook).

Projects:

Students in the course may choose between two ways of grading for the Project component of their grade:
  • Completing the three standard projects assigned this semester in the course (decision trees, CNNs, and a ML testing/evaluation project). This approach is recommended for students signed up for CS4364 who have not had any previous machine learning experience. Datasets will be provided by the instructor. These assignments are to be completed individually.
  • Completing a single semester-long project on a topic and dataset of the student's choosing. This option is meant for students in CS6364 and those with previous machine learning experience. The goal is to work on a project that could form the foundation of an academic article or similar. Students selecting this option will need to submit a proposal by the deadline on the schedule to be approved; if you do not submit a proposal, we will expect that you will be working on the three projects above instead. The projects selected need to be complex. This assignment must be completed in teams of two or three -- project complexity must scale with the number of team members specified.
You may choose to complete the projects in any programming language of your choice, although instructional support will only be available for Python and PyTorch.

  • At the time this syllabus was written, GWU expects full in-person instruction starting January 18th. As circumstances may change depending on university policy, we may have to adjust the delivery of the course to be fully remote, and/or make modifications to accomodate social distancing. In-person groupwork and assessment may no longer be feasible at certain points during the semester. We will publish updates to Ed regarding policy changes. The grading of the course should not need to change, though assessments may be moved online if in-person exams are no longer feasible.
  • Instructors and students will all need to be flexible regarding unforseen circumstances in terms of absences due to illness and/or quarantine, as applicable. If you are ill, please send an email to the instructor as soon as possible, so we can make sure to connect and make arrangements.
  • We will attempt to have in-person lectures recorded this semester. Remote lectures will not be recorded.
  • We expect the possibility that many students will have absences due to illness during the semester. Due to the potential volume of these absences, students must be proactive in managing their circumstances by: 1) emailing the instructor before any due dates to request extensions. Doctor's notes are NOT required due to illness to request an extension, as long as the request is made 12 hours before the assignment deadline. Otherwise, a doctor's note detailing the dates of student illness will be required to request extensions 12 hours or less before the assignment due date (or after the assignment due date has passed); 2) Due to the expected volume of late submissions/makeups from COVID19, students turning in late work must follow up with the instructor to ensure that their work gets graded and recorded on Blackboard within two weeks of the late due date for that assignment. Late work will not be graded/recorded after that deadline.

If you feel pressured about an assignment, please email the instructor instead of cheating. All work that you submit in this course for a grade should be your own (or the work of your group, who's names you have documented). If we detect cheating, we reserve the right to assign the student a 0 on the assignment, or an F in the course for more egregious violations. We will also be using automated software to be checking for cheating with code that is submitted to us.

You are not allowed to collaborate on and graded assignment unless explicitly told to. Group assignments require collaboration within each group, but no collaboration between groups is permitted. Please refer to the academic integrity policy linked from the course web page. This policy will be strictly enforced. If you're having significant trouble with an assignment, please contact the instructors. Please see: Academic Integrity Policy

If you are a student with a disability and you need academic accommodations, please see the instructor during the first week of class, and contact the Disability Support Services Office (DSSO). All academic accommodations must be arranged through DSSO.

All people have the right to be addressed and referred to in accordance with their personal identity. In this class, we will have the chance to indicate the name that we prefer to be called and, if we choose, to identify pronouns with which we would like to be addressed...I will do my best to address and refer to all students accordingly and support classmates in doing so as well.

Instructor Communication


We encourage active communication and feedback between the instructor and students!

Please make sure to check Ed at least daily during weekdays, as the instructors will post all important information there about the course (we will not be using Blackboard except for assignment submissions).

The instructor will also periodically reach out to students via an email check-in; when you get such an email, it doesn't mean you did anything wrong! We simply like to know how our students are doing and feeling as the course goes on, especially if there are things we can help with!

Of course, communication goes both ways: we would love to hear from you at any time during the semester about anything you're stuggling with (or enjoying) in the course -- please do not hesitate to reach out via email (and/or have an office hours session)!