CSCI 3411 - Operating Systems

Posted on August 1, 2019 by Gabriel Parmer

This course covers the fundamental concepts of operating systems, focusing on resource management and abstraction. This includes OS structure, processes and thread management, communication with peripherals (I/O), synchronization, deadlocks, memory management, Virtual Machines, cloud infrastructures, and abstractions for cloud computation. The workload for this class is heavy and programming intensive.

Course Info

Instructors Contact info:

Professor: Gabriel Parmer
Teaching Assistants:
- Ratnadeep Bhattacharya (Deep) – ratnadeepb at gwu dot edu (CS)
- Jordan McShan – jordan_mcshan at gwu dot edu (ECE)
Tutor:
- Joe Hallal (do not contact via email)

Fall 2018 Class Time/Location:

Class: Tuesdays/Thursdays @ 12:45-2:00, FNGR 223
Labs: Thursdays @ 2:10-4:00 (CS), 4:10-6:00 (ECE), 6:10-8:00 (CS), SEH 4040

Office Hours:

For general class questions, use Piazza.
Gabe’s office hours in SEH 4600: Wednesday 4-5, Thursday 10-11
- We can always walk and chat after class
- “Evening OS Hour” Wednesday 9-10pm via Twitch
TA/Tutor’s office hours:
- Deep: Monday 2-4 in the 4th floor common area, SEH
- Jordan: Wednesday 2-4 in the 4th floor common area, SEH
- Joe: Sunday 10:30am-12:30pm in the 4th floor common area, SEH
Don’t forget about the ACM study group in Tompkins 2-4 on Sundays

TODO

Register for the class on Piazza. I’m opening this early, over the summer so that you can ask any questions you may have in preparation for the class.
Once the class starts, on Piazza you’ll find a link to the course survey. You cannot receive any grades until you fill this out; please do so immediately.
On Piazza, you should also find a link to the first homework (HW0) on github (which requires you to make/have a github account).

Course Overview

Learning Outcomes

Objectives - In completing this class, students will…

understand key concepts involving system resource management, organization, and abstraction
understand how an OS manages and interfaces with hardware
understand fundamental trade-offs integral to system design
experience both development and experimentation in a real OS

Structure - This class is broken into two main activities: lectures and lab.

Lectures will discuss the main concepts in systems with reference to implementation details where beneficial. A series of homeworks (all in C) will test student’s ability to apply the courses concepts in a set of projects. The midterm and final exams test student’s understanding of the lecture’s concepts.
Labs will help you jump into the xv6 kernel and learn how to write and understand code in key subsystems. You will have programming assignments related to lab lectures throughout the semester. In the second half of the class, you will have a large group project involving kernel programming.

Direct Instruction and Independent Learning

Each semester, you’re expected to spend at least:

70 hours in class and lab for direct instruction, and
150 hours out of class working on your project and class assignments.

Course Prerequisites and Student Responsibilities

Prerequisites:

Computer Architecture I, Software Engineering
Assignments for this class will be done in C, and will require kernel-level programming. Though we do not assume that you have kernel experience, if you aren’t somewhat comfortable with C, you will have a very difficult time. Thus, a familiarity with C or a willingness to quickly learn and practice it is required. If you are not comfortable with C, at least go through Essential C and the exercises in the Course Material Section.

Responsibilities - Students must

Attend all classes unless you are sick or there is an emergency in which cases you must contact the professor via email before class.
Interact, ask questions, and generally participate in class discussions.
Attend all labs, and do work assigned therein.
Complete programming problems assigned in lab, and all written assignments individually.
Work productively as a group on an extended and difficult project.
Complete all work in accordance with the academic honesty rules for the university and for the class.
Work to digest and process all provided material including Piazza content (you cannot ignore Piazza).
Contact the Professor if you are worried about completing any responsibilities.

Professor and Instructional Staff Responsibilities

You should depend on the instructional staff (including Gabe) for the class to

be clear about student’s responsibilities in the class,
convey the course’s materials on this webpage, on Piazza, in class, and in lab in a manner that is approachable and understandable to every single student,
provide an environment in which everyone feels open to ask questions, and seek to remedy any misunderstandings or confusions, and
provide additional material on demand if anyone is curious about knowledge beyond the scope of the class.

These can be remembered as “we’re here to make it clear what you should do, what you should know, and what to do if you don’t feel like you know what you’re supposed to do or know”. If at any point, you have concerns about any of these, of if we’re dropping the ball on any of them, please let us know and we’ll do better. There are trade-offs made in the class (for example, homeworks are not formally specified as I don’t want them to be 50 pages each), but you always encouraged to ask questions and clarify.

Above all, it is important that this class provides an environment for learning and self improvement. Gabe takes this very seriously, so if it is not adequately serving that purpose, don’t hesitate to let him know why.

Experiments

Part of keeping a course “fresh” and successful is trying new things, and keeping the things that work. Given this, the experiments I’m running this year include:

Online office hours. I really want to make office hours work this year. Historically a very small minority of the class attends office hours. There are many reasons for this, but two of them are inconvenience/timing and the fact that office hours can be intimidating. Thus we’re going to try a few things:

I will hold an hour of office hours (called the “Evening OS Hour”) a week via twitch, at a time that I hope everyone can make it: 9pm-10. This office hour will be in addition to the normal office hours so that the traditional options are still fully available to you.
For this hour, I’ll be covering a topic voted on in the subsequent week. These can include the normal class activities (Q&A about the class topics), but can also include a discussion about “skills”. These skills might include: how to use an editor efficiently, how to walk through a code-base, how to use and run xv6, how to debug using gdb, how to decipher a Makefile, how to understand and debug C, etc…
Topics for discussion during this time will be generated in the previous meeting, and voting is open to all students.

Organized class clarifications. I’d like to encourage a strong feedback loop for answering questions and concerns about the material in lecture and lab. The instructional staff make mistakes and might not convey everything perfectly, and everyone at some point in the class will get confused by the material. You must take the student responsibility (above) seriously that requires you to ask questions in class if something isn’t clear. However, some of these confusions only surface after you have some time to process/absorb the material.

Thus, after each class, we will post a specific Piazza note asking for areas in which you don’t feel confident. If I forget to post this, please do so with the title “Class clarifications for y xx/xx” where y is the day of the week (Tuesday or Thursday), and xx/xx is the month/day, and use the Piazza topic lecture_clarifications. I will look at these topics, and try and cover them as much as possible at the start of the next class. This is your opportunity to tell me what needs more coverage in the class.

In class, I’ll be calling on random students. I’m sorry.

This class requires active discussion, and for a broad cross-section of the class must be involved. Historically fewer than 10 students have fielded the majority of the questions, which isn’t a healthy dynamic, nor does it lead to a discussion that will help the majority. I completely understand that answering questions can be very stressful, and the constant possibility of being called on next can keep you on your toes. Know a few things:

Most of the things that I say are wrong when I’m learning something new. I think this applies to most people.
I learn slow, but once I get it, I know it well. This makes answering questions (and taking exams) on a tight timeline particularly difficult.

So I completely sympathize that answering in-class questions can be quite challenging. Because of this, my expectations are not that you correctly answer every question. Often the most intelligent way to answer a question is with another question. So I expect that if you cannot take a shot at a question, you try and identify where the confusion is and point me in the right direction to better explain the material. However, please stay alert, and do your best.

Everyone gets a single day that they are guaranteed not to be called on. Though attendance is mandatory, if you know you aren’t going to be able pay attention (personal complications, multiple exams, no sleep the night before, you know, college 😉), then you can get a pass. You must fill out the form at least 15 minutes before class, and can only fill it out once. Double check that you’re choosing the correct day.

Course Schedule

Topic	`xv6` Book Chapters/Additional Reading¹	Silberschatz Chapters²
OS Fundamentals: Hardware and Structure	0, 1, 3 and this post	1, 2
Processes, IPC, Threads, Isolation	this post	3, 4
Scheduling and Critical Sections	5, 4	5, 6, 19
Mutexes, Condition Vars, Deadlock	N/A	6, 7
Midterm	all of the things…
Memory Management and Allocation	N/A	8
Memory Protection	2	9
File Systems and Storage	6	10, 11, 12, 13
Security	N/A	15
Final	all of the things…

You can find the lectures on Piazza after they are delivered. By default, I’ll provide these after they are fully covered in class. I’ve found that people read ahead which can harm the learning processes (as quite a bit of information is not on the slides) as it removes the problem solving processes. If this policy hinders your learning process, talk to me and we can discuss options. For your reference, all of the lecture’s pdfs are on the following topics:

Class Introduction, OS definition and history
System structure and protection
System structure and processes
IPC and threads
Synchronization
Deadlocks
Scheduling
Real-Time Scheduling
Memory Management
Memory Management II
File Systems APIs and Abstractions
File Systems Design and Implementation

Course Material

Strongly encouraged (but optional) Text:

Operating Systems Concepts, newest edition, by Silberschatz, Galvin, and Gagne. If you have another edition, it is probably OK.

If you have trouble understanding the xv6 source code, I recommend the following resources:

xv6 main webpage
xv6 book
The book is paired with the printed code (can be generated with make print)

You can fork the xv6 source on github. When you do, test that you can run it with make, and make qemu.

In the end, you have to get comfortable walking through the code, and understanding it. Using a code indexing system will make this easier. ggtags or ctags for emacs does great, but there are corresponding technologies for nearly all editors.

If you’re having trouble with C, here is a list of references:

A good short refresher/introduction is Nick Parlante’s Essential C, memory and pointers, linked lists, and linked list problems. These are reproduced from here. The Linked List problem sets are especially useful.
A few resources from our Software Engineering course (i.e. csci2113).
If you’re having trouble understanding what a specific declaration is in C, use the cdecl command, or, if it isn’t installed, you can use the cdecl webpage.
For those of you who know C, but want to learn how to write better C, one source is the Composite Style Guide.

It is your responsibility to quickly get up to speed in C, so please use these resources.

Assignment Submission

All of your homeworks for this class will be submitted via github. The last commit that you make to the repo will be graded, and the lateness penalties of that commit (see “Late Policy” below) will apply. Please do not make pushes to your repo after the version you want graded. Please also not that it is necessary to issue a git pull up to the github repo for submission. Local commits to your own repo do not update github, thus cannot be counted as submission.

Grading

Grades will be assigned with the following proportions:

Homeworks and Quizzes: 50%
Midterm: 20%
Final: 20%
Class and Lab Participation: 10%

Homeworks include smaller programming assignment, written responses, and a long final programming project.

Participation will be graded not only on attendance, but also on interactivity and involvement during class and lab.

If needs be, there will be short quizzes at the beginning of classes. These will only happen if people make it a habit of not coming to class, or coming late.

Homework Grade Modifications:

The following modifies apply to your grade on all homeworks (but not the project):

+3 points: At least 10 commits that include “real” work (features, fixes, or tests). You should only commit when your code compiles. Including commits that “inflate” your commit count will result in -50% credit. The intention of this is to encourage you to iteratively develop your code, and ensure that you’re using git properly, and you will get extra credit if we believe you’re acting in the spirit of this intention.
+3 points: making more than 10 commits, with at least 7 of them 5 days before the deadline. Including commits that “inflate” your commit count will result in -50% credit. The intention of this is to encourage you to not procrastinate. You will get extra credit if we believe your acting in the spirit of this intention.
-50 points: not removing debugging/logging/informational prints from your code (in industry, you cannot commit code with printouts).
-50 points: not following the naming requirements, or not structuring code into specific files as required in the specification. If you rename functions, and that causes your code to not compile with our test code, see the next bullet.
-100 points: code does not compile as submitted. You must ensure it compiles with no warnings (using the xv6 warning flags, or -Wall -Wextra -Werror if not in xv6) as we will make compilations fail (with -Werror) if there are warnings. Forgetting to git add files is not an excuse. “A small change makes it compile” is not an excuse. You should always do a fresh git clone of your submitted repo, to make sure that it compiles and your tests work. Make sure it works in the VM using the version of the compiler we use (i.e. Ubuntu 14.04). If you don’t test in this specific environment, your code will likely fail compilation.

Your grade can be minimum 0%, but your maximum grade can be higher than 100%. There is one exception:

-200 points: If you cheat on a homework. You cannot get credit for work done by cheating, so you will get a zero on the assignment. In addition, you get this penalty. Please just come and talk to me if you’re feeling desperate. Talk to me before you do anything rash that sets you up for failure. You cannot resubmit work that you cheated on (see “Late Policy” below).

I cannot emphasize this point enough: The only way people have failed this class is by not trying or by cheating.

Late Policy:

The only acceptable excuses for late submissions are medical, family, and related issues. For these, you will require a note (from a doctor, parent, etc…) to justify an extension. Unacceptable excuses include:

your computer failing (we’re using github for a reason),
lots of midterms,
forgetting to git add necessary files, or forgetting to git push (proper git usage is a requirement for the class), and
you worked very hard and put a lot of time and effort in, but still aren’t quite “there” yet.

You get 0 credit for late work that is unexcused. However, once the grades are released, you can resubmit your code within seven days, rounded up to the closest midnight. For example, if the grades are released at 2:30pm on the 8th, resubmissions are due at midnight on the 15th (i.e. $8 + 7 = 15$). You will get 0 credit for submissions after this date.

We will grade the last commit before the resubmission deadline, and you will receive 50% of the credit on top of the original grade, rounded up to the nearest percent. For example, if you got a 43% on the initial deadline, and resubmit and get a 68%, your final grade ($g$) will be $g = \lceil 43 + \frac{(68 - 43)}{2} \rceil = 56$.

Do not come to rely on the resubmission process. Homeworks are due every week or every other week, so it is very harmful to get behind on them, or to add more work to your schedule. Thus, our late policy is quite strict to discourage procrastination.

Please note that in terms of academic honesty, you cannot share your code with your peers even after it is due.

Testing your implementation:

When you are required to test your code, please note that our test cases will best every edge case in the implementation. If you do not write code to test the edge-cases of the specifications, then you will lose credit. This is the largest reason why students get lower grades than they believe they deserve. If you believe that an error was made in grading, then please explicitly address why you believe your test cases are sufficient to test all cases. See the discussion about testing and debugging below.

Style and Code Craftsmanship:

Any homework graded by demo (i.e. the project) must adhere to style guidelines, and you must curate your code. It must be simple, readable, and clean. See this post For some thoughts on how to make your code more readable, see the Composite Style Guide.

Academic Honesty

Just as you can do a google search for code online, it is trivial for us to do the same. We have caught numerous people cheating in the past in this way. If you feel pressured about an assignment, please come see me instead of cheating.

You are not allowed to collaborate on the homeworks and the lab assignments. The group projects require collaboration within each group, but no collaboration between teams is permitted. Please refer to the academic integrity policy linked from the course web page. This policy will be strictly enforced. If you’re having significant trouble with an assignment, please contact me. If you have not signed the academic integrity statement for homework 0, you will not receive a passing grade for the class.

Summary: I want to be very clear:

If you see the code of another student (who is not in your group), you are cheating.
If you share your code with any other student (not in your group), you are cheating. This applies even when you are no longer taking the class.

In the past, the only way that people have failed this class is to either not try (not do homeworks), or to cheat. There is almost no reason to cheat. Come to me, and we can discuss where you’re at, and what you have to do to make it through this difficult class!

Justification: Cheating hurts your ability to compete for CS jobs. Everyone came to college to learn. You learn by doing work to reinforce class ideas. Understanding ideas from class is nowhere near as effective at learning as applying those ideas in an implementation. Thus, you need to do your own implementation to be a competitive CS student in the real-world!

Cheating hurts your peers. Courses that are graded on a curve have the side-effect that if someone inflates their grade not by hard work, but by cheating, your good grade (should the Professor not catch you) will hurt the grade of all of your colleagues that put the work in.

Cheating also hurts the quality of the OS class in the long term, thus hurts all future generations of GW CS undergrads. If you cheat, especially using previous year student’s work, to the point where the Professor believes that they must develop new assignments every year, then those assignments will not be as strong. They will not meet learning objectives in a course like this. All future CS students will be less prepared for competing for the best jobs.

Attending Office Hours

Please come see Gabe to address any of your concerns. You should not wait until a problem comes to a head before meeting. Gabe can help you find a path to avoid bigger problems. Of course, the main use for office hours is to discuss the homeworks, or help prepare for exams.

If you have a particularly horrible debugging problem, then Gabe might be your best bet. Regardless, who you talk to, if you want them to look at code then you must adhere to style conventions. See style conventions below. At the most basic level, you must

properly indent your code,
simplify your logical structure as much as you can to make the logic clear,
reduce the levels of indentation in your code as much as possible (see the Composite Style Guide for information about how to do this),
use functions to clarify your code, and
use well-chosen variable and function names.

If you don’t demonstrate to us that you actively tried to understand your own code by making it as easy to understand as possible, then it isn’t reasonable for us to dive into your code. Note that you really should take all of these steps when developing, debugging, and testing your own code, regardless. In the “real world” it is expected that you generate code that adheres to these guidelines.

For online office hours, you don’t need to prepare at all. Show up, and see what happens. If there aren’t questions, we’ll talk about the material, technical skills, or do code walk throughs.

Additional Material

xv6

To run xv6, you need qemu installed. If you aren’t running Linux, then you might execute Linux in a virtual machine (e.g. using Virtual Box or VMWare Workstation), and qemu within Linux. I believe others have gotten qemu executing in OSX as well (but it takes installing a cross-compiled ELF-version of gcc), so discuss on Piazza if you’re interested in that.

$ git clone https://github.com/gparmer/gwu-xv6.git xv6
$ cd xv6
$ make qemu

Simple!

Note that the xv6 book has a great overview of the design of the entire system. Please use it as a resource if you need it. Note that since you don’t own this repo, you cannot git push to it. You have to fork it on github to become the owner of your own fork. For this class, all assignments will be completed on repos that we provide for you.

Some hints for understanding the code-base. All code that runs in user-level is in the files that correspond to the entries of UPROGS, and all of the header files that they include. The rest of the code executes in the kernel. Thus to add a new program, you only need to provide the .c file, and add it in a similar manner to UPROGS.

When you’re looking at a .c or .h file, always make sure you understand if it is executing in user-space (thus must make a system call to enter the kernel), or is in the kernel (thus can directly call functions within the kernel). Using a code indexing system will make it easier to walk through the source code. ggtags or ctags for emacs does great, but there are corresponding technologies for nearly all editors.

`git` and `github`

Using github. github provides a wonderful set of repository management features online. You can go through the github bootcamp if you don’t know how to use it.

You can use github to coordinate between team members in many different ways. I’d strongly recommend that when working in a team, you use the feature branch workflow pattern of interaction. At the highest-level, this means that each team member will be working on their features on separate branches, and will integrate them into the mainline to coordinate with everyone else.

A few of the ways to interact with github:

git pull and git push are you main means of pulling your repo off of github, and to update the version on github. If you’re working in a group, these are the only ways that you synchronize your work between team members. If you want to submit an assignment, git push is the only means to do so.
Note that you’ll often have to resolve merge conflicts when you do a git pull when another team member has previously made changes to the repo. You have to go through your code looking for any place that git has inserted text like “HEAD”, “>>>”, “<<<”, and “—” that indicate where you’ve made changes that conflict with the github repo.
Alternatively, you can use “Pull Requests” (PRs) to and the feature branch workflow to coordinate with your team. This is highly recommended and emulates industry environments. A huge benefit of these is that you can do code reviews using github on PRs! See the software engineering guide for some information on code reviews and PRs. However, it is more advanced, so make sure you understand the simpler method, above, and get some practice using it before moving on to PRs.

Using git. I’m not going to do a deep dive into using git here, and expect that you can figure out the main features online. Some quick git tips can be found at the bottom of software engineering guide. An incomplete list of commands that you can use with git include:

git diff, git diff --stat, and git status which all let you understand at different granularities what changes have been made. Before you commit code, you should likely execute all of these commands. You do this for two reasons. First, you want to make sure that only code modifications that you want in the commit, are in it. Each commit should be relatively focused, so it is important to catch this. Second, it is a good way to remind yourself what the commit contains so that you can better summarize it in the commit message. It is also useful to know that you can execute diff on ranges of commits to see what changed (see git log to see the list of commits).
git commit -a, obviously. Do not, do not, use the m flag. You should write a “full form” message, not a single line. The focus of your message should be on 1. summarizing the commit in a single-line (which is displayed along with the commit in github), and 2. an explanation of the details of the commit. The latter should explain the intention, and often includes an itemized list of the main aspects of the commit. @s should be used to reference any issues it addresses.
git branch <b> and git co <b> to use and manage your branches. Get used to using branches. Branches are useful for when you’re trying to debug a problem, or implement a new feature. To see how they’re useful, it is important to realize that it is not uncommon to get distracted, and have to redirect your attention to another feature or bug. Without branches (and the ability to roll-back to before your work on the feature), you end up with a commit history of half-finished features intertwined with bug fixes and other features. This makes your Pull Requests schizophrenic and difficult to review. So by default, you should always be developing on a non-mainline branch. When you go fix a bug, or work on a new feature, ask yourself if it deserves its own branch.
git stash is an acknowledgment that we mess up with our branches. If you find that you’re somewhat accidentally developing a new feature, or fixing a new bug, but have a bunch of unstaged changes on the current branch, you can stash them, create a new branch, stash pop the changes into the new branch. I’m sure there are other ways to do this, but this example at the very least demonstrates how stash can be used to delay a current set of changes.
git rebase enables you to update a branch to an updated master, and to “squash” multiple commits, and clean up your commit history. When the master progresses beyond the point where your branch split from it, you generally want to fast-forward merge your changes onto the new master. git rebase master is your friend here. It is generally superior to git merge as it maintains a cleaner (linear) history. git rebase -i allows you to rewrite history by combining different commits. This is very useful as it enables you to have a set of commits that tell a story, and don’t have a number of “in progress” commits. My suggestion is that you label your commits that you’ll likely want to get rid of in the future with TODO as a header on the title line. More information about this can be found here and here.

Submitting assignments. When you submit an assignment, I suggest that you follow this procedure:

git status to make sure that you have git added all of the intended files.
git diff and git diff --stat to make sure that you’re committing what you think you are. Of note, check to see if you’re committing any printfs that are intended for debugging, not for program output.
git commit and git push to upload your code to github.
Go to the associated webpage for the repository, and make sure that the most recent commit is present in your repository.
Use git clone ... to get a fresh version of the repository, build it, and run your tests. Students who omit this step get bad grades for very avoidable reasons. Always make sure that your submission matches what you think you’re submitting.

Style

Please document in your repository (e.g. in the README.md) which coding style you’re using. By default, the xv6 repo we use is formatted in accordance with the Composite Style Guide. See that description to understand why style is important, and why we choose the style we do.

It is important, when you’re working on a team to use a consistent style. It is more important to choose one and stick with it, than to choose the “perfect style”. You can find many different styles online, including

Google’s C++ style, Java style, and Python style
Linux’s C style
GWU’s OS research group’s C style along with justifications

Though not relevant for this class, many modern languages have prescribed styles such as go, and Rust.

Style requirements for this class. For the purposes of this class, never bring code to the instructional staff that isn’t properly indented. That is the minimum bar that humans need to be able to understand code. If you bring code that isn’t indented, then we will think that you haven’t put a sincere effort into understanding your code as you haven’t invested care into making it understandable.

Debugging

Debugging is a skill. Just like any other skill, practicing doing it well will make you much better at it. Most students are so rushed by deadlines (often due to procrastination) that they focus only on “making it work” rather than on refining their debugging skills to become better computer scientists. Finding a fix to any given issue is important, but you must focus on developing your long-term skills, or else you’ll find your “ceiling” of capability much sooner than you’d like.

To deliberately and continuously improve your debugging skills, I implore you to focus on the following:

listing and check your assumptions,
applying the scientific method by treating your assumptions like hypothesis,
planning your debugging approach, and feeding this back into the development of your code, and
continuously rewriting your code to make it more debuggable and understandable.

They are each briefly discussed below.

List and Check Your Assumptions

When something goes wrong with your code, it is because your mental model of how everything should fit together does not match the instructions you’re providing the computer. Your mental model is build of many assumptions about what your code is doing at any point in time. One of the most important learned skills is recognizing when you’re making an assumption, and identifying what the assumptions are. Examples of these assumptions include:

The list head is properly initialized before calling the add function.
Error checking code should properly catch an aberrant input.
When the thread is initialized, it should use the specific part of the stack previously that was allocated.
The addresses used in the stack are within the bounds of the allocated stack.
When I use the sbrk system call in my code, it expand the heap and allows me to use more memory.
When a system call is invoked in xv6, the first code in the kernel properly accesses the system call’s arguments.
…

Most your debugging time should be going to coming up with creative ways to check your assumptions. When something goes wrong, (at least) one of your assumptions is wrong. Your job is to figure out which one(s). This changes the emphasis of debugging from “why so wrong!?!?” to “which specific part of my understanding of this is incorrect?”. This is how you change debugging from what feels like a random process, to something that is methodical, deliberate, and, most importantly, productive.

It is very common for people to get stuck with no obvious path forward. The program is simply not doing what they think it should, and it makes no sense! If you’re at this point, then you’ve already made too many assumptions (at least one of them being wrong), and have built your entire mental model upon them. The solution is to go back, and start testing not your code, but the assumptions made throughout the lines of your code. Where is your understanding of your code going wrong?

Scientific Method

The scientific method is based around formulation hypothesis about how some phenomenon will behave, designing tests for checking the hypothesis, running the experiments/tests, and either confirming the hypothesis, or starting the entire process again. An important aspect of this process is that it is focused on knowledge creation. Deriving a hypothesis takes a deliberate process that focuses you to apply your brain, and think about the workings of the entire system. If it is confirmed, you often know not only that the hypothesis is true, but also why. If the hypothesis is wrong, you learn that there is a bug in its formulation, and you might go back to more foundational tests to figure out how and why it is wrong. This enables you to understand the system better, and made a better hypothesis next time around.

You should treat your assumptions about your own code with the same care. First, acknowledge that they are assumptions – never just assume that a part of your code works without testing – and then treat the assumptions as hypothesis about the code that need to be tested. Most importantly treat your assumptions as something that you refine through intentional testing. It is through this process that you learn about how your code works.

Planning your Approach

When you sit down with the goal of solving some programming task, it is natural to try and jump in and start developing. This is a fine approach if the problem that you’re trying to solve is below some complexity threshold. If it is a very comfortable task for you, then just “diving in” can be a perfectly viable approach. As you get more and more skilled as a programmer, this threshold will increase. However, as you get more and more skilled, you’re responsibilities will likely increase faster than this threshold. Thus, it is necessary to learn to be productive on complex projects that you cannot easily just “sit down and implement”.

It is necessary to do work that is beyond this threshold for your own personal development. Much of the work in this class is valuable because it is, by design, too complex to implement productively without a plan. This forces you to practice the act of planning out your approach. I help with this in the homeworks by providing a set of “levels” that need to be completed. You must complete the levels one at a time as they provide a path to incrementally add complexity only after you’ve learned about the problem from the previous levels.

Your development plan is simply a sequence of implementation goals that you implement, and test one at a time. You don’t need to have a complete plan mapping from the start of the implementation to the end, but you do want the first N steps so that you have a plan. This allows you to implement you code in a way that has a little bit more of a long-term design behind it.

Rewriting Code

When we write code, we often don’t understand the problem domain well enough to provide an ideal implementation. The effect of this is that our code is far too complex. As we learn about different aspects of the problem domain – “ohhh, I have to worry about i > N as well”, “yikes, what if the head is NULL?”, etc.. – we add more and more conditions and loops. Keep this in mind: at any point in your code look at the level of indentation; that is a good proxy for the number of things you need to keep in your head to understand a given line of code. Each condition and loop encodes variables and changes that you need to remember to understand the code in their bodies. Given that human short-term memory holds only 7$$2, code simply becomes too complex to keep in your head.

It is necessary to revisit and rewrite your code when it becomes to complex. Debugging code that is too difficult to understand and too challenging to even formulate assumptions about will take orders of magnitude more time to debug than to rewrite your code with a simpler structure and debug that. I very frequently find that I cannot make sense about how to modify my code to achieve some goal, and the only anecdote that I know is rewriting the code to constrain and abstract away complexity. You’d be amazing how much a few well-named functions containing well-abstracted code can make your code easier to understand. Additionally, the act of trying to simplify your code makes your code easier to debug and forces you to understand the fundamental problems, which also makes debugging (and assumption formulation) easier. In short, real-world programming involves a lot of code rewriting. For much more information about how to write C code with an eye toward constraining the complexity, see the composite style guide.

More Resources

Devon O’Dell’s Debugging Mindset
Bryan Cantrill’s Debugging Trilogy (more amusing than informative)
Do you have any other suggestions? Send me an email!

Testing

Untested code must be assumed wrong, simply put. You can never make the assumption that any part of your code works that you haven’t tested, no matter how simple. “It should work” is something that the instructional staff will never believe, and should be an indication of hard debugging ahead if you find yourself uttering it. “It should work” is a huge assumption with a builtin recognition that you aren’t going to test the assumption. Put another way, it is a recognition that you’re about to waste a lot of your own time. So I want to make it clear that you must test all parts of your own code. If your grade is lower because a branch doesn’t quite do what it is supposed to, “but it is just a small error” not only is completely incorrect, but demonstrates a complete lack of understanding of the fundamental importance and purpose of testing.

The job of a library, an operating system, or a piece of utility code is to make the programmer able to perform some operation, or to make the operation easier. In such situations, you have to assume that programmers don’t know what they are doing, and that, to your greatest ability, you’ll catch their errors and return a useful indication of where they went wrong. In the case of an OS, you have to check that all parameters/inputs are valid as if they are not, they might cause a bug that could crash the entire system, or allow the user to take over the system. This is accomplished in a similar way in every language, on every platform: write the code to explicitly check the validity of arguments, write the code to check for edge cases, and return the proper result. In the case (as in this class) where you’re given the specification for each operation, you simply need to ensure that you encode all of these erroneous inputs into conditions, and return the proper value.

Next, since code that isn’t tested doesn’t work, you have to write tests to make sure that all inputs result in the correct output. Properly testing your code is simple: codify your assumptions about how your implementation behaves into code. This absolutely means that you’ll spend time writing your testing code to generate all the edge cases, but this is absolutely necessary.

Most submissions that lose points, lose them because:

Conditions aren’t there for specified inputs.
Conditions are there, but aren’t properly tested, thus produce incorrect behavior.
Implementations are incomplete, or have known bugs.

There is almost no reason to lose points for 1. and 2. Testing is an essential part of being a developer. Thus, the instructional staff will not be lenient if you don’t do proper testing.

Find the book online ↩
Silberschatz chapters are specified with respect to the 9th edition↩