CS 297, Spring 2010: Advanced Operating Systems

Instructor: Professor Gabriel Parmer
gparmer at gwu
Class Schedule: Monday 6:10-8:40pm, Tompkins 205
Office hours: Monday 11am-12noon, 5-6pm

This course covers advanced topics in Operating Systems (OSes), while also reviewing and reinforcing fundamental concepts including concurrency and parallelism, OS structure, threading models, accountability, security, reliability, and data movement. Topics covered include:

News:
  • 4/21/10: Please check the schedule for your presentation slot, and directions on the final report.
  • 2/13/10: The class Composite repository now contains the capability to dynamically self-load components. A new component, boot.o can create a new component, test.o, load its memory, and initiate execution. If you require fine-grained control of 1) component's memory, 2) capabilities, or 3) spd creation/deletion, you might want to check out the code (git pull). It is backwards compatible (i.e. run.sh still works without change).
  • 2/9/10: Please check the schedule for changes. The snow day required dropping a paper, and moving some others around.

Objectives

Students will gain an understanding of the primary issues in the design and implementation of OSes such as structure, threading, accountability, parallelism, data movement, reliability, and security. By investigating research projects that approach the different design dimensions in interesting ways, students will witness, understand, and evaluate the trade-offs and implications of the different implementation decisions. By reading about, understanding the purpose of, and discussing about a variety of systems, students will gain an appreciation for the potential impact of OSes and how they effect both specific applications and software in general. Though a semester-long project, students will gain hands-on experience implementing OS services and applying the different methodologies that are discussed throughout the class.

Student Responsibilities

Students must
  1. Read all papers for the class, and write summaries.
  2. Present research papers to the class.
  3. Complete a semester-long project investigating an aspect of OSes in depth.

The course will be paper based. This means we will be reading a variety of research papers, students will be presenting them in class, and we will discuss them.

Paper Summaries: At 11:59pm on Sundays before class, you must email me a summary of each paper that is going to be discussed. Please put your name at the top of your reviews. All summaries for the week should be contained in a single text (.txt) or .pdf document. Word documents (.doc, .docx) might not be graded. You are exempt from reviewing papers only if you are presenting on that day. The summary must include:

  1. A 1 to 3 sentence summary of the purpose of the paper (i.e. what is its contribution?).
  2. What you liked about the paper, and what you though were its limitations (e.g. generality of the approach, restrictiveness of the assumptions, etc...)
  3. Questions you had about the paper. You are not expected to understand 100% of each paper, and the questions you may have can be part of the class discussion.
These summaries should not be long. In fact, the more concise they are, the better. You should bring a copy of each paper to class.

Presentations: When you are scheduled to present in the upcoming class, you should contact me and I will give some advice on which topics to focus on for your specific paper. You must email me your presentation by Friday at 11:59pm. This will allow us two days to alter the presentation if necessary.

Projects: See the section below on the course project.

Reading and Presenting Research Papers

Reading and understanding research papers is a skill that you will develop throughout the course. The most important concept to understand when you're done reading a paper is "what was the purpose of the research"? In answering this, you will define the contributions of the paper. You should have a general idea of what the contributions are after reading the abstract and introduction, but keep them in mind throughout as they provide motivation for how the system is implemented, and how it is evaluated (which tests are done). When the authors are presenting techniques and implementation details, always keep in mind what the assumptions are, and what the limitations are of those techniques. As you are reading through the papers, make notes of what questions you have, what you don't understand, what you like, and what facets of the research seem limiting.

Making presentations for research papers can be difficult, but if you keep a few things in mind it should organize your approach. First, it is important to convey in your presentation the motivation for the approach taken in the paper. You will answer, what is the problem being addressed by the paper? This might include how other systems have insufficiently approached the same problem, how some specific application scenarios necessitate the research, etc... After the motivation, the implementation details should be presented. You should not include all details here. Instead, you should focus on the most important details essential for understanding the empirical evaluation. If minute details are important, we can look at the paper directly. Last, you want to discuss how the authors justify their system through empirical evaluation. What tests do they do, why, and what are the results.

Often presentations already exist online for some papers (especially the more recent ones). You can feel free to use content from these presentations (I encourage it) but you must include attributions and credits appropriately. If you do not do this, you are plagiarizing (see Academic Honesty below).

General advice on giving good presentations and writing good papers can be found here.

Course Project

Throughout the course of the semester, you will work on an implementation project that focuses on extending an OS in a novel way. Some of you are already working on projects, or have ideas that might be applicable to this class. If this is the case, I encourage you to meet with me as soon as possible to determine how your project or idea can be extended for the class. Most of you will not have a project already that is relevant for the class. In this case, I have a list of possible projects, and we will meet in the first two weeks of the class to determine which one you are most interested in. Most suggestions will be focused on the Composite OS, an in-house system at GW. You can expect that if you do a project involving Composite, you can use the professor as a resource for implementation questions you may have. To some extent, the same applies if you choose to do a project in Linux. However, if you project involves another system, you should not expect much implementation guidance from the professor.

The project, topics, and the software infrastructure for the course will be discussed in class.

Overview of the Composite OS

Composite is a component-based operating system. Individual system policies, mechanisms, and abstractions are defined as separate, independent components. These components execute in user-space in their own page-table protected protection domains1. Each component exposes an interface (a set of functions) that can be invoked by other components. A component, then, is a specific implementation of an interface. There can be multiple implementations for a single interface. For example, the scheduler interface can be implemented by different components that separately provide fixed-priority round-robin, and earliest deadline first scheduling policies.

When a component, A, invokes functions within the interface for B, we say that A depends on B. An executable system is composed from a set of components by satisfying all of their depends on constraints. Threads execute through the system by invoking components and continuing execution within them. Component-based systems are beneficial as the features and policies of the system are chosen specifically to satisfy the goals of the system. Additionally, as each component is confined within its own protection domain, a fault in one does not necessarily effect other components. This fault isolation has promise to greatly improve the reliability of the system as a whole. For a course like this, an additional benefit is that making modifications and additions to the system is easier due to the required modularity of the code.

The current code-base includes ~30 components that implement the necessary functionality for a http server (aka. a web-server). There are many possible projects relevant to this class including extending the functionality of the system, or enhancing the current implementation.

The bootstrapping model for Composite is unique. The hardware is booted into Linux, and Composite is inserted into the kernel-level as a module. Once in the kernel-level, it hijacks the hardware by resetting hardware-defined kernel-entry points to point to code in the Composite system. For example, system calls are redirected to execute the Composite system call handler instead of the default Linux handler. In this way, Composite has all the power of a normal kernel, but can be developed with more ease as "rebooting" it consists of only reinserting the Composite module. See "Hijack: Taking Control of COTS Systems for Real-Time User-Level Services" for more details.

To log into your virtual machine, use the user-name composite, and the password composite. To get a root terminal, use sudo su with the same user name and password. The Composite source tree can be found in your virtual machine image under /home/composite/Development/composite. Specifically note the doc/ directory that includes documents explaining the directory structure, the build system, and how to implement a component and an interface. The source is in a git repository and you should do your development on the advos branch. Please checkin regularly.

Footnotes:
1 - Composite provides Mutable Protection Domains (MPD). These enable the system to dynamically construct and remove protection domain barriers between separate components at run-time. This enables a novel approach to dynamically trade-off the fault isolation and performance of the system depending on where inter-component communication patterns at any point in time.

Composite FAQ

Question: The root terminal returns control back to me, and httperf doesn't work anymore! What's happening?

Composite is setup to only run for a limited amount of time before shutting down and returning all control to Linux. You can see how long this "runtime" is by seeing the RUNTIME_SEC define in src/components/implementation/sched/sched_timing.h. By default Composite executes for 30 seconds before quitting. If you try and use httperf after Composite has shutdown, nothing will happen.

This runtime limit exists for a reason. You often wish to run Composite as the highest priority in the Linux system (to get more predictable timing results). In such a case, you must have a mechanism for the system to stop executing. This timeout provides that. If the timeout becomes annoying to you, please feel free to change the above file to have a higher runtime.

So keep in mind that you can only exercise the Composite system while it is executing, and that is for a finite amount of time. Further, remember that at point in time, you can terminate Composite by using Cntl-C in the terminal executing it.

Question: The virtual machine locks up when I try and run Composite. What's happening?

If you are using VirtualBox, you need to switch to VMWare player or workstation. VirtualBox does not emulate essential x86 instructions.

If you are using VMWare and it still locks up, change the style of virtualization it uses. You have the option of using binary translation, or hardware virtualization (assuming your hardware isn't old). Try both.

Question: Composite segfaults when I try and use httperf. Why?

There could be multiple reasons for this to happen.

  1. You must wait for Composite to complete its bootup procedure before contacting it with httperf. To tell if Composite has fully booted yet, wait for it to print out information about each component before starting httperf. If you do not, the components might not have initialized by the time packets start flowing through the system. This will likely cause faults. This could be solved by making the initialization sequence more robust.
  2. There are still bugs in the system. If the segfault was triggered by an assert (see doc/debugging.txt), then either you hit a pending bug, you introduced a bug, or one component invoked another before the second was initialized. In the last case, simply make sure that threads for components that make invocations are initialized with a lower priority than those that are invoked. This is done in the runscript (see doc/executing_composite.txt).
  3. Be sure to always run Composite by executing the following:
    $ make
    $ sh run.sh

    Don't forget the make.

Question: OMG, make isn't working, there are complaints about modules not installed, files not present, and the world is ending!

There are many possible reasons for this. Instead of diving into each of these, we will simply reset the system to a good state so that you can attempt to execute it again. Instructions follow.

First, make sure that you have the two terminals, A and B. In B, you should be logged in as root (by using sudo su for example). In B, you should be in /root/experiments while in A, you should be in the Composite src/ directory. If you are getting the weird errors as above, first you want to reset your transfer directory (see doc/build_system.txt): rm ~/transfer/*. Second, in B (in /root/experiments), remove all components: rm *. Next, check if the modules are inserted (lsmod and look for cos and cosnet). If they are remove them with rmmod.

You have cleaned up the system. Now in A, type
$ make
$ make cp

Now your ~/transfer/ directory has all relevant files in it. In B, type
$ cp ~composite/transfer/* .
$ make init
$ make

Now you should be able to run Composite by doing the normal:
$ make
$ sh run.sh

Question: How do I make hello world?

There are two general types of components in the system, 1) those that export an interface to be used by other components, and 2) those that don't. Additionally, there are components that require a thread to initialize them, and those that don't, instead having threads execute through them via invocation instead.

All of these types of components are exemplified by the ping pong example. In microkernel circles, ping pong means two components (or servers, or processes, or address spaces) that simply pass a token back and forth. This is used to measure the cost of an invocation (an RPC). The ping component requires a thread to initialize it, and this is the thread that makes all of the invocations. ping does not export an interface. Thus, all relevant code for ping is in:
src/components/implementation/other/ping/*
Note that all components that don't export an interface are placed into the other directory.

pong, on the other hand, does not need a thread to initialize it, instead simply responding to invocations from the ping component. It does, however, need to export an interface to be used by ping. The interface and implementation:
src/components/interface/pong/*
src/components/implementation/pong/pingpong/*

The runscript to execute a system focused on ping pong is found in:
src/platform/linux/util/pingpong_run.sh
You can see that the ping component is initialized with a thread with priority 4 (...;pingp.o,a4;...), while pong is not (...;ppong.o, :...). Always remember that lower numerical priorities is the same as a higher practical priority.

It is trivial to create a hello world component from these examples. For another trivial component, see src/components/implementation/other/cpu/*.

Schedule

Deadlines are in bold. Note that this schedule is subject to change.

DateTopic:
  • Paper, location* / Presenter
1/11 Administrative details, Introduction on OSes and System Structure:
  • Policy/Mechanism Separation in Hydra, policy_mech_sep.pdf / G. Parmer [pdf]
  • Worse is Better, worse_is_better.pdf / G. Parmer
1/18 Martin Luther King Day, no class

System Structure 0:

  • The UNIX Time-Sharing System, historical/unix.pdf
  • The Structure of the "THE"-Multiprogramming System, historical/dijkstra_the.pdf
  • The Nucleus of a Multiprogramming System, historical/nucleus.pdf
  • Hydra: The Kernel of a Multiprocessor Operating System, historical/hydra.pdf
Noone will present these papers, but we will discuss them throughout the "System Structure" segment of the class. You must still summarize them.

You must have a discussion with me about your interests and possible topics for the course project.

1/25System Structure I:
  • Exokernel: An Operating System Architecture for Application-Level Resource Management, structure/exokernel.pdf / James Marshall
  • Extensibility, Safety and Performance in the SPIN Operating System, structure/spin.pdf / Scotty Smith
  • An introduction to the Composite component-based OS, no required reading / G. Parmer
1/27 (Wed)

Brief (1/2 page at most) writeup due about the topic of your project.

2/1System Structure II:
  • On Microkernel Construction, structure/on_ukern_construction.pdf / Ryan Festag
  • Achieved IPC Performance (Still the Foundation for Extensibility), structure/achieved_ipc_perf.pdf and Extensible Kernels are leading OS Research Astray, structure/extensibility_leading_research_astray.pdf / G. Parmer
  • Xen and the Art of Virtualization, structure/xen.pdf / Tareque Hossain
2/8System Structure III:

Cancelled due to DC snow fail.

2/15Presidents Day, no class

Progress Report 1 due for your project.

The expressibility of an interface is determined by the breadth of different situations, applications, and requirements it can cater to. A "holy-grail" is to find a single interface that is maximally expressive. Keep this in mind as you read this paper. This interface is "higher-level" than the ones we have discussed so far, and therefore focuses more on providing abstractions to applications than on system infrastructure (e.g. libOSes).
  • The Ubiquitous File Server in Plan 9, structure/plan9.pdf / Review-only
2/22System Structure IV, Data Movement and Resource Accouting I:
  • Are Virtual Machine Monitors Microkernels Done Right?, structure/vms_ukern_right.pdf and Are Virtual Machine Monitors Microkernels Done Right?, structure/ukern_vms_right.pdf / Andrew Thaeler
  • Briefly: Hype and Virtue, structure/vms_leading_research_astray.pdf.pdf / G. Parmer
  • Isolating Web Programs in Modern Browser Architectures, structure/chrome.pdf / Shan Li
  • Fbufs: a High-Bandwidth Cross-Domain Transfer Facility, data_movement/fbufs.pdf / Wei Wang
3/1Resource Accounting II:
  • Making Paths Explicit in the Scout Operating System, accounting/scout.pdf / Kerry McKay
  • Resource Containers: a New Facility for Resource Management in Server Systems, accouting/resource_containers.pdf / Andrew Sweeney
  • Processes in KaffeOS: Isolation, Resource Management, and Sharing in Java, accouting/kaffeos.pdf / Edward Robinson
3/8Threading and Concurrency I:
  • Scheduler Activations: Effective Kernel Support for the User-Level Management of Parallelism, threading/scheduler_activations.pdf / Francesco Scarimbolo
  • Flash: an Efficient and Portable Webserver, threading/flash.pdf / Syed Muhammad Gilani
  • Capriccio: Scalable Threads for Internet Services, threading/capriccio.pdf / Patrick Thomson
3/15Spring Break, no class
3/22Threading and Concurrency II and Parallelism and Synchronization I:
  • Why Events are a Bad Idea, threading/events_bad.pdf and Why Threads are a Bad Idea, threading/threads_bad.pdf / Samy Al Bahra
  • Cooperative Task Management without Manual Stack Management, threading/fibers.pdf / Murugappan Alagappan
  • Algorithms for Scalable Synchronization on Shared-Memory Multiprocessors, parallelism/synchronization.pdf / Samy Al Bahra

Progress Report 2 due for your project

3/29Parallelism and Synchronization II:
  • Tornado: Maximizing Locality and Concurrency in a Shared Memory Multiprocessor Operating System, parallelism/tornado.pdf / Francesco Scarimbolo
  • Corey: an Operating System for Many Cores, parallelism/corey.pdf / Jiguo Song
  • The Multikernel: a New OS Architecture for Scalable Multicore Systems, parallelism/barrelfish.pdf / Ahmed Anber
4/5Parallelism and Synchronization III and Reliability I:
  • Mapreduce: Simplifying Data Processing on Large Clusters, parallelism/mapreduce.pdf / Jiaqiang Xu
  • Practical Byzantine Fault Tolerance, reliability/byzantine_fault_tolerance.pdf / Bilal Habib
  • Microreboot: A Technique for Cheap Recovery, reliability/microreboot.pdf / Natalie Rabinovich
4/12Reliability II:
  • CuriOS: Improving Reliability through Operating System Structure, reliability/curios.pdf / Malik Sharif
  • XFI: Software Guards for System Address Spaces, reliability/xfi.pdf / Rabia Shahid
  • Mondrix: Memory Isolation for Linux using Mondriaan Memory Protection, reliability/mondrix.pdf / Yue Wu
4/19Reliability III and Security I:
  • Enhancing Server Availability and Security through Failure-Oblivious Computing, reliability/failure_oblivious.pdf / Juan Falquez
  • No review required: A Note on the Confinement Problem, security/confinement.pdf / G.Parmer
  • Protection, security/protection.pdf and Protection and the Control of Information Sharing in Multics, security/multics.pdf / Kerry McKay
  • Eros: a Fast Capability System, security/eros.pdf / Cheryl Babcock
4/26Security II:
  • Labels and Event Processes in the Asbestos Operating System, security/asbestos.pdf / Teng Li
  • Class wrap-up and reflection.
4/27

Project Presentations:

  • Rabia Shahid
  • Malik Sharif
  • Kerry McKay
  • Juan Falquez
  • Yue Wu
  • Natalie Rabinovich
  • Ahmed Anber
  • Jiguo Song
  • Murugappan Alagappan
  • Samy Al Bahra
  • Patrick Thomson
  • Wei Wang

4/28

Final report due (midnight): If you worked on a Composite project: Include your source code in a tarball (.tgz) or zip archive. Please ensure that whichever method you use, you use compression. This can be a copy of the composite source tree. If you wish to do that, do a make clean, and go into src/components/lib/dietlibc-0.29/ and also do a make clean before creating the archive. A README file must describe how to use/test your code.

If you did not do a Composite project, please explain the separate modules of your project, how you tested them, and what results you produced.

For all projects, please include a design document (project report) describing the design of your project, what you completed in your project, what you didn't complete, why you didn't complete it, what the stumbling blocks were, and what you learned along the way.

Project Presentations:

  • Bilal Habib
  • Teng Li
  • Syed Muhammad Gilani
  • Francesco Scarimbolo
  • Edward Robinson
  • Andrew Sweeney
  • Shan Li
  • Andrew Thaeler
  • Tareque Hossain
  • Ryan Festag
  • Scotty Smith
  • James Marshall
  • Jiaqiang Xu

* - You can retrieve the archive of all papers for the course, and the this field is the location within that archive.

Grading

Grades will be assigned with the following proportions:
  • Class Presentation(s): 20%*
  • Class Participation: 40%*. This criteria includes
    1. coming to class,
    2. making comments and contributing to in-class discussion, and
    3. paper summaries and comments that you must submit at 11:59pm each Sunday before class.
  • Semester Project: 40%, broken down as follows
    • 5% progress report 1
    • 5% progress report 2
    • 30% final report and presentation
* - Those who make two presentations instead of one have the option of changing these to Presentations: 30%, Participation: 30%.

Academic Honesty

You are not allowed to collaborate on the homeworks and the lab assignments. The group projects require collaboration amidst each group, but no collaboration between teams is permitted. Please refer to the academic integrity policy linked from the course web page. This policy will be strictly enforced. If you're having significant trouble with an assignment, please contact me.
Academic Integrity Policy
Credit: I'd like to thank Prof. Narahari for the first versions of this academic honesty policy.

Please click on any section to see its contents.