Composite Component Invocations

Posted on January 17, 2016 by Gabe Parmer

In the previous post I discussed some of the main dimensions that differentiate different system communication mechanisms. I concluded that most existing systems use either synchronous communication on the same core, or asynchronous communication across cores, both with limited data-movement capabilities (i.e. messages have a relatively small maximum size).

In this post, I’ll discuss how component invocations are implemented in Composite. Composite focuses on the fine-grained decomposition of system software into separate, hardware-isolated components, each exporting an interface of functions that other components can invoke. The invocation path must be fast, as inter-component interactions are frequent. This implementation firmly falls into synchronous, intra-core, with restricted data movement. The dominant paradigm (since Lietke) for this type of communication is Synchronous Rendezvous Between Threads (SRBT). This is a well-studied, well-optimized, and established paradigm for Inter-Process Communication (IPC)1.

Synchronous Rendezvous between Threads

The L4 family of microkernels is the main prototype for this communication paradigm. A client thread in the calling component invokes the kernel to both activate the server thread, and block awaiting a reply. The server thread invokes the kernel to both reply to a client and block waiting for the next client. The entire call (from client to server) and return (from server to client) is implemented with only two system calls (compare to UNIX pipes). Given the history of this approach, one would likely assume that it is strictly better than many of the alternative implementations. As mentioned in the previous post, once you change some of the system’s goals or assumptions, it is possible for a different design to be beneficial. To understand the design space around this mechanism, lets list out a number of the designs and consequences of the designs inherent in SRBT systems.

Each of these is more completely and precisely characterized in a paper comparing implementations of synchronous IPC between threads with Composite IPC.

A number of pieces of research have investigated alterations to many of these factors. These include:

The main emphasis of SRBT is on the efficiency of the IPC mechanism. It does a very good job of this; specifically, the Nova IPC path takes this to a practical conclusion. However, any design makes trade-offs and assumptions, and a different design might alleviate some of the listed assumptions above.

IPC via Thread Migration

Anyone familiar with modern microkernels is likely aware of SRBT designs. Thread migration, maybe not. It has a long history and is summarized in that paper as:

…during [IPC], a single thread abstraction moves between tasks with the logical flow of control, and “server” code is passively executed. … The key element of our design is a decoupling of the thread abstraction into the execution context and the schedulable thread of control, consisting of a chain of contexts.

Note that this thread migration is different from the identical term that denotes the movement of a thread’s context between cores in a multicore system. Here we are talking about the IPC design sweet spot on a single core.

It should be noted that thread migration is pervasively used for communication between user-level and kernel-level in monolithic systems. A flow of execution at user-level, executing within an application, continues in kernel-level via a system-call. There is no SRBT, thus no switching between threads, and we think of a single application thread that executes at some points in the application, and at others in the kernel. time reports these different execution times along with total time. As pointed out in the quote, a key aspect is the decoupling of the execution context, and the scheduling context (which probably sounds familiar from Credo).

Execution context. Isolation between user-level and kernel-level is necessary, and supported by dual-mode execution. An implication of this is that the stack used for execution at kernel-level must be isolated from the stack at user-level to avoid corruption and information leakage. Additionally, registers must be carefully managed when transitioning between modes. Calling conventions are protocols that specify how they should be managed (related: cdecl for function calls). So just as with SRBT, different execution contexts are required in different protection domains.

Scheduling context. The scheduler does not treat the executions at user-level and kernel-level as different entities. It only schedules the single thread that migrates between user- and kernel-levels.

This decoupling of the execution context and scheduling context is the key concept to thread migration. Note that this concept alone removes a few of the design assumptions on the previous list.

Composite Component Invocation: Motivation

So what is the design of component invocation in Composite? We deviate from the dominant SRBT designs, and instead focus on thread migration. Lets look at some of the benefits that it enables.

Composite Component Invocation: Design

Composite uses a capability system to track which kernel resources each component has access to. Their implementation (as resource tables) is detailed in the Speck paper. One such resource is a synchronous invocation gate (SInv). Each SInv denotes a function (address) to activate in the server. Each corresponds to a stub that calls a function in the server component’s interface. When the client activates its SInv capability, the stack and instruction pointers to return to in the client are saved on a stack (like the C execution stack) in the kernel. Returning back to the client simply restores these values. Note that neither the client nor the server need to trust each other as the kernel mediates control transfer into and out of each component.

The calling convention is such that four registers-worth of data is passed on call and return (on x86-32). The design explicitly separates generic data transfer from control flow and fixed message passing as motivated by the previous analysis. Future articles will cover the data-movement APIs in Composite.

Scheduling context. Note that the same schedulable thread executes through the entire IPC path much as a thread in a monolithic kernel traverses user- and kernel-mode. Composite’s IPC is an optimized version of thread migration. Note that the optimizations to SRBT made by Credo and timeslice donation move SRBT closer to thread migration by attempting to decouple scheduling and execution contexts.

Execution contexts. This is where it gets a little tricky. The question is this: which stack should we use for execution within the server component? There are a couple of easy edge cases:

Intelligent Management of Execution Contexts

Blocking waiting for an execution context (stack) in a server increases the latency for an invocation. We’ve investigated two different policies for assigning execution contexts throughout the system. We’ve considered the hard real-time, worst-case latency introduced by the blocking on execution context contention. Alternatively, we’ve investigated adapting the allocation of stacks throughout the system in response to measured latency. Such policies are possible due to the configurable user-level stack manager that monitors where and when contention occurs, and adapts execution context allocations accordingly.

Summary

Component invocation via thread migration is the means for IPC in Composite. It is an implementation of one of the design sweet spots for communication: same-core, synchronous IPC with restricted data movement. As opposed to the prevalent implementation that uses synchronous rendezvous between threads, thread migration affords a number of benefits when controlling end-to-end latency is a primary concern. Composite’s implementation also focuses on enabling the component-based implementation of policies for scheduling, contention, and thread block/wakeup. We’ve also seen that work has been done to move SRBT implementations toward thread migrations to enjoy some of its benefits. In future posts, we’ll elaborate on the asynchronous IPC facilities in Composite to finish the high-level design of IPC theme.


  1. This post will mainly discuss inter-component communication. However, we’ll still use the traditional terminology of IPC. The important similarity between components and processes is that they are mutually isolated, for example via page-table hardware.