Design of the Genode OS Architecture: Core - the root of the process tree

Core is the first user-level program that takes control when starting up the system. It has access to the raw physical resources and converts them to abstractions that enable multiple programs to use these resources. In particular, core converts the physical address space to higher-level containers called dataspaces. A dataspace represents a contiguous physical address space region with an arbitrary size (at page-size granularity). Multiple processes can make the same dataspace accessible in their local address spaces. The system on top of core never deals with physical memory pages but uses this uniform abstraction to work with memory, memory-mapped I/O regions, and ROM areas.

Note: Using only contiguous dataspaces may lead to fragmentation of the physical address space. This property is, however, only required by a few rare cases (e.g., DMA transfers). Therefore, later versions of the design will support non-contiguous dataspaces.

Furthermore, core provides all prerequisites to bootstrap the process tree. These prerequisites comprise services for creating processes and threads, for allocating memory, for accessing boot-time-present files, and for managing address-space layouts. Core is almost free from policy. There are no configuration options. The only policy of core is the startup of the init process to which core grants all available resources.

In the following, we explain the session interfaces of core's services in detail.

RAM - allocator for physical memory

A RAM session is a quota-bounded allocator of blocks from physical memory. There are no RAM-specific session-construction arguments. Immediately after the creation of a RAM session, its quota is zero. To make the RAM session functional, it must be loaded with quota from another already existing RAM session, which we call the reference account. The reference account of a RAM session can be defined initially via:

int ref_account(Ram_session_capability ram_session_cap);

Once the reference account is defined, quota can be transferred back and forth between the reference account and the new RAM session with:

int transfer_quota(Ram_session_capability ram_session_cap,
                   size_t amount);

Provided, the RAM session has enough quota, a dataspace of a given size can be allocated with:

Ram_dataspace_capability alloc(size_t size);

The result value of alloc is a capability to the RAM-dataspace object implemented in core. This capability can be communicated to other processes and can be used to make the dataspace's physical-memory region accessible from these processes. An allocated dataspace can be released with:

void free(Ram_dataspace_capability ds_cap);

The alloc and free calls track the used-quota information of the RAM session accordingly. Current statistical information about the quota limit and the used quota can be retrieved by:

size_t quota();
size_t used();

Closing a RAM session implicitly destroys all allocated dataspaces.

ROM - boot-time-file access

A ROM session represents a boot-time-present read-only file. This may be a module provided by the boot loader or a part of a static ROM image. On session construction, a file identifier must be specified as a session argument using the tag filename. The available filenames are not fixed but depend on the actual deployment. On some platforms, core may provide logical files for special memory objects such as the GRUB multiboot info structure or a kernel info page. The ROM session enables the actual read access to the file by exporting the file as dataspace:

Rom_dataspace_capability dataspace();

IO_MEM - memory mapped I/O access

With IO_MEM, core provides a dataspace abstraction for non-memory parts of the physical address space such as memory-mapped I/O regions or BIOS areas. In contrast to a memory block that is used for storing information of which the physical location in memory is of no matter, a non-memory object has a special semantics attached to its location within the physical address space. Its location is either fixed (by standard) or can be determined at runtime, for example by scanning the PCI bus for PCI resources. If the physical location of such a non-memory object is known, an IO_MEM session can be created by specifying base and size as session-construction arguments. The IO_MEM session then provides the specified physical memory area as dataspace:

Io_mem_dataspace_capability dataspace();

IO_PORT - access to I/O ports

For platforms that rely on I/O ports for device access, core's IO_PORT service enables fine-grained assignment of port ranges to individual processes. Each IO_PORT session corresponds to the exclusive access right to a port range as specified with the io_port_base and io_port_size session-construction arguments. Core creates the new IO_PORT session only if the specified port range does not overlap with an already existing session. This ensures that each I/O port is driven by only one process at a time. The IO_PORT session interface resembles the physical I/O port access instructions. Reading from an I/O port can be performed via an 8bit, 16bit, or 32bit access:

unsigned char  inb(unsigned short address);
unsigned short inw(unsigned short address);
unsigned       inl(unsigned short address);

Vice versa, there exist functions for writing to an I/O port via an 8bit, 16bit, or 32bit access:

void outb(unsigned short address, unsigned char value);
void outw(unsigned short address, unsigned short value);
void outl(unsigned short address, unsigned value);

The address argument of I/O-port access functions are absolute port addresses that must be within the port range of the session.

IRQ - handling device interrupts

The IRQ service of core provides processes with an interface to device interrupts. Each IRQ session corresponds to an attached interrupt. The physical interrupt number is specified via the irq_number session-construction argument. A physical interrupt number can be attached to only one session. The IRQ session interface provides a blocking function to wait for the next interrupt:

void wait_for_irq();

While the wait_for_irq function blocks, core unmasks the interrupt corresponding to the IRQ session. On function return, the corresponding interrupt line is masked and acknowledged.

RM - managing address space layouts

RM is a region manager service that allows for constructing address space layouts (region map) from dataspaces and that provides support for assigning region maps to processes by paging the process' threads. Each RM session corresponds to one region map. After creating a new RM session, dataspaces can be attached to the region map via:

void *attach(Dataspace_capability ds_cap,
             size_t size=0, off_t offset=0,
             bool use_local_addr = false,
             addr_t local_addr = 0);

The attach function inserts the specified dataspace into the region map and returns the actually used start position within the region map. By using the default arguments, the region manager chooses an appropriate position that is large enough to hold the whole dataspace. Alternatively, the caller of attach can attach any sub-range of the dataspace at a specified target position to the region map by enabling use_local_addr and specifying an argument for local_addr. Note that the interface allows for the same dataspace to be attached not only to multiple region maps but also multiple times to the same region map. As the counterpart to attach, detach removes dataspaces from the region map:

void detach(void *local_addr);

The region manager determines the dataspace at the specified local_addr (not necessarily the start address) and removes the whole dataspace from the region map. To enable the use of a RM session by a process, we must associate it with each thread running in the process. The function

Thread_capability add_client(Thread_capability thread);

returns a thread capability for a pager that handles the page faults of the specified thread according to the region map. With subsequent page faults caused by the thread, the address-space layout described by the region map becomes valid for the process that is executing the thread.

CPU - allocator for processing time

A CPU session is an allocator for processing time that allows for the creation, the control, and the destruction of threads of execution. There are no session arguments used. The functionality of starting and killing threads is provided by two functions:

Thread_capability create_thread(const char* name);
void kill_thread(Thread_capability thread_cap);

The create_thread function takes a symbolic thread name (that is only used for debugging purposes) and returns a capability to the new thread. Furthermore, the CPU session provides the following functions for operating on threads:

int set_pager(Thread_capability thread_cap,
              Thread_capability pager_cap);

int cancel_blocking(Thread_capability thread_cap);

int start(Thread_capability thread_cap,
          addr_t ip, addr_t sp);

int state(Thread_capability thread,
          Thread_state     *out_state);

The set_pager function registers the thread's pager whereas pager_cap (obtained by calling add_client at a RM session) refers to the RM session to be used as the address-space layout. For starting the actual execution of the thread, its initial instruction pointer (ip) and stack pointer (sp) must be specified for the start operation. In turn, the state function provides the current thread state including the current instruction pointer and stack pointer. The cancel_blocking function causes the specified thread to cancel a currently executed blocking operation such as waiting for an incoming message or acquiring a lock. This function is used by the framework for gracefully destructing threads.

Note: Future versions of the CPU service will provide means to further control the thread during execution (e.g., pause, execution of only one instruction), acquiring more comprehensive thread state (current registers), and configuring scheduling parameters.

PD - providing protection domains

A PD session corresponds to a memory protection domain. Together with one or more threads and an address-space layout (RM session), it forms a process. There are no session arguments. After session creation, the PD contains no threads. Once a new thread has been created from a CPU session, it can be assigned to the PD by calling:

 int bind_thread(Thread_capability thread);

CAP - allocator for capabilities

A capability is a system-wide unique object identity that typically refers to a remote object implemented by a service. For each object to be made remotely accessible, the service creates a new capability associated with the local object. CAP is a service to allocate and free capabilities:

 Capability alloc(Capability ep_cap);
 void free(Capability cap);

The alloc function takes an entrypoint capability as argument, which is the communication receiver for invocations of the new capability's RPC interface.

LOG - debug output facility

The LOG service is used by the lowest-level system components such as the init process for printing debug output. Each LOG session takes a label string as session argument, which is used to prefix the debug output of this session. This enables developers to distinguish multiple producers of debug output. The function

 size_t write(const char *string);

outputs the specified string to the debug-output backend of core.

Sections