Bringing the Genode OS Framework to the OKL4 kernel

This article documents the process of bringing the Genode OS Framework to a new kernel platform, namely the OKL4 kernel developed by OK-Labs. OKL4 is an industry-grade kernel that is deployed in millions of mobile phones.

For our work, we went for the OKL4 version 2.1 for two reasons. First, whereas this version officially supports the x86 architecture, the later version 3 is pretty much focused on the ARM architecture. At present, the x86 architecture is our primary platform for Genode development. Second, we like to follow the evolution of OKL4 from its genesis (L4ka::Pistachio) to the capability-based kernel design as pursued with the later versions. On this path, the version 2.1 is an important milestone, which we wont like to miss. Nevertheless of having chosen version 2.1 to begin with, we plan to bring Genode to later versions of OKL4 as well.

In the article, we face numerous challenges such as integrating OKL4 support into Genode's build system, exploring the OKL4 kernel interface and the boot procedure, adapting Genode's framework libraries to the feature set provided by the new kernel, and accessing interrupts and other hardware resources.

The intended audience are developers interested in exploring the realms of the L4-microkernel world and kernel developers who consider running Genode as user-land infrastructure on top of their kernel. For the latter group, we laid out the article as a rough step-by-step guide providing our proposed methodology for approaching the port of Genode to a new kernel platform. At many places, the article refers to the source code of Genode, in particular the base-okl4 repository. You can read the code online via our subversion repository:

Browse the Genode subversion repository...

Build-system support

The first step is to create a simple hello-world program that can be executed directly on the OKL4 kernel as roottask-replacement. This program does not rely on any kernel features but uses port I/O to output some characters to the serial interface. We need to understand the following things:

  • We need a program that outputs some characters to the serial interface. This program can be developed on a known kernel platform. Once we have a working hello program, we only need to port it to the new kernel platform but can assume that the test program itself is correct.

  • How must the OKL4 rootask be linked in order to be executed by the kernel?

  • How does the OKL4 boot procedure work? OKL4 relies on a tool called elfweaver, which creates a bootable ELF-image (often called single image) from multiple binaries, in particular the kernel and roottask. We need to create a minimalist elfweaver configuration file that just starts the kernel and our hello example.

The result of this first step can be found in src/test/okl4_01_hello_raw:

crt0

is the assembly startup code taken from the L4/Fiasco version of Genode. This code defines the initial stack, contains the entry point of the hello program, which calls a C function called _main.

hello.cc

is the implementation of the _main function, which outputs some characters directly via the serial interface of a PC. It does not contain any kernel-specific code nor it depends on any include files.

genode.ld

is the linker script that we already use for Genode programs on other base platforms.

weaver.xml

is the description file of the single image to be created by OKL4's elfweaver tool. It is useful to take a close look at this file. The most important bits are the filename of the kernel specified in the <kernel> tag and the filename of the hello program specified in the <rootprogram> tag.

Makefile

contains the steps needed to compile the hello program and invoke elfweaver to create the bootable single image.

To boot the single image, you can use your favorite boot loader such as Grub. The single-image file must be specified as kernel. When booted, the program should print a message over the serial line.

The next step is the proper integration of the hello example into the Genode build system. For this, we create a new source-code repository called base-okl4 with the following structure:

 base-okl4/lib/mk/x86/startup.mk
 base-okl4/mk/spec-okl4.mk
 base-okl4/mk/spec-okl4_x86.mk
 base-okl4/src/test/okl4_02_hello/target.mk
 base-okl4/src/test/okl4_02_hello/hello.cc
 base-okl4/src/platform/x86/_main.cc
 base-okl4/src/platform/x86/crt0.s
 base-okl4/src/platform/genode.ld
 base-okl4/etc/specs.conf

The OKL4-specific build-system support is contained in the files specs.conf, spec-okl4.mk, and spec-okl_x86.mk. The specs.conf file steers the build process once the base-okl4 repository is specified in the REPOSITORIES declaration in the etc/build.conf file in the build directory. The spec-okl4_x86.mk file describes the build specifics via the mechanism described in Genode's getting-started documentation:

 SPECS = genode okl4_x86

Driven by the content of this SPECS declaration, the build system first includes the spec files for spec-genode.mk (found in the base/ repository) and spec-okl4_x86.mk (found in the base-okl4/ repository). The latter file contains all build options for OKL4 on the x86 architecture, extends the SPECS declaration by the platform specifics x86_32 and okl4 (which both apply for okl4_x86), and aggregates the corresponding spec files:

 SPECS += x86_32 okl4

 LD_SCRIPT    ?= $(call select_from_repositories,src/platform/genode.ld)
 CXX_LINK_OPT += -Wl,-T$(LD_SCRIPT) -Wl,-Ttext=0x01000000

 include $(call select_from_repositories,mk/spec-x86_32.mk)
 include $(call select_from_repositories,mk/spec-okl4.mk)

The spec file for x86_32 is contained in the base/ repository. The one for okl4 is provided by base-okl4/. It contains all build options that are independent from the hardware platform, OKL4 is deployed on:

 -include $(call select_from_repositories,etc/okl4.conf)
 -include $(BUILD_BASE_DIR)/etc/okl4.conf

 INC_DIR += $(OKL4_DIR)/build/iguana/include
 INC_DIR += $(REP_DIR)/include

 PRG_LIBS += startup

 CC_OPT_NOSTDINC += -nostdinc
 CXX_LINK_OPT    += -static -nostdlib -Wl,-nostdlib
 EXT_OBJECTS     += $(shell $(CUSTOM_CXX_LIB) -print-file-name=libsupc++.a) \
                    $(shell $(CUSTOM_CXX_LIB) -print-file-name=libgcc_eh.a) \
                    $(shell $(CUSTOM_CXX_LIB) -print-libgcc-file-name)

 EXT_OBJECTS += $(OKL4_DIR)/build/iguana/lib/libl4.a

The most interesting point is that this file reads an OKL4-specific config file from the etc/ subdirectory of the build directory. From this file, it obtains the location of the OKL4 distribution via the OKL4_DIR declaration. The spec-okl4.mk file above adds the build/iguana/include path to the default include search locations. We need this path for including the headers from the l4/ subdirectory. Unfortunately, build/iguana/include/ contains a lot of further includes, which we don't want to use. In contrary, these includes pollute our include-search space. This is particularly problematic for headers such as stdio.h, which will inevitably collide with Genode's own libC headers. Hence we need to find a way, to isolate the l4/ headers from the remaining Iguna headers. One elegant way is to shadow the build/iguana/include/l4 directory in our local Genode build directory. This can be accomplished either manually by creating a symbolic link from OKL4's build/iguana/include/l4 to an include file within our Genode build directory, or by letting make create such a link automatically. The corresponding rules for this approach can be found in the spec-okl4.mk file.

On Genode, the startup code is encapsulated in a library called startup, which is linked to each program by default. This library essentially consists of a little snipped of assembly startup code crt0.s, which calls a platform- independent C startup function called _main implemented in _main.cc. The library-description file for the startup library is called startup.mk and has the following content:

 REQUIRES = okl4 x86
 SRC_S    = crt0.s
 SRC_CC   = _main.cc

 vpath crt0.s   $(REP_DIR)/src/platform/x86
 vpath _main.cc $(REP_DIR)/src/platform/x86

We will use a _main.cc from another platform as template for the OKL4- specific startup code but strip it down to an absolute minimum (leaving out everything except the call the actual main function. Note that for this simple setup, we need to explicitly reference a symbol of crt0.s from _main.cc to prevent the linker from discarding the otherwise unreferenced object file (which only contains our entry point). The easiest way is to reference the __dso_handle variable, which is defined in crt0.s. However, this is an intermediate work-around, which we will remove in the next step. Alternatively, we could rely on the -u option of the linker to prevent the entry symbol (_start) from being discarded.

The implementation of the hello program equals the version of okl4_01_hello_raw except that the main function is actually called main rather than _main. The corresponding target description file target.mk is straight forward:

 TARGET   = hello
 REQUIRES = okl4
 SRC_CC   = hello.cc

Creating dummy versions of the env and cxx libraries

So far, the hello program does rely neither on OKL4-specific nor Genode-specific code. The goal of the next step is to remove the differences between the _main.cc file in our repository and the _main.cc file of the other base platforms. We will add proper C++ initialization, the calling of static constructors, and a proper console implementation.

The first step is to include the cxx library to our target. This is a Genode-specific C++ support library, which contains functions used as back end of the GCC's libsupc++ and libgcc_eh. To include the cxx library for building our hello program, we add the following declaration to the target.mk file:

 LIBS = cxx

On a rebuild, the build system will try to compile the cxx library, which, in turn, depends on a number of Genode header files. Most of these header files are generic and hence contained in the base/ repositories. However, the following header files are specific for the actual base platform and, therefore, must be provided by ourself:

base/capability.h

This file defines the representation of an object capability on the actual platform. For now, we can use the following version, which we will expand later on (at the current stage, the Capability class is not actually used but we need its definition for successful compilation. The OKL4-specific capability.h file must be placed in include/base/ of the base-okl4/ repository.

 #ifndef _INCLUDE__BASE__CAPABILITY_H_
 #define _INCLUDE__BASE__CAPABILITY_H_

 namespace Genode {
   class Capability {
     public: bool valid() const { return false; }
   }
   typedef int Connection_state;
 }

 #endif /* _INCLUDE__BASE__CAPABILITY_H_ */
base/native_types.h

This file defines platform representations of thread IDs, locks etc. Please take a look at the native_types.h file of another platform to get an overview on these types. For now, the following simple version suffices:

 #ifndef _INCLUDE__BASE__NATIVE_TYPES_H_
 #define _INCLUDE__BASE__NATIVE_TYPES_H_

 namespace Genode {
   typedef volatile int Native_lock;
   typedef          int Native_thread_id;
   typedef          int Native_thread;
 }

 #endif /* _INCLUDE__BASE__NATIVE_TYPES_H_ */

In fact, at this point, the types are just dummies, which we will replace later when porting further parts of the framework.

base/ipc.h

This is a platform-specific wrapper for Genode's IPC API. Usually, this file just includes base/ipc_generic.h. Optionally, it can host platform-specific IPC functionality.

 #ifndef _INCLUDE__BASE__IPC_H_
 #define _INCLUDE__BASE__IPC_H_

 #include <base/ipc_generic.h>

 #endif /* _INCLUDE__BASE__IPC_H_ */
base/ipc_msgbuf.h

This file defines the IPC message-buffer layout. Naturally, it is highly platform specific. For now, the following dummy message-buffer layout will do:

 #ifndef _INCLUDE__BASE__IPC_MSGBUF_H_
 #define _INCLUDE__BASE__IPC_MSGBUF_H_

 namespace Genode {
   class Msgbuf_base { };

   template <unsigned BUF_SIZE>
   class Msgbuf : public Msgbuf_base { };
 }

 #endif /* _INCLUDE__BASE__IPC_MSGBUF_H_ */

Once, we have created these platform-specific header files, the cxx libary should compile successfully. However, there are a number of unresolved symbols when linking the hello program. The cxx library uses Genode's env()->heap() as back end for its local malloc implementation. But so far, we do not have ported Genode's env library. Furthermore, there are unresolved references to Genode::printf as provided by Genodes console implementation and some functions of the IPC framework.

Let us first resolve the Genode::printf references by creating an OKL4-specific version of Genode's console library. For this, we create a new back end in src/base/console/okl4_console.cc that uses the serial output mechanism that we employed for our first hello_raw program. The corresponding library description file lib/mk/printf_okl4.mk looks as follows:

 SRC_CC = okl4_console.cc
 LIBS   = cxx console

 vpath %.cc $(REP_DIR)/src/base/console

Now, we can add printf_okl4 to the LIBS declaration of hello's target.mk file. When recompiling the hello program, the new printf_okl4 library will be built and resolve the Genode::printf symbols. There remain the unresolved references to Genode::env() and parts of the IPC framework.

The IPC implementation in src/base/ipc/ipc.cc is not straight forward and we defer it for now. Hence, we place only the following dummy functions into the ipc.cc file:

 #include <base/ipc.h>

 using namespace Genode;

 Ipc_ostream::Ipc_ostream(Capability dst, Msgbuf_base *snd_msg) :
   Ipc_marshaller(0, 0) { }

 void Ipc_istream::_wait() { }

 Ipc_istream::Ipc_istream(Msgbuf_base *rcv_msg) :
   Ipc_unmarshaller(0, 0) { }

 Ipc_istream::~Ipc_istream() { }

 void Ipc_client::_call() { }

 Ipc_client::Ipc_client(Capability &srv, Msgbuf_base *snd_msg,
                                         Msgbuf_base *rcv_msg) :
   Ipc_istream(rcv_msg), Ipc_ostream(srv, snd_msg), _result(0) { }

 void Ipc_server::_wait() { }

 void Ipc_server::_reply() { }

 void Ipc_server::_reply_wait() { }

 Ipc_server::Ipc_server(Msgbuf_base *snd_msg,
                        Msgbuf_base *rcv_msg) :
   Ipc_istream(rcv_msg), Ipc_ostream(Capability(), snd_msg) { }

The corresponding library-description file lib/mk/ipc.mk looks as follows:

 SRC_CC = ipc.cc
 vpath ipc.cc $(REP_DIR)/src/base/ipc

By adding ipc to the LIBS declaration in hello's target.mk file, the IPC-related linker errors should disappear and only the reference to Genode::env() remains. To resolve this symbol, we add the following dummy function directly into the code of hello.cc.

 namespace Genode {
   void *env() { return 0; }
 }

Before we can use the Genode framework, which is written in C++, we need to make sure that all static constructors are executed in the startup code (_main). Therefore, we add the following code to the _main function:

 void (**func)();
 for (func = &_ctors_end; func != &_ctors_start; (*--func)());

The referenced symbols _ctors_start and _ctors_end are created by the linker script. The corresponding declarations are provided by base/include/base/crt0..

Now, its time to replace the direct I/O port access in hello.cc by Genode's printf implementation. Just add the following line to the main function of hello.cc and make sure to include <base/printf.h>:

 Genode::printf("This is Genode's printf\n");

When starting the resulting program, this message should appear via the serial interface comport 0.

Initializing the C++ exception handling

The Genode OS Framework makes use of C++ exceptions. Hence, we need to make sure to properly initialize the libsupc++. This initialization comes down to calling the function

 __register_frame(__eh_frame_start__);

which is performed by the function init_exception_handling as provided by the generic cxx library. Normally, init_exception_handling is called from _main. It is important to know that the initialization code does use malloc, which is mapped to Genode's env()->heap() by the cxx library. Consequently, we need a working heap to successfully initialize the exception handling.

Therefore, we have to replace the dummy env() function in our hello program with something more useful. The header file src/test/minimal_env.h provides the heap functionality by using a minimalistic custom environment, which contains a heap with static pool of memory. With such an environment in place, we can safely call init_exception_handling from the _main startup code. The test okl4_02_hello is the result of this step. It first prints some text via Genode's printf implementation and then triggers a C++ exception.

Thread creation

So far, we have not performed any OKL4 system call. The first system call that we will explore is the L4_ThreadControl to create a thread. A corresponding test for this functionality is implemented in the test/okl4_03_thread example. This example creates a new thread with the thread number 1. Note that the matching L4 thread ID uses the lowest 14 bits as version number, which is always set to 1. Hence, the L4 thread ID of thread number 1 will be 0x4001. If you happen to need to look up this thread in OKL4's kernel debugger, you will find its thread control block (TCB) via this number.

Another important thing to note is that rootask's main thread runs initially at the priority of 255 whereas newly created threads get assigned a default priority of 100. To make OKL4's preemtive scheduling to work as expected, we need to assign the same priority to both threads by calling L4_Set_Priority.

IPC framework

Now that we can start multiple threads, we can fill Genode's IPC framework with life.

However, before we can get started with communication between threads, the communication partners must have a way to get to know each other. In particular, a receiver of IPC communication needs a way to make its communication address known to a sender. OKL4 uses L4_ThreadId_t as communication address. The thread's ID is assigned to each thread by its creator. The thread itself however, does not know its own identity when started up. In contrast to other L4 kernels that provide a way for thread to determine its own identity via a L4_Myself call, this functionality is not supported on OKL4. Therefore, the creator of a new thread must communicate the assigned thread ID to the new thread via a startup protocol. We use OKL4's UserDefinedHandle for this purpose. This is an entry of the threads UTCB that can be remotely accessed by the creating thread. Before starting the new thread, the creator writes the assigned thread ID to the new thread's user-defined handle. In turn, the startup code of the new thread copies the supplied value from the user-defined handle to a thread-local entry of the UTCB (a designated ThreadWord). In the following, the thread can always determine its own global ID by reading this ThreadWord from its UTCB. We declare the convention about which ThreadWord to use for this purpose in Genode's base/native_types.h (UTCB_TCR_THREAD_WORD_MYSELF).

IPC send and wait

The test program okl4_04_ipc_send_wait sends an IPC messages via Genode's Ipc_istream and Ipc_ostream framework. To make this example functional, we have to work on the following parts of the base-okl4/ repository.

include/base/capability.h

Genode uses the Capability class to address an IPC communication and a referenced object. Therefore, we must provide a valid representation of these information. Because all IPC operations on OKL4 always address threads, we use L4_ThreadId_t as representation of communication address. There are no kernel objects representing user-level objects in OKL4 (version 2). So we need to manage object identities on the user level, unprotected by the kernel. For now, we simply use a globally unique object ID for this purpose.

include/base/ipc_msgbuf.h

The message-buffer representation used for OKL4 does not use any kernel-specific layout because IPC payload is always transferred through the communicating thread's UTCBs. Hence, the Msgbuf template does only need to provide some space for storing messages but no control information.

src/base/ipc/ipc.cc

For the send-and-wait test, we need to implement the Ipc_istream and Ipc_ostream class functions: the constructors of Ipc_istream and Ipc_ostream, the _wait function, and the _send function. It is useful to take a look at the other platform's implementations for reference. Because the Genode IPC Framework provides the functionality for marshalling and unmarshalling of messages, we skip OKL4 message.h convenience abstraction in favor of addressing UTCB message registers ipc.h directly.

IPC call

The test program okl4_05_ipc_call performs IPC communication using Genode's Ipc_client and Ipc_server classes. To make this test work, the corresponding functions in src/base/ipc/ipc.cc must be implemented, in particular the functions _reply_wait and _call.

Address-space creation and page-fault handling

There are the following Peculiarities of OKL4 with regard to address-spaces.

OKL4 does not use IPC to establish memory mappings but an independent system call L4_MapControl to configure the local or an remote address space. In the line of other L4 kernels, page faults are handled via an IPC-based pager protocol. The typical mode of operation of a pager looks like:

  1. A page fault occurs, the kernel translates the page fault into a page-fault message delivered to the pager of the faulting thread.

  2. The pager receives a page-fault message, decodes the page-fault address, the fault type (read, write, execute), and the instruction pointer of the faulter from the page-fault message.

  3. The pager resolves the page fault by populating the faulter's address spaces with valid pages via L4_MapControl.

  4. The pager answers the page-fault message with an empty IPC to resume the operation of the faulter.

In contrast to L4/Fiasco and L4ka::Pistachio, which incorporate the memory mapping into the reply message, this procedure involves an additional system call. However, it is more flexible and allows the construction of a fully populated address space without employing an IPC-based protocol. Furthermore, the permissions for establishing memory mappings are well separated from IPC-communication rights.

In contrast to the L4/Fiasco and L4ka::Pistachio kernels, which take a virtual address of the mapper as argument, the OKL4 map operation always refers to a physical page. This enables the configuration of a remote address space without having all the used pages locally mapped as well. For specifying a local virtual address for a mapping, we can use the L4_ReadFpage function to look up a physical-memory descriptor for a given virtual address.

The test okl4_06_pager creates an address space to be one-to-one mapped with roottask. In the new address space, a thread is created. For the new thread, we use the roottask thread as pager. Once started, the new that raises a number page faults:

  1. Reading the first instruction of the entry point

  2. Accessing the first stack element

  3. Reading data

  4. Writing data

The pager receives the corresponding page-fault messages, prints the decoded information, and resolves the page faults accordingly.

Determining the memory configuration and boot modules

OKL4 provides its boot information to roottask via a boot-info structure, which is located at the address provided in roottask's UTCB message register 1. This structure is created by OKL4's elfweaver during the creation of the boot image. It has no fixed layout but it contains a batch of operations such as "add memory pool" or "create protection domain". In short, it (loosely) resembles the content of the elfweaver XML config file in binary form. Most of elfweaver's features will remain unused when running Genode on OKL4. However, there are some important bits of information we need to know:

  • Memory configuraion

  • Information on the boot modules

For parsing the boot-info structure, there exists a convenient library located in the OKL4 source tree at libs/bootinfo. The test program okl4_07_boot_info uses this library to obtain the information we are interested in.

Note that we link the library directly to the test program by using the EXT_OBJECTS declaration in the target.mk file. We are not adding this library to the global spec-okl4.mk file because we need the bootinfo-library only at a very few places (this test program and core).

We obtain the memory configuration by assigning a callback function to the init_mem entry of the bi_callbacks_t structure supplied to the parser library. There are indeed two init_mem function called init_mem and init_mem2. The second instance is called during a second parsing stage. However, both functions seem to be called with the same values. So we just disregard the values supplied to init_mem2 at this point.

To include other modules than the rootprogram to the boot image, we use the help of elfweaver's <pd> declaration. We create a pseudo protection domain as a container for several memory sections, each section loaded with the content of a file. An example declaration for including the files init and config into the boot image looks like this:

<pd name="modules">
  <memsection name="init"   file="init"   direct="true" />
  <memsection name="config" file="config" direct="true" />
</pd>

The direct="true" attribute has the effect that the memory section will have equal physical and virtual addresses.

When observing the output of okl4_07_boot_info, the relevant information are the new_ms (new memory section) lines with owner != 0 (another PD than roottask) and virtpool != 1. These memory sections correspond to the files. However, the association of the memory sections with their file names is still missing at this point. To resolve this problem, we also observe the export_object calls. For each memory section, export_object gets called with the type parameter set to BI_EXPORT_MEMSECTION_CAP and the key parameter set to the name of the file. Note that the file name is converted to upper case. For associating memory sections with file names, we assume that the order of new_ms calls corresponds to the order of matching export_object calls.

Interrupt handling and time source

In contrast to most of the classical L4 kernels, OKL4 provides no means for accessing wall-clock time from the user land. Internally, OKL4 uses a scheduling timer to perform preemptive scheduling but it does not expose a time source to the user land via IPC timeouts. Hence, we need an alternative way to obtain a user-level time source. We follow the same path as Iguana by driving the programmable interval timer (PIT) directly from a user-level service. Because OKL4 uses the more modern APIC timer, which is completely independent of the PIT, both the kernel and the user land can use entirely different timer devices as their respective time source.

The PIT is connected to the interrupt line 0 of the programmable interrupt controller (PIC). The test program okl4_08_timer_pit switches the PIT into one-shot mode and waits for timer interrupts. Each time a timer interrupt occurs, the next one-shot is scheduled. The program tests two important things: How does the interrupt handling work on OKL4 and how to provide a user-level time source?

The following things are worth mentioning with regard to IRQ handling:

  • By default, no one (roottask included) has the right to handle interrupts. We have to explicitly grant ourself the right to handle a particular interrupt by calling L4_AllowInterruptControl.

  • When calling L4_RegisterInterrupt, the kernel expects a real global thread ID, not the magic ID returned by L4_Myself().

  • Interrupts are delivered in an asynchronous fashion by using OKL4's notification mechanism. To block for incoming asynchronous messages, the corresponding notification bit must be unmasked and notifications must be accepted.

  • The interrupt-handler loop invokes two system calls per interrupt, L4_ReplyWait for blocking for the next interrupt and L4_AcknowledgeInterrupt for interrupt acknowledgement. Both syscalls could be consolidated into a call of L4_AcknowledgeWaitInterrupt.

Porting core

Now that we have discovered the most functional prerequisites for running Genode on OKL4, we can start porting Genode's core. I suggest to take another platform's core version as a template. For OKL4, the base-pistachio version becomes handy. First, make a copy of src/core to the base-okl4/ repository. Then we revisit all individual files and remove all platform-specific code with the goal to create a skeleton of core that compiles successfully. Thereby, we can already apply some simple type substitutions, for example by using the types declared in native_types.h we can avoid using platform-specific types such as L4_ThreadId_t.

By trying to compile core, we will see that there are still a few framework libraries missing, namely pager, lock, and raw_signal. For resolving the dependency on the lock library, we can use a simple spinlock implementation as an intermediate step. The implementation at src/base/lock/lock.cc looks like this:

#include <base/cancelable_lock.h>
#include <cpu/atomic.h>

using namespace Genode;

Cancelable_lock::Cancelable_lock(Cancelable_lock::State initial)
: _native_lock(UNLOCKED)
{
  if (initial == LOCKED)
    lock();
}

void Cancelable_lock::lock()
{
  while (!cmpxchg(&_native_lock, UNLOCKED, LOCKED));
}

void Cancelable_lock::unlock()
{
  _native_lock = UNLOCKED;
}

Note that this implementation does not fully implement the Cancelable_lock semantics but it is useful to get things started. The corresponding lib/mk/lock.mk can be based on another platform's variant:

SRC_CC   = lock.cc
vpath lock.cc $(REP_DIR)/src/base/lock

The OKL4-specific signal library can be taken almost unmodified from base-pistachio/. The pager library is a bit more complicated because it depends on ipc_pager.h and the corresponding part of the ipc library, which we have not yet implemented yet. However, based on the knowledge gained from the okl4_06_pager test, the adaption of another platform's implementation of src/base/ipc/pager.cc becomes straight-forward. For now, it actually suffices to leave the functions in pager.cc blank.

Once, we get the skeleton of core linked, we can work on the OKL4-specific code, starting with core's platform initialization in platform.cc. Configuring core's memory allocators:

region_alloc

This is the allocator containing the virtual address regions that are usable within core. The boot-info parser reports these regions via the callbacks init_mem and add_virt_mem.

ram_alloc

This is the allocator containing the available physical memory pages. It must be initialized with the physical-memory ranges provided via the init_mem and add_phys_mem callbacks.

core_mem_alloc

This is an allocator for available virtual address ranges within core. In contrast to region_alloc and ram_alloc, which both are operating at page-granularity, core_mem_alloc can be used to allocate arbitrarily-sized memory objects. The implementation uses region_alloc and ram_alloc as back ends. The core-local mapping of physical memory pages to core's virtual address space is done in a similar way as practiced in the okl4_06_pager test program.

For implementing the allocators, special care must be taken to make their interfaces thread safe as they may be used concurrently by different core threads. With the memory configuration in place, core will pass the first initialization steps and tries to initialize Core_env, which is a core-specific variant of the Genode environment. A part of Core_env is a server-activation, which is indeed a thread. Upon the creation of this thread, the main thread of core will stop executing until the new thread's startup protocol is finished. So we have to implement core's thread-creating facility, which is platform_thread.cc.

After core successfully creates its secondary threads (called activation and pager), and finishes the initialization of Core_env(), it starts executing the main function, which uses plain Genode APIs such as the env()->heap(). The heap however relies on a working env()->rm_session() and env()->ram_session(). To make env()->rm_session() functional, we need to provide a working implementation of the Core_rm_session::attach() function, which maps the content of a dataspace to core's local address space. Once, core starts using its Env, it will try to use env()->rm_session() to attach dataspaces into its local address space. Therefore, we need an implementation of a core version of the Rm_session interface, which we call Core_rm_session. This implementation uses the OKL4 kernel API to map the physical pages of a dataspace into core's local address space. With the working core environment, core will look for the binary of the init process. Init is supplied to core as a boot module via the elfweaver mechanism we just explored with the okl4_07_boot_info test. Within core, all boot modules are registered to an instance of the Rom_fs class. Hence, we will need to call OKL4's boot-info parser with the right callback functions supplied and put the collected information into Rom_fs. It is useful to take the other platforms as reference.

Starting init

To enable core to successfully load and start the init process, we first need to build the init binary. For compiling init we have to implement the still missing functionality of determining the parent capability at the startup code. The needed function is called parent_cap() and should be implemented in the _main function. For OKL4, the implementation looks exactly like the Pistachio version. On both kernels, the parent capability is supplied at predefined locations declared in the linker script. The corresponding symbols are called _parent_cap_thread_id and _parent_cap_local_name.

After successfully having started init, we can proceed with starting further instances of init as a children of the first instance. This can be achieved by the following config file:

<config>
  <parent-provides>
    <service name="ROM"/>
    <service name="RAM"/>
    <service name="CAP"/>
    <service name="PD"/>
    <service name="RM"/>
    <service name="CPU"/>
    <service name="LOG"/>
  </parent-provides>
  <default-route>
    <any-service> <parent/> </any-service>
  </default-route>
  <start name="init.1">
    <binary name="init"/>
    <resource name="RAM" quantum="5M"/>
  </start>
  <start name="init.2">
    <binary name="init"/>
    <resource name="RAM" quantum="5M"/>
    <config>
      <parent-provides>
        <service name="ROM"/>
        <service name="RAM"/>
        <service name="CAP"/>
        <service name="PD"/>
        <service name="RM"/>
        <service name="CPU"/>
        <service name="LOG"/>
      </parent-provides>
      <default-route>
        <any-service> <parent/> </any-service>
      </default-route>
      <start name="init.2.1">
        <binary name="init"/>
        <resource name="RAM" quantum="2M"/>
      </start>
      <start name="init.2.2">
        <binary name="init"/>
        <resource name="RAM" quantum="2M"/>
      </start>
    </config>
  </start>
</config>

To successfully execute the creation of this nested process tree, we need a correct implementation of unmap functionality within core. Furthermore, if starting multiple processes, we will soon run into the problem of starting too many threads in core. This is caused by the default implementation of Genode's signal API. Within core, each Rm_session_component within core is a signal transmitter, used for signalling address-space faults. With the default implementation, each signal transmitter employs one thread. Because OKL4's roottask is limited to 8 threads, the number of RM sessions becomes quite limited. Therefore, we disable signal support on OKL4 for now by the means of a dummy implementation of the signal interface. Later, we can create a OKL4-specific signal implementation, which will hopefully be able to utilize OKL4's asynchronous notification mechanism.

Hardware access and the Genode demo scenario

The default demo scenario of Genode requires hardware access performed by the following components:

  • The timer driver needs access to a hardware timer. On x86, the programmable interval timer (PIT) is available for this use case. However, for the first version of Genode on OKL4, we can use a simple dummy driver that ignores the argument of msleep and just returns.

  • The PS/2 driver and the timer driver rely on interrupts. We already exercised interrupt handling in okl4_08_timer_pit. So it is relatively straight-forward to implement the IRQ service in core. (taking the other platforms such as Pistachio as reference)

  • The VESA driver requires several hardware facilities, in particular access to the VGA registers via I/O ports, the frame buffer via memory-mapped I/O and other resources such as the PIC (at least some VESA BIOSes rely on the PIT to implement proper delays during the PLL initialization). However, with a working implementation of the I/O-port service and I/O-memory service in core, these requirements become satisfied.

If all the hardware-access services within core are in place, we should be able to start fb_drv, ps2_drv, nitpicker, launchpad. Furthermore starting and killing of an additional testnit process via the launchpad should work. However, we will observe that starting another instance of testnit after killing it will not work. In order to fully support restartable components, we have to implement thread destruction, and the cancel-blocking mechanism within core. The interesting bits about thread destruction are Platform_thread::unbind and Platform_pd::_destroy_pd. For implementing the cancel-blocking mechanism, we have to revisit core's Platform_thread::cancel_blocking, the IPC framework (src/base/ipc/ipc.cc) and the lock implementation (src/base/lock/lock.cc).

With this work done, we are able to run the full Genode demonstration scenario including the Scout tutorial browser, user-level device drivers for PS/2 input and video, and the dynamic creation and destruction of process trees.

Outlook

We consider the result of the porting work as described in this article as the first working version of Genode on OKL4. Of course, there are several areas for possible improvements, which we will address in a demand-driven way. The following list gives some hints:

  • Exploring OKL4's kernel mutex for Genode's lock implementation, paying special attention to the cancel-blocking semantics

  • Increasing the flexibility of the UTCB allocator in core. Right now, the UTCB area of each PD is equally sized, defined by the THREAD_BITS definition. In the future, we could support differently sized UTCB areas to tailor the number of threads per protection domain.

  • Checking the privileges of non-core tasks

  • Supporting RM faults and nested region-manager sessions

  • Replacing the dummy timer implementation with a proper PIT-based timer

  • Virtualizing the PIT in the VESA frame-buffer driver, otherwise the PIT-based timer service won't be usable because of both components needing access to the PIT. Fortunately, the VESA BIOS of Qemu does not access the PIT but we are aware that other BIOSes do.

  • Eventually optimize I/O port access. Right now, we perform an RPC call to core for each I/O port access, which is ok for the other platforms because I/O ports are rarely used (mostly for the PS/2 driver, but at a low rate). On OKL4 however, we provide the user-level time source via the timer driver that accesses the PIT via I/O ports. We could optimize these accesses by lazily mapping the I/O ports from core to the timer driver the first time, an RPC call to the I/O service is performed.