Design of the Genode OS Architecture: Interfaces and Mechanisms
The system is structured as a tree. The nodes of the tree are processes. A node, for which sub-nodes exist, is called the parent of these sub-nodes (children). The parent creates children out of its own resources and defines their execution environment. Each process can announce services to its parent. The parent, in turn, can mediate such a service to its other children. When a child is created, its parent provides the initial contact to the outer world via the following interface:
void exit(int exit_value); Session_capability session(String service_name, String args); void close(Session_capability session_cap); int announce(String service_name, Root_capability service_root_cap); int transfer_quota(Session_capability to_session_cap, String amount);
- exit
-
is called by a child to request its own termination.
- session
-
is called by a child to request a connection to the specified service as known by its parent whereas service_name is the name of the desired service interface. The way of resolving or even denying a session request depends on the policy of the parent. The args parameter contains construction arguments for the session to be created. In particular, args contains a specification of resources that the process is willing to donate to the server during the session lifetime.
- close
-
is called by a child to inform its parent that the specified session is no longer needed. The parent should close the session and hand back donated resources to the child.
- announce
-
is called by a child to register a locally implemented service at its parent. Hence, this child is a server.
- transfer_quota
-
enables a child to extend its resource donation to the server that provides the specified session.
We provide a detailed description and motivation for the different functions in Sections Servers and Quota.
Servers
Each process may implement services and announce them via the announce function of the parent interface. When announcing a service, the server specifies a root capability for the implemented service. The interface of the root capability enables the parent to create, configure, and close sessions of the service:
Session_capability session(String args); int transfer_quota(Session_capability to_session_cap, String amount); void close(Session_capability session_cap);
Figure 1 illustrates an announcement of a service. Initially, each child has a capability to its parent. After Child1 announces its service "Service", its parent knows the root capability of this service under the local name srv1_r and stores the root capability with the announced service name in its root_list. The root capability is intended to be used and kept by the parent only.
When a parent calls the session function of the root interface of a server child, the server creates a new client session and returns the corresponding client_session capability. This session capability provides the actual service-specific interface. The parent can use it directly or it may pass it to other processes, in particular to another child that requested the session. In Figure 2, Child2 initiates the creation of a "Service" session by a session call at its parent capability (1). The parent uses its root list to look up the root capability that matches the service name "Service" (2) and calls the session function at the server (3). Child1 being the server creates a new session (session1) and returns the session capability as result of the session call (4). The parent now knows the new session under the local name srv1_s1 (5) and passes the session capability as return value of Child2's initial session call (6). The parent maintains a session_list, which stores the interrelation between children and their created sessions. Now, Child2 has a direct communication channel to session1 provided by the server (Child1) (7).
The close function of the root interface instructs the server to destroy the specified session and to release all session-specific resources.
Even though the prior examples involved only one parent, the announce-request mechanism can be used recursively for tree structures of any depth and thus allow for partitioning the system into subsystems that can cooperate with each other whereas parents are always in complete control over the communication and resource usage of their children (and their subsystems).
Figure 3 depicts a nested subsystem on the left. Child1 announces its service named "Service" at its parent that, in turn, announces a service named "Service" at the Grandparent. The service names do not need to be identical. Their meaning spans to their immediate parent only and there may be a name remapping on each hierarchy level. Each parent can decide itself whether to further announce services of their children to the outer world or not. The parent can announce Child1's service to the grandparent by creating a new root capability to a local service that forwards session-creation and closing requests to Child1. Both Parent and Grandparent keep their local root lists. In a second step, Parent2 initiates the creation of a session to the service by issuing a session request at the Grandparent (1). Grandparent uses its root list to look up the service-providing child (from Grandparent's local view) Parent1 (2). Parent1 in turn, implements the service not by itself but delegates the session request to Child1 by calling the session function of the actual "Service" root interface (3). The session capability, created by Child1 (4), can now be passed to Parent2 as return value of nested session calls (5, 6). Each involved node keeps the local knowledge about the created session such that later, the session can be closed in the same nested fashion.
Quota
Each process that provides services to other processes consumes resources on behalf of it clients. Such a server requires memory to maintain session-specific state, processing time to perform the actual service function, and eventually further system resources (e.g., bus bandwidth) dependent on client requests. To avoid denial-of-service problems, a server must not allocate such resources from its own budget but let the client pay. Therefore, a mechanism for donating resource quotas from the client to the server is required. Both client and server may be arbitrary nodes in the process tree. In the following, we examine the trading of resource quotas within the recursive system structure using memory as an example.
When creating a child, the parent assigns a part of its own memory quota to the new child. During the lifetime of the child, the parent can further transfer quota back and forth between the child's and its own account. Because the parent creates its children out of its own resources, it has a natural interest to correctly manage child quotas. When a child requests a session to a service, it can bind a part of its quota to the new session by specifying a resource donation as an argument. When receiving a session request, the parent has to distinct three different cases, dependent on where the corresponding server resides:
- Parent provides service
-
If the parent provides the requested services by itself, it transfers the donated amount of memory quota from the requesting child's account to its own account to compensate the session-specific memory allocation on behalf of its own child.
- Server is another child
-
If there exists a matching entry in the parent's root list, the requested service is provided by another child (or a node within the child subsystem). In this case, the parent transfers the donated memory quota from the requesting child to the service-providing child.
- Delegation to grandparent
-
The parent may decide to delegate the session request to its own parent because the requested service is provided by a lower node of the process tree. Thus, the parent will request a session on behalf of its child. The grandparent neither knows nor cares about the actual origin of the request and will simply decrease the memory quota of the parent. For this reason, the parent transfers the donated memory quota from the requesting child to its own account before calling the grandparent.
This algorithm works recursively. Once, the server receives the session request, it checks if the donated memory quota suffices for storing the session-specific data and, on success, creates the session. If the initial quota donation turns out to be too scarce during the lifetime of a session, the client may make further donations via the transfer_quota function of the parent interface that works analogously.
If a child requests to close a session, the parent must distinguish the three cases as above. Once, the server receives the session-close request from its parent, it is responsible to release all resources that were used for this session. After the server releases the session-specific resources, the server's quota can be decreased to the prior state. However, an ill-behaving server may fail to release those resources by malice or caused by a bug.
If the misbehaving service was provided by the parent himself, it has the full authority to not hand back session-quota to its child. If the misbehaving service was provided by the grandparent, the parent (and its whole subsystem) has to subordinate. If, however, the service was provided by another child and the child refuses to release resources, decreasing its quota after closing the session will fail. It is up to the policy of the parent to handle such a failure either by punishing it (e.g., killing the misbehaving server) or by granting more of its own quota. Generally, misbehavior is against the server's own interests and each server would obey the parent's close request to avoid intervention.
Successive policy management
For supporting a high variety of security policies for access control, we require a way to bind properties and restrictions to sessions. For example, a file service may want to restrict the access to files according to an access-control policy that is specific for each client session. On session creation, the session call takes an args argument that can be used for that purpose. It is a list of tag-value pairs describing the session properties. By convention, the list is ordered by attribute priority starting with the most important property. The server uses these args as construction arguments for the new session and enforces the security policy as expressed by args accordingly. Whereas the client defines its desired session-construction arguments, each node that is incorporated in the session creation can alter these arguments in any way and may add further properties. This effectively enables each parent to impose any desired restrictions to sessions created by its children. This concept works recursively and enables each node in the process hierarchy to control exactly the properties that it knows and cares about. As a side note, the specification of resource donations as described in the Section Quota is performed with the same mechanism. A resource donation is a property of a session.
Figure 4 shows an example scenario. A user application issues the creation of a new session to the GUI server and specifies its wish for reading user input and using the string "Terminal" as window label (1). The parent of the user application is the user manager that introduces user identities into the system and wants to ensure that each displayed window gets tagged with the user and the executed program. Therefore, it overrides the label attribute with more accurate information (2). Note that the modified argument is now the head of the argument list. The parent of the user manager, in turn, implements further policies. In the example, Init's policy prohibits the user-manager subtree from reading input (for example to disable access to the system beyond official working hours) by redefining the input attribute and leaving all other attributes unchanged (3). The actual GUI server observes the final result of the successively changed session-construction arguments (4) and it is responsible for enforcing the specified policy for the lifetime of the session. Once a session has been established, its properties are fixed and cannot be changed.