String processing

The utilities at util/string.h subsume functions for parsing and processing textual strings.

Span

The parsing utilities operate on spans of bytes represented by the Span type, which is a tuple of start pointer to the first byte and the number of bytes belonging to the span. The type is suitable for propagating references to parts of a text buffer along a chain of callers where the view on the buffer can be narrowed at each level.

To aid the use of Span for parsing, it is equipped with the following handy utilities: Span::cut and Span::split allow for the easy chopping of strings into sub strings at specific characters. Whereas cut splits a span into two parts separated by a given character, split cuts the span into all separated pieces, e.g., for iterating over lines separated by \n. The Span::trimmed method accesses the span without leading and trailing spaces. Finally, the Span::equals, starts_with, and ends_with methods ease string-matching code. In order to conveniently combine these utilities with String<N> objects, the String::with_span method allows for safely accessing the string's content as Span.

Genode::Span

Byte_range_ptr

The Byte_range_ptr is a span of mutable bytes. The type is designated as argument for functions that write into a byte buffer, tying the start pointer and the buffer length together into a single argument. The as_output method provides a convenient way for rendering printable objects into a byte buffer.

Genode::Byte_range_ptr

Parse functions

The framework addresses the common problem of obtaining an internal data representation from text by the convention of parse functions. By providing a parse method, a class can accompany the logic for importing text into an object.

 size_t parse(Span const &s);

The return value denotes the number of parsed bytes. It allows the caller to infer the presence of trailing erroneous bytes that could not be parsed completely.

Basic string operations

There exists a small set of string-manipulation operations as global functions in the Genode namespace.

repos/base/include/util/string.h

To cover the common case of embedding a string buffer as a member variable in a class, there exists the String class template, which alleviates the need for C-style arrays in such situations.

The String constructor takes any number of arguments, which will appear concatenated in the constructed String. Each argument must be printable as explained in Section Diagnostic output.

Genode::String

There exist a number of printable helper classes that cover typical use cases for producing formatted text output.

Num_bytes

wraps an integer value and produces an output suffixed with K, M, or G whenever the value is a multiple of a kilobyte, megabyte, or gigabyte.

Cstring

wraps a plain C character array to make it printable. There exist two constructors. The constructor with one argument expects a null-terminated character array. The other constructor takes the number of to-be-printed characters as arguments.

Hex

wraps an integer value and produces hexadecimal output.

Char

produces a character corresponding to the ASCII value the wrapped integer argument.

To improve safety in situations that require raw byte-wise access of memory, the two utilities Byte_range_ptr and Const_byte_range_ptr hold a pointer together with a size limit in bytes. They should be used instead of traditional C-style pairs of pointer and size arguments to equip each pointer with its legitimate range of access. Note that the utilities are meant for ephemeral arguments only. They are deliberately not copyable to prevent the accidental storing of the embedded pointer values.

Diagnostic output

To enable components to produce diagnostic output like errors, warnings, and log messages, Genode offers a simple Output interface for sequentially writing single characters or character sequences.

Genode::Output

Functions for generating output for different types are named print and take an Output & as first argument. The second argument is a const & to the value to print. Overloads of the print function for commonly used basic types are provided. Furthermore, there is a function template that is used if none of the type-specific overloads match. This function template expects the argument to be an object with a print method. In contrast to a plain print function overload, such a method is able to incorporate private state into the output.

repos/base/include/base/output.h

The component's execution environment provides an implementation of the Output interface that targets a LOG session. This output back end is offered to the component in the form of the log, warning, error, and trace functions that accept an arbitrary number of arguments that are printed in a concatenated fashion. Each message is implicitly finalized with a newline character.

repos/base/include/base/log.h

Obtaining backtraces

As debugging aid, it is sometimes insightful to obtain call graphs of executed code. Such backtraces can be generated via the utilities provided by os/backtrace.h. As a precondition for getting useful output, make sure to have compiled your executable binary with frame pointers. By adding the following line to the etc/build.conf file of the build directory, one can instruct the build system to produce binaries in the needed form.

 CC_OPT += -fno-omit-frame-pointer

The general mechanism for generating a backtrace has the form of the printable Backtrace class. An object of this type can be passed to any of the log, warning, error, or trace functions to output the backtrace of the point of call. As a convenient shortcut for the common case of printing a backtrace to the log, one can call the backtrace() function instead. The output looks like in the following example.

 [init -> test-log] backtrace "ep"
 [init -> test-log]   401ff89c   10014f4
 [init -> test-log]   401ff90c   1001637
 [init -> test-log]   401ff94c   10006e2
 [init -> test-log]   401ffaec  5008aa9f
 [init -> test-log]   401ffc6c  50048dbb
 [init -> test-log]   401ffc8c  5004be41
 [init -> test-log]   401ffcdc  5003a04d
 [init -> test-log]   401ffe6c  50065218
 [init -> test-log]   401fff7c  50079d54

The first line contains the thread name of the caller, which is followed by one line per stack frame. The first number is the stack address whereas the second line is the return address of the stack frame, both given in hexadecimal format. The latter can be correlated to source code by inspecting the disassembled binary using the objdump utility. In practice, however, one may prefer the convenience of the tool/backtrace utility to streamline this procedure.

  1. Execute the backtrace tool with the debug version of your executable as argument. For example, after having observed a backtrace printed by the test-log program, one may issue:

     build/x86_64$ ../../tool/backtrace debug/test-log
    
  2. Once started, the tool waits for you pasting the logged backtrace into the terminal. For each stack frame, it then prints the corresponding function name and source-code location.

Unicode handling

The string-handling utilities described in Section Basic string operations operate on ASCII-encoded character strings where each character is encoded as one byte. It goes without saying that ASCII is unsuitable for user-facing components that are ultimately expected to support the display of international characters. The Utf8_ptr utility accommodates such components with an easy way to extract a sequence of Unicode codepoints from an UTF-8-encoded string.

Genode::Utf8_ptr