10.18 Garbage collection for dummies

Note: Reading this section is strongly recommended before attempting complex C++ programming.

Within LilyPond, interaction with Guile is ubiquitous. LilyPond is written in C++ and Guile Scheme. Even in C++, most of the code uses Guile APIs to interface with the outside Scheme world, both with user and internal Scheme code.

Scheme is a garbage-collected language. This means that once in a while, a so-called garbage collector scans the memory for values that are no longer being used, and reclaims them. This process ensures that the memory is given back to the computer and made available for other uses. The garbage collector implementation used in Guile 2 and later is the Boehm-Demers-Weiser garbage collector (BDWGC).

C++, on the other hand, usually frees values at determined points of time (although most of the time they remain implicit, through the use of the famous “RAII” or “scope-bound resource management” technique). It has no direct support for garbage collection. This can make memory management of Scheme values in C++ a challenge (or a headache). Whenever you are using a value whose memory is managed by Guile, you must keep an eye on its lifetime.

To be more precise, the garbage collector works in a mark phase and a sweep phase. During marking, the collector scans values that the program is currently using, then asks these values for containing references to other values, and continues following references until all reachable objects have been found. Objects that are unreachable can logically no longer be used in the program, so they are freed in the sweep phase.

In Schemeland, the interpreter takes care of marking values for you. For instance, if you store a list in a variable, then during garbage collection, this list is automatically marked, and this causes all elements of the list to be marked in turn, which ensures they remain alive. In Cppland, you need to be very careful to keep values allocated on Guile’s heap as visible to the garbage collector if they cannot be reached from the Scheme side.

Understanding which values are under garbage-collected management

To begin with, which values are allocated on the Guile heap? The basic Guile API type is the SCM type, which represents a value boxed for usage in Scheme. The SCM type is pointer-sized piece of data. It is either a pointer to Scheme data structures (e.g. pair, double pair, etc.) – in this case, the pointer is 64-bit aligned and has its lower bits set to 0 –, or it is an immediate value (short integer, boolean, '(), etc.) – in which case the lower order bits are non-zero. Smobs, vectors, strings and many other Scheme data structures are represented as pairs, where the car holds a tag value (non-aligned, lower order bits set) and the cdr holds the pointer to data. From the scheme side, the fact that these types are represented using pairs is invisible.

Thus, for immediate SCM values, all the value is contained in the SCM itself. There is no concept of freeing these values, as they are never heap-allocated: they just keep being copied around, and dropped by normal C++ lifetime mechanisms when done (such as dropping local variables of a function when it returns). On the other hand, all other values point to memory allocated on the Guile heap. It is the lifetime of this memory that you need to care about.

LilyPond adds its own object types to Guile as well. They as called “smobs”, which depending on sources means “Scheme objects” or “small objects”. Smobs come in two flavors:

“Simple smobs” are objects that can be passed around by copy without changing the meaning. Their classes derive from Simple_smob. Pitch and Duration are good examples. The usual way to create them is just like a normal C++ object (e.g., Smob_type variable (constructor parameters);). When created in this way, simple smobs are allocated on the stack like any other C++ automatic variable, and dropped in the same way too. When you need to send a simple smob to Schemeland, you should call the member function smobbed_copy (). This calls the smob’s copy constructor to make a copy under garbage collection control, packed in an SCM value.

“Complex smobs” are objects with an identity, such as Music, Context and Grob. Their classes derive from Smob. They are always created via the C++ new operator. After allocating, their memory is put under the control of the garbage collector. A complex smob has a field containing its SCM identity, which points back to itself. You can access this field using the member function self_scm ().

The function to convert a SCM value back into the C++ smob type is unsmob<Smob_type *> (value) (which returns a null pointer if the SCM was not a value of the smob type in question). Because of the dual nature of simple smobs, you need to be mindful that if Smob_type derives from Simple_smob, the memory referred to by the result of unsmob<Smob_type> (if non-null) may either be on the stack or on the Guile heap, even though most of the time it will be on the Guile heap. On the other hand, for a complex smob, it will systematically be on the Guile heap.

How values are protected

When the garbage collector starts a collection, it first scans all memory being used by the program at the current point of time. This is called the root set. For Scheme, it includes all global variables of all modules and local variables of the function being executed. C++ adds everything that is on the stack and in registers (FIXME: investigate global variables). The dependencies of these values are then marked, etc.

Marking roots

The marking of the C++ function stack is very simple: scan the stack and treat every value as a possible pointer. This principle is called “conservative garbage collection”, and has a few consequences. One is that there may be some false positives, if random values on the stack happen to look the same as pointers to memory in the Guile heap. These values will be held longer than necessary, which is harmless.

Another, much more nasty consequence is that values are only kept alive while they have an SCM presence on the stack. Here is an example of what not to do:

Complex_smob_type *
func ()
{
  Complex_smob_type *object = new Complex_smob_type ();
  object->unprotect ();
  return object;
}

When the caller of this function receives the object pointer, there is no reason for the object’s SCM identity (what would be returned by its self_scm () method) to be present on the stack or in registers. Only the pointer to the C++ object is. This does not work to protect the object from garbage collection. The object could be freed if a GC pass occurs. The fix is to unprotect later if possible, at a point where the object’s self_scm () is placed in a long-lived reachable Scheme data structure. Alternatively, if this is impractical, return an SCM to keep the object protected. The unprotect () method actually returns the SCM for convenience.

SCM
func ()
{
  Complex_smob_type *object = new Complex_smob_type ();
  return object->unprotect ();
}

A different, even nastier trap can be illustrated with this example:

LY_DEFINE (ly_func, "ly:func",
           1, 0, 0, (SCM param),
           R"(
Doc
           )")
{
  Smob_type *object = unsmob<Smob_type> (param);
  // do some stuff here, including
  scm_cons (a, b)
  // ...
  return to_scm (object->some_field_);
}

At first glance, this looks fine. The SCM value param should remain on the stack until the end of the function, keeping the smob protected. This is not always true, however. If the compiler does a clever optimization, it might reuse the memory of the param variable for something else. If this happens, the object is unprotected while the memory of the cons cell is being allocated, which could cause the smob to be collected. The access object->some_field_ is then use-after-free.

The solution to this is to use scm_remember_upto_here, which allows to forcefully keep the object alive:

LY_DEFINE (ly_func, "ly:func",
           1, 0, 0, (SCM param),
           R"(
Doc
           )")
{
  Smob_type *object = unsmob<Smob_type> (param);
  // do some stuff here, including
  scm_cons (a, b)
  // ...
  SCM field = to_scm (object->some_field_);
  scm_remember_upto_here (param);
  return field;
}

GC marking for smobs

Guile automatically marks the elements contained in compound values of the types it provides, like lists and vectors. LilyPond’s smobs must do the same in order to keep elements they refer to alive while they are themselves alive. This is done by implementing the member function SCM mark_smob () const. This function must call scm_gc_mark on every Scheme value that needs to be kept alive with the object. It can return an SCM value, which is marked in the same way. (This dates back to Guile 1, which used the C++ function stack to mark objects. It was necessary to keep the stack depth constant when marking objects such as lists, or stack overflows would have easily ensued. It is no longer very relevant in Guile 2.)

For many smob types, mark_smob needs to add marking to the implementation of the superclass. This is usually done using a derived_mark method. This is the case for translators, for example. The child class should thus just implement derived_mark and not override mark_smob.

For simple smobs allocated as automatic variables, i.e., outside of Guile’s control, mark_smob is not called during garbage collection. In this case, the only marking that the object receives is conservative scanning of the stack. This has the strong implication that a simple smob must contain all SCM values it refers to in its memory image on the stack. Anything that needs more complex marking behavior should be a complex smob. For example, it’s not OK for a simple smob to contain an std::vector<SCM>. On the other hand, that would be OK for a complex smob as long as its mark_smob function iterates over the vector to mark each element. The simplest solution is storing a Guile vector, of SCM type, which is OK even in simple smobs because the memory image on the stack is an SCM vector value, which during marking causes the marking of all vector elements, unlike an std::vector<SCM>.

Initial protection for complex smobs

When you create a complex smob, it receives an initial GC protection, which should be removed with its unprotect () method once the complex smob enters an area where it is protected by other means.

There is no such protection for a smobbed_copy () of a simple smob because those tend to be more short-lived and are often just returned to Scheme after being created.

TODO: expand on smob constructors, especially the need for Preinit classes. See lily/include/smobs.hh.

TODO: explain the quirks of finalization (non-)ordering. See commit 6555b3841a.


LilyPond Contributor’s Guide v2.25.21 (development-branch).