Assignment 9: Garter:   Garbage Collection
1 Preparing for future assignments
2 Garbage-collection
2.1 Setting up GC
2.2 Finding the root set:   walking the stack
2.3 Copying values:   the core of the algorithm (depth-first version)
2.4 Copying values:   breadth-first version
2.5 Running with multiple heap sizes
2.6 Testing
2.7 No builtin functions
3 List of Deliverables
4 Grading Standards
5 Submission
8.12

Assignment 9: Garter: Garbage Collection🔗

Due: Fri 03/29 Sat 03/30 at 8:59pm

git clone

In this assignment you’ll implement a GARbage-collecTEd Runtime.

1 Preparing for future assignments🔗

Through the prior assignment, we’ve used naive_stack_allocation to determine where to store each variable on the stack. That function has had the signature

naive_stack_allocation : tag aprogram -> (tag aprogram * arg name_envt)

(where name_envt is a name-to-whatever environment; we just renamed it for clarity and to contrast with another possibility, below) and the returned environment contained an arg for every unique name in the program being compiled. In Diamondback and Egg-eater, this worked fine, because we had already alpha-renamed all names in the program to make them unique. However in Fer-de-lance, this environment was overly simplistic, since a given name could appear at (at least) two different locations: as a local variable in the function where it was defined, and as a closed-over variable in any closures that use that name. For each function, though, a name should only be bound to one unique location. Accordingly, as a preparatory step before the rest of this assignment, refactor this function to have the signature

naive_stack_allocation : tag aprogram -> (tag aprogram * arg name_envt name_envt)

Instead of one global environment containing all the names, create a nested collection of environments that map closure names to their internal environments. When you compile a function, you should look up the appropriate environment within this nested collection and use that.

Note: the environment you use to compile the body of a closure is not the same environment as the one you use to fill in the slots of the closure itself — why not? Make sure you use the correct two environments to collectively compile a closure.

Note: You need to determine the appropriate name to use for each closure, to use as the key in the nested collection of environments — depending on the details of how ANF works, not every closure has a name. You can resolve this either by tweaking your ANF translation, or by producing the following signature instead:

naive_stack_allocation : tag aprogram -> (tag aprogram * arg name_envt tag_envt)

Here, name_envt is the same name-to-whatever mapping as above, while tag_envt is a mapping from tags to whatever you want, where tags come from the tagged AST. This approach also has risks, since it requires that you never re-tag the AST once you’ve produced this environment.

You can choose which of these two approaches to use, and you’ll continue with that signature in the next assignment. In a comment in your code, explain which signature you chose, and explain why you prefered that signature choice.

2 Garbage-collection🔗

2.1 Setting up GC🔗

Now that you have mutation, garbage collection becomes interesting: you have the potential for cycles and for unreachable data, i.e. garbage to be collected. You therefore need to implement garbage collection. The garbage collector should be run on every single allocation attempt: in other words, instead of only manually bumping R15 every time you want to allocate some memory, you will first call try_gc, which will collect garbage if necessary so that you can do the allocation.1You may optimize this, by using the HEAP_END variable to check whether a garbage-collection is needed and skipping it when possible. But this optimization is not mandatory. It will abort your program if garbage collection could not allocate a new semispace heap, so try_gc should never return NULL.

/*
  Try to reserve the desired number of bytes of memory, and free garbage if
  needed.  Fail (and exit the program) if there is insufficient memory.  Does
  not actually allocate the desired number of bytes of memory; the caller
  will do that.

  Arguments:

    uint64_t* alloc_ptr - the current top of the heap (which we store in R15), where
                          the next allocation should occur, if possible
    int bytes_needed - the number of bytes of memory we want to allocate
                       (including padding)
    uint64_t* cur_frame - the base pointer of the topmost stack frame of our code
                          (i.e., RBP)
    uint64_t* cur_stack_top - the stack pointer of the topmost stack frame of our
                              code (i.e., RSP)

  Returns:
    The new top of the heap (i.e. the new value of R15) after garbage collection.
    Does not actually allocate bytes_needed space.

  Side effect:
    Also updates HEAP and HEAP_END to point to the new location of the heap
*/
uint64_t* try_gc(uint64_t* alloc_ptr, int bytes_needed,
                 uint64_t* cur_frame, uint64_t* cur_stack_top);

The provided C code includes a dummy implementation of a garbage collector: it does not collect anything, but does check whether there is sufficient space for the desired allocation. The provided ML code includes a reserve helper function for you to call try_gc correctly.

Note: the provided C code does not “split the heap in half”, as we discussed in class, but rather allocates a new heap area and triggers your gc function on the extant and the newly-alocated heap areas, then frees the old heap once gc is complete. This makes it marginally more likely for your code to segfault if accessing memory out of bounds (rather than just accidentally accessing the other half of the already-allocated heap), so this tweak mostly should help you find errors in your code more quickly. It does not affect the overall “two equal-sized heaps” concept underlying the algorithm.

2.2 Finding the root set: walking the stack🔗

You will need to be able to walk the Garter stack frames to find any pointers in those frames. We will assume that the Garter stack is contiguous, and that there are no C frames in there (as might happen if Garter code called C code which called back into Garter code). We therefore need to mark the beginning and end of the Garter stack. To do so, we’ll employ a trick. In main.c, we will declare a global variable uint64_t* STACK_BOTTOM. As part of the prelude of our_code_starts_here, once we’ve set RBP correctly, we will store that value into STACK_BOTTOM, and thereby inform our runtime about exactly where the Garter stack will end, by calling a runtime function set_stack_bottom to do the assignment in C for us.

We have provided you with some skeleton (and untested) code in gc that will walk the Garter stack frames, from the provided stack-top down to STACK_BOTTOM. Along the way, it will try to copy any heap-allocated values, if needed.

2.3 Copying values: the core of the algorithm (depth-first version)🔗

At its heart, Cheney’s semispace collector works with two values: the address of a value to be copied if needed, and the address where to copy values if needed. Note: the description of Cheney’s algorithm online is a breadth-first traversal of the heap, whereas the description in this section is depth-first. The breadth-first approach is presented next. Implement either of them.

Let’s call the value to be copied garter_val, and its address garter_val_addr. Call the destination address heap_top.

Be careful to keep straight the differences between garter_val_addr, which is where the Garter value is currently stored and which must be updated; garter_val, which may be a primitive value or a tagged heap-allocated thing; and heap_thing, which is the data pointed-to by garter_val and which may be a tuple, a closure or a forwarding pointer.

I recommend running your program with very small heap sizes (20 words or so), and calling smart_print_heap after every call to copy_if_needed, to see how your heap is changing over the course of the algorithm.

2.4 Copying values: breadth-first version🔗

(Note: the signature of copy_if_needed and the implementation of gc will need to change to support breadth-first traversal.)

Instead of calling copy_if_needed recursively, we separate the two steps. First, we walk the stack and copy each value if needed, without recurring. That is, we copy the contents of each heap_thing, update the value at the old garter_val location to be a forwarding pointer, and return the newly allocated location, which we use to update the pointer in garter_val_addr. Once we’ve done this, our roots will all point into the to-space, and the to-space will contain heap-allocated values that all point into the from-space.

Next, we loop over the to-space, one heap-thing at a time, until we run out of values in the to-space. (This is a tricky iteration, since it modifies the number of values in the to-space as the iteration proceeds. Nevertheless it must terminate, when implemented correctly, since every step moves one heap-thing into the to-space, and there are finite number of heap-things to move.) For each value within the heap-thing, copy it to the to-space if needed. If the value points to a forwarding pointer, just update the value to point to the forwarded value directly.

The advantage of the breadth-first traversal is that it is non-recursive and uses no stack space, and you can easily inspect the state of the heap after each iteration of the loop. The disadvantage is that it does not keep linked items near each other in contiguous memory the way the depth-first traversal does.

2.5 Running with multiple heap sizes🔗

We’ve extended the command line for main.c to accept a heap size (as an integer in words) as its first command-line argument. If none is provided, a default heap size of 100K words is allocated.

2.6 Testing🔗

This works mostly as before, except that there are a few additional forms for checking things relative to the garbage collector. As mentioned, the main program is parameterized over an integer argument that allows you to select the size of the heap in terms of (8-byte) words. This is exposed through the testing library as well, so you can write:

tgc "gctest" 10 "(1, 2)" "(1, 2)"

and this will run the test with a heap size of 10.

You can also test for specific errors, for example in the case that there will never be enough memory to fit the required data:

tgcerr "gctest" 10 "(1, (3, (4, 5)))" "Out of memory"

Finally, you can use tvgc to run a tgc test with valgrind, to improve errors on segfaults and check memory.

2.7 No builtin functions🔗

The compiler’s command-line argument -no-builtins should be used to omit all the built-in functions (print, equals, etc) that you may have desugared and produced lambdas for. This is useful to ensure that when we test your programs, they only include memory that the test program’s source requires them to allocate. (Naturally, those programs will not use print etc!) You can create unit tests in your input/do_pass directory using this parameter; read the README file in that directory for full information.

3 List of Deliverables🔗

Again, please ensure the makefile builds your code properly. The black-box tests will give you an automatic 0 if they cannot compile your code!

DO NOT SUBMIT YOUR .git DIRECTORY! For that matter, don’t submit your output or _build directories. Basically, run make clean and then submit a zip of the remaining directory.

4 Grading Standards🔗

For this assignment, you will be graded on

5 Submission🔗

Wait! Please read the assignment again and verify that you have not forgotten anything!

Please submit your homework to https://handins.khoury.northeastern.edu/ by the above deadline.

1You may optimize this, by using the HEAP_END variable to check whether a garbage-collection is needed and skipping it when possible. But this optimization is not mandatory.