Lists in Racket#
Every programming language has some method to store a collection of items. These generally fall into two broad categories lists and arrays. An array tends to have a fixed size. A list can be resized as needed. Racket has a special kind of list designed specifically for recursive programming.
What is a List#
The list is a fundamental data type in Racket and other Lisp related languages. It is inspired by a traditional pen and paper concepts, matched to the limits of computers. A list is just a collection of items stored in a specific order.
Take a grocery store list as an example.
1lb carrots
3 boxes of cereal
4 chicken thighs
1 gallon milk
The items in the list are always in an order. The first item is the one at the top of the list. In this case, 1lb carrots is the first item. The rest of the items are everything in the list with the first element removed.
3 boxes of cereal
4 chicken thighs
1 gallon milk
If we put the first item into our shopping cart, we are left with the rest of the items to find in the store.
With just these two terms, we can refer to any point in the list.
First: The current first item in a list.
Rest: All remaining elements after the first item.
This gives us a recursive understanding of lists.
How would we talk about the second element in the list? It is clearly not the first element, so it is part of the rest. Specifically, it is the first element of the rest of the list. We remove the real first element by saying rest, then the original second element becomes the new first element.
1lb carrots First Element
3 boxes of cereal First of the Rest
4 chicken thighs First of the Rest of the Rest
1 gallon milk First of the Rest of the Rest of the Rest
Why is the third element in the list called the First of the Rest of the Rest? We unpack this statement backwards. The of the rest command means remove the first element from the list.
3 boxes of cereal
4 chicken thighs
1 gallon milk
There is a second of the rest, we again remove the first element.
4 chicken thighs
1 gallon milk
Lastly, we take the first element of the remaining values. In this case, the first element is now “4 chicken thighs”. It was originally third, but since we removed two elements it has become the first.
To get the \(n\)-the element, we need to take the rest of the list \(n-1\) times. Then the element we want will have moved to the first position.
What if we want to add to a list? In Racket, this always happens in the front. The only place we can insert new elements is into the very first position. This pushes all the other values back by one.
The command to add new elements is the cons command. It is short for construct. We always construct a new list by adding one existing element to an old list. We start with our original list of four items.
1lb carrots
3 boxes of cereal
4 chicken thighs
1 gallon milk
We want to construct a new list by adding “1 box rice” to the list. The new element goes in the front. All other elements are pushed back one position.
1 box rice
1lb carrots
3 boxes of cereal
4 chicken thighs
1 gallon milk
We always add to the front for two reasons. First, it is computationally efficient. Using pointers, we can put a new element in front of the old elements without moving anything else around. The old list stays exactly where is was, the new value just needs to store a pointer to the old data. Secondly, it is more naturally recursive. When building lists and taking lists apart, we always work from the front.
How Lists Work#
The commands available for lists are very closely related to hardware. The are based on pointers.
Note
A pointer is a memory address on a computer. It tells us where in our memory a value or data structure lives. Pointers are generally very small compared to data structures. We can always travel to the memory location and get any data we need.
If a list is empty, that means it has no elements in it. We call the empty list null. The null value is a non-existent place in memory. It is traditionally equal to 0, but not always. It is used to say “we don’t actually have anything in memory yet.” Since the empty list has no values to store in memory, it meets this definition. It is a list whose pointer doesn’t get us to any data.
If a list is not null, then it has values. At any given time, we only store two pointers related to the list. Everything else can be accessed by using these values.
first - a pointer to the first value in the list.
rest - a pointer to the rest of the list.
The idea of thinking about lists in terms of only the first and rest was inspired by hardware. On the IBM 704 a single word of memory was broken up into two component parts [Wikipediacontributors21a]. The first part was used for the first part of the list.. The second part was used for a pointer to the rest of the list. When the word was loaded into memory, these two values would go to their respective registers on the processor.
The developers of the original LISP realized this could be used to store a list. We could store information about the first value in the first section and a pointer to the rest of the list in the rest section. This would allow programmers to generate lists in a way that naturally matched the hardware of the system.
Computers are much bigger and have more memory now, but these efficiencies still exist. By only needing to load one element at a time, we can work with massively long lists without wasting memory. Regardless of how big the list is, the process is always only focused on a very small area.
A basic implementation of the Racket list functions are built in Python below. Of course, the practical implementation is defined at a most lower level in assembly in Racket. These functions need to be exceptionally efficient. They are the foundation on which almost every program is built. This Python implementation should give you a feel for the underlying concepts.
We start out by making a class to replicate the word of memory. Since everything in Python can be a pointer, we just need two attributes for our two values.
class node:
def __init__(self):
self.__first = None
self.__rest = None
To work with the list, we need some predicates. It is useful to know if the list is null or not. We can check this by seeing if the value is set or not. Python calls the empty value None
.
def isNull(L):
return L==None
We might also want to ask if the list contains any nodes. Python has a command isinstance
that can solve this problem. We call this command isCons
because nodes can only be created by using the construct command. We are asking “was this built by construct?”
def isCons(L):
return isinstance(L,node)
Python and Racket are both dynamically typed languages. The programmer never tells the code what type a variable has. It has to be determined dynamically. We might want to ask “is this a list?” A variable could also store a string or number. A value is a list if it is either null or contains a node.
def isList(L):
return isNull(L) or isCons(L)
We also want to be able to ask what the first value is. As long as the list is not null, this is easy to get from our attribute.
def first(L):
if isCons(L):
return L.__first
raise error("No first!")
The code to get the rest of the list is almost a mirror image. We just select a different attribute.
def rest(L):
if isCons(L):
return L.__rest
raise error("No rest!")
When we want to create a node, we allocate the memory then set values. The construct command has two inputs. We give it a value we want added to the list and the current list. It places this value at the front of the old list. Since all we have for a list is a pointer, we just need to store the value and place the pointer in the rest section of the node.
def cons(a,L):
n = node()
n.__first = a
n.__rest = L
return n
These six commands form the basis for all list work in Racket. As you can see, none contain any loops. They are very simple functions that can easily be implemented at a very low level for a specific processor. These means they can all run exceptionally fast.
We can use these functions to build more complex tools. What if we wanted to create a list? We set a variable and used construct repeatedly.
L = None
L=cons(1,L)
L=cons(2,L)
L=cons(3,L)
If we want to determine the length of this list, we can do so recursively. The empty list has no elements, so its length is trivially zero. What about longer lists? When we are looking at the list, all we have is the current first element. We have the first element, so there is at least one element. We just need to know how many values are left in the list. This can be handled by rest.
The length function has two cases.
Base Case: The empty list has a size of zero.
Recursive Case: The length of a list with values is 1 + the length of the rest of the list.
The recursive case counts the first element, then recursively counts the rest. We can implement this in Python using our functions.
def length(L):
if isNull(L):
return 0
else:
return 1+length(rest(L))
We can run a few tests on our code.
print("Is the list null?")
print(isNull(L)) #False
print("Does the list have elements?")
print(isCons(L)) #True
print("Is it a list?")
print(isList(L)) #True
print("What is the first element?")
print(first(L)) #3
print("What is the second element?")
print(first(rest(L))) # 2
print("What is the length?")
print(length(L))
The output shows we have created a list.
Is the list null?
False
Does the list have elements?
True
Is it a list?
True
What is the first element?
3
What is the second element?
2
What is the length?
3
Racket Lists#
The Racket language provides the basic list commands we saw in Python. In the case of Racket, these commands are very efficiently implemented. We can pretend they are perfect implementation and build our function on them.
We need to start with the empty list. There are two ways to write the null list. We can explicitly say null
. We can also use a single quote followed by empty parenthesis '()
. This second method visually represents an empty list.
The construct command is implemented as the cons
function. The cons
function takes two inputs. The first is the element to put in front of the list. The second is the list to add the value to. The null
list is always our starting point.
We build lists from the back towards the front. If we want to make a list with the elements \([2,4,6,8]\), we need to start with the last element.
We start by making a list with just the number 8, the last element in our list. Since there is nothing after it, we add this number to the empty list.
(cons 8 null) ;A list with just 8
When Racket displays a list, it shows it in parenthesis. To distinguish between functions and lists, a single quote is put at the start of a list. The return value of the previous cons
command in '(8)
.
If we want to put 6 before the 8, we need two cons
commands.
(cons 6 (cons 8 null)) ;The list '(6 8)
The single quote is critical to differentiate between applying the function named 6 to the input 8 and a list containing 6 and 8. There is no function named 6. The command (6 8)
would be nonsense because we can’t apply the function 6.
To build our retire target list, we just need to cons
all the values backwards. Each time we add the new value to a list. Except for the starting point, the list is the return value of the nested call.
(cons 2 (cons 4 (cons 6 (cons 8 null)))) ;The list '(2 4 6 8)
The cons
command can construct lists our of anything. Racket is dynamically typed. We can put an integer, float, string, or another list into a cons
.
Lists is Racket are always displayed with a single quote and parenthesis. The values are separated by spaces.
Once we have built a list, we can define it as a constant.
(define X (cons 4 (cons 6 (cons 7 null)))) ;The list '(4 6 7)
There is one very big difference between this list and arrays or Python lists. You cannot change the values in the list X
. You could make a new list with modified values, but you cannot modify the values in place.
If we ran (cons 5 X)
the list X
will not change. We made a new list where 5 is the first element, but that does not change X
. This may seem like a poor use of memory, but it is not. Everything is managed with pointers. Because the list positions cannot be changed, many lists can overlap in the same areas of memory.
We may want to ask questions about lists. There are three primary predicates about lists.
(null? X) ;Returns True if X is an empty list
(cons? X) ;Returns True if X is a list with one or more values
(list? X) ;Returns True if X is of the data type list
Once we have a list, we can get the first element out of that list. The first
command gets the first element out of a list. The first
command includes error checking and will give you nice warnings.
Both the following commands will get the first value in the list.
;Get First Value in List
(first X) ;Modern Format Returns 4
We can also get the rest of a list. In practice, this means we get the a pointer to the list with the first element removed.
(rest X) ;Modern Format Returns '(6 7)
We need to create lists often. There are two short-cuts that can be used. One is the quoted list. We can write the list out exactly as we want it using a single quote.
(define L '(1 2 3 4 5))
This can cause some unexpected behavior.
(define M '(+ 1 9))
You might think the list M
would contain the number 9. It does not. It contains exactly what was typed. It is a list with three values.
The values are
The addition function
The number `
The number 9
This method is useful for symbolic reasoning, but may not be what we want. If we want the evaluation to happen, we can use the list
command. This does all evaluations first, then makes the list.
(define N (list (+ 1 9)))
Both these can be implemented using the cons
command, but they save us time.
Note
Which command you use should depend on the situation. Use the quoted list when you want to type exactly what you want in the list. Use the list
command when you need the list elements evaluated first. Use the cons
command when you need to programmatically build the list.
If we create a multi-element list, we can access each element using just first
and rest
.
(define Q '(1 2 3 4 5))
(first Q) ; the first element 1
(first (rest Q)) ; The second element 2
(first (rest (rest Q))) ; The third element 3
(first (rest (rest (rest Q)))) ; The fourth element 4
(first (rest (rest (rest (rest Q))))) ; The fifth element 5
We can build very nested list using just these commands. We can build '(1 (2 (3)) (4) () 5)
using just cons
and null
. We always start from the end. Since this example has nested lists, it is useful to build them before added them.
The last element is 5.
(cons 5 null) ;Makes '(5)
Before the 5, we have a null list. We can add a null
as the element at the first on the list.
(cons null (cons 5 null)) ;Makes '(() 5)
The next value working backwards is a list containing the number 4. First, we test building that list. Then we add it to our command sequence.
(cons 4 null) ;Makes '(4)
(cons (cons 4 null) (cons null (cons 5 null))) ;Makes '((4) () 5)
The next element is a nested list with two values '(2 (3))
. The number 2 followed by a list containing the number 3.
(cons 2 (cons (cons 3 null) null)) ; Makes '(2 (3))
(cons (cons 2 (cons (cons 3 null) null))
(cons (cons 4 null)
(cons null (cons 5 null))
)
) ; Makes '((2 (3)) (4) () 5)
The first value is just the number 1. The final command to generate the list is tabbed in for better readability.
(cons 1
(cons (cons 2 (cons (cons 3 null) null))
(cons (cons 4 null)
(cons null
(cons 5 null)
)
)
)
) ; Makes '(1 (2 (3)) (4) () 5)
Using just the seven most basic list commands, we can build any list. The last piece we need is recursive functions to make our lists.
(null? A)
is A a null list(cons? A)
is A a list with at least one element(list? A)
is A a list(cons x A)
add element A to the front of list A(first A)
get the first element of A(rest A)
get the rest of list Anull
a constant for the empty list.
Note
Racket has a feature called a dotted pair. It should be avoided unless you have a very good reason to use it. Since the rest position in the list is a pointer, you can technically ask it to point to anything. This works up to a point. The data is stored, but you no longer have a list.
If we execute (cons 7 6)
we have made a dotted pair '(7 . 8)
. Notice the dot between the values. This is not a list. You cannot use first
or rest
on it. Most list operations will fail. Dotted pairs should be avoided. They have some special uses, but should be avoided in most cases.
Recursive Functions on Lists#
Making and using lists in Racket requires recursive thinking. We implement a few common list methods to show different ways to recursively interact with a list.
List Length#
The first function we look at determines the length of a list. This is a build in command, but it provides a good case study. This function only takes lists as inputs. The input-contract
line has no meaning within the language, it is just commenting style. It tells every only programmer that this function only works in situations where (list? myList)
returns true. If that is not the case, there is no promise this code will work. The output-contract
tells the programmer what to expect as output. In this case, an integer. That integer is the length of the list.
To compute the length of a list recursively, we need a base case. The smallest list in existence is the null list. We know for a fact it as a length of 0 because it contains nothing. We can start with (if (null? myList) 0 ...)
.
What about the recursive case? We need to think of the list in terms of first and rest. When we are concerned about length, all that matters about first is its existence. The first element counts for 1 unit of length. The rest of the list also has a length. We can compute this recursively. We can just ask (length (rest myList))
. We then need to add one to the result to account for the first value.
Racket already has a length
function build it, so we use numItems
instead. This is just to make it clear we are not messing with the built-in length
function. We could overwrite it, but in general it is bad practice to overwrite a built-in. You never know what effects you might have.
The full function is given below.
;Compute Size of a List
;input-contract: (list? myList)
;output-contract: (int? (numItems myList))
; and it is the size of list
(define (numItems myList)
(if (null? myList)
0
(+ 1 (numItems (rest myList)))
)
)
(numItems '(1 2 3 4 5 6)) ;Returns 6
(numItems '()) ;Returns 0
Total List Values#
What if we want to look at a problem where the first value matters? We want to sum up all the values in a list. This time our function acts on a list which must only contain integers. If this is not the case, we make no promises about it working. We might guess that it would probably also work for floats, but might not for strings. This isn’t relevant. The input-contract
only promises it will work on integers. If there are any other data types that happen to work, it is pure luck and not promised.
The function says it will return to us an integer that is the sum of all values. Even if the function worked with other data types, the comment is telling us it is only tested and confirmed in this situation.
Designing the function requires looking at the base case and recursive case. Again, the smallest possible list is the empty list. We need some value for the total of no elements. The most obvious solution is to return 0.
When the list has elements, we need to look at the first and rest. What do we do with each? The first value needs to be added to the running total. If we add all first values until we run out, we will have totaled the entire list. For the rest of the list, we also need to know the total. We can recursively ask what the total of the rest of the list is.
;Total Values in List
;input-contract: (lists? myList) and all elements are integers
;output-contract: (int? (total myList))
;and returns sum of all items
(define (total myList)
(if (null? myList)
0
(+ (first myList) (total (rest myList)))
))
We can expand out the recursive calls one at a time to see how the function works.
(total '(2 4 6 8 10))
(+ 2 (total '(4 6 8 10)))
(+ 2 (+ 4 (total '(6 8 10))))
(+ 2 (+ 4 (+ 6 (total '(8 10)))))
(+ 2 (+ 4 (+ 6 (+ 8 (total '(10))))))
(+ 2 (+ 4 (+ 6 (+ 8 (+ 10 (total null))))))
(+ 2 (+ 4 (+ 6 (+ 8 (+ 10 0)))))
30
Create a List of Even Numbers#
We also need to create lists recursively. This will be built on the cons
command. This function creates a list of even numbers. Generating a sequence of \(n\) even numbers requires us to start at zero and add 2 with each step.
\( \begin{align} & 0, (0+2), (0+2+2), (0+2+2+2), \cdots \\ & 0, 2, 4, 6, \cdots \end{align} \)
This function always starts at zero, but how can we implement that in Racket? We have no global or local variables to assign to zero and update. We need to use function inputs. We can create two functions, the first function (evennums n)
just takes the number of values we need. It defaults our second value in a helper (evennums_h n 0)
.
What is the smallest list of even numbers? It is the empty list which contains zero even numbers. This gives us a base case. If \(n\)=0 then we can just return the null list.
To come up with the recursive case think about how (cons a B)
works. It takes a value and puts that in the front of the list. Think about the list '(0 2 4 6)
. It is a list of the first 4 even numbers. We could also think about it as “the number zero followed by three more even numbers.”
This approach leads us to a recursive definition. We need to (cons currentValue (...))
. To make the recursive list, we need to decrease \(n\) by one because we have added one element to the list. Since we want even numbers, we also increase currentValue
by 2 so that it has the next even value.
The full implementation is given below.
;Create a list of even numbers
;input-contract: (int? n)
;output-contract: (list? (evennums n))
;with the first n even numbers
(define (evennums n)
(evennums_h n 0))
;This function actually solves the problem
;using the start define above
(define (evennums_h n currentValue)
(if (= n 0)
null
(cons currentValue
(evennums_h (- n 1) (+ currentValue 2))))
)
(evennums 200)
Since these functions are pure, we can easily break up each step and evaluate it individually to see what is happening.
(evennums 4)
(evennums_h 4 0)
(cons 0 (evennums_h 3 2))
(cons 0 (cons 2 (evennums_h 2 4)))
(cons 0 (cons 2 (cons 4 (evennums_h 1 6))))
(cons 0 (cons 2 (cons 4 (cons 6 (evennums_h 0 8)))))
(cons 0 (cons 2 (cons 4 (cons 6 null))))
'(0 2 4 6)
Get Element#
A common issue with Racket is the programmers design to have arrays where each index can be accessed. This is possible in Racket, but it is less efficient then a recursive pattern.
This example uses a new command called cond
. The condition allows us to chain together if statements. It is similar to a switch
or elif
sequence in a iterative language. We give a sequence of pairs commands [(condition) (return value when condition is true)]
. The code goes down the chain until it finds a case that is true. There is also a special keywork else
which tells Racket to execute this code if none of the previous conditions were true.
The reason a condition helps us in this example is there are two base cases. We want to access the value at index \(x\) from the list \(A\). It is possible the list will be empty. If the list contains no elements, it doesn’t matter what index you asked for. That index does not exist and we need to return nothing.
That was one base case, what is the other? If we are asked to get the value at index zero, traditionally the first value, this is trivial. We already have a command called first
. We can just call it.
If neither of these cases happen, we need recursion. If we remove the first index from the list, we can imagine all the indexes shifting down by one. In Python, imagine we have L=[2,4,6]
. Right now L[0]
is two and L[1]
is four. If we remove the first element, we are left with L=[4,6]
. The value that used to be at index one is not at index zero. It has shifted down by one.
Each time we take the rest of the list, we can shift the target index down by one. Eventually, it will reach zero or the list will run out of elements. This gives us a recursive call (getElement (- x 1) (rest A))
.
The full implementation is given below.
;Gets the element at index x (starting from 0)
;out of the list A
;input-contract: (int? x) and (list? A)
;output-contact: element at index x or null if index does not exist
(define (getElement x A)
(cond
;No element to return if list is null
[(null? A) null ]
;The first element has a command
[(equal? x 0) (first A)]
;For the rest, remove the first value and decrease counter
[else (getElement (- x 1) (rest A))]
)
)
;the following code will return 5
(getElement 4 '(1 2 3 4 5 6 7 8 9 10) )