4 Programming in Curry

4.2 Lists

4.2.1 Notation

A List is a simple algebraic polymorphic datatype defined by two constructors conventionally referred to as Nil and Cons. Within the Curry language, the datatype “List of a” would be declared as:

data List a = Nil | Cons a (List a)

Because lists are one of the most frequently used types in functional, logic and functional logic programs, many languages offer several special notations for lists. In Curry, the type “List of a”, where a is a type variable that stands for any type, is predefined and denoted by [a]. Likewise, [] denotes the constructor Nil, the empty list, and “:” denotes the constructor Cons, which takes an element of type a and a list of a’s. Thus, with a syntax that is not legal in Curry, but is quite expressive, the above declaration would look like:

data [a] = [] | a : [a]

The expression (u:v) denotes the list with the first element u followed by the list v. The infix operator “:”, which read “cons”, is predefined, right associative and has precedence 5. This implies that u:v:w is parsed as u:(v:w).

A list can also be denoted by enumerating its elements, e.g., “[u,v,w]” is a list containing three elements, “u”, “v” and “w”, i.e., it is just another notation for “u:v:w:[]”. This notation can be used with any number of elements. The elements are enclosed in brackets and separated by commas. This notation has several advantages over the standard algebraic notation: lists stand out in a program and references to lists are textually shorter. In particular, the number of parentheses occurring in the text is reduced. This claim can be easily verified by comparing the built-in notation with the ordinary notation.

The type list is polymorphic, which means that different lists can have elements of different types. However, all the elements of a particular list must have the same type. The following annotated examples show this point [Browse Program][Download Program]:

-- list of integers
digits = [0,1,2,3,4,5,6,7,8,9]
-- list of characters, equivalent to "Pakcs", print with putStr
string = [’P’,’a’,’k’,’c’,’s’]
-- list of list of integers
matrix = [[1,0,2],[3,7,2],[2,8,1],[3,3,4]]

Other special notations available for lists are described in Sections 4.2.3 and 4.2.4.

4.2.2 Inductive Definitions

Many elementary functions on lists are defined by an induction similar to that available for the naturals. The cases of the induction are conveniently defined by different rules using pattern matching. For lists, the base case involves defining a function for [] whereas the inductive case involves defining the function for a list (u:v) under the assumption that the value of the function for v is available. In a program, this is expressed by a recursive call. The function that counts the number of elements of a list is emblematic in this respect:

len []    = 0
len (u:v) = 1 + len v

For computing the length of a list, the value of u is irrelevant and u should be replaced by an anonymous variable in the above definition.

Exercise 7

Code an inductively defined function that takes a list of integers and returns the sum of all the integers in the list. Hint: the function should return 0 for an empty list. [Browse Answer][Download Answer]

The prelude defines many useful functions on lists, e.g., “++” for concatenation, “!!” for indexing, i.e., (l!!i) is the i-th (starting from 0) element of l, etc. We will use some of these functions, after providing a brief explanation, in this section. We might also re-define some functions already available in the prelude or other libraries when they make good examples. E.g., the function len discussed above is equivalent to the function length of the prelude. In Section 4.2.5, we will present the most important list functions available in the prelude.

Functions inductively defined are easy to code, understand and evaluate. Sometimes they may be inefficient. Below are two definitions of a function to reverse a list. For long lists, the second one is much more efficient.

slowRev [] = []
slowRev (u:v) = slowRev v ++ [u]
fastRev l = aux l []
  where aux [] r = r
        aux (u:v) r = aux v (u:r)

A function inductively defined performs a “traversal” of its argument. During this traversal some computation is performed on each element of the list—this is referred to visiting a cons—and the result combined with a recursive invocation of the function. Loosely speaking, the visit can be performed either before the recursive call, or after, or both. The following example shows how to subtract the minimum element of a list of integers from all the elements of the list. The function performs a single traversal of its argument. The minimum of the list is computed (as much as feasible) before the recursive call. The subtraction is computed after the recursive calls (otherwise the minimum could not be known) [Browse Program][Download Program]:

submin []     = []
submin (x:xs) = fst (aux (x:xs) x)
  where aux []     m = ([],m)
        aux (y:ys) m = let (zs,n) = aux ys (min y m)
                        in (y-n:zs,n)

The function fst, which returns the first element of a pair, is defined in the prelude. The function min, which returns the minimum of two integers, is defined in the standard prelude.

More complicated computations may lead to more complicated inductive definitions. A discussion on the structure and the design of inductively defined function is in [7].

Exercise 8

Code an inductively defined function that transposes a matrix represented by a list of lists (all of the same length). [Browse Answer][Download Answer]

There are a couple of noteworthy alternatives to directly defining inductive functions. One involves higher-order list functions. Some of these functions are presented in Section 4.2.6. The other involves narrowing. Lists are a particularly fertile ground for narrowing. Below are two definitions of the function that computes the last element of a list. The first definition is inductive, whereas the second is narrowing-based.

inductLast [x]     = x
inductLast (x:y:z) = inductLast (y:z)
narrowLast (_++[e]) = e

Observe that narrowLast computes the result by solving the equation “l == _++[e]” in which “l” is the argument of the call.

4.2.3 Ranges

A special notation is available to define lists containing ranges of integers. The most common of this notation is “[e_1 .. e_2]” which denotes the list “[e_1,e_1+1,e_1+2,,e_2]”. For example:


Prelude> [2 .. 5]
Result: [2,3,4,5] ?
Prelude>

Similarly, the expression “[e ..]” denotes the infinite list of all the integers starting from e. This list cannot be printed in its entirety, but it can be used in a program if only a finite portion of the list is needed, because the evaluation strategy is lazy.

The elements in the lists defined by the above expressions are consecutive, i.e., the distance between adjacent elements is one. The above expressions can be generalized to produce lists where the distance between adjacent elements is a constant greater than one. This distance is inferred from the first two elements of the expression. For example:


Prelude> [2, 6 .. 20]
Result: [2,6,10,14,18] ?
Prelude>

Likewise, “[2, 6 ..]” generates the infinite list “[2,6,10,14,]”.

Ranges can be defined using ordinary functions. The prelude defines four functions whose names start with enumFrom. These functions define in the ordinary syntax the notations for ranges.

4.2.4 Comprehensions

Another useful notation involving lists goes under the name of list comprehension. A list comprehension is a notation to construct a list from one or more other lists called generators. It goes without saying that ranges are simple generators. For example, the infinite sequence of square and triangular numbers are obtained as follows [Browse Program][Download Program]:

squares   = [x * x | x <- [0 ..]]
triangles = [x * (x+1) div 2 | x <- [0 ..]]

A generator is an expression of the form var <- list. Generators can be nested and/or combined with guards. A guard is a Boolean expression that filters the elements produced by the generator. For example, if isPrime is a predicate telling whether an integer greater than 2 is a prime number, the following comprehension is the sequence of the prime numbers [Browse Program][Download Program]:

primes = [x | x <- [2 ..], isPrime x]

In this example, the guard is the Boolean expression (isPrime x). The elements produced by the generator are passed to the comprehension if and only if the guard holds.

Generators are considered to be nested from left to right. The following example shows how to compute pairs where the second component is not greater than the first [Browse Program][Download Program]:

lexPairs = [(x,y) | x <- [0 .. 3], y <- [x .. 3]]

This simple example shows that the second generator (y<-[x .. 3]) is nested within the first one, since it references the generated elements.

Exercise 9

Compute the Fibonacci sequence using a list comprehension. Hint: compute a list of pairs of numbers where each pair contains two consecutive Fibonacci numbers. [Browse Answer][Download Answer]

4.2.5 Basic Functions

The PAKCS compiler/interpreter of Curry is distributed with the prelude, a collection of primitive and fundamental types and functions, and with several libraries. The prelude and some of these libraries contain useful list functions. In this section, we informally discuss some of these functions. The currydoc documentation utility, which is distributed with PAKCS, should be used for an exhaustive up-to-date consultation of the content of these libraries.

Name Description Example(s)
head First element of a list head [1,2]=1; head [] fails
tail All the elements but the first tail [1,2]=[2]; tail [] fails
length Length length [1,2]=2
null Tell whether it is nil null [1,2]=False
++ Concatenate two lists [1,2]++[3]=[1,2,3]
!! n-th element of a list [1,2]!!1=[2]; [1,2]!!4 fails
reverse Reverse the order of the elements reverse [1,2]=[2,1]
concat Concatenate all the lists of a list concat [[1,2],[3]]=[1,2,3]
take List of the first n elements take  2 [1,2,3]=[1,2]
drop All elements but the first n drop  2 [1,2,3]=[3]
and Boolean conjunction and [True,False,True]=False
or Boolean disjunction or [True,False,True]=True
elem Whether a value is in a list elem  2 [1,3,5]=False
nub Remove duplicates nub [1,2,2]=[1,2]
delete Remove the first occurrence of a value delete  2 [2,1,2]=[1,2];
delete  2 [1]=[1]

Many more functions that operate on lists are defined in the libraries of the PAKCS distribution (e.g., see the library Data.List which contains the definition of nub and delete discussed above). The above table is intended to give only a feeling of what is available.

4.2.6 Higher-order Functions

Lists are commonly used to represent collections of elements. Some computations of a list can be expressed by repeatedly applying another, somewhat simpler, computation to all the elements of the collection. This section discusses some frequently occurring situations of this kind.

The simplest case is when a list, which we refer to as the result list, is obtained from another list, which we refer to as the argument list, by applying the same function, say f, to all the elements of the argument list. This is easily accomplished by defining a new function, say flist since its analogy to f, as follows:

flist []     = []
flist (x:xs) = f x : flist xs

Although trivial, the definition of flist can be avoided altogether using the function map, provided by the prelude. The function map is higher-order in that it takes as an argument the function, in this example f, that is applied to all the arguments of the list. Thus, the function flist defined above is the same as map  f.
The following code, taken from the prelude, shows the type and the definition of map:

map          :: (a->b) -> [a] -> [b]
map _ []     = []
map f (x:xs) = f x : map f xs

It can be seen that the first argument of map is a function from any type a to any type b. The second argument of map is a list whose elements must have, of course, type a. The result is a list of type b. For example, suppose that isEven is a function telling whether an integer is even. Then, the expression (map isEven [0,1,2,3]) evaluates to [True,False,True,False].

A second frequently used higher-order function on lists is filter. As the name suggests, filter is used to filter the elements of a list that satisfy some criterion expressed by a predicate.
The following code, taken from the prelude, shows the type and the definition of filter:

filter          :: (a -> Bool) -> [a] -> [a]
filter _ []     = []
filter p (x:xs) = if p x then x : filter p xs else filter p xs

It can be seen that the first argument of filter is a function from any type a to Bool, i.e., a predicate. The second argument of map is a list whose elements must have, of course, type a. The result is again a list of type a. The elements of the result are the elements of the second argument that satisfy the predicate. For example, as before, suppose that isEven is a function telling whether an integer is even. Then, the expression (filter  isEven  [0,1,2,3]) evaluates to [0,2].

The last higher-order function operating on lists that we describe in this section is used to “combine together” all the elements of a list. For example, a function that adds all the elements of a list of integers can be defined using a higher-order function and the ordinary addition on integers. Several options should be considered, e.g., whether the elements of a list are composed starting with the first or the last one, whether the list can be empty and thus a default value must be supplied, etc. The prelude contains a family of functions, referred to as folds for this purpose. The names of these functions starts with “fold”.
The following code, taken from the prelude, shows the type and the definition of foldr:

foldr            :: (a->b->b) -> b -> [a] -> b
foldr _ z []     = z
foldr f z (x:xs) = f x (foldr f z xs)

For example, functions that compute the sum, product and maximum of all the elements of a list of integers are easily defined through foldr as follows [Browse Program][Download Program]:

sumList  = foldr (+) 0
prodList = foldr (*) 1
maxList  = \l -> foldr max (head l) (tail l)

The last function is more complicated than the previous two, because it is meaningful only for non-empty lists. The function foldr1, defined in the prelude, would simplify our definition of maxList.

4.2.7 Set Functions

A non-deterministic function f may return different results for some combination of arguments. The programmer has no control over which of these results an invocation of f returns. Any result returned by f is completely unrelated to any other result. For some problems, it may be convenient to consider all the results returned by f as a whole. The device that returns all the results returned by f for some combination of arguments is called the set function of f and is typically denoted f_𝒮. For example, given:

f x = x ? x+1

the value of f_𝒮 x is the set {x,x+1} where the details of the representation of this set are irrelevant to our discussion.

The definition of the set function of a function f separates the non-determinism of f from any non-determinism of the arguments to which f_𝒮 is applied [4]. In particular, f_𝒮 is deterministic for any f. Referring to the example above, f_𝒮 (2 ? 4) evaluates to {2,3} and {4,5} in which the non-determinism originates entirely from the argument.

To ease the definition of set functions, there is a library Control.Search.SetFunctions. This library defines both a suitable abstraction of a type set to represent the results of a set function application, and a family of functions set0, set1,   Gives a function f of arity n, the set function of f is provided by the expression setn f. The first argument of setn must be the identifier of a function of arity n.

For example, consider the problem of computing all the subsets of a set S. This is called the powerset of S. Let us represent a set with a list. This representation requires some care to ensure that duplicate elements in a list and the order of the elements in a list are not observable. We ignore these issues since they are irrelevant to our discussion. The following non-deterministic function returns a subset of its argument, a set represented as a list [Browse Program][Download Program]:

subset []     = []
subset (x:xs) = x:subset xs
subset (_:xs) =   subset xs

In PAKCS, with the help of the library Control.Search.SetFunctions, we define the power set of a set as follows:

powerset s = set1 subset s  -- powerset is the set function of subset

For example, the prerequisites for the undergraduate Computer Science courses at Portland State are abstracted by 16 rules as follows [Browse Program][Download Program]:

isPrereqOf 162 = 161
isPrereqOf 163 = 162
isPrereqOf 200 = 162
...
isPrereqOf 303 = 252
isPrereqOf 303 = 300
isPrereqOf 350 = 252

The meaning is that, e.g., 162 is a direct prerequisite of both 163 and 200 and that, e.g., both 252 and 300 are direct prerequisites of 303. Observe that IsPrereqOf is a many-to-many relation.

The function to compute all the direct prerequisites of a course is shown below [Browse Program][Download Program]:


isPrereqOf_𝒮 course = set1 IsPrereqOf course

4.2.8 Narrowing

Narrowing is a convenient programming feature when dealing with lists. Lists are frequently used to represent collections of elements. Sometimes the problem is to find in a list either elements or sublists that satisfy certain relationships. The programmer can either code functions to compute these elements or express the relationships using variables for these elements and let narrowing compute the elements by instantiating the variables. Generally, the latter leads to simpler and more declarative programs.

For example, consider a program that plays the game of poker. A hand is represented by a list of 5 cards. Suppose that the problem is to find whether 4 out of the 5 cards have the same rank, i.e., the hand is a four-of-a-kind. A narrowing-based solution removes one card from the hand so that the remaining 4 cards have the same rank. The following function takes a hand and returns a Maybe type. If the hand is a four-of-a-kind, the function returns just the rank of the four cards, otherwise it returns nothing. [Browse Program][Download Program]:

isFour (x++[_]++z) | map rank (x++z) == [r,r,r,r] = Just r where r free
isFourdefault _ = Nothing

The card removed from the hand is represented by the anonymous variable in the functional pattern. This card is non-deterministically selected. The remaining cards are represented by x and z. Additionally, the condition of the first rule imposes that all the cards in x and z have the same rank, r. The rank, too, is non-deterministically selected by solving the equation “map rank (x++z) == [r,r,r,r]”. If the condition succeeds, there is obviously a unique value for all these variables. If the first rule cannot be applied, it must be that there are no four cards of the same rank in the hand. The default rule reports this condition

The advantage of the narrowing-based approach over more conventional approaches is that no instructions need to be coded both to isolate the card that does not contribute to the four-of-a-kind nor to find the rank of the four-of-a-kind when it exists.

The above example is typical of situations in which a collection contains elements that must satisfy a certain conditions. Since lists are implicitly ordered, conditions involving the position of elements in a collection can also be conveniently expressed using narrowing.

Exercise 10

Similar to the example just discussed, code a function that tells whether a hand in a game of poker is a full house. Hint: [Browse Cards.curry][Download Cards.curry] defines suits, ranks, etc. [Browse Answer][Download Answer].