Computing Partitions
In an earlier article on optimization, we evaluated all the different ways of splitting 13 items into sets of 5+4+4 and, applying a cost function to each partition, found one that minimised the total cost. It turned out that the partition 13=5+4+4 gave an optimal solution, but we didn't know in advance that this would be the case: perhaps splitting the items into 13=6+4+3 would have been better.
The careful reader may also have noticed that the algorithm we used to generate the partitions produced duplicate entries. This didn't matter in that particular application, as we were just interested in finding an optimal split, but there are times when we want a unique list. In this article we develop an algorithm that generates a list of all the partitions of a given set into n subsets.
Wikipedia gives the following definition of a partition:
A partition of a set X is a set of nonempty subsets of X such that every element x in X is in exactly one of these subsets.
For example, these are all the partitions of the set {1,2,3}:
Partitions of {1,2,3} |
---|
{ {1,2,3} } |
{ {1}, {2,3} } |
{ {1,2}, {3} } |
{ {1,3}, {2} } |
{ {1}, {2}, {3} } |
Note that { {1,3}, {2,3} } is not a partition of {1,2,3} because the element 3 appears in two of the subsets.
We are going to use a recursive algorithm to compute the partitions. There is no way to partition a non-empty set into zero non-empty subsets, and there is no way to partition a set of n elements into more than n non-empty subsets. In these two cases, our function should return an empty list.
There is only one way to split a set of n elements into n subsets (the set of n sets, each containing a single element), and there is only one way of splitting it into exactly 1 subset (namely, the set itself). This gives us the beginnings of our recursive function:
(defn partitions
"Return all partitions of the set `X` into `n` non-empty subsets."
[X n]
(cond
(< n 1) '()
(> n (count X)) '()
(= n 1) (list #{X})
(= n (count X)) (list (set (map hash-set X)))
:else :TODO))
We have the terminal conditions for our recursion, but how do we implement the recursive part? If we remove the first element from the set, we have a set of n-1 elements on which we can recur, so we just have to work out what to do with the first element. It is either in a subset on its own, or in a subset with at least one other element.
To compute partitions where the first element is in a subset on its own, we compute partitions of the remaining elements of size n-1 and build partitions of size n by adding the singleton set of the first element to each partition.
For example, if we are partitioning the set {a,b,c,d} into 3, we start by computing all partitions of {b,c,d} of size 2, then add the set {a} to each partition;
Partitions of {b,c,d} of size 2 | After adding the set {a} |
---|---|
{ {b}, {c,d} } | { {a}, {b}, {c,d} } |
{ {b,c}, {d} } | { {a}, {b,c}, {d} } |
{ {b,d}, {c} } | { {a}, {b,d}, {c} } |
The code to do this looks like:
(let [x (first X)
ps (partitions (disj X x) (dec n))]
(map (fn [p] (conj p #{x})) ps))
To compute partitions where the first element is in a subset with at least one other element, we partition the remaining elements into n subsets and, for each partition, generate n new partitions by adding the first element to each of these n subsets in turn. Continuing the example above:
Partitions of {b,c,d} of size 3
After adding the element a to each subset in turn
{ {b}, {c}, {d} }
{ {a,b}, {c}, {d} }
{ {b}, {a,c}, {d} }
{ {b}, {c}, {a,d} }
It will be useful to introduce a helper function that takes a partition of size n and an element, and returns a list of n partitions as in the above table.
(defn expand-partition
"Given a partition of size n and an element, generate n new
partitions by adding `element` to each subset in turn."
[partition element]
(let [partition (vec partition)]
(for [i (range (count partition))]
(set (update-in partition [i] #(conj % element))))))
With this in hand, we can compute the partitions where the first element is not in a set on its own:
(let [x (first X)
ps (partitions (disj X x) n)]
(mapcat (fn [p] (expand-partition p x)) ps))
To compute all partitions into n subsets, we concatenate the two lists of partitions with lazy-cat
. Putting this all together into our partitons
function:
(defn partitions
"Return all partitions of the set `X` into `n` non-empty subsets."
[X n]
(cond
(< n 1) '()
(> n (count X)) '()
(= n 1) (list #{X})
(= n (count X)) (list (set (map hash-set X)))
:else (let [x (first X)
xs (disj X x)]
(lazy-cat (map (fn [p] (conj p #{x})) (partitions xs (dec n)))
(mapcat (fn [p] (expand-partition p x)) (partitions xs n))))))
(partitions #{"a" "b" "c" "d"} 3)
;; => (#{#{"a"} #{"b"} #{"c" "d"}}
;; #{#{"a"} #{"d"} #{"b" "c"}}
;; #{#{"a"} #{"c"} #{"b" "d"}}
;; #{#{"c"} #{"a" "b"} #{"d"}}
;; #{#{"b"} #{"d"} #{"a" "c"}}
;; #{#{"b"} #{"c"} #{"a" "d"}})
So far, we have only computed partitions of a specified size. We can easily generate a list of all partitions of a given set by calling our partitions
function repeatedly:
(defn all-partitions
[X]
(mapcat (partial partitions X) (range 1 (inc (count X)))))
(all-partitions #{1 2 3})
;; => (#{#{1 2 3}}
;; #{#{1} #{2 3}}
;; #{#{1 2} #{3}}
;; #{#{2} #{1 3}}
;; #{#{1} #{2} #{3}})
A nice feature of Clojure lets us define a function with a different body according to the number of arguments, so we can bring the functionality of all-partitions
into our partitions
function:
(defn partitions
"Return all partitions of the set `X` into `n` non-empty subsets.
When `n` is not specified, return all partitions of `X`"
([X n]
(cond
(< n 1) '()
(> n (count X)) '()
(= n 1) (list #{X})
(= n (count X)) (list (set (map hash-set X)))
:else (let [x (first X)
xs (disj X x)]
(lazy-cat (map (fn [p] (conj p #{x})) (partitions xs (dec n)))
(mapcat (fn [p] (expand-partition p x)) (partitions xs n))))))
([X]
(mapcat (partial partitions X) (range 1 (inc (count X))))))
Coming full circle to our optimization example, when we used the clojure.math.combinatorics/combinations
function to compute partitions of 13=5+4+4, we evaluated (13 5) * (8 4) = (13!/(8!5!))*(8!/(4!4!) = 90090 partitions. Using our new approach, and evaluating all partitions of 13 items into 3, we'd have to check (count (partitions (set (range 13)) 3)
, or 261625 possibilities.
The source code from this article is also available as a Gist.