Linear programming duality

The conventional statement of linear programming duality is completely inscrutable.

  • Prime: maximise b^T x subject to Ax \leq c and x \geq 0.
  • Dual: minimise c^T y subject to A^T y \leq b and y \geq 0.

If either problem has a finite optimum then so does the other, and the optima agree.

do understand concrete examples.  Suppose we want to pack the maximum number vertex-disjoint copies of a graph F into a graph G.  In the fractional relaxation, we want to assign each copy of F a weight \lambda_F \in [0,1] such that the weight of all the copies of F at each vertex is at most 1, and the total weight is as large as possible.  Formally, we want to

maximise \sum \lambda_F subject to \sum_{F \ni v} \lambda_F \leq 1 and \lambda_F \geq 0,

which dualises to

minimise \sum \mu_v subject to \sum_{v \in F} \mu_v \geq 1 and \mu_v \geq 0.

That is, we want to weight vertices as cheaply as possible so that every copy of F contains 1 (fractional) vertex.

To get from the prime to the dual, all we had to was change a max to a min, swap the variables indexed by F for variables indexed by v and flip one inequality.  This is so easy that I never get it wrong when I’m writing on paper or a board!  But I thought for years that I didn’t understand linear programming duality.

(There are some features of this problem that make things particularly easy: the vectors b and c in the conventional statement both have all their entries equal to 1, and the matrix A is 0/1-valued.  This is very often the case for problems coming from combinatorics.  It also matters that I chose not to make explicit that the inequalities should hold for every v (or F, as appropriate).)

Returning to the general statement, I think I’d be happier with

  • Prime: maximise \sum_j b_j x_j subject to \sum_j A_{ij}x_j \leq c_i and x \geq 0.
  • Dual: minimise \sum_i y_i subject to \sum_i A_{ij} y_i \leq b_j and y \geq 0.

My real objection might be to matrix transposes and a tendency to use notation for matrix multiplication just because it’s there.  In this setting a matrix is just a function that takes arguments of two different types (v and F or, if you must, i and j), and I’d rather label the types explicitly than rely on an arbitrary convention.

Random Structures and Algorithms 2017

A partial, chronologically ordered, list of talks I attended at RSA in Gniezno, Poland. Under construction until the set of things I can remember equals the set of things I’ve written about.

Shagnik Das

A family of subsets of [n] that shatters a k-set has at least 2^k elements. How many k-sets can we shatter with a family of size 2^k? A block construction achieves (n/k)^k \approx e^{-k} \binom n k. (Random is much worse.) Can in fact shatter constant fraction of all k-sets. When n = 2^k-1, identify the ground set with \mathbb F_2^k \setminus \{0\}, and colour by \chi_w(v) = v \cdot w for w \in \mathbb F_2^k.

Claim. A k-set is shattered if and only if it is a basis for \mathbb F_2^k.

Proof. First suppose that v_1, \ldots, v_k is a basis. Then for any sequence \epsilon_i, there is a unique vector w such that v_i \cdot w = \epsilon_i. (We are just solving a system of full rank equations mod 2.)

Next suppose that v_1, \ldots, v_k are linearly dependent; that is, that they are contained in a subspace U of \mathbb F_2^k. Choose w orthogonal to U. Then for any u \in U and any w' we have u \cdot w' = u \cdot (w+w'), so two of our colourings agree on v_1, \ldots, v_k. \Box

We finish with the observation that random sets of k vectors are fairly likely to span \mathbb F_2^k: the probability is

    \[ 1 \cdot (1 - 1/2^k) \cdot (1 - 1/2^{k-1}_ \cdot \cdots \cdot (1-1/2) \geq 1 - \sum_j=1^k 1/2^j > 0. \]

Blowing up this colouring gives a construction that works for larger n.

At the other end of the scale, we can ask how large a family is required to shatter every k-set from [n]. The best known lower bound is \Omega(2^k \log n), and the best known upper bound is O(k2^k \log n), which comes from a random construction. Closing the gap between these bounds, or derandomising the upper bound, would both be of significant interest.

Andrew McDowell

At the start of his talk in Birmingham earlier this summer, Peter Hegarty played two clips from Terminator in which a creature first dissolved into liquid and dispersed, then later reassembled, stating that it had prompted him to wonder how independent agents can meet up without any communication. Andrew tackled the other half of this question: how can non-communicating agents spread out to occupy distinct vertices of a graph? He was able to analyse some strategies using Markov chains in a clever way.

Tássio Naia

A sufficient condition for embedding an oriented tree on n vertices into every tournament on n vertices that implies that almost all oriented trees are unavoidable in this sense.

Matchings without Hall’s theorem

In practice matchings are found not by following the proof of Hall’s theorem but by starting with some matching and improving it by finding augmenting paths.  Given a matching M in a bipartite graph on vertex classes X and Y, an augmenting path is a path P from x \in X \setminus V(M) to y \in Y \setminus V(M) such that ever other edge of P is an edge of M.  Replace P \cap M by P \setminus M produces a matching M' with |M'| = |M| + 1.

Theorem.  Let G be a spanning subgraph of K_{n,n}.  If (i) \delta(G) \geq n/2 or (ii) G is k-regular, then G has a perfect matching.

Proof. Let M = \{x_1y_1, \ldots, x_ty_t\} be a maximal matching in G with V(M) \subset V(G).

(i) Choose x \in X \setminus V(M), y \in Y \setminus V(M).  We have N(x) \subseteq V(M) \cap Y and N(y) \subseteq V(M) \cap X.  Since \delta(G) \geq n/2 there is an i such that x is adjacent to y_i and y is adjacent to x_i.  Then xy_ix_iy is an augmenting path.

(ii) Without loss of generality, G is connected.  Form the directed graph D on v_1, \ldots, v_t by taking the directed edge \vec{v_iv_j}  (i \neq j) whenever x_iy_j is an edge of G.  Add directed edges arbitrarily to D to obtain a k-regular digraph D', which might contain multiple edges; since G is connected we have to add at least one directed edge.  The edge set of D' decomposes into directed cycles.  Choose a cycle C containing at least one new edge of D', and let P be a maximal sub-path of C containing only edges of D.  Let v_i, v_j be the start- and endpoints of P respectively.  Then we can choose y \in N(x_i) \setminus V(M) and x \in N(y_j) \setminus V(M), whence yQx is an augmenting path, where Q is the result of “pulling back” P from D to G, replacing each visit to a v_s in D by use of the edge y_sx_s of G. \square


Matchings and minimum degree

A Tale of Two Halls

(Philip) Hall’s theorem.  Let G be a bipartite graph on vertex classes X, Y.  Suppose that,  for every S \subseteq X, |N(S)| \geq |S|.  Then there is a matching from X to Y.

This is traditionally called Hall’s marriage theorem.  The picture is that the people in X are all prepared to marry some subset of the people in Y.  If some k people in X are only prepared to marry into some set of k-1 people, then we have a problem; but this is the only problem we might have.  There is no room in this picture for the preferences of people in Y.

Proof. Suppose first that |N(S)| > |S| for all S \subset X.  Then we can match any element of X arbitrarily to a neighbour in Y and obtain a smaller graph on which Hall’s condition holds, so are done by induction.

Otherwise |N(S)| = |S| for some S \subset X.  By induction there is a matching from S to N(S).  Let T = X \setminus S.  Then for any U \subseteq T we have

    \[|N(U) \setminus N(S)| + |N(S)| = |N(U \cup S)| \geq |U \cup S| = |U| + |S|\]

hence |N(U) \setminus N(S)| \geq |U| and Hall’s condition holds on (T, Y \setminus N(S)).  So by induction there is a matching from T to Y \setminus N(S), which together with the matching from S to N(S) is a matching from X to Y. \square

Corollary 1.  Every k-regular bipartite graph has a perfect matching.

Proof. Counted with multiplicity, |N(S)| = k|S|.  But each element of Y is hit at most k times, so |N(S)| \geq k|S| / k = |S| in the conventional sense. \square

Corollary 2. Let G be a spanning subgraph of K_{n,n} with minimum degree at least n/2.  Then G has a perfect matching.

This is very slightly more subtle.

Proof. If |N(S)| = n there is nothing to check.  Otherwise there is a y \in Y \setminus N(S).  Then N(y) \subseteq X \setminus S and |N(y)| \geq n/2, so |S| \leq n/2.  But |N(S)| \geq n/2 for every S. \square

Schrijver proved that a k-regular bipartite graph with n vertices in each class in fact has at least \left(\frac {(k-1)^{k-1}} {k^{k-2}}\right)^n perfect matchings.  For minimum degree k we have the following.

(Marshall) Hall’s theorem.  If each vertex of X has degree at least k and there is at least one perfect matching then there are at least k!.

This turns out to be easy if we were paying attention during the proof of (Philip) Hall’s theorem.

Proof. By (Philip) Hall’s theorem the existence of a perfect matching means that (Philip) Hall’s condition holds.  Choose a minimal S on which it is tight.  Fix x \in S and match it arbitrarily to a neighbour y \in Y.  (Philip) Hall’s condition still holds on (S - x, Y-y) and the minimum degree on this subgraph is at least k-1, so by induction we can find at least (k-1)! perfect matchings.  Since there were at least k choices for y we have at least k! perfect matchings from S to N(S).  These extend to perfect matchings from X to Y as in the proof of (Philip) Hall’s theorem. \square

Thanks to Gjergji Zaimi for pointing me to (Marshall) Hall’s paper via MathOverflow.

Block partitions of sequences

Let L be a line segment of length l broken into pieces of length at most 1. It’s easy to break L into k blocks (using the preexisting breakpoints) that differ in length by at most 2 (break at the nearest available point to l/k, 2l/k etc.).

In the case where each piece has length 1 and the number of pieces isn’t divisible by k, we can’t possibly do better than a maximum difference of 1 between blocks.  Is this achievable in general?

In today’s combinatorics seminar Imre Bárány described joint work with Victor Grinberg which says that the answer is yes.

Continue reading “Block partitions of sequences”

The Namer-Claimer game

Consider the following game played on the integers from 1 to n.  In each round Namer names a forbidden distance d, then Claimer claims a subset of [n] that does not contain any two integers at distance d.  After finitely many rounds, Claimer will have claimed sets that cover the whole of [n], at which point the game ends.  How many rounds will there be with best play?

I’ve run this question out at a number of workshops and open problems sessions, and haven’t yet heard back about a success.  I’ll explain the known upper and lower bounds below the fold, but encourage you to spend a few minutes thinking about it before taking a look.

Continue reading “The Namer-Claimer game”

Nowhere zero 6-flows

A flow on a graph G is an assignment to each edge of G of a direction and a non-negative integer (the flow in that edge) such that the flows into and out of each vertex agree.  A flow is nowhere zero if every edge is carrying a positive flow and (confusingly) it is a k-flow if the flows on each edge are all less than k.  Tutte conjectured that every bridgeless graph has a nowhere zero 5-flow (so the possible flow values are 1, 2, 3, 4).  This is supposed to be a generalisation of the 4-colour theorem.  Given a plane graph G and a proper colouring of its faces by 1, 2, 3, 4, push flows of value i anticlockwise around each face of G.  Adding the flows in the obvious way gives a flow on G in which each edge has a total flow of |i-j| \neq 0 in some direction.

Seymour proved that every bridgeless graph has a nowhere zero 6-flow.  Thomas Bloom and I worked this out at the blackboard, and I want to record the ideas here.  First, a folklore observation that a minimal counterexample G must be cubic and 3-connected.  We will temporarily allow graphs to have loops and multiple edges.

We first show that G is 3-edge-connected.  It is certainly 2-edge-connected, else there is a bridge.  (If G is not even connected then look at a component.)  If it were only 2-edge-connected then the graph would look like this.



Contract the top cross edge.


If the new graph has a nowhere zero 6-flow then so will the old one, as the flow in the bottom cross edge tells us how much flow is passing between the left and right blobs at the identified vertices and so the flow the we must put on the edge we contracted.  So G is 3-edge connected.

Next we show that G is 3-regular.  A vertex of degree 1 forces a bridge; a vertex of degree 2 forces the incident edges to have equal flows, so the two edges can be regarded as one.  So suppose there is a vertex v of degree at least 4.


We want to replace two edges xv and yv by a single edge xy to obtain a smaller graph that is no easier to find a flow for.  The problem is that in doing so we might produce a bridge.

The 2-edge-connected components of  G-v are connected by single edges in a forest-like fashion.  If any of the leaves of this forest contains only one neighbour of v then there is a 2-edge-cut, so each leaf contains at least two neighbours of v.


If there is a component of the forest with two leaves then choose x and y to be neighbours of v from different leaves of that component.


Otherwise the 2-edge-connected components of G-v are disconnected from each other.  Now any such component C must contain at least 3 neighbours of v, else there is a 2-edge-cut.  If some C contains 4 neighbours of v then we can choose x and y to be any two of them.  Otherwise all such C contain exactly 3 neighbours of v, in which case there must be at least two of them and we can choose x and y to be neighbours of v in different components.


So G is 3-regular and 3-edge-connected.  If G is only 1-connected then there is no flow between 2-connected components, so one of the components is a smaller graph with no nowhere zero 6-flow.  If G is only 2-connected then because 3 is so small we can also find a 2-edge-cut.

Finally, we want to get rid of any loops and multiple edges we might have introduced.  But loops make literally no interesting contribution to flows and double edges all look like


and the total flow on the pair just has to agree with the edges on either side.

We’ll also need one piece of magic (see postscript).

Theorem. (Tutte) Given any integer flow of G there is a k-flow of G that agrees with the original flow mod k.  (By definition, flows of j in one direction and k-j in the other direction agree mod k.)

So we only need to worry about keeping things non-zero mod k.

The engine of Seymour’s proof is the following observation.

Claim. Suppose that G = G_0 \cup C_1 \cup \cdots \cup C_t where each C_i is a cycle and the number of new edges when we add C_i to G_0 \cup C_1 \cup \cdots \cup C_{i-1} is at most 2.  Then G has a 3-flow which is non-zero outside G_0.  

Write E_i for the set of edges added at the ith stage.  We assign flows to C_t, \ldots, C_1 in that order.  Assign a flow of 1 in an arbitrary direction to C_t; now the edges in E_t have non-zero flow and will never be touched again.  At the next stage, the edges in E_{t-1} might already have some flow; but since |E_{t-1}| \leq 2 there are only two possible values for these flows mod 3.  So there is some choice of flow we can put on C_{t-1} to ensure that the flows on E_{t-1} are non-zero.  Keep going to obtain the desired 3-flow, applying Tutte’s result as required to bring values back in range.

Finally, we claim that the G we are considering have the above form with G_0 being a vertex disjoint union of cycles.  Then G_0 trivially has a 2-flow, and 3 times this 2-flow plus the 2-flow constructed above is a nowhere-zero 6-flow on G.

For F \subseteq G, write [F] for the largest subgraph of G that can be obtained as above by adding cycles in turn, using at most two new edges at each stage.  Let G_0 be a maximal collection of vertex disjoint cycles in G with [G_0] connected, and let H = G - V(G_0).  We claim that V(H) is empty.  If not, then the 2-connected blocks of H are connected in a forest-like fashion; let H_0 be one of the leaves.


By 3-connectedness there are three vertex disjoint paths from H_0 to G_0. At most one of these paths travels through H - H_0; let x and y be endpoints of two paths that do not.  These paths must in fact be single edges, as the only other way to get to G_0 would be to travel through H - H_0.  Finally, since H_0 is 2-connected it contains a cycle through x and y, contradicting the choice of G_0.

Postscript. It turns out that Tutte’s result is far from magical; in fact its proof is exactly what it should be.  Obtain a directed graph H from G by forgetting about the magnitude of flow in each edge (if an edge contains zero flow then delete it).  We claim that every edge is in a directed cycle.


Indeed, choose a directed edge \vec{xy}.  Let Y be the set of vertices that can be reached by a directed path from y and let X be the set of vertices that can reach X by following a directed path.  If \vec{xy} is not in any directed cycle then X and Y are disjoint and there is no directed path from Y to X.  But then there can be no flow in \vec{xy}, contradicting the definition of H.

So as long as there are edges with flow value at least k, find a directed cycle containing one of those edges and push a flow of k through it in the opposite direction.  The total flow in edges with flow at least k strictly decreases, so we eventually obtain a k-flow.

Ultrafilter quantifiers and Hindman’s theorem

A filter on \mathbb N is a consistent notion of largeness for subsets of \mathbb N. “Largeness” has the following properties.

  • if A is large and A \subseteq B then B is large
  • if A and B are large then A \cap B is large
  • the empty set is not large

At most one of A and A^\text{c} is large; an ultrafilter is a filter which always has an opinion about which it is.

  • either A or A^\text{c} is large

The usual notation for “A is large” is A \in \U, where \U is the ultrafilter. This casts ultrafilters as sets, rather than notions of largeness. To bring the notion of largeness to the foreground we can use ultrafilter quantifiers; we write \U(x)\ A(x), read “for \U-most x, A(x) holds” (where we have also identified A with the predicate “is a member of A“).

  • if \U(x)\ \phi(x) and \phi \implies \psi then \U(x)\ \psi(x)
  • if \U(x)\ \phi(x) and \U(x)\ \psi(x) then \U(x)\ \phi(x) \wedge \psi(x)
  • \forall x\ \phi(x) \implies \U(x)\ \phi(x) \implies \exists x\ \phi(x)
  • \neg [\U(x)\ \phi(x)] \iff \U(x)\ \neg \phi(x)

From this point of view \forall x\ \phi(x) says that the set of elements with property \phi is everything, and \exists x\ \phi(x) that the set of elements with property \phi is non-empty. \U behaves like a mixture of \forall and \exists, with the considerable advantage that logical operations pass through \U unchanged without having to worry about De Morgan’s laws.

Adding ultrafilters

It turns out that the set of ultrafilters on \mathbb N is a model for the Stone–Čech compactification \beta \mathbb N of (\mathbb N, +). \mathbb N is embedded as the set of principal ultrafilters (“A is large if and only if n \in A“), and the addition on \mathbb N extends to \beta \mathbb N:

    \[\U + \V = \{A : \{x : A-x \in \V\} \in \U\}.\]

(Here A-x should be interpreted as (A-x) \cap \mathbb N.) But to use this addition with quantifiers all we need to know is that (\U+\V)\ \phi(x) \iff \U(x)\ \V(y)\ \phi(x+y). If you can see that from the first definition then your brain is wired differently from mine.

It is a fact that there exist idempotent ultrafilters, with \U + \U = \U. Given such a \U, we can play the following game. Suppose that \U(x)\ \phi(x). Then (\U+\U)(x)\ \phi(x), so \U(x)\ \U(y)\ \phi (x+y) and therefore \U(x)\ \U(y)\ \phi(x) \wedge \phi(y) \wedge \phi (x+y) (by ANDing with the original assertion, and the original assertion with the dummy variable x replaced by y). In particular, \exists x\ \U(y)\ \phi(x) \wedge \phi(y) \wedge \phi (x+y). But now we can fix a good choice of x and repeat the whole process with \psi(y) = \phi(x) \wedge \phi(y) \wedge \phi (x+y) in place of \phi to get that

    \[\exists x\ \exists y\ \U(z)\ \phi(x) \wedge \phi(y) \wedge \phi (x+y) \wedge \phi(z) \wedge \phi(x+z) \wedge \phi(y+z) \wedge \phi (x+y+z).\]

Iterating, we eventually obtain an infinite sequence x_1, x_2, \ldots such that \phi holds for every sum of finitely many of the x_i. Together with the observation that whenever \mathbb N is finitely coloured exactly one of the colour classes is \U-large this proves Hindman’s theorem, that we can always find an infinite sequence x_1, x_2, \ldots such that every sum of finitely many terms has the same colour.

A version of this argument with more of the details filled in is on the Tricki.

Piecewise syndetic and van der Waerden

Joel Moreira has just proved that whenever the natural numbers are finitely coloured we can find x and y such that x, x+y and xy are all the same colour.  He actually proves a much more general result via links to topological dynamics, but he includes a direct proof of this special case assuming only a consequence of van der Waerden’s theorem related to piecewise syndetic sets.  I want to record the details here so that I remember them in future.

Syndetic, thick and piecewise syndetic

A set of natural numbers is thick if it contains (arbitrarily) long intervals, and syndetic if it has bounded gaps. It is piecewise syndetic if it has bounded gaps on long intervals (the same bound for each interval).

Piecewise syndetic sets are partition regular:

Claim. If S = A \cup B is piecewise syndetic then either A or B is. (This extends to partitions into more parts by induction.)

Proof. Choose intervals I_n of length n on which the gaps in S are bounded by c. Suppose that A is not piecewise syndetic. Then for every d there is an n_d such that A \cap I_{n_d} has a gap more than d points wide. So B \cap I_{n_d} contains an interval of length d on which its gaps are bounded by c. \ \Box

It follows that, whenever the natural numbers are finitely coloured, one of the colour classes is piecewise syndetic. So to prove van der Waerden’s theorem it would suffice to show that every piecewise syndetic set has long arithmetic progressions. We’ll prove this apparent strengthening, assuming van der Waerden’s theorem itself.

Claim. If S is piecewise syndetic, then it contains long APs.

Proof. If S has long intervals with gaps bounded by c, then the union of the c translates S+1, \ldots, S+c of S contains long intervals. So S+1, \ldots, S+c can be viewed as a c-colouring of a set of long intervals, hence one of them contains long APs by van der Waerden’s theorem. But they are all translates of S, so S contains long APs. \ \Box

In fact we can say much more: for every k there is a d such that the set of a such that the AP of length k and common difference d starting at a is contained in S is piecewise syndetic. To see this, let B be a block of consecutive integers sufficiently long that whenever it is c-coloured it contains an AP of length k. Pack translates of B greedily into the intervals in S+1, \ldots, S+c. For each translate B_i of B there is a triple (a_i, d, j) such that B_i \cap S_j contains the AP of length k and common difference d starting at a_i. Now the set of a_i is piecewise syndetic (it has long intervals with gaps bounded by 2|B|). And there are only finitely many choices for d (at most |B|/(k-1)) and j (at most c). So by the first claim there is some choice of (d,j) such that the corresponding a_i are piecewise syndetic. That is, the set S \cap (S - d) \cap \cdots \cap (S-(k-1)d) is piecewise syndetic. This is Theorem 5.1 in Moreira’s paper, with F = [k] and n = d. The case of general F follows by taking k = \max F.