Random Structures and Algorithms 2017

A partial, chronologically ordered, list of talks I attended at RSA in Gniezno, Poland. Under construction until the set of things I can remember equals the set of things I’ve written about.

Shagnik Das

A family of subsets of [n] that shatters a k-set has at least 2^k elements. How many k-sets can we shatter with a family of size 2^k? A block construction achieves (n/k)^k \approx e^{-k} \binom n k. (Random is much worse.) Can in fact shatter constant fraction of all k-sets. When n = 2^k-1, identify the ground set with \mathbb F_2^k \setminus \{0\}, and colour by \chi_w(v) = v \cdot w for w \in \mathbb F_2^k.

Claim. A k-set is shattered if and only if it is a basis for \mathbb F_2^k.

Proof. First suppose that v_1, \ldots, v_k is a basis. Then for any sequence \epsilon_i, there is a unique vector w such that v_i \cdot w = \epsilon_i. (We are just solving a system of full rank equations mod 2.)

Next suppose that v_1, \ldots, v_k are linearly dependent; that is, that they are contained in a subspace U of \mathbb F_2^k. Choose w orthogonal to U. Then for any u \in U and any w' we have u \cdot w' = u \cdot (w+w'), so two of our colourings agree on v_1, \ldots, v_k. \Box

We finish with the observation that random sets of k vectors are fairly likely to span \mathbb F_2^k: the probability is

    \[ 1 \cdot (1 - 1/2^k) \cdot (1 - 1/2^{k-1}_ \cdot \cdots \cdot (1-1/2) \geq 1 - \sum_j=1^k 1/2^j > 0. \]

Blowing up this colouring gives a construction that works for larger n.

At the other end of the scale, we can ask how large a family is required to shatter every k-set from [n]. The best known lower bound is \Omega(2^k \log n), and the best known upper bound is O(k2^k \log n), which comes from a random construction. Closing the gap between these bounds, or derandomising the upper bound, would both be of significant interest.

Andrew McDowell

At the start of his talk in Birmingham earlier this summer, Peter Hegarty played two clips from Terminator in which a creature first dissolved into liquid and dispersed, then later reassembled, stating that it had prompted him to wonder how independent agents can meet up without any communication. Andrew tackled the other half of this question: how can non-communicating agents spread out to occupy distinct vertices of a graph? He was able to analyse some strategies using Markov chains in a clever way.

Tássio Naia

A sufficient condition for embedding an oriented tree on n vertices into every tournament on n vertices that implies that almost all oriented trees are unavoidable in this sense.

Clique decompositions of multipartite graphs and completion of Latin squares

Ben Barber, Daniela Kühn, Allan Lo, Deryk Osthus and Amelia Taylor Journal of Combinatorial Theory, Series A, Volume 151, October 2017, Pages 146–201 PDF

A Latin square of order n is an n \times n grid of cells, each of which contains one of n distinct symbols, such that no symbol appears twice in any row or column.  There is a natural correspondence between Latin squares of order n and partitions of E(K_{n,n,n}) into triangles.  We identify the three vertex classes of K_{n,n,n} with the n rows, n columns and n symbols of the Latin square.  A triangle ijx corresponds to the symbol x appearing in the intersection of row i and column j of the Latin square.  Since each cell contains exactly one symbol, and each symbol appears exactly once in each row and each column, the triangles corresponding to a Latin square do indeed partition E(K_{n,n,n}).

What about partitions of K_{n,n,n,n} into K_4‘s?  Identify the vertex classes of K_{n,n,n,n} with n rows, n columns, n red symbols and n  blue symbols.  Then a K_4 ijxy corresponds to a red symbol x and a blue symbol y in the intersection of row i and column j.  If we look at just the red symbols or just the blue symbols then we see a Latin square.  But we also have the extra property that each pair xy of red and blue symbol appears in exactly one cell of the grid.  Two Latin squares with this property are called orthogonal.  So pairs of orthgonal Latin squares of order n correspond to decompositions of K_{n,n,n,n} into K_4‘s.  More generally, a sequence of r-2 mutually orthogonal Latin squares of order n  corresponds to a partition of the complete r-partite graph on vertex classes of size n into K_r‘s.

Suppose now that we have a partial Latin square of order n, that is, a partially filled in n \times n grid of cells obeying the rules for a Latin square.  Can it be completed to a Latin square?  In the early 1980s several researchers proved that the answer is yes provided at most n-1 cells have been filled in total.  This is best possible, as if we place n-1 x‘s and a single y on the main diagonal there is no legal cell in which to place the nth x.  The same example shows that it is not enough to ask only that each row and each column contains only a small number of non-empty cells.  But what if each row, column and symbol has been used at most \epsilon n times?  Can we then complete to a Latin square?  Daykin and Häggkvist conjectured that we can, provided \epsilon \leq 1/4.

What does this mean on the graph side?  Let G be a subgraph of K_{n,n,n} obtained by deleting a set of edge-disjoint triangles such that no vertex is in more than \epsilon n triangles.  Then G should have a triangle-decomposition if \epsilon \leq 1/4.  In this paper we prove that G has a triangle-decomposition provided \epsilon \leq 3/104 - o(1).

In fact we prove something more general.  The G‘s obtained in this way have the properties that (i) each vertex has the same number of neighbours in each other vertex class and (ii) each vertex sees at least a 1-\epsilon proportion of the vertices in each other class (a partite minimum degree condition).  We prove that all such graphs have triangle-decompositions when \epsilon \leq 3/104 - o(1).

The proof is based on that of the similar result for non-partite graphs in Edge decompositions of graphs with high minimum degree.  However, it is not simply a translation of that proof to the partite setting.  In the partite case we not only have to ensure that all of our gadgets can be embedded in partite graphs, we must also take care to ensure a stronger notion of divisibility is preserved throughout our decomposition process.  This makes the proof extremely technical.

We also prove the analogous result for K_r-decompositions of complete r-partite graphs, with the less impressive bound \epsilon \leq 1/10^6r^3.  The connection to mutually orthogonal Latin squares is more complicated for r \geq 4, as the partially filled in cells only correspond to K_r‘s in the case where each non-empty cell contains one of each symbol, but we still show that there is some \epsilon such that if each row, column and coloured symbol is used at most \epsilon n times then partial mutually orthogonal Latin squares can be complete.

As for the non-partite case, the bounds on \epsilon are currently limited by available fractional or approximate decomposition results.  Improvements to these would lead automatically to improvements of the bounds in this paper.

Matchings without Hall’s theorem

In practice matchings are found not by following the proof of Hall’s theorem but by starting with some matching and improving it by finding augmenting paths.  Given a matching M in a bipartite graph on vertex classes X and Y, an augmenting path is a path P from x \in X \setminus V(M) to y \in Y \setminus V(M) such that ever other edge of P is an edge of M.  Replace P \cap M by P \setminus M produces a matching M' with |M'| = |M| + 1.

Theorem.  Let G be a spanning subgraph of K_{n,n}.  If (i) \delta(G) \geq n/2 or (ii) G is k-regular, then G has a perfect matching.

Proof. Let M = \{x_1y_1, \ldots, x_ty_t\} be a maximal matching in G with V(M) \subset V(G).

(i) Choose x \in X \setminus V(M), y \in Y \setminus V(M).  We have N(x) \subseteq V(M) \cap Y and N(y) \subseteq V(M) \cap X.  Since \delta(G) \geq n/2 there is an i such that x is adjacent to y_i and y is adjacent to x_i.  Then xy_ix_iy is an augmenting path.

(ii) Without loss of generality, G is connected.  Form the directed graph D on v_1, \ldots, v_t by taking the directed edge \vec{v_iv_j}  (i \neq j) whenever x_iy_j is an edge of G.  Add directed edges arbitrarily to D to obtain a k-regular digraph D', which might contain multiple edges; since G is connected we have to add at least one directed edge.  The edge set of D' decomposes into directed cycles.  Choose a cycle C containing at least one new edge of D', and let P be a maximal sub-path of C containing only edges of D.  Let v_i, v_j be the start- and endpoints of P respectively.  Then we can choose y \in N(x_i) \setminus V(M) and x \in N(y_j) \setminus V(M), whence yQx is an augmenting path, where Q is the result of “pulling back” P from D to G, replacing each visit to a v_s in D by use of the edge y_sx_s of G. \square

 

Matchings and minimum degree

A Tale of Two Halls

(Philip) Hall’s theorem.  Let G be a bipartite graph on vertex classes X, Y.  Suppose that,  for every S \subseteq X, |N(S)| \geq |S|.  Then there is a matching from X to Y.

This is traditionally called Hall’s marriage theorem.  The picture is that the people in X are all prepared to marry some subset of the people in Y.  If some k people in X are only prepared to marry into some set of k-1 people, then we have a problem; but this is the only problem we might have.  There is no room in this picture for the preferences of people in Y.

Proof. Suppose first that |N(S)| > |S| for all S \subset X.  Then we can match any element of X arbitrarily to a neighbour in Y and obtain a smaller graph on which Hall’s condition holds, so are done by induction.

Otherwise |N(S)| = |S| for some S \subset X.  By induction there is a matching from S to N(S).  Let T = X \setminus S.  Then for any U \subseteq T we have

    \[|N(U) \setminus N(S)| + |N(S)| = |N(U \cup S)| \geq |U \cup S| = |U| + |S|\]

hence |N(U) \setminus N(S)| \geq |U| and Hall’s condition holds on (T, Y \setminus N(S)).  So by induction there is a matching from T to Y \setminus N(S), which together with the matching from S to N(S) is a matching from X to Y. \square

Corollary 1.  Every k-regular bipartite graph has a perfect matching.

Proof. Counted with multiplicity, |N(S)| = k|S|.  But each element of Y is hit at most k times, so |N(S)| \geq k|S| / k = |S| in the conventional sense. \square

Corollary 2. Let G be a spanning subgraph of K_{n,n} with minimum degree at least n/2.  Then G has a perfect matching.

This is very slightly more subtle.

Proof. If |N(S)| = n there is nothing to check.  Otherwise there is a y \in Y \setminus N(S).  Then N(y) \subseteq X \setminus S and |N(y)| \geq n/2, so |S| \leq n/2.  But |N(S)| \geq n/2 for every S. \square

Schrijver proved that a k-regular bipartite graph with n vertices in each class in fact has at least \left(\frac {(k-1)^{k-1}} {k^{k-2}}\right)^n perfect matchings.  For minimum degree k we have the following.

(Marshall) Hall’s theorem.  If each vertex of X has degree at least k and there is at least one perfect matching then there are at least k!.

This turns out to be easy if we were paying attention during the proof of (Philip) Hall’s theorem.

Proof. By (Philip) Hall’s theorem the existence of a perfect matching means that (Philip) Hall’s condition holds.  Choose a minimal S on which it is tight.  Fix x \in S and match it arbitrarily to a neighbour y \in Y.  (Philip) Hall’s condition still holds on (S - x, Y-y) and the minimum degree on this subgraph is at least k-1, so by induction we can find at least (k-1)! perfect matchings.  Since there were at least k choices for y we have at least k! perfect matchings from S to N(S).  These extend to perfect matchings from X to Y as in the proof of (Philip) Hall’s theorem. \square

Thanks to Gjergji Zaimi for pointing me to (Marshall) Hall’s paper via MathOverflow.

Partition regularity and other combinatorial problems

This is the imaginative title of my PhD thesis.  It contains four unrelated pieces of work.  (I was warned off using this phrasing in the thesis itself, where the chapters are instead described as “self-contained”.)

The first and most substantial concerns partition regularity.  It is a coherent presentation of all of the material from Partition regularity in the rationals, Partition regularity with congruence conditions and Partition regularity of a system of De and Hindman.

The remaining three chapters are expanded versions of Maximum hitting for n sufficiently large, Random walks on quasirandom graphs and A note on balanced independent sets in the cube.1

1 “A note … ” was fairly described by one of my examiners as a “potboiler”.  It was also my first submitted paper, and completed in my second of three years.  Perhaps this will reassure anybody in the first year of a PhD who is worrying that they have yet to publish.

Nowhere zero 6-flows

A flow on a graph G is an assignment to each edge of G of a direction and a non-negative integer (the flow in that edge) such that the flows into and out of each vertex agree.  A flow is nowhere zero if every edge is carrying a positive flow and (confusingly) it is a k-flow if the flows on each edge are all less than k.  Tutte conjectured that every bridgeless graph has a nowhere zero 5-flow (so the possible flow values are 1, 2, 3, 4).  This is supposed to be a generalisation of the 4-colour theorem.  Given a plane graph G and a proper colouring of its faces by 1, 2, 3, 4, push flows of value i anticlockwise around each face of G.  Adding the flows in the obvious way gives a flow on G in which each edge has a total flow of |i-j| \neq 0 in some direction.

Seymour proved that every bridgeless graph has a nowhere zero 6-flow.  Thomas Bloom and I worked this out at the blackboard, and I want to record the ideas here.  First, a folklore observation that a minimal counterexample G must be cubic and 3-connected.  We will temporarily allow graphs to have loops and multiple edges.

We first show that G is 3-edge-connected.  It is certainly 2-edge-connected, else there is a bridge.  (If G is not even connected then look at a component.)  If it were only 2-edge-connected then the graph would look like this.

2connected

 

Contract the top cross edge.

2connectedcontracted

If the new graph has a nowhere zero 6-flow then so will the old one, as the flow in the bottom cross edge tells us how much flow is passing between the left and right blobs at the identified vertices and so the flow the we must put on the edge we contracted.  So G is 3-edge connected.

Next we show that G is 3-regular.  A vertex of degree 1 forces a bridge; a vertex of degree 2 forces the incident edges to have equal flows, so the two edges can be regarded as one.  So suppose there is a vertex v of degree at least 4.

deg4

We want to replace two edges xv and yv by a single edge xy to obtain a smaller graph that is no easier to find a flow for.  The problem is that in doing so we might produce a bridge.

The 2-edge-connected components of  G-v are connected by single edges in a forest-like fashion.  If any of the leaves of this forest contains only one neighbour of v then there is a 2-edge-cut, so each leaf contains at least two neighbours of v.

leaves

If there is a component of the forest with two leaves then choose x and y to be neighbours of v from different leaves of that component.

twoleaves

Otherwise the 2-edge-connected components of G-v are disconnected from each other.  Now any such component C must contain at least 3 neighbours of v, else there is a 2-edge-cut.  If some C contains 4 neighbours of v then we can choose x and y to be any two of them.  Otherwise all such C contain exactly 3 neighbours of v, in which case there must be at least two of them and we can choose x and y to be neighbours of v in different components.

twoislands

So G is 3-regular and 3-edge-connected.  If G is only 1-connected then there is no flow between 2-connected components, so one of the components is a smaller graph with no nowhere zero 6-flow.  If G is only 2-connected then because 3 is so small we can also find a 2-edge-cut.

Finally, we want to get rid of any loops and multiple edges we might have introduced.  But loops make literally no interesting contribution to flows and double edges all look like

doubleedge

and the total flow on the pair just has to agree with the edges on either side.

We’ll also need one piece of magic (see postscript).

Theorem. (Tutte) Given any integer flow of G there is a k-flow of G that agrees with the original flow mod k.  (By definition, flows of j in one direction and k-j in the other direction agree mod k.)

So we only need to worry about keeping things non-zero mod k.

The engine of Seymour’s proof is the following observation.

Claim. Suppose that G = G_0 \cup C_1 \cup \cdots \cup C_t where each C_i is a cycle and the number of new edges when we add C_i to G_0 \cup C_1 \cup \cdots \cup C_{i-1} is at most 2.  Then G has a 3-flow which is non-zero outside G_0.  

Write E_i for the set of edges added at the ith stage.  We assign flows to C_t, \ldots, C_1 in that order.  Assign a flow of 1 in an arbitrary direction to C_t; now the edges in E_t have non-zero flow and will never be touched again.  At the next stage, the edges in E_{t-1} might already have some flow; but since |E_{t-1}| \leq 2 there are only two possible values for these flows mod 3.  So there is some choice of flow we can put on C_{t-1} to ensure that the flows on E_{t-1} are non-zero.  Keep going to obtain the desired 3-flow, applying Tutte’s result as required to bring values back in range.

Finally, we claim that the G we are considering have the above form with G_0 being a vertex disjoint union of cycles.  Then G_0 trivially has a 2-flow, and 3 times this 2-flow plus the 2-flow constructed above is a nowhere-zero 6-flow on G.

For F \subseteq G, write [F] for the largest subgraph of G that can be obtained as above by adding cycles in turn, using at most two new edges at each stage.  Let G_0 be a maximal collection of vertex disjoint cycles in G with [G_0] connected, and let H = G - V(G_0).  We claim that V(H) is empty.  If not, then the 2-connected blocks of H are connected in a forest-like fashion; let H_0 be one of the leaves.

seymourlast

By 3-connectedness there are three vertex disjoint paths from H_0 to G_0. At most one of these paths travels through H - H_0; let x and y be endpoints of two paths that do not.  These paths must in fact be single edges, as the only other way to get to G_0 would be to travel through H - H_0.  Finally, since H_0 is 2-connected it contains a cycle through x and y, contradicting the choice of G_0.

Postscript. It turns out that Tutte’s result is far from magical; in fact its proof is exactly what it should be.  Obtain a directed graph H from G by forgetting about the magnitude of flow in each edge (if an edge contains zero flow then delete it).  We claim that every edge is in a directed cycle.

tutte

Indeed, choose a directed edge \vec{xy}.  Let Y be the set of vertices that can be reached by a directed path from y and let X be the set of vertices that can reach X by following a directed path.  If \vec{xy} is not in any directed cycle then X and Y are disjoint and there is no directed path from Y to X.  But then there can be no flow in \vec{xy}, contradicting the definition of H.

So as long as there are edges with flow value at least k, find a directed cycle containing one of those edges and push a flow of k through it in the opposite direction.  The total flow in edges with flow at least k strictly decreases, so we eventually obtain a k-flow.

Fractional clique decompositions of dense graphs and hypergraphs

Ben Barber, Daniela Kühn, Allan Lo, Richard Montgomery and Deryk Osthus, Journal of Combinatorial Theory, Series B, Volume 127, November 2017, Pages 148–186 PDF

Together with Daniela Kühn, Allan Lo and Deryk Osthus I proved that for every graph F there is a constant c_F < 1 such that every “F-divisible” graph G on n vertices with minimum degree at least (c_F + o(1))n has an F-decomposition. In practice, the current obstacle to improving the bounds on c_F is usually our knowledge of another quantity, the fractional decomposition threshold for cliques.

A graph G has a fractional F-decomposition if we can assign a non-negative weight to each copy of F in G such that the total weight of the copies of F containing each fixed edge of G is exactly 1. We prove that every graph with minimum degree at least (1-1/10000r^{3/2})n has a fractional K_r-decomposition. This greatly improves the previous bound of (1-2/9r^4)n for large r. We also prove a similar result for hypergraphs.

The proof begins with an approximate fractional K_r-decomposition obtained by weighting every r-clique in our graph equally. We then use small gadgets to make local adjustments to the total weight over each edge until we end up with a genuine fractional K_r-decomposition.

Edge-decompositions of graphs with high minimum degree

Ben Barber, Daniela Kühn, Allan Lo and Deryk Osthus, Advances in Mathematics, Volume 288, 22 January 2016, Pages 337–385 PDF

When can the edge set of a graph G be partitioned into triangles? Two obvious necessary conditions are that the total number of edges is divisible by 3 and the degree of every vertex is even. We call these conditions triangle divisibility. Triangle divisibility is not a sufficient condition for triangle decomposition (consider C_6), but it is sufficient if G is complete. So we would like to know how far from complete G can be and triangle divisibility still remain sufficient for triangle decomposition. Nash-Williams conjectured that minimum degree 3n/4 (where n is the number of vertices of G) should suffice for large n. In this paper we prove that every triangle divisible graph with minimum degree 9n/10 + o(n) has a triangle decomposition. We also prove similar results with any graph F in place of triangles.

The proof uses the absorbing method. It is very easy to remove triangles at the beginning of the process, but very hard at the end. So we make use of the flexibility we have at the beginning to make a plan for dealing with a small remainder. The key idea is that given a possible remainder R we can find a graph A such that A and A \cup R both have triangle decompositions. By reserving sufficiently many such A at the start of the process we know that we will be able to solve our problems at the end.

Random walks on quasirandom graphs

Ben Barber and Eoin Long, The Electronic Journal of Combinatorics, 20(4) (2013), #P25 PDF

Take a long (proportional to n^2) random walk W in a quasirandom graph G. Must the subgraph of edges traversed by W be quasirandom? We’d like to say yes, for the following reason: W visits every vertex about the same number of times, so we pick up the same number of random edges at every vertex. In the case where the minimum degree of G is large, this argument is essentially correct. If G has some vertices of very low degree then it breaks down because the random walk can get stuck in clusters of low degree vertices. However, a more sophisticated argument can recover a result that is almost as strong.

The proofs both fall into two parts: first show that the random walk does not differ too much from a process that has much more independence, then exploit that independence by applying standard concentration results to show that things work with high probability. It turns out that our results can be tweaked to apply to the more general case of random homomorphisms of trees (rather than paths) provided the maximum degree of the tree isn’t too large, so we indicate the necessary changes at the end of the paper.