Warning: Cannot modify header information - headers already sent by (output started at /home/public/wp-content/themes/simplelin/functions.php:1) in /home/public/wp-includes/feed-rss2.php on line 8
Scrapbook – Ben Barber https://babarber.uk mathematical storytelling Sat, 09 May 2020 15:25:29 +0000 en-US hourly 1 https://wordpress.org/?v=5.9.2 Diffy naive https://babarber.uk/546/diffy-naive/ https://babarber.uk/546/diffy-naive/#respond Sat, 09 May 2020 15:25:29 +0000 https://babarber.uk/?p=546 A week or so ago Rob Eastaway posted about the game Diffy on Twitter. Diffy begins with four numbers arranged around a cycle. Taking the (absolute values of) differences between adjacent pairs produces four more numbers around a cycle. If you start with positive integers then iterating this process eventually reaches 0, 0, 0, 0. How many iterations does it take?

A quick search (and the existence of Rob Eastaway’s talk on the subject) reveals that a fair amount is known about Diffy. I have deliberately not read any of it in detail. I don’t even know why you always reach zero, which is not as obvious as I had assumed: consider 0,0,0,1, say, which reaches zero in four steps, but increases the sum of the four numbers along the way. The question I looked at was “What is the maximum number of steps this process can take, beginning with numbers from [n] = \{1,\ldots,n\}?” Call this f(n).

Rob begins by asking us to show that f(12)+1 = 10. (+1 by counting the cycles we see along the way rather than the number of steps we take.) There are only 12^4 starting points so this takes no time at all if we break the spirit of the puzzle and use a computer. But there’s no need to be pointlessly inefficient. The length of a Diffy “game” is unchanged if we

  • add a constant to every number
  • rotate the numbers around the cycle
  • reverse the order of the numbers around the cycle
  • (multiply the numbers by a non-zero constant)

By applying the first three observations we can say that there will be a witness for the value of f(n) using numbers from [n] with the first number being 1, and the fourth number being at least as large as the second.

f' [x,y,z,w] = [abs (x-y), abs (y-z), abs (z-w), abs (w-x)]

g' xs = takeWhile (/=[0,0,0,0]) (iterate f' xs)

h' xs = length (g' xs) + 1

i' n = maximum [(h' [a,b,c,d], [a,b,c,d]) | a <- [1], b <- [a..n], c <- [a..n], d <- [b..n]] 

(Not pictured: good choice of variable names.)

So an n^4 exhaust is down to n^3/2 using symmetry.

My next thought in these situations is that we’re doing an awful lot of recomputation. If I’ve already scored 0,1,0,1, why should I reevaluate it when scoring 0,0,0,1? Why not store the score somewhere so we can look it up later?

f n = maximum (elems table) where
  table = array ((0,0,0),(n-1,n-1,n-1)) [((a,b,c), g a b c) | a <- [0..n-1], b <- [0..n-1], c <- [0..n-1]]
  g 0 0 0 = 1 -- represents c c c c for c > 0 the first time you reach it
  g a b c | a > c = g c b a
          | otherwise = (table ! normalise a (abs (b-a)) (abs (c-b)) c) + 1

normalise a b c d = let m = minimum [a,b,c,d]
                    in  normalise2 (a-m) (b-m) (c-m) (d-m)

normalise2 0 b c d = (b,c,d)
normalise2 a 0 c d = (c,d,a)
normalise2 a b 0 d = (d,a,b)
normalise2 a b c 0 = (a,b,c)

This is what I ended up with. It isn’t pretty. The thing that wasn’t, but should have been, immediately obvious is that the states I encounter along the way won’t be in the nicely normalised form we’ve just decided we want to work with, so the naive lookup table increases our time back up to n^4. On the other hand, normalising things like this is expensive compared to the computation we’re trying to save. It’s a very bad deal, and even if it weren’t I can’t afford n^3 memory for very large values of n.

So we’re back to computing f(n) honestly for each n in time O(n^3). The next observation is that if we want f(n) for lots of n we can do better: if f(n) is to be larger than f(n-1) then the pattern witnessing that had better use both 1 and n, taking us to O(n^2) for each new n. In fact, we can say a bit more. The 1 and n would either have to be adjacent or opposite.

    1 -- b        1 -- n
    |    |        |    |
    d -- n        d -- c

Then we still have some symmetries to play with with. In the opposite case, all of the edges are related by symmetries so we can assume that b-1 is the smallest difference. In the adjacent case there is less symmetry, but we can still assume that d-1 \leq n-c.

direct n = maximum ( [score 1 b n d | b <- [1..n`quot`2], d <- [b..n+1-b]]
                  ++ [score 1 n c d | c <- [1..n], d <- [1..n+1-c]] )

score 0 0 0 0 = 0
score a b c d = 1 + (score (abs (a-b)) (abs  (b-c)) (abs (c-d)) (abs (d-a)))

That’s not bad and will get you well up into the thousands without difficulty.

(Everything above is a mostly accurate representation of my progress through this problem, with light editing to fix errors and wholesale removal of dollar signs from my Haskell, since they confuse the LaTeX plugin. What follows is ahistorical, presenting an endpoint without all the dead ends along the way.)

So far we haven’t thought about the actual operation we’re iterating, so let’s do that now. Suppose that we’re in the opposite case with b and d strictly between 1 and n. Then replacing 1 by 2 and n by n-1 produces a pattern that goes to the same place as the original pattern under two rounds of iteration, so such patterns aren’t interesting in our search; they were considered in previous rounds. Similarly, in the adjacent case where c > d we can decrease c and n to obtain an equivalent-after-two-iterations pattern with a smaller value of n, so we don’t need to consider it. Finally, if we’re in the opposite case but b=1, say, then we could alternatively have viewed ourself as being in the adjacent case. So we can reduce our search to

direct n = maximum [score 1 a b n | a <- [1..n], b <- [a..n]]

for n^2/2 or, after further assuming that a-1 \leq n-b,

direct n = maximum [score 1 a b n | a <- [1..(n+1)]`quot`2], b <- [a..n-a+1]]

for n^2/4. (Once these computations are taking hours or days the constant factor improvements shouldn’t be undervalued.)

We’ve already seen that storing lots of information about previously computed results is not helpful, but we can store the known values of f(n) and its “inverse” g(k), the least n such that f(n) \geq k. Then when testing whether a,b,c,d scores at least k it might be worth checking whether \max(a,b,c,d) - \min(a,b,c,d) \geq g(k), which is the absolute minimum requirement to last at least k rounds. But our scoring function is so cheap that you don’t have to do very much at all of this sort of thing before it becomes too expensive, and in practice the optimal amount of checking seems to be zero.

Unless we can somehow do the checking without actually doing the checking? If we’re currently trying to check whether f(n) \geq k+1 then we’d better have the smallest and largest differences differing by at least g(k)-1. That -1 is the point at which I throw my hands up and switch to the interval [0,n] rather than [n] = [1,n], which means you have to keep an eye out for off by one errors when comparing earlier results with what comes next. We already have

    \[0 \leq a \leq b \leq n, \qquad a \leq n-b.\]

The largest difference at the beginning is n, so we additionally require that either n-a \geq g(k) or n-(b-a) \geq g(k). Rearranging, either a \leq n - g(k) or a \geq g(k) - n + b. This takes us to

newRecord n = maximum [score 0 a b n | b <- [..n], a <- h b]
  where
    gk = firstTimeWeCanGet ! (recordUpTo ! (n-1)) -- see full listing
    h b = let top = min b (n-b) in
          if   n-gk < b-n+gk && n-gk < top
          then [0..n-gk] ++ [b-n+gk..top]
          else [0..top]

with a cheap improvement from doing the first round of the iteration by hand.

newRecord n = maximum [score a (b-a) (n-b) n | b <- [0..n], a <- f b] + 1

There’s at least one more idea worth considering. We haven’t used the fourth symmetry, of multiplying by non-zero constants. The practical use would be to divide out any common factors of a,b,n, but doing the gcd each time is too expensive. I had greater hopes for checking that at least one of them is odd, which should save a quarter of the work half of the time, for a total saving of about 12%, but it doesn’t seem to help, even if you enforce it by never generating the bad pairs a, b, the same way we are able to do for the size consideration in the current listing.

This will produce the (k,g(k)) pairs up to (28,19513) in 100 minutes on my 3.6GHz machine. It hasn’t found the next pair yet after a few days of searching. It’s possible that some of the optimisation considerations (especially for what should be the cheap cases like eliminating a\equivb\equiv b \equiv n \equiv 0 \pmod 2) change for large k as naive scoring becomes more expensive, but my haphazard trials have had inconsistent results, both algorithmically and in terms of apparently making the compiler stop favouring certain optimisations.

(0,0)
(1,1)
(2,1)
(3,1)
(4,1)
(5,3)
(6,3)
(7,4)
(8,9)
(9,11)
(10,13)
(11,31)
(12,37)
(13,44)
(14,105)
(15,125)
(16,149)
(17,355)
(18,423)
(19,504)
(20,1201)
(21,1431)
(22,1705)
(23,4063)
(24,4841)
(25,5768)
(26,13745)
(27,16377)
(28,19513)
]]>
https://babarber.uk/546/diffy-naive/feed/ 0
Concentration inequalities https://babarber.uk/480/concentration-inequalities/ https://babarber.uk/480/concentration-inequalities/#respond Mon, 11 Feb 2019 11:28:33 +0000 https://babarber.uk/?p=480 In early 2019 I gave three lectures on concentration inequalities from a combinatorial perspective to the postgraduate reading group SPACE (Sum-Product, Additive-Combinatorics Etc.) at the University of Bristol. I prepared some very rough notes on what was covered.

You might also be interested in a scan of my undergraduate lecture notes on the same topic.

]]>
https://babarber.uk/480/concentration-inequalities/feed/ 0
Chromatic number of the plane https://babarber.uk/399/chromatic-number-of-the-plane/ https://babarber.uk/399/chromatic-number-of-the-plane/#respond Wed, 11 Apr 2018 17:01:58 +0000 http://babarber.uk/?p=399 The unit distance graph on \mathbb R^2 has edges between those pairs of points at Euclidean distance 1.  The chromatic number of this graph lies between 4 (by exhibiting a small subgraph on 7 vertices with chromatic number 4) and 7 (by an explicit colouring based on a hexagonal tiling of the plane).  Aubrey de Grey has just posted a construction of a unit distance graph with chromatic number 5, raising the lower bound on \chi(\mathbb R^2) by 1This MathOverflow post is a good jumping off point into the discussion online.

I was explaining this problem to a colleague and they asked whether this graph was connected (it is) and whether that was still true if we restricted to rational coordinates.  It turns out this was addressed by Kiran B.Chilakamarri in 1988, and the answer is the rational unit distance graph is connected from dimension 5 onwards.

To see that \mathbb Q^4 is not connected, consider a general unit vector x = (a_1/b, a_2/b, a_3/b, a_4/b) where b is coprime to \gcd(a_1, a_2, a_3, a_4).  Then a_1^2 + a_2^2 + a_3^2 + a_4^2 = b^2.

Claim.  b is divisible by 2 at most once.

Proof. Squares mod 8 are either 0, 1 or 4.  If b is divisible by 4 then one of the a_i is odd, hence squares to 1 mod 8.  But then a_1^2 + a_2^2 + a_3^2 + a_4^2 cannot be divisible by 8, which is a contradiction.

So the entries of x in their reduced form do not contain any 4‘s in their denominator, and so the same must hold for all sums of unit vectors.  Hence we can’t express, say, (1/4, 0, 0, 0) as a sum of unit vectors, and (1/4, 0, 0, 0) is not connected to 0.

Connectedness in dimension 5 (hence also later) uses Lagrange’s theorem on the sums of four squares.  We’ll show that (1/N, 0, 0, 0, 0) can be expressed as a sum of 2 unit vectors.  By Lagrange’s theorem, write 4N^2-1 = a_1^2 + a_2^2 + a_3^2 + a_4^2.  Then

    \[1 = \left(\frac 1 {2N}\right)^2 + \left(\frac {a_1} {2N}\right)^2+ \left(\frac {a_2} {2N}\right)^2+ \left(\frac {a_3} {2N}\right)^2+ \left(\frac {a_4} {2N}\right)^2\]

hence

    \[\left(\frac 1 {N},0,0,0,0\right) = \left(\frac 1 {2N},\frac {a_1} {2N},\frac {a_2} {2N},\frac {a_3} {2N},\frac {a_4} {2N}\right)+ \left(\frac 1 {2N}, -\frac {a_1} {2N}, -\frac {a_2} {2N}, -\frac {a_3} {2N}, -\frac {a_4} {2N}\right)\]

is a sum of 2 unit vectors.

]]>
https://babarber.uk/399/chromatic-number-of-the-plane/feed/ 0
The number of maximal left-compressed intersecting families https://babarber.uk/374/maximal-left-compressed-intersecting-families/ https://babarber.uk/374/maximal-left-compressed-intersecting-families/#respond Fri, 23 Feb 2018 12:57:11 +0000 http://babarber.uk/?p=374 A family of sets \mathcal A \subseteq [n]^{(r)} (subsets of \{1, \ldots, n\} of size r) is intersecting if every pair of sets in \mathcal A have a common element.  If n < 2r then every pair of sets intersect, so |\mathcal A| can be as large as \binom n r.  If n \geq 2r then the Erdős–Ko–Rado theorem states that |\mathcal A| \leq \binom {n-1} {r-1}, which (up to relabelling of the ground set) is attained only by the star \mathcal S of all sets containing the element 1.

A hands on proof of the Erdős–Ko–Rado theorem use a tool called compression.  A family \mathcal A is left-compressed if for every A \in \mathcal A, any set obtained from A by deleting an element and replacing it by a smaller one is also in \mathcal A.  You can show by repeatedly applying a certain compression operator that for every intersecting family \mathcal A there is a left-compressed intersecting family \mathcal A' of the same size.  Thus it suffices to prove the Erdős–Ko–Rado theorem for left-compressed families, which is easy to do by induction.

There is a strong stability result for large intersecting families.  The Hilton–Milner family consists of all sets that contain 1 and at least one element of [2,r+1], together with [2,r+1] itself.  This is an intersecting family, and in fact is the largest intersecting family not contained in a star.  The Hilton–Milner family has size O(n^{r-2}), so any family that gets anything like close to the Erdős–Ko–Rado bound must be a subset of a star.

As part of an alternative proof of the Hilton–Milner theorem, Peter Borg partially answered the following question.

Let \mathcal A \subseteq [n]^{(r)} be an intersecting family and let X \subseteq [n].  Let \mathcal A(X) = \{A \in \mathcal A : A \cap X \neq \emptyset\}.  For which X is |\mathcal A(X)| \leq |\mathcal S(X)|?

Borg used that fact that this is true for X = [2,r+1] to reprove the Hilton–Milner theorem.  In Maximum hitting for n sufficiently large I completed the classification of X for which this is true for large n.  The proof used the apparently new observation that, for n \geq 2r, every maximal left-compressed intersecting family in [n]^{(r)} corresponds to a unique maximal left-compressed intersecting family of [2r]^{(r)}.  In particular, the number of maximal left-compressed intersecting families for n \geq 2r is independent of n.  For r=1, 2, 3, 4, 5, 6 there are 1, 2, 6, 72, 37145, 1081162102034 (OEIS) such families respectively.  In the rest of this post I’ll explain how I obtained these numbers.

We want to count maximal left-compressed intersecting families of [2r]^{(r)}.  The maximal part is easy: the only way to get two disjoint sets of size r from [2r] is to take a set and its complement, so we must simply choose one set from each complementary pair.  To make sure the family we generate in this way is left-compressed we must also ensure that whenever we choose a set A we must also choose every set B with B \leq A, where B \leq A means “B can be obtained from A by a sequence of compressions”.  The compression order has the following properties.

  • If A = \{a_1 < \cdots < a_r\} and B = \{b_1 < \cdots < b_r\} then A \leq B if and only if a_i \leq b_i for each i.
  • A \leq B if and only if B^c \leq A^c.

Here’s one concrete algorithm.

  1. Generate a list of all sets from [2r-1]^{(r)}.  This list has one set from each complementary pair.
  2. Put all A from the list with A < A^c into \mathcal A.  (These sets must be in every maximal left-compressed intersecting family.)  Note that we never have A^c < A as A^c contains 2r but A doesn’t.
  3. Let A be the first element of the list and branch into two cases depending on whether we take A or A^c.
    • If we take A, also take all B from the list with B < A and B^c for all B from the list with B^c < A. (Since B^c contains 2r and A doesn’t, the second condition will never actually trigger.)
    • If we take A^c, also take all B from the list with B < A^c and B^c for all B from the list with B^c < A^c. (It is cheaper to test A < B than the second condition to avoid taking complements.)
  4. Repeat recursively on each of the two lists generated in the previous step.  Stop on each branch whenever the list of remaining options is empty.

The following is a fairly direct translation of this algorithm into Haskell that makes no attempt to store the families generated and just counts the number of possibilities.  A source file with the necessary import’s and the choose function is attached to the end of this post.

r = 5

simpleOptions = [a | a <- choose r [1..(2*r-1)], not [dollar-sign] a `simpleLeftOf` (simpleComplement a)]

simpleLeftOf xs ys = all id [dollar-sign] zipWith (<=) xs ys

simpleComplement a = [1..(2*r)] \\ a

simpleCount [] = 1
simpleCount (a:as) = simpleCount take + simpleCount leave
  where
    -- take a
    -- all pairs with b < a or b^c < a are forced
    -- second case never happens as b^c has 2r but a doesn't
    take = [b | b <- as, not [dollar-sign] b `simpleLeftOf` a]
    -- leave a, and so take a^c
    -- all pairs with b < a^c or b^c < a^c (equivalently, a < b) are forced
    c = simpleComplement a
    leave = [b | b <- as, not (b `simpleLeftOf` c || a `simpleLeftOf` b)]

This will compute the number of maximal left-compressed intersecting families for r \leq 5 in a fraction of a second.  For r=6 it would probably find the answer in less than a month.  I obtained the value for r=6 in a couple of days on a single core by using a better representation of the sets in our family.

The dream is to pack all of the elements of our list into a single machine word and perform each comparison in a small number of instructions.  For example, we could encode an element of [12]^{(6)} by writing each element as 4 binary digits then concatenating them in increasing order to obtain a 24 bit word.  But comparing two such words as integers compares the corresponding sets lexicographically rather than pointwise.  Edward Crane suggested that as the lists are so short and the elements are so small we can afford to be quite a lot more wasteful in our representation: we can write each element of our set in unary!  The rest of this section should be considered joint work with him.

The first iteration of the idea is to write each element x of [12] as a string of x 1’s followed by 12-x 0’s, then concatenate these strings to obtain a representation of our set.  This representation has the great advantage that we can compare sets pointwise by comparing strings bitwise, and we can do this using very few binary operations: a is contained in b if and only if a \& b = a.

Unfortunately this representation uses 72 bits in total, so won’t fit into a 64-bit machine word. Observing that we never use 0 and encoding by x-1 1‘s followed by 11-x 0‘s saves only 6 bits. But we can do even better by encoding each element of the set differently. The first element is always at least 1, the second is always at least 2 and so on. Similarly, the first element is at most 7, the second at most 8 and so on. Working through the details we arrive at the following representation.

Identify each element of [12]^{(6)} by an “up-and-right” path from the bottom left to the top right corner of a 6 \times 6 grid: at the ith step move right if i is in your set and up if it isn’t. Then A \leq B if and only if the path corresponding to A never goes below the path corresponding to B. So we can compare sets by comparing the regions below the corresponding paths. Recording these regions can be done using 36 bits, which happily sits inside a machine word. This representation also has the helpful property that taking the complement of a set corresponds to reflecting a path about the up-and-right diagonal, so the representation of the complement of a set can be obtained by swapping certain pairs of bits followed by a bitwise NOT.

The value for r=6 was obtained using this new representation and the old algorithm, with one minor tweak.  It’s a bad idea to start with a lexicographically ordered list of sets, as the early decisions will not be meaningful and not lead to much of a reduction in the length of the the lists.  Optimal selection of which pair to decide at each stage is probably a complicated question.  As a compromise I randomised the order of the list at the start of the process, then took the first remaining pair at each stage.

The Haskell source is here.  There are a few more performance tricks to do with the exact bit representation of the sets, which I’m happy to discuss if anything is unclear.

]]>
https://babarber.uk/374/maximal-left-compressed-intersecting-families/feed/ 0
Counting colourings with containers https://babarber.uk/369/counting-colourings-with-containers/ https://babarber.uk/369/counting-colourings-with-containers/#respond Thu, 30 Nov 2017 17:09:14 +0000 http://babarber.uk/?p=369 On the maximum number of integer colourings with forbidden monochromatic sums, Hong Liu, Maryam Sharifzadeh and Katherine Staden

Maryam spoke about this paper at this week’s combinatorics seminar.

The problem is as follows.  Let f(A, r) be the number of r-colourings of a subset A of [n] with no monochromatic sum x + y = z.  What is the maximum f(n,r) of f(A, r) over all A \subseteq [n]?

One possibility is that we take A to be sum-free, so that f(A, r) = r^{|A|}.  The maximum size of a sum-free set of [n] is around n/2, achieved by the set O of odd numbers and the interval I = [\lfloor n/2 \rfloor + 1, n], so f(n, r) \geq r^{n/2}.

Another possibility is to choose r sum-free sets A_1, \ldots, A_r and take all colourings of A = A_1 \cup \cdots \cup A_r such that the elements of colour i are contained in A_i.  There are

    \[1^{n_1} \cdot 2^{n_2}  \cdots  r^{n_r}\]

such colourings, where n_j is the number of elements in exactly j of the A_i.  For example, we might take half of the A_i to be O and half to be I.  Then the odd numbers greater than n/2 are in every set, and the evens greater than n/2 and the odds less than n/2 are in half of the sets, so the number of colourings is around

    \[r^{n/4}(r/2)^{n/2}.\]

For r=4 this matches the previous lower bound; for r \geq 5 it is larger.

It’s easy to see that this construction cannot improve the bound for r = 2: it only provides 2^{n_2} good colourings, but n_2 \leq n/2 as elements contributing to n_2 are in A_1 \cap A_2, which must be sum-free.

What about r=3?  Now we get 2^{n_2}3^{n_3} = 3^{n^3 + n_2 / \log_2 3} good colourings.  We also have that

    \[2n_2 + 3n_3 \leq  |A_1| + |A_2| + |A_3| \leq 3n/2.\]

But since \log_2(3) > 3/2 we have

    \[3^{n^3 + n_2 / \log_2 3} \leq 3^{n^3 + 2n_2 / 3} \leq 3^{n/2}.\]

Moreover, if n_2 is not tiny then we are some distance off this upper bound, so the only good constructions in this family come from having all the A_i substantially agree.

How can we get matching upper bounds?  If there weren’t very many maximal sum-free sets then we could say that every good colouring arises from a construction like this, and there aren’t too many such constructions to consider.  This is too optimistic, but the argument can be patched up using containers.

The container method is a relatively recent addition to the combinatorial toolset.  For this problem the key fact is that there is a set \mathcal F of 2^{o(n)} subsets of [n] such that

  • every sum-free set is contained in some F \in \mathcal F,
  • each F \in \mathcal F is close to sum-free.  Combined with a “sum-removal lemma” this means in particular that it has size not much larger than n/2.

We now consider running the above construction with each A_i an element of \mathcal F.  Since the containers are not themselves sum-free, this will produce some bad colourings.  But because every sum-free set is contained in some element of \mathcal F, every good colouring of a subset A of [n] will arise in this way.  And since there are most |\mathcal F|^r choices for the sets A_i the number of colourings we produce is at most a factor 2^{o(n)} greater than the biggest single example arising from the construction.

This is the big idea of the paper: it reduces counting colourings to the problem of optimising this one construction.  For r \leq 5 the authors are able to solve this new problem, and so the original.

 

]]>
https://babarber.uk/369/counting-colourings-with-containers/feed/ 0
Linear programming duality https://babarber.uk/342/linear-programming-duality/ https://babarber.uk/342/linear-programming-duality/#respond Tue, 29 Aug 2017 13:39:25 +0000 http://babarber.uk/?p=342 The conventional statement of linear programming duality is completely inscrutable.

  • Prime: maximise b^T x subject to Ax \leq c and x \geq 0.
  • Dual: minimise c^T y subject to A^T y \leq b and y \geq 0.

If either problem has a finite optimum then so does the other, and the optima agree.

do understand concrete examples.  Suppose we want to pack the maximum number vertex-disjoint copies of a graph F into a graph G.  In the fractional relaxation, we want to assign each copy of F a weight \lambda_F \in [0,1] such that the weight of all the copies of F at each vertex is at most 1, and the total weight is as large as possible.  Formally, we want to

maximise \sum \lambda_F subject to \sum_{F \ni v} \lambda_F \leq 1 and \lambda_F \geq 0,

which dualises to

minimise \sum \mu_v subject to \sum_{v \in F} \mu_v \geq 1 and \mu_v \geq 0.

That is, we want to weight vertices as cheaply as possible so that every copy of F contains 1 (fractional) vertex.

To get from the prime to the dual, all we had to was change a max to a min, swap the variables indexed by F for variables indexed by v and flip one inequality.  This is so easy that I never get it wrong when I’m writing on paper or a board!  But I thought for years that I didn’t understand linear programming duality.

(There are some features of this problem that make things particularly easy: the vectors b and c in the conventional statement both have all their entries equal to 1, and the matrix A is 0/1-valued.  This is very often the case for problems coming from combinatorics.  It also matters that I chose not to make explicit that the inequalities should hold for every v (or F, as appropriate).)

Returning to the general statement, I think I’d be happier with

  • Prime: maximise \sum_j b_j x_j subject to \sum_j A_{ij}x_j \leq c_i and x \geq 0.
  • Dual: minimise \sum_i y_i subject to \sum_i A_{ij} y_i \leq b_j and y \geq 0.

My real objection might be to matrix transposes and a tendency to use notation for matrix multiplication just because it’s there.  In this setting a matrix is just a function that takes arguments of two different types (v and F or, if you must, i and j), and I’d rather label the types explicitly than rely on an arbitrary convention.

]]>
https://babarber.uk/342/linear-programming-duality/feed/ 0
Random Structures and Algorithms 2017 https://babarber.uk/333/random-structures-and-algorithms-2017/ https://babarber.uk/333/random-structures-and-algorithms-2017/#respond Mon, 28 Aug 2017 15:49:22 +0000 http://babarber.uk/?p=333 A partial, chronologically ordered, list of talks I attended at RSA in Gniezno, Poland. Under construction until the set of things I can remember equals the set of things I’ve written about.

Shagnik Das

A family of subsets of [n] that shatters a k-set has at least 2^k elements. How many k-sets can we shatter with a family of size 2^k? A block construction achieves (n/k)^k \approx e^{-k} \binom n k. (Random is much worse.) Can in fact shatter constant fraction of all k-sets. When n = 2^k-1, identify the ground set with \mathbb F_2^k \setminus \{0\}, and colour by \chi_w(v) = v \cdot w for w \in \mathbb F_2^k.

Claim. A k-set is shattered if and only if it is a basis for \mathbb F_2^k.

Proof. First suppose that v_1, \ldots, v_k is a basis. Then for any sequence \epsilon_i, there is a unique vector w such that v_i \cdot w = \epsilon_i. (We are just solving a system of full rank equations mod 2.)

Next suppose that v_1, \ldots, v_k are linearly dependent; that is, that they are contained in a subspace U of \mathbb F_2^k. Choose w orthogonal to U. Then for any u \in U and any w' we have u \cdot w' = u \cdot (w+w'), so two of our colourings agree on v_1, \ldots, v_k. \Box

We finish with the observation that random sets of k vectors are fairly likely to span \mathbb F_2^k: the probability is

    \[ 1 \cdot (1 - 1/2^k) \cdot (1 - 1/2^{k-1}_ \cdot \cdots \cdot (1-1/2) \geq 1 - \sum_j=1^k 1/2^j > 0. \]

Blowing up this colouring gives a construction that works for larger n.

At the other end of the scale, we can ask how large a family is required to shatter every k-set from [n]. The best known lower bound is \Omega(2^k \log n), and the best known upper bound is O(k2^k \log n), which comes from a random construction. Closing the gap between these bounds, or derandomising the upper bound, would both be of significant interest.

Andrew McDowell

At the start of his talk in Birmingham earlier this summer, Peter Hegarty played two clips from Terminator in which a creature first dissolved into liquid and dispersed, then later reassembled, stating that it had prompted him to wonder how independent agents can meet up without any communication. Andrew tackled the other half of this question: how can non-communicating agents spread out to occupy distinct vertices of a graph? He was able to analyse some strategies using Markov chains in a clever way.

Tássio Naia

A sufficient condition for embedding an oriented tree on n vertices into every tournament on n vertices that implies that almost all oriented trees are unavoidable in this sense.

]]>
https://babarber.uk/333/random-structures-and-algorithms-2017/feed/ 0
Matchings without Hall’s theorem https://babarber.uk/214/matchings-without-halls-theorem/ https://babarber.uk/214/matchings-without-halls-theorem/#respond Tue, 07 Feb 2017 16:10:15 +0000 http://babarber.uk/?p=214 In practice matchings are found not by following the proof of Hall’s theorem but by starting with some matching and improving it by finding augmenting paths.  Given a matching M in a bipartite graph on vertex classes X and Y, an augmenting path is a path P from x \in X \setminus V(M) to y \in Y \setminus V(M) such that ever other edge of P is an edge of M.  Replace P \cap M by P \setminus M produces a matching M' with |M'| = |M| + 1.

Theorem.  Let G be a spanning subgraph of K_{n,n}.  If (i) \delta(G) \geq n/2 or (ii) G is k-regular, then G has a perfect matching.

Proof. Let M = \{x_1y_1, \ldots, x_ty_t\} be a maximal matching in G with V(M) \subset V(G).

(i) Choose x \in X \setminus V(M), y \in Y \setminus V(M).  We have N(x) \subseteq V(M) \cap Y and N(y) \subseteq V(M) \cap X.  Since \delta(G) \geq n/2 there is an i such that x is adjacent to y_i and y is adjacent to x_i.  Then xy_ix_iy is an augmenting path.

(ii) Without loss of generality, G is connected.  Form the directed graph D on v_1, \ldots, v_t by taking the directed edge \vec{v_iv_j}  (i \neq j) whenever x_iy_j is an edge of G.  Add directed edges arbitrarily to D to obtain a k-regular digraph D', which might contain multiple edges; since G is connected we have to add at least one directed edge.  The edge set of D' decomposes into directed cycles.  Choose a cycle C containing at least one new edge of D', and let P be a maximal sub-path of C containing only edges of D.  Let v_i, v_j be the start- and endpoints of P respectively.  Then we can choose y \in N(x_i) \setminus V(M) and x \in N(y_j) \setminus V(M), whence yQx is an augmenting path, where Q is the result of “pulling back” P from D to G, replacing each visit to a v_s in D by use of the edge y_sx_s of G. \square

 

]]>
https://babarber.uk/214/matchings-without-halls-theorem/feed/ 0
Matchings and minimum degree https://babarber.uk/201/matchings-and-minimum-degree/ https://babarber.uk/201/matchings-and-minimum-degree/#respond Thu, 02 Feb 2017 11:42:31 +0000 http://babarber.uk/?p=201 A Tale of Two Halls

(Philip) Hall’s theorem.  Let G be a bipartite graph on vertex classes X, Y.  Suppose that,  for every S \subseteq X, |N(S)| \geq |S|.  Then there is a matching from X to Y.

This is traditionally called Hall’s marriage theorem.  The picture is that the people in X are all prepared to marry some subset of the people in Y.  If some k people in X are only prepared to marry into some set of k-1 people, then we have a problem; but this is the only problem we might have.  There is no room in this picture for the preferences of people in Y.

Proof. Suppose first that |N(S)| > |S| for all S \subset X.  Then we can match any element of X arbitrarily to a neighbour in Y and obtain a smaller graph on which Hall’s condition holds, so are done by induction.

Otherwise |N(S)| = |S| for some S \subset X.  By induction there is a matching from S to N(S).  Let T = X \setminus S.  Then for any U \subseteq T we have

    \[|N(U) \setminus N(S)| + |N(S)| = |N(U \cup S)| \geq |U \cup S| = |U| + |S|\]

hence |N(U) \setminus N(S)| \geq |U| and Hall’s condition holds on (T, Y \setminus N(S)).  So by induction there is a matching from T to Y \setminus N(S), which together with the matching from S to N(S) is a matching from X to Y. \square

Corollary 1.  Every k-regular bipartite graph has a perfect matching.

Proof. Counted with multiplicity, |N(S)| = k|S|.  But each element of Y is hit at most k times, so |N(S)| \geq k|S| / k = |S| in the conventional sense. \square

Corollary 2. Let G be a spanning subgraph of K_{n,n} with minimum degree at least n/2.  Then G has a perfect matching.

This is very slightly more subtle.

Proof. If |N(S)| = n there is nothing to check.  Otherwise there is a y \in Y \setminus N(S).  Then N(y) \subseteq X \setminus S and |N(y)| \geq n/2, so |S| \leq n/2.  But |N(S)| \geq n/2 for every S. \square

Schrijver proved that a k-regular bipartite graph with n vertices in each class in fact has at least \left(\frac {(k-1)^{k-1}} {k^{k-2}}\right)^n perfect matchings.  For minimum degree k we have the following.

(Marshall) Hall’s theorem.  If each vertex of X has degree at least k and there is at least one perfect matching then there are at least k!.

This turns out to be easy if we were paying attention during the proof of (Philip) Hall’s theorem.

Proof. By (Philip) Hall’s theorem the existence of a perfect matching means that (Philip) Hall’s condition holds.  Choose a minimal S on which it is tight.  Fix x \in S and match it arbitrarily to a neighbour y \in Y.  (Philip) Hall’s condition still holds on (S - x, Y-y) and the minimum degree on this subgraph is at least k-1, so by induction we can find at least (k-1)! perfect matchings.  Since there were at least k choices for y we have at least k! perfect matchings from S to N(S).  These extend to perfect matchings from X to Y as in the proof of (Philip) Hall’s theorem. \square

Thanks to Gjergji Zaimi for pointing me to (Marshall) Hall’s paper via MathOverflow.

]]>
https://babarber.uk/201/matchings-and-minimum-degree/feed/ 0
Block partitions of sequences https://babarber.uk/194/block-partitions-of-sequences/ https://babarber.uk/194/block-partitions-of-sequences/#respond Tue, 31 Jan 2017 17:47:50 +0000 http://babarber.uk/?p=194 Let L be a line segment of length l broken into pieces of length at most 1. It’s easy to break L into k blocks (using the preexisting breakpoints) that differ in length by at most 2 (break at the nearest available point to l/k, 2l/k etc.).

In the case where each piece has length 1 and the number of pieces isn’t divisible by k, we can’t possibly do better than a maximum difference of 1 between blocks.  Is this achievable in general?

In today’s combinatorics seminar Imre Bárány described joint work with Victor Grinberg which says that the answer is yes.

The proof is algorithmic.  Start with any partition of L into k parts.  If the difference in length between the smallest and largest parts is at most 1 then we’re done, so assume not.  Fix a longest block B and let C be a shortest block.  Suppose without loss of generality that C is to the right of B.  Increase the length of C by stealing one piece from the part to the left of C.  Repeat this process until either we find a good partition of L, or we steal a piece from B.  On each iteration a piece moves away from B, so one of these must occur in finite time.

At each stage we add a piece of length at most 1 to a block of length more than 1 shorter than B, so we never create new blocks of length at least that of B.  So when B is destroyed there are either fewer blocks of maximal length, or the number of blocks of maximal length has decreased.  There are only finitely many possible lengths for blocks, so by repeating with a new block of maximal length we are guaranteed to find a good partition in finite time.

]]>
https://babarber.uk/194/block-partitions-of-sequences/feed/ 0