Integer programming for Set

Overview

We aim to answer the question How many Sets are possible in k cards? by applying integer programming.

An Integer Program

Denote Sets by S and cards by v.

max sum_S z_S
s.t. sum_v x_v <= k
     z_S <= x_v  for all S, for all v in S
     z_S in {0,1}  for all S
     x_v in {0,1}  for all v.

A corresponding LP relaxation is obtained by replacing "in {0,1}" constraints by "in [0,1]". This relaxation is far from tight.

Let n = 81. Let x_v := k/n for all v and z_S = k/n for all S. This has objective value k(n-1)/6. But the generic upper bound gives k(k-1)/6 as an upper bound on the IP optimum, which is a much better bound.

Relationship to Densest k-Subgraph

The problem being solved here is an extension of the Densest k-Subgraph problem to a hypergraph with hyperedges of size 3.

A corresponding naive LP relaxation for that problem, when applied to the complete graph K_n, has LPOPT=k(n-1)/2 and IPOPT=k(k-1)/2, for a gap of (n-1)/(k-1).

The following paper proposes inequalities to improve the situation:

@article{Bonomo2011,
title = "A polyhedral study of the maximum edge subgraph problem",
journal = "Discrete Applied Mathematics",
volume = "",
number = "0",
pages = " - ",
year = "2011",
note = "",
issn = "0166-218X",
doi = "10.1016/j.dam.2011.10.011",
url = "http://www.sciencedirect.com/science/article/pii/S0166218X11003702",
author = "Flavia Bonomo and Javier Marenco and Daniela Saban and Nicolás E. Stier-Moses",
keywords = "Polyhedral combinatorics",
keywords = "Quasi-cliques",
keywords = "Maximum edge subgraph problem"
}

In particular, their neighbour inequalities, when applied to K_n, result in an LP optimum of k(k-1)/2, same as the IP.

Neighbour Inequality for Set

A similar neighbour inequality can be derived for Set. If a point v is taken, k-1 points remain. In the best situation, they pair up to form Sets with v, so v is in at most (k-1)/2 Sets with the other chosen points.

The resulting inequality is sum_{S on v} z_S < floor((k-1)/2) x_v for all v.

Adding this inequality to our IP makes the LP optimum match our generic upper bound. Still, the IP takes prohibitively long to solve, even for k=11. But this inequality is necessary for the final formulation to run quickly.

Symmetry Breaking

Feasibility and objective value are preserved under affine automorphism. If we have a feasible solution with affine dimension d, then we may find d+1 affinely independent vectors and map them into {(0,0,0,0), (0,0,0,1), (0,0,1,0), (0,1,0,0), (1,0,0,0)}. Thus, if we know a lower bound on the affine dimension of the solution, we may force some of these zero and standard basis vectors to be taken by the IP.

A simple lower bound on dimension is: if a solution has affine dimension d, then it has at most 3^d cards. So we know d >= log_3(k).

Together with the neighbour inequality, this is enough to solve for k=12. Without the neighbour inequality, even k=11 is still infeasible.

TODO

Is the neighbour inequality facet-defining?

TODO: Can it be reduced to Densest k-Subgraph?

I don't think so. There's no particular reason to expect it, since maximum matching is easy but independent set is hard. But note that it's easy to reduce hypergraph isomorphism to graph isomorphism. (Consider the incidence graph, and attach a triangle to each of the sets to make sure set nodes get mapped to set nodes.)