|
Maria-Florina Balcan, Nicholas J. A. Harvey.
43rd Annual ACM Symposium on Theory of Computing (STOC 11),
San Jose, CA, June 2011.
PDF.
arXiv,
August 2010.
PDF.
There has recently been significant interest in the machine learning community on understanding
and using submodular functions. Despite this recent interest, little is known about submodular functions
from a learning theory perspective. Motivated by applications such as pricing goods in economics, this
paper considers PAC-style learning of submodular functions in a distributional setting.
A problem instance consists of a distribution on
{0,1}n
and a real-valued function on
{0,1}n
that is non-negative, monotone
and submodular. We are given poly(n) samples from this distribution, along
with the values of the function at those sample points. The task is to
approximate the value of the function to within a multiplicative factor at
subsequent sample points drawn from the same distribution, with sufficiently high
probability. We prove several results for this problem.
-
If the function is Lipschitz and the distribution is a
product distribution, such as the uniform distribution, then
a good approximation is possible: there is an
algorithm that approximates the function to within a factor O(log (1/ε)) on a
set of measure 1-ε, for any ε>0.
-
If we do not assume that the distribution is a product distribution, then the approximation factor
must be much worse: no algorithm can approximate the function to within a factor of
O(n1/3/log n)
on a set of measure 1/2+ε, for any constant ε>0.
This holds even if the function is Lipschitz.
-
On the other hand, this negative result is nearly tight:
for an arbitrary distribution, there is an algorithm that approximates
the function to within a factor of
n1/2
on a set of measure 1-ε.
Our work combines central issues in optimization (submodular functions and matroids) with central
topics in learning (distributional learning and PAC-style analyses) and with central concepts in pseudorandomness
(lossless expander graphs). Our analysis involves a twist on the usual learning theory models
and uncovers some interesting structural and extremal properties of submodular functions, which we
suspect are likely to be useful in other contexts. In particular, to prove our general lower bound, we use
lossless expanders to construct a new family of matroids which can take wildly varying rank values on
superpolynomially many sets; no such construction was previously known.
|