Selected Publications

A complete list of my publications is also available.

Learning Submodular Functions

2011/06

Maria-Florina Balcan, Nicholas J. A. Harvey.
43rd Annual ACM Symposium on Theory of Computing (STOC 11), San Jose, CA, June 2011. PDF.
arXiv, August 2010. PDF.

There has recently been significant interest in the machine learning community on understanding and using submodular functions. Despite this recent interest, little is known about submodular functions from a learning theory perspective. Motivated by applications such as pricing goods in economics, this paper considers PAC-style learning of submodular functions in a distributional setting.

A problem instance consists of a distribution on {0,1}ⁿ and a real-valued function on {0,1}ⁿ that is non-negative, monotone and submodular. We are given poly(n) samples from this distribution, along with the values of the function at those sample points. The task is to approximate the value of the function to within a multiplicative factor at subsequent sample points drawn from the same distribution, with sufficiently high probability. We prove several results for this problem.

If the function is Lipschitz and the distribution is a product distribution, such as the uniform distribution, then a good approximation is possible: there is an algorithm that approximates the function to within a factor O(log (1/ε)) on a set of measure 1-ε, for any ε>0.
If we do not assume that the distribution is a product distribution, then the approximation factor must be much worse: no algorithm can approximate the function to within a factor of O(n^1/3/log n) on a set of measure 1/2+ε, for any constant ε>0. This holds even if the function is Lipschitz.
On the other hand, this negative result is nearly tight: for an arbitrary distribution, there is an algorithm that approximates the function to within a factor of n^1/2 on a set of measure 1-ε.

Our work combines central issues in optimization (submodular functions and matroids) with central topics in learning (distributional learning and PAC-style analyses) and with central concepts in pseudorandomness (lossless expander graphs). Our analysis involves a twist on the usual learning theory models and uncovers some interesting structural and extremal properties of submodular functions, which we suspect are likely to be useful in other contexts. In particular, to prove our general lower bound, we use lossless expanders to construct a new family of matroids which can take wildly varying rank values on superpolynomially many sets; no such construction was previously known.

Keywords: Learning Theory, Submodular Functions, Matroids

Notes: In this paper we use Talagrand's inequality to prove a concentration inequality for Lipschitz, monotone submodular functions. We were unaware of prior work proving concentration for subadditive functions by Schechtman in 2003, using Talagrand's inequality, and by Hajiaghayi et al. in 2005, using self-bounding functions.

In 2009, Chekuri and Vondrak also proved a concentration inequality for Lipschitz, monotone submodular functions, using the FKG inequality. Their inequality achieves Chernoff-type concentration, which is somewhat better than the Talagrand-like concentration that we achieve. An earlier version of our paper only proved concentration for matroid rank functions. After learning of the work of Chekuri and Vondrak, we improved our inequality to its current form, in which it applies to arbitrary Lipschitz, monotone, submodular functions.

Perhaps the nicest way to prove such inequalities is using self-bounding functions. In particular, this approach yields concentration for non-monotone functions; it is not clear whether the FKG or Talagrand approaches do too. See the note by Vondrak from 2010.

	Graph Sparsification by Edge-Connectivity and Random Spanning Trees	2011/06
	Wai Shing Fung, Ramesh Hariharan, Nicholas J. A. Harvey, Debmalya Panigrahi. 43rd Annual ACM Symposium on Theory of Computing (STOC 11), San Jose, CA, June 2011. PDF. arXiv, May 2010. PDF. We present two new approaches to constructing graph sparsifiers -- weighted subgraphs for which every cut has the same value as the original graph, up to a factor of (1 ± ε). The first approach independently samples each edge uv with probability inversely proportional to the edge-connectivity between u and v. The fact that this approach produces a sparsifier resolves an open question of Benczur and Karger. Concurrent work of Hariharan and Panigrahi (2010) also resolves this question. The second approach samples uniformly random spanning trees. Both of our approaches produce sparsifiers with O(n log²(n)/ε²) edges. Our proofs are based on extensions of the Karger-Stein contraction algorithm which allow it to compute minimum "Steiner cuts".
	Keywords: Graph Algorithms, Random Sampling, Contraction Algorithm, Splitting Off, Random Walks

Notes: Many of the results in this paper were independently discovered by Fung & Harvey and by Hariharan & Panigrahi. Actually Hariharan & Panigrahi had many more algorithmic results than we did, including the linear time algorithm for unweighted graphs. They graciously offered to merge the papers for a joint STOC submission.

	Sketching and Streaming Entropy via Approximation Theory	2008/10
	Nicholas J. A. Harvey, Jelani Nelson, Krzysztof Onak. IEEE Information Theory Workshop (ITW 08), Porto, Portugal, May 2008. PDF Slides. IEEE Symposium on Foundations of Computer Science (FOCS 08), Philadelphia, PA, October 2008. PDF. We conclude a sequence of work by giving near-optimal sketching and streaming algorithms for estimating Shannon entropy in the most general streaming model, with arbitrary insertions and deletions. This improves on prior results that obtain suboptimal space bounds in the general model, and near-optimal bounds in the insertion-only model without sketching. Our high-level approach is simple: we give algorithms to estimate Rényi and Tsallis entropy, and use them to extrapolate an estimate of Shannon entropy. The accuracy of our estimates is proven using approximation theory arguments and extremal properties of Chebyshev polynomials, a technique which may be useful for other problems. Our work also yields the best-known and near-optimal additive approximations for entropy, and hence also for conditional entropy and mutual information.
	Keywords: Streaming Algorithms, Shannon Entropy, Renyi Entropy, Stable Distributions, Chebyshev Polynomials

	Algebraic Algorithms for Matching and Matroid Problems	2006/10
	Nicholas J. A. Harvey. IEEE Symposium on Foundations of Computer Science (FOCS 06), Berkeley, CA, October 2006. Received the Best Student Paper (Machtey) Award. PDF PS Slides. SIAM Journal on Computing, 39(2):679-702, 2009. PDF. We present new algebraic approaches for several well-known combinatorial problems, including non-bipartite matching, matroid intersection, and some of their generalizations. Our work yields new randomized algorithms that are the most efficient known. For non-bipartite matching, we obtain a simple, purely algebraic algorithm with running time O(n^ω) where n is the number of vertices and ω is the matrix multiplication exponent. This resolves the central open problem of Mucha and Sankowski (FOCS 2004). For matroid intersection, our algorithm has running time O(nr^ω-1) for matroids with n elements and rank r that satisfy some natural conditions.
	Keywords: Non-bipartite Matching, Matroid Intersection, Algebraic Algorithms, Fast Matrix Multiplication

Notes: I recommend reading the journal version of this paper, not the conference version. The algorithms in the journal version are different and simpler. The conference paper sketched algorithms for the basic path-matching and independent matching problems. To keep the journal paper as simple as possible, these are not discussed there. The conference papers also claims that graphic matroid intersection can be solved in O(n^2.575) time. When asked to provide the details, I could not reproduce them. Most likely I was mistaken, so I retract the claim.

	SkipNet: A Scalable Overlay Network with Practical Locality Properties	2003/03
	Nicholas J. A. Harvey, Michael B. Jones, Stefan Saroiu, Marvin Theimer, Alec Wolman. Fourth USENIX Symposium on Internet Technologies and Systems (USITS '03), Seattle, WA, March 2003. Received the Best Paper Award. PDF PS HTML Slides. Long version, October 2003. PS. Source code is available. Scalable overlay networks such as Chord, CAN, Pastry, and Tapestry have recently emerged as flexible infrastructure for building large peer-to-peer systems. In practice, such systems have two disadvantages: They provide no control over where data is stored and no guarantee that routing paths remain within an administrative domain whenever possible. SkipNet is a scalable overlay network that provides controlled data placement and guaranteed routing locality by organizing data primarily by string names. SkipNet allows for both fine-grained and coarse-grained control over data placement: Content can be placed either on a pre-determined node or distributed uniformly across the nodes of a hierarchical naming subtree. An additional useful consequence of SkipNet's locality properties is that partition failures, in which an entire organization disconnects from the rest of the system, can result in two disjoint, but well-connected overlay networks.
	Keywords: Peer-to-Peer, Scalable, Locality, Self-Configuring, Range Query, Distributed System

Notes: Essentially the same idea was independently discovered by Aspnes and Shah, and published in James Aspnes, Gauri Shah, "Skip graphs". SODA 2003, 384-393.