Deep Learning in Discrete Optimization

AMS 467/667, Spring 2020
Instructor: William Cook
Lectures: Maryland 110, Tuesday and Thursday, 9:00--10:15.
Office Hours: Whitehead 201-A, Tuesday 1PM - 3PM.
TA: Yuanya Ma


Course Description

The goal of the course is to examine research-level topics in the application of deep-learning techniques to the solution of computational problems in discrete optimization. The tools of deep learning, mixed-integer programming, and heuristic search will be studied, analyzed, and applied to a variety of models, including the traveling salemsan problem, vehicle routing, and graph coloring.

The first half of the course will cover background material, introducing students to deep learning (focusing on practical aspects) and heuristic search techniques, and covering major topics in applied mixed-integer programming, including modeling, linear-programming duality, cutting planes, column generation, and branch-and-bound. We will then make an in-depth study of recent papers where deep learning has been proposed as a solution-technique in discrete optimization and where mixed-integer programming has been adopted to study the performance of deep learning models. The presentations will be aimed towards discussions of open research questions.

Prerequisites

General mathematical maturity is expected: students should feel comfortable reading on their own Part 1 (Applied Math and Machine Learning Basics) in the text Deep Learning.

Although no specific background in the topics is required, students will get the most out of the course if they have previous knowledge of one of the three components: (1) discrete optimization, (2) machine learning, (3) computation (PyTorch or a good background in computer programming).

Course Work

There will be two homework assignments. The first is to modify an existing code to solve an instance of the traveling salesman problem, using lazy constraints and a MIP solver. The second will be for you to build, train, and test a neural network for a simple combinatorial problem, using the PyTorch framework. The two assignments together will count for 25% of the course grade.

The major part of the course work is a final project, applying deep learning to a discrete optimization problem, or using discrete optimization to annalyze a deep learning model. In this project students are encouraged to work in teams of up to four members. Each team will submit a proposal (1 to 2 pages), a final report (5 to 10 pages), computer code, and test data. Teams will also give a short presentation in the final week of lectures. The project counts for 75% of the course grade.

Textbooks

Deep Learning, Goodfellow, Bengio, and Courville (pdf freely available, hardcopy $55.00 at Amazon).
In Pursuit of the Traveling Salesman (hardcopy $12.95 at Amazon).
Model Building in Mathematical Programming, H. Paul Williams (online version at no cost while on Johns Hopkins network, hardcopy $50.07 at Amazon).




Outline of Lectures

Introduction (Week 1)
Deep Learning, Goodfellow, Bengio, Courville, Chapters 1-5.
In Pursuit of the Traveling Salesman, Chapter 1.

Linear-Programming Duality (Week 2a)
In Pursuit of the Traveling Salesman, Chapter 5. Very gentle introduction to linear programming.
Liner Programming, Vasek Chvatal. Fantastic introduction to LP, duality, and the simplex algorithm. Sadly, the book is out of print, but used copies are available for $13.
ORF 307: Optimization, Robert Vanderbei. Course notes based on his book Linear Programming: Foundations and Extensions.

The Cutting-Plane Method (Week 2b)
In Pursuit of the Traveling Salesman, Chapter 6. Cutting Planes.
Traveling salesman problem: exact solution with the cutting plane method (YouTube video.) A tutorial for the TSP cutting-plane method, created for the Concorde TSP app. Hope you don't mind the Siri voice.
The Traveling Salesman Problem with integer programming and Gurobi. A demo and sample code, implementing the MIP process we discussed in class.

Notes for Week 2
Please read Deep Learning, Goodfellow, Bengio, Courville, Chapter 6.
Also please install the Gurobi solver on a local machine. Gurobi will be used for the first HW assignment.

HW 1: Hello MIP World

Branch and Bound (Week 3)
In Pursuit of the Traveling Salesman, Chapter 7. Branching.
The branch-and-cut algorithm for solving mixed-integer optimization problems, Jim Luedtke. IMA New Directions Short Course, August 10, 2016.

Heuristic Search (Week 4a)
In Pursuit of the Traveling Salesman, Chapter 4. Searching for a Tour.
Experimental analysis of heuristics for the STSP, D. S. Johnson, L. McGeoch. Appeared in the book The Traveling Salesman Problem and its Variations, edited by Gutin and Punnen, 2002. Survey of classical heuristic methods for the (symmetric) TSP.

PyTorch Tutorial (Week 4b)
Tutorial 1, Jupyter notebook by Yuanye Ma, course TA.
graphs6.txt, data set for the tutorial.

HW 2: Hello DL World

Rectified linear units (Week 5a)
Understanding deep neural networks with rectified linear units, R. Arora, A. Basu, P. Mianjy, A. Mukherjee. 2018. Appeared in ICLR 2018.
Discrete Geometry meets Machine Learning, Amitabh Basu. Talk given at the 22nd Aussois Combinatorial Optimization Workshop, January 11, 2018.
Bounding and counting linear regions of deep neural networks, T. Serra, C. Tjandraatmadja, S. Ramalingam. 2018. Appeared in ICML 2018. A talk on this topic (given by one of the authors of the paper) can be viewed here.

Using MIP to study ReLU neural nets (Week 5b)
Deep neural networks and mixed integer linear optimizatio, M. Fischetti, J. Jo. 2018. Published in the journal Constraints, July 2018, Issue 3, pp. 296-309.
Evaluating robustness of neural networks with mixed integer programming, V. Tjeng, K. Xiao, R. Tedrake, 2019. Appeared in ICLR 2019. Computer code available here.
Strong mixed-integer programming formulations for trained neural networks, R. Anderson, J. Huchette, C. Tjandraatmadja, J. P. Vielma, 2018. Appeared in ICPCO 2019. Full version of paper, with W. Ma as additional co-author.

Q-Learning (Week 6)
Learning combinatorial optimization algorithms over graphs, H. Dai, E. B. Khalil, Y. Zhang, B. Dilkina, L. Song. 2017. Appeared in NIPS 2017. Computer code by Hanjun Dai, implementing the discusssed models, is available on GitHub. Hanjun Dai also has PowerPoint slides available for a talk on this topic.
Machine learning for combinatorial optimization: a methodological Tour de Horizon, Y. Bengio, A. Lodi, A. Prouvost, 2018. Nice survey paper.
Machine Learning for Humans, Part 5: Reinforcement Learning, V. Maini. 2017. Gentle introduction; good way to get accustomed to the terminology used in Q-learning.
In Pursuit of the Traveling Salesman, Chapter 9. Complexity. The dynamic-programming algorithm we discuessed in class can be found on pages 180--182.

Graph Neural Networks (GNNs) (Week 7a)
Benchmarking graph neural networks, V. P. Dwivedi, C. K. Joshi, T. Laurent, Y. Bengio, X. Bresson. Posted on arXiv.org March 2, 2020. Computer code is available on GitHub.
A comprehensive survey on graph nueral networks, Z. Wu, S. Pan, F. Chen, G Long, C. Zhang, P. Yu. Posted on arXiv.org December 4, 2019.
Graph neural networks: a review of methods and applications, J. Zhou et al. Posted on arXiv.org July 10, 2019.

COVID-19 and Explainable Machine Learning (Week 8a)
Kaggle's COVID-19 pages. Access to a massive collection of 44,000 scholarly articles. Challenge consists of 10 specific data anlaysis tasks.
Boolean decision rules via column generation, S. Dash, O. Gunluk, D. Wei. Winner of the $5000 FICO Explainable Machine Learning Challenge at NeurIPS 2018.

Simplex Method (Pivoting via a DNN?) (Week 8b)
Robert Bixby: Solving Linear Programs: The Dual Simplex Algorithm (1/3): Some Basic Theory (YouTube video). Bixby's lecture on the simplex algorithm.
Solving Linear and Integer Programs, Robert Bixby and Ed Rothberg. The description of the simplex method we covered in class is given on slide 16.
Liner Programming, Vasek Chvatal, Chapters 2, 3, and 7.

Sequence-to-Sequence Learning (Week 9a)
Neural machine translation by jointly learning to align and translate, D. Bahdanau, K. Cho, Y. Bengio. 2014.
Sequence to sequence learning with neural networks, I. Sutskever, O. Vinyals, Q. V. Le. 2014.
Pointer networks, O. Vinyals, M. Fortunato, N. Jaitly. 2017.

Attention (Week 9b)
Attention is all you need, A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. Gomez, L. Kaiser, I. Polosukhin. December 2017.
Masterclass: Attention is all your need. Great talk by Lukasz Kaiser (Google Brain) on the Vaswani et al. paper.
Transfomer: A novel neural network architecture for language understanding. Google AI Blog by Jakob Uszkoreit on the Vaswani et al. paper.
Attention, learn to solve routing problems!, W. Kool, H. van Hoof, M. Welling. Appeared in ICLR 2019. (Source code on GitHub.)

Team Meetings (Week 10)
Online meetings with each of the teams to dicuss project proposals.

Branch-and-Bound Rules via Machine Learning (Week 11a)
A machine learning-baded approximation of strong branching, A. Alvarez, Q. Louveaux, L. Wehenkel. INFORMS Journal of Computing (2017) 185-195.
Learning to branch in mixed integer programming, E. B. Khalil, P. Le Bodic, L. Song, G. Nemhauser, B. Dilkina. Appeared in AAAI 2016.
Exact combinatorial optimization with graph convolutional neural networks, M. Gasse, D. Chetelat, N. Ferroni, L. Charlin, A. Lodi. Appeared in NeurIPS 2019.
On learning and branching: a survey, A. Lodi, G. Zarpellon. Top (2017) 207-236.

Computational Testing (Week 11b)
Constraint Integer Programming, Tobias Achterberg, Ph. D. Thesis, Berlin 2009. Beautiful examples of careful computational testing.
Benchmarking optimization software with performance profiles, E. Dolan, J. More. Mathematical Programming (2002) 201-213.
A note on performance profiles for benchmarking software, Nicholas Gould, Jennifer Scott. ACM Transactions of Mathematical Software (2016) Article 15.
Perprof-py: A Python package for performance profile of mathematical optimization software, A. Soares Siqueira, R. Gaia Costa da Silva, L. Santos. Journal of Open Research Software. Code available on Github.
Experimental analysis of algorithms, Catherine McGeoch. Notices of the AMS, March 2001.
A theoretician's guide to the experimental analysis of algorithms, David S. Johnson, 2001.

PhD Programs in Discrete Optimization (Week 12a)
Discussion of PhD programs that have a discrete optimization focus.

Team Meetings (Week 12b)
Office hours from 9AM to Noon, Thursday, April 23.

Team Final Project Presentations (Week 13)
Schedule will be posted on Tuesday, April 21




Click here for the 2018 version of the course at the University of Waterloo.



General References

Deep Learning
Ian Goodfellow, Yoshua Bengio, Aaron Courville
MIT Press, 2016
The main course text for fundamentals of deep learning.

A Practical Guide to Discrete Optimization, Chapter 1, Chapter 7
David Applegate, William Cook, Sanjeeb Dash
Computational studies in discrete optimization.