# Working group “Machine learning and optimization”

This new working group will be held on Tuesday mornings. It is organized by Romual Elie, Walid Hachem and Gabriel Stoltz. There is a mailing list if you want (gt-appr-opt followed by @liste.enpc.fr).

The sessions of the working group will be held in the seminar room of CERMICS (2nd floor of Coriolis building, left wing, see here for instructions to reach the building).

January 8th, 2019:

• 10h-11h Olivier Fercoq (Telecom Paris), Smooth Primal-Dual Coordinate Descent Algorithms for Nonsmooth Convex Optimization
We propose a new randomized coordinate descent method for a convex optimization template with broad applications. Our analysis relies on a novel combination of four ideas applied to the primal-dual gap function: smoothing, acceleration, homotopy, and coordinate descent with non-uniform sampling. As a result, our method features the first convergence rate guarantees among the coordinate descent methods, that are the best-known under a variety of common structure assumptions on the template. We provide numerical evidence to support the theoretical results with a comparison to state-of-the-art algorithms.
• 11h15-12h15, Laurent Najman (ESIEE), A morphological point of view on graph-based classification and clustering
In this talk, we will introduce the maximum margin partition problem on graphs. This problem can be seen as the “dual” of the maximum margin boundary problem, of which SVM is a well-known solution. We will see that the maximum margin partition problem naturally leads to different variants of the watershed, the main tool developed in mathematical morphology for image segmentation. We will survey the main ideas underlying the watershed framework for data analysis and regularization: distances, minimum spanning tree, connected filtering and ensemble techniques. We will then present some applications (including for example GIS data) to clustering and semi-supervised learning. If time permits, we will clarify (with the help of the Gamma-convergence framework) the link between minimum-spanning tree optimization and other more classical optimization framework such as, for example, the random-walk approach based on graph-Laplacian.

November 6th, 2018:

• 10h-11h, Alexandre d’Aspremont (ENS Paris), Restarting Frank-Wolfe
Conditional Gradients (aka Frank-Wolfe algorithms) form a classical set of methods for constrained smooth convex minimization due to their simplicity, the absence of projection step, and competitive numerical performance. While the vanilla Frank-Wolfe algorithm only ensures a worst-case rate of O(1/\epsilon), various recent results have shown that for strongly convex functions, the method can be slightly modified to achieve linear convergence. However, this still leaves a huge gap between sublinear O(1/\epsilon) convergence and linear O(\log 1/\epsilon) convergence to reach an $\epsilon$-approximate solution.
Here, we present a new variant of Conditional Gradients, that can dynamically adapt to the function’s geometric properties using restarts and thus smoothly interpolates between the sublinear and linear regimes.
• 11h15-12h15, Guillaume Obozinski (Ecole des Ponts), Structured prediction
I will make an overview of structured prediction a.k.a. structured output learning. A number of problems in statistics and machine learning go beyond regression or multi-class classification, and require to make a more structured decision in the presence of uncertainty: some example include historically image segmentation, pose estimation, and a number of task in NLP such as part of speech tagging, word alignment of sentences from a source language to a target language in machine translation; it is still relevant today for alignment of knowledge bases, for applications in computational biology such as protein function prediction, or side-chain prediction, etc. These problems are very similar in structure to problems considered in operations research, except that the cost defining the objective, which is often a function of input data, is not known and must be learned. This type of formulation could be relevant for operations research problems in the presence of uncertainty when the distribution of the random phenomena to take into account is not known. Several frameworks and formulation have been proposed for structured prediction. I will present the main ones: graphical models and structured SVMs, and discuss some of the algorithms used for learning and prediction. Time permitting I will also talk about the “learn to search” paradigm and deep learning approaches.

There are other events/working groups/seminars on related topics on campus:

• Friday October 16th, Christian Soize, Approche probabiliste de machine learning pour les grandes simulations numériques (more details)
• Thursday, November 22nd, Robert Gower, Stochastic quasi-gradient methods (more details)
• Thursday, November 22nd, Maxime Sangnier, What can a statistician expect from GANs? (more details)