Monday: Day 1 ( Sept 2, 2024)
8:45 - 9:00
Registration 8:45 - 9:15 (V107-Schlumberger)
9:00 - 9:10
Santiago VELASCO-FORERO (Mines Paris, PSL Research University)
9:10 - 10:00
Nicolaas Kuiper Honorary Professor at the Institut des Hautes Études Scientifiques, Bures-sur-Yvette, France
Abstract
The Calculus of Variations played a key role in the historical development of Mathematics, both as a stimulation to solve spectacular problems and to develop new concepts. but also as providing a laboratory for the creation of new geometries.
This phenomenon continues unabated to this day in many different parts of Mathematics but also of Theoretical Physics, which has been a great provider of challenges for both the Calculus of Variations and Geometry.
One of the features which has played a key role in the most advanced refinements of the Calculus of Variations is that often critical situations correspond to situations of special geometric interest, e.g. in connection with the presence of non-compact invariance (or covariance) groups. This could connect to some topological conditions, which could be obstructions to solving a variational problem.
References
[1] Jean-Pierre BOURGUIGNON, Variational Calculus, Springer, Math. Monographs, 2022.
[2] Constantin CARATHEODORY, Variationsrechnung und partielle Differentialgleichungen erster Ordnung, B. G. Teubner, Berlin, 1935; 2nd edition in english : Calculus of variations, Chelsea, New York, 1982.
[3] Dario CORDERO-ERAUSQUIN, Bruno NAZARET, and Cédric VILLANI, A mass-transportation approach to sharp Sobolev and Gagliardo-Nirenberg inequalities, Adv. Math. 182 (2) (2004), p. 307–332.
[4] Israel M. GELFAND, Sergei V. FOMIN, Calculus of variations, Prentice-Hall, Englewood Cliffs, 1963.
[5] Jürgen JOST, Xian-qing LI-JOST, Calculus of Variations, Cambridge Univ. Press, 1999.
[6] Joseph-Louis DE LA GRANGE, Mechanique Analitique, Veuve Deaint, 1786.
[7] Cornelius LANCZOS, The variational principles of mechanics, 4th edition, University of Toronto Press, Toronto, 1970; reprinted by Dover, New York, 1986.
[8] Jürgen MOSER, A sharp form of an inequality by N. Trudinger, Indiana Math. J. , 20 (1971), p. 1077–1092.
[9] Giorgio TALENTI, Best constant in Sobolev inequality, Ann. Mat. Pura Appl. (4) 110 (1976), p.353–732.
[10] Laurence C. YOIUNG, Lectures on the calculus of variations and optimal control, 2nd edition, Chelsea, New York, 1980.
10:00 - 10:30
Coffee Break (V117)
10:30 - 11:30
1School of E.C.E., National Technical University of Athens, Greece.
2Robotics Institute, Athena Research Center, Greece.
Abstract
Tropical geometry is a relatively recent field in mathematics and computer science combining elements of algebraic geometry and polyhedral geometry. It has recently emerged successfully in the analysis and extension of several classes of problems and systems in both classical machine learning and deep learning. In this talk we will first summarize a few introductory ideas and tools of tropical geometry and its underlying max-plus arithmetic. Then, we will focus on how this new set of tools can aid in the analysis, design and understanding of several classes of neural networks and other machine learning systems, including deep neural networks with piecewise-linear activations, morphological neural networks, neural network minimization, and nonlinear regression with piecewise-linear functions. Our coverage will include studying the representation power, training and pruning of these networks and regressors under the lens of tropical geometry and max-plus algebra. More information and related papers can be found in http://robotics.ntua.gr.
11:30 - 12:30
Abstract
Tropical convex sets arise as log-limits of parametric families of classical convex sets, or as images by a non-archimedean valuation of convex sets over ordered nonarchimedean fields. Feasibility problems in tropical convex programming reduce to classes of repeated zero-sum games that are solvable in practice in a highly scalable way, although their theoretical complexity remain unsettled. This leads to effective methods to solve problems of tropical linear regression and of tropical separation (analogues of SVM). Such problems arise in particular in auction theory, in which the response of an agent to a price offer is represented by a tropical hypersurface, and one is interested in inferring a secret utility of an agent. We will finally discuss a recent interpretation of probabilistic language models in terms of metric and tropical geometry
References
[1] Xavier Allamigeon, Pascal Benchimol, Stéphane Gaubert, and Michael Joswig, What Tropical Geometry Tells Us about the Complexity of Linear Programming, SIAM Rev., 63(1), 123–164, 2021.
[2] Marianne Akian, Stéphane Gaubert, Yang Qi and Omar Saadi, Tropical linear regression and mean payoff games: or, how to measure the distance to equilibria, SIAM J. Disc. Math, 37(2), 2023.
[3] Elizabeth Baldwin and Paul Klemperer, Understanding Preferences: “Demand Types”, and the Existence of Equilibrium With Indivisibilities, Econometrica, 87(3), 2019.
[4] Stéphane Gaubert and Yiannis Vlassopoulos, Directed Metric Structures arising in Large Language Models, arXiv:2405.12264
12:30 - 14:00
Lunch (not included in the workshop)
14:00 - 15:00
Abstract
Mathematical morphology is a theory of non-linear operators with solid geometrical and topological features. The elementary morphological operators, which form the basis for many other powerful operators, commute with extrema operators in a complete lattice. Vector-valued mathematical morphology concerns operators defined on complete lattices obtained by endowing a vector space with a complete lattice structure. They are beneficial for processing multivariate objects, like 3D audio signals and hyperspectral and color images, partially because they better capture the intercorrelation between feature channels. Although deep learning models have emerged as the cutting-edge solution for many image and signal processing tasks, the relevance of mathematical morphology should be remembered. Elementary morphological operators can be found, for example, in activation functions and pooling layers in current deep-learning models. In this talk, we shall revise the role of mathematical morphology in some deep neural network architectures. Furthermore, we shall bring concepts from vector-valued morphology to the deep-learning framework. By combining these rich research fields, we aim to develop deep-learning models that naturally incorporate geometric and topological properties and can effectively capture the interrelationships between feature channels.
15:00 - 16:00
Abstract
The Matheron-Maragos-Banon-Barrera (MMBB) [Matheron75, Maragos89, Banon91] representation theorems provide an astonishing general formulation for any nonlinear operator between complete lattices, based on combinations of infimum of dilations and supremum of erosions. The theory is relevant when the basis (minimal kernel) of the operators can be learnt. In the case of non-increasing or non-translation-invariant operators the constructive decomposition of operators becomes more complex but would still be based on basic morphological dilation, erosion, anti-dilation and anti-erosion. In this talk, I will first discuss the theoretical interest of the MMBB representation theory to model nonlinear layers and operators in deep learning networks and to highlight their interest to propose more general nonlinearity activations and layers [Velasco-Forero22a].
Any network architecture combining convolution, down/up-sampling, ReLUs, etc. could be seen at first sight as incompatible with a lattice theory formulation. In fact, as it was shown by Keshet [Keshet02, Keshet03], low-pass filters, decimation/interpolation, Gaussian/Laplacian pyramids and other typical image processing operators, admit an interpretation as erosions and adjunctions in the framework of (semi)-lattices. In addition, max-pooling and ReLUs are just dilation operators. The notion of deepness or recurrence in a network can be seen as the composition and iteration of basic operators. In the second part of the talk, I will therefore discuss a complete theoretical formulation of deep learning networks in terms of morphological operators and point out some open questions on the fixed points [Velasco-Forero22b] and the study of order stability in the corresponding (semi-)lattices [Hejmans92].
References
[Banon91] G.J.F. Banon, J. Barrera. Minimal representations for translation-invariant set mappings by mathematical morphology. SIAM Journal Applied Mathematics, 51(6): 1782-1798, 1991.
[Hejmans92] H.J.A.M. Hejmans, J. Serra. Convergence, continuity, and iteration in mathematical morphology. Journal of Visual Communication and Image Representation, 3(1): 84-102, 1992.
[Keshet02] R. Keshet. A Morphological View on Traditional Signal Processing. In Mathematical Morphology and its Applications to Image and Signal Processing. Computational Imaging and Vision, Vol 18. Springer, 2002.
[Keshet03] R. Keshet, H.J.A.M. Heijmans. Adjunctions in Pyramids, Curve Evolution and Scale-Spaces.
International Journal of Computer Vision, 52: 139-151, 2003.
[Matheron75] G. Matheron. Random sets and integral geometry. NewYork, Wiley, 1975.
[Maragos89] P. Maragos. A representation theory for morphological image and signal processing. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2(6): 1989.
[Velasco-Forero22a] S. Velasco-Forero, J. Angulo. MorphoActivation: Generalizing ReLU Activation Function by Mathematical Morphology. In: Discrete Geometry and Mathematical Morphology. DGMM 2022. Lecture Notes in Computer Science, vol 13493. Springer, 2022
[Velasco-Forero22b] S. Velasco-Forero, A. Rhim, J. Angulo. Fixed point layers for geodesic morphological operations. In: BMVC. London, United Kingdom, 2022.
16:00 - 18:00
Tuesday: Day 2 ( Sept 3, 2024)
9:00 - 9:10
9:10 - 10:00
Abstract
In 1824, Sadi Carnot (1796-1832) presented the basics of thermodynamics (2nd principle only) in his only published book–Réflexions sur la puissance motrice du feu (hereafter Réflexions). Sadi Carnot knew Watt's works. In his Réflexions he dealt with heat machines and gas theory through: a) the caloric hypothesis mixed with a weak heat concept, b) the 4-phases cycle and c) ad absurdum proof theorem (atypical for a physical science at that time). The impossibility of perpetual motion was linked to state of a system, reversible processes and cycle (four phases). In his unpublished Notes sur les mathématiques, la physique et autres sujets (s.d.) he made slight and indirect use of the hypothesis on puissance motrice conservation/heat–work (Cfr. Carnot s.d., pp. 134–135). He also provided a cycle (three phases) in his unpublished Recherche d’une formule propre à représenter la puissance motrice de la vapeur d’eau (between November 1819 and March 1827). At the beginning of the discursive part of Réflexions – and at the end of his celebrated theorem – he claimed that work can be obtained every time there is a difference in temperature between which heat flows (Cfr. Pisano's works). He proposed a different manner to close his own cycle (displayed by cylinders only) and determined (erroneously) the mathematical function of the efficiency for an heat machine. The Italian scholar Paolo Ballada (1815-1888), also called Paul de Saint–Robert or Paul comte de Saint–Robert (living in Piemonte–Italy, beyond France) wrote to technological topics, almost exclusively about thermodynamics and mechanics. Among his writings, the following three works should be remarked: Principes de thermodynamique (1865); a Booklet (perhaps the first) on Sadi Carnot's biography (1868); and a 2nd edition of his Principes de thermodynamique (1870) which included both Sadi's biography and others, e.g., on Mayer (based on information he likely could have obtained directly from the author). In my talk I present an historical–genesis account of Sadi Carnot’s thermodynamics in his Réflexions (1824), and reading of Saint–Robert Robert’s Principes de thermodynamique (1865-1870).
References
[1] Carnot L ([1783] 1786) Essai sur les machines en général. Defay, Dijon.
[2] Carnot L (1778) Mémoire sur la théorie des machines pour concourir au prix de 1779 propose par l’Académie Royale des Sciences de Paris. The manuscript is conserved in the Archives de l’Academie des sciences, Institut de France, and consists of 85 sections in 63 folios. Sections 27–60 are reproduced. In: Gillispie (1971), Appendix B, pp. 271–296 | See also: Carnot L (1780) Mémoire sur la théorie des machines pour concourir au prix que l’Académie Royale des Sciences de Paris doit adjuger en 1781. The manuscript is dated from Béthune 15 July 1780. It is conserved in the Archives de l’Académie des sciences, Institut de France, and consists of 191 sections in 106 folios. Sections 101–160 are reproduced. In: Gillispie (1971), Appendix C, pp. 299–343.
[3] Carnot S (1878) Réflexions sur la puissance motrice du feu sur les machinés propre à développer cette puissance. Gauthier–Villars, Paris.
[4] Clapeyron EBP (1834) Mémoire sur la puissance motrice du feu. Journal de l’École Royal Polytechnique XXIII/XIV:153–190. [Collections École polytechnique Z 5 (1834)].
[5] Gillispie CC, Pisano R (2014) Lazare and Sadi Carnot. A Scientific and Filial Relationship. 2nd edition. Springer, Dordrecht [On St Robert, see Chap. X].
[6] Pisano R, Cooppersmith J, Australia, Peake M (2021) Essay on Machines in General (1786). Text, Translations and Commentaries. Lazare Carnot's Mechanics – Vol. 1. Springer, Cham.
[7] Saint–Robert P (1865) Principes de Thérmodynamique. Cassone. 1st edn, Torino [1870. 2nd edn. Loecher, Turin et Florence [see also idem edition printed by Gauthier – Villars, Paris ; the book was also reviewed in Le Mondes Revue Hebdomadaire des sciences, Tome VIII, 1865, pp 514–515. In the secondary literature, a book by Saint–Robert entitled Traité de thermodynamique seemed to have been circulated. Its existence is doubtful since any reference–frontispieces–editions seems to be lacking].
[8] Saint–Robert P (1868) Notice biographique sur Sadi Carnot. Atti della Reale Accademia delle scienze di Torino, Cl. Sci. Fisiche e Matematiche IV:151–170.
[9] Saint–Robert P (1869) Sadi Carnot. Notice biographique par le Comte Paul de Saint–Robert. This title is only cited in: Pubblicazione del Regno. Bibliografia d’Italia. Anno III. Nı. 1, Gennaio 1869. Loesher, Torino e Firenze. [It would refer to Saint–Robert 1868; someone else also cited Classe Scienze Fisiche e Matematiche VI. It is archivied at the French Academy of Science. I have copy from French Academy of Science]: see also (1870) Note I. Sadi Carnot. Principes de Thermodynamique. 2nd edn. Loecher, Turin et Florence [see also the edition printed by Gauthier – Villars, Paris], pp. 431–450.
[10] Volta A (1800) On the electricity excited by the mere contact of conducting substances of different kinds. In a letter from Mr. Alexander Volta […] to the Rt. Hon. Sir Joseph Banks […] Philosophical Transactions of the Royal Society II:403–431
10:00 - 10:30
Coffee Break (V117)
10:30 - 11:30
Abstract:
Thermodynamics understanding by Geometric model were initiated by all precursors Carnot, Gibbs, Duhem, Reeb, Carathéodory. It is only recently that Symplectic Foliation Model introduced in the domain of Geometric statistical Mechanics has opened the door for a solid bedrock, giving a geometric definition of Entropy as invariant Casimir function on Symplectic leaves (coadjoint orbit of the Lie Group acting on the system and interpreted as level sets of Entropy).
As observed by Georges Reeb "Thermodynamics has long accustomed mathematical physics [see DUHEM P.] to the consideration of completely integrable Pfaff forms: the elementary heat dQ [notation of thermodynamicists] representing the elementary heat yielded in an infinitesimal reversible modification is such a completely integrable form. This point does not seem to have been explored since then." Notion of foliation in thermodynamics appears in C. Carathéodory paper where horizontal curves roughly correspond to adiabatic processes, performed in the language of Carnot cycles. The properties of the couple of Poisson manifolds was previously explored by C. Carathéodory in 1935, under the name of “function groups, polar to each other”. This seminal work of C. Caratheodory leads to the concept of a Poisson structure which was first defined independently by Lichnerowicz and Kirillov.
A symplectic foliation model of Thermodynamics has been defined by Jean-Marie Souriau based on "Lie Groups Thermodynamics" model. This model gives a cohomological characterization of Entropy, as an invariant Casimir function in coadjoint representation. The dual space of the Lie algebra foliates into coadjoint orbits identified with the Entropy level sets. In the framework of Thermodynamics, a symplectic bifoliation structure is associated to describe non-dissipative dynamics on symplectic leaves (on level sets of Entropy as constant Casimir function on each leaf), and transversal dissipative dynamics, given by Poisson transverse structure (Entropy production from leaf to leaf). The symplectic foliation orthogonal to the level sets of moment map is the foliation determined by hamiltonian vector fields generated by functions on dual Lie algebra. The orbits of a Hamiltonian action and the level sets of its moment map are polar to each other. The space of Casimir functions on a neighborhood of a point is isomorphic to the space of Casimirs for the transverse Poisson structure. Souriau’s model could be then interpreted by Libermann's foliations, clarified as dual to Poisson Gamma-structure of Haefliger, which is the maximum extension of the notion of moment in the sense of J.M. Souriau, as introduced by P. Molino, M. Condevaux and P. Dazord in papers of “Séminaire Sud-Rhodanien de Géométrie ». The symplectic duality to a symplectically complete foliation, in the sense of Libermann, associates an orthogonal foliation. We conclude with link to Cartan foliation and Edmond Fedida works on Cartan's mobile frame-based foliation.
In the first part, we will present the theme "Statistical learning on Lie groups" [1,2] which extends statistics and machine learning to Lie groups based on the theory of representations and cohomology of Lie algebra. From the work of Jean-Marie Souriau on " Lie Groups Thermodynamics" [4] initiated within the framework of symplectic models of statistical mechanics, new geometric statistical tools have been developed to define probability densities (Gibbs density of Maximum Entropy) on Lie Groups or homogeneous manifolds for supervised methods, and the extension of the Fisher metric of Information Geometry for unsupervised methods (in metric spaces).
In a 2nd part, TINNs (Thermodynamics-Informed Neural Networks) [3,5] will be discussed for AI-augmented engineering applications. The geometric structures of TINNs are studied by their metriplectic flow (also called GENERIC flow) modeling non-dissipative dynamics (1st thermodynamic principle of energy conservation) and dissipative dynamics (2nd thermodynamic principle of entropy production). The Thermodynamics of Lie Groups of Souriau makes it possible to geometrically characterize the metriplectic flow by a structure of “webs” composed of symplectic foliations and transversely Riemannian foliations. From the symmetries of the problem, the coadjoint orbits of the Lie group generate via the moment map (geometrization of Noether's theorem) the symplectic foliation (defined as the level sets of entropy, where entropy is an invriant Casimir function on these symplectic leaves). The metric on symplectic leaves is given by the Fisher metric. The dynamics along these symplectic leaves, given by the Poisson bracket, characterizes the non-dissipative dynamics. The dissipative dynamics is then given by the transverse Poisson structure and a metric flow bracket, with an evolution from leaf to leaf constrained by the production of entropy. The transverse foliation is a Riemannian foliation whose metric is given by the dual of Fisher metric (Hessian of Entropy).
These machine-learning tools on Lie groups and TINNs are addressed in two European projects: a European network COST CaLISTA [6] and the European Marie-Curie action MSCA CaLIGOLA [7].
References:
[1] Barbaresco, F. (2022) Symplectic theory of heat and information geometry, chapter 4, Handbook of Statistics, Volume 46, Pages 107-143, Elsevier, https://doi.org/10.1016/bs.host.2022.02.003 https://www.sciencedirect.com/science/article/abs/pii/S0169716122000062
[2] Barbaresco, F. (2023). Symplectic Foliation Transverse Structure and Libermann Foliation of Heat Theory and Information Geometry. In: Nielsen, F., Barbaresco, F. (eds) Geometric Science of Information. GSI 2023. Lecture Notes in Computer Science, vol 14072. Springer, Cham. https://doi.org/10.1007/978-3-031-38299-4_17 ; https://link.springer.com/chapter/10.1007/978-3-031-38299-4_17
[3] Barbaresco F. (2022). Symplectic Foliation Structures of Non-Equilibrium Thermodynamics as Dissipation Model: Application to Metriplectic Non-Linear Lindblad Quantum Master Equation, submitted to MDPI special Issue "Geometric Structure of Thermodynamics: Theory and Applications", 2022
[4] Souriau, J.M. (1969). Structure des systèmes dynamiques. Dunod. http://www.jmsouriau.com/structure_des_systemes_dynamiques.htm
[5] Cueto E. and Chinesta F. (2022), Thermodynamics of Learning of Physical Phenomena, arXiv:2207.12749v1 [cs.LG] 26 Jul 2022
[6] European COST network CA21109 : CaLISTA - Cartan geometry, Lie, Integrable Systems, quantum group Theories for Applications ; https://site.unibo.it/calista/en
[7] European HORIZON-MSCA-2021-SE-01-01 project CaLIGOLA - Cartan geometry, Lie and representation theory, Integrable Systems, quantum Groups and quantum computing towards the understanding of the geometry of deep Learning and its Applications; https://site.unibo.it/caligola/en
11:30 - 12:30
Join work with Q. Hernández, P.Urdeitx, C. Bermejo, I. Alfaro, D. González & F. Chinesta
Abstract
Thermodynamics could be seen as an expression of physics at a high epistemic level. As such, its potential as an inductive bias to help machine learning procedures attain accurate and credible predictions has been recently realized in many fields. We review how thermodynamics provides helpful insights in the learning process. At the same time, we study the influence of aspects such as the scale at which a given phenomenon is to be described, the choice of relevant variables for this description or the different techniques available for the learning process.
References
[1] Quercus Hernandez, Alberto Badías, David González, Francisco Chinesta, and Elías Cueto. Deep learning of thermodynamics-aware reduced-order models from data. Computer Methods in Applied Mechanics and Engineering, 379:113763, jun 2021.
[2] Haijun Yu, Xinyuan Tian, E. Weinan, and Qianxiao Li. OnsagerNet: Learning stable and interpretable dynamics using a generalized Onsager principle. Physical Review Fluids, 6(11), 2021.
[3] Anthony Gruber, Kookjin Lee, and Nathaniel Trask. Reversible and irreversible bracket-based dynamics for deep graph neural networks. may 2023.
[4] Elias Cueto and Francisco Chinesta. Thermodynamics of Learning Physical Phenomena. Archives of Computa- tional Methods in Engineering, 30(8):4653–4666, nov 2023.
[5] David González, Francisco Chinesta, and Elías Cueto. Learning non-markovian physics from data. Journal of Computational Physics, 428:109982, 2021.
[6] Miroslav Grmela. Bracket formulation of dissipative fluid mechanics equations. Physics Letters A,102(8):355–358, jun 1984.
[7] Philip J. Morrison. Bracket formulation for irreversible classical fields. Physics Letters A, 100(8):423–427, feb 1984.
[8] Quercus Hernandez, Alberto Badias, Francisco Chinesta, and Elias Cueto. Thermodynamics-informed Graph Neural Networks. IEEE Transactions on Artificial Intelligence, pages 1–1, 2022.
[9] Quercus Hernández, Alberto Badías, David González, Francisco Chinesta, and Elías Cueto. Structure-preserving neural networks. Journal of Computational Physics, 426:109950, feb 2021.
14:00 - 15:00
Professor emeritus, Lab. de Math. Jean Leray
Abstract
I choose, as a starting point, the time where foliations became a topological object. This happened in 1952 when G. Reeb gave a codimension-one foliation of the 3-sphere [1, p. 112]. This point of view culminated with the work of W. Thurston [5, 6] which, surprisingly enough, reduces the question of existence of a foliation to a question of algebraic topology, namely the homotopy class of plane fields on the manifold under consideration. Meanwhile, namely in 1970, A. Haefliger made accessible to every body—sic!—a notion of singular foliations [3] (here, I mean that this notion probably existed in his thesis (1958) but, it is very hard to read, as the subsequent paper [2]). In contrast, the 1970 paper is very clear. I intend to explain in detail what a Haefliger structure is. A basic property, not shared by the foliations is the following. If M has such a structure , say ξ, and if f : N → M is a smooth map, the pull-back f∗ξ of ξ by f is well defined. For foliations some transversality condition is required. This new flexibility is due to the allowed singularities. The last step I would like to discuss in my talk will consist of enriching Haefliger structures with transverse geometric structures such as symplectic/contact structures when it is possible. This type of question is discussed in my 2016 paper.
References
[1] G. Reeb, Sur certaines propriétés topolologiques des variétés feuilletées, p. 91–158 in: Pub. Inst. Math. Univ. Strasbourg, Actual. Sci. & Indust., vol. XI, Hermann Eds, Paris, 1952.
[2] A. Haefliger, Structures feuilletées et cohomologie à valeur dans un faisceau de groupoïdes, Comment. Math. Helv. 32(1958), 248—329.
[3] A. Haefliger, Homotopy and integrability, in: Manifolds-Amsterdam 1970 (Proc. Nuffic Summer School, Lect. Notes in Math., vol 197, Springer, Berlin, 1971, p. 133–163.
[4] F. Laudenbach & G. Meigniez, Haefliger structures and symplectic/contact structures, J. of Ecole polytechnique 3 (2016), 1–29.
[5] W. Thurston, The theory of foliations of codimension greater than one, Comm. Math. Helv. 49 (1974) ,214–231.
[6] W. Thurston, Existence of codimension-one foliations, Annals of Math. 104 (1976), 249–268.
15:00 - 16:00
Abstract:
Central objects, for this talk, are rooted in the context of bounded convex homogeneous domains. Such domains have been extensively studied by Koszul [5]. Those domains are of interest in statistics, information geometry (including machine learning)(see [1]). Certain of those domains parametrize finite dimensional manifolds of probability distributions, of exponential type. The case of Wishart laws will be discussed. It is interesting to stress the link between this class of Koszul domains and the Monge-Ampère equation. This implies that certain manifolds of probability distributions and the elliptic Monge-Ampère equation are tightly related. We prove that those specific manifolds are affine and that they form a non-compact symmetric space. On the other hand, those manifolds turn out to be pre-Frobenius manifolds, containing a Frobenius manifold [2, 3]. To be precise, the Frobenius manifold, here, corresponds to exp(a), where a is a Cartan subalgebra of a Lie algebra t. Frobenius manifolds are central in the classification of 2d-Topological Field Theory and in mirror symmetry. Finally, we relate the results above with works of J-M. Souriau. By [4], the Frobenius statistical manifolds are symplectic manifolds, equipped with a Poisson structure. J-M. Souriau extended the concept of a Gibbs state for a Hamiltonian action of a Lie group G on a symplectic manifold [7]. Via Souriau's approach and using Souriau's generalisation of Gibbs states [6], we consider this in the context of (pre)Frobenius statistical symplectic manifolds.
References
[1] Andersson, S.A., Wojnar, G.G., Wishart Distributions on Homogeneous Cones. Journal of Theoretical Probability 17, 781{818 (2004).
[2] Combe, N. On Frobenius structures in symmetric cones, 2023, ArXiv 2309.04334,
[3] Combe, N. and Manin, Y.I. (2020), F-Manifolds and geometry of information. Bull. London Math. Soc., 52: 777-792.
[4] Combe, N., Combe, P., Nencka, H. (2023). Poisson Geometry of the Statistical Frobenius Manifold. In: Nielsen, F., Barbaresco, F. (eds) Geometric Science of Information. GSI 2023. Lecture Notes in Computer Science, vol 14072. Springer, Cham. pp.165-172
[5] Koszul, J-L, Domaines bornés homogènes et orbites de groupes de transformations affines, Bulletin de la S. M. F., tome 89 (1961), p. 515-533
[6] Barbaresco, F. , Koszul Information Geometry and Souriau Geometric Temperature/Capacity of Lie Group Thermodynamics, Entropy 2014, 16, 4521-4565
[7] Souriau, J-M. Mécanique Classique et Géométrie Symplectique; Report ref. CPT-84/PE-1695 CNRS Centre de Physique Théorique: Marseille, France, 1984. (In French)
16:00 - 16:15
Coffee Break (V117)
16:15 - 17:15
Dipartimento di Matematica "Tullio Levi-Civita", Padova, Italy
Abstract
This talk reviews the geometric structure of (super) integrable Hamiltonian systems, which is what in symplectic geometry is variously called isotropic-coisotropic dual pair, bifoliation, or symplectically complete foliation. The dynamical meaning of both objects---the isotropic and the coisotropic foliations---is enlightened and illustrated in some classical mechanical systems. Time permitting, the connection with the existence of (large) symmetry groups is explained.
References
[1] Francesco Fasso, Superintegrable Hamiltonian systems: geometry and perturbations., Acta Applicandae Mathematicae 87, 93-121 (2005)
[2] Francesco Fasso, Perturbations of Superintegrable Hamiltonian Systems. In: Meyers, R.A. (eds) Encyclopedia of Complexity and Systems Science. Springer, Berlin, Heidelberg (2022).
[
17:15 - 18:15
Keio University with the Mathematical Sciences Team of RIKEN AIP
Abstract:
Most applications of Lie groups in machine learning use them to capture the inherent symmetries of the data, and building models that are invariant/equivariant with respect to such symmetries. There is, however another (dual, if you will) approach where the group acts on the parameters of a model, instead of acting on the data. In [1] we use the Lie group to perturb distributions** on the model parameters, and efficiently find the best perturbation direction via gradient methods. We are thus able to formulate an update rule that generalizes existing learning methods as well as producing novel update rules.
More precisely positing that the candidate parameter distributions form an orbit under a Lie group action turns out to be highly fruitful: The setup naturally hands us pathwise gradient estimators (see [2]), since we can connect the perturbations in the parameters to the perturbations in the density functions, thus being able to estimate the gradient of an expectation using the expectation of gradients. Using this, and other structures of Lie groups, such as homogeneity and exponential maps, we arrive at the Lie-group Bayesian Learning Rule.
An already established method for updating distributions is the Bayesian Learning Rule (BLR, [3]), a natural gradient descent method on an exponential family with linear updates on the natural parameters which lie in a flat space. For exponential families that also have a Lie group action, both BLR and the group update method can be applied. Using the characterization of such families in [4] (which also gives us a recipe for constructing such families) we can show that the linearization of the Lie-group update gives BLR.
The specialization of our update method for simple cases, such as the multiplicative group of positive reals yields interesting empirical phenomena to be explored. For the multiplicative group in image classification task, the weights end up being sparse and localized despite no explicit regularization to force it.
**Distribution Learning: In variational inference, the goal is to learn not a single value of a parameter but a distribution over the parameters, thus capturing the inherent stochasticity of decision-making, increasing robustness. Given a model and some data, we can form a loss function on the model parameters. Classically one tries to minimize the loss itself. The Bayesian learning problem is an objective on the probability density functions that balances the expected loss and negative entropy. The objective is also known as evidence lower bound or negative variational free energy, and has a Bayesian and a thermodynamical interpretation [5,6]. It can also be written in terms of KL-divergence to a Gibbs distribution (with the loss interpreted as the energy).
References
[1] Kıral, E. M., Möllenhoff, T., & Khan, M. E. (2023, April). The Lie-group Bayesian Learning Rule. In International Conference on Artificial Intelligence and Statistics (pp. 3331-3352). PMLR.
[2] Mohamed, S., Rosca, M., Figurnov, M., & Mnih, A. (2020). Monte Carlo gradient estimation in machine learning. Journal of Machine Learning Research, Vol. 21, No. 132, (pp. 1-62).
[3] Khan, M. E., & Rue, H. The Bayesian Learning Rule. (2023) Journal of Machine Learning Research. Vol. 24, No. 281, (pp. 1-46).
[4] Tojo, K., & Yoshino, T. (2021). Harmonic exponential families on homogeneous spaces. Information Geometry, Vol. 4 No. 1, (pp. 215-243).
[5] Zellner, A. (1988). Optimal information processing and Bayes's theorem. The American Statistician, Vol. 42, No. 4, (pp. 278-280).
[6] Jaynes, E. T. (1957). Information theory and statistical mechanics. Physical review, Vol. 106, No 4, pp. 620.
Wednesday: Day 3 ( Sept 4, 2024)
9:00 - 9:10
Santiago VELASCO-FORERO (Mines Paris, PSL Research University)
9:10 - 10:00
Abstract:
In this presentation, we propose a new method to perform data augmentation in a reliable way in the High Dimensional Low Sample Size (HDLSS) setting using a geometry-based variational autoencoder (VAE). Our approach combines the proposal of 1) a new VAE model, the latent space of which is modeled as a Riemannian manifold and which combines both Riemannian metric learning and normalizing flows and 2) a new generation scheme which produces more meaningful samples especially in the context of small data sets. The method is tested through a wide experimental study where its robustness to data sets, classifiers and training samples size is stressed. It is also validated on a medical imaging classification task on the challenging ADNI database where a small number of 3D brain magnetic resonance images (MRIs) are considered and augmented using the proposed VAE framework. In each case, the proposed method allows for a significant and reliable gain in the classification metrics. For instance, balanced accuracy jumps from 66.3% to 74.3% for a state-of-the-art convolutional neural network classifier trained with 50 MRIs of cognitively normal (CN) and 50 Alzheimer disease (AD) patients and from 77.7% to 86.3% when trained with 243 CN and 210 AD while improving greatly sensitivity and specificity metrics.
Coffee Break 10:00 - 10:30 (V117)
10:30 - 11:30
Abstract:
Geometry-aware deep learning has significantly influenced fields ranging from 3D computer vision to scientific research. In recent years, the research community has explored geometric algebra (a.k.a. Clifford algebra) representations for deep learning. This approach offers a strong mathematical foundation for handling complex geometric transformations. In this talk, I will discuss several works that did so, including recent works that explore equivariances with respect to non-Euclidean geometries, as, e.g., found in relativistic space-time.
11:30 - 12:30
Abstract:
Equivariance can enhance the data efficiency of machine learning models by incorporating prior knowledge about a problem.
Thanks to their flexibility and generality, steerable CNNs are a popular design choice for equivariant networks.
By leveraging concepts from harmonic analysis, these networks model symmetries through specific constraints on their learnable weights or filters. This framework facilitates the practical implementation of a wide variety of equivariant architectures - e.g. to most Euclidean isometries, including E(3), E(2) and their subgroups.
However, unknown or imperfect symmetries can sometimes lead to overconstrained weights and suboptimal performance.
This challenge motivated the study of strategies to enforce softer priors into the models.
In the second half of this talk, we will discuss a novel probabilistic approach to learning the degrees of equivariance in steerable CNNs.
The method replaces the equivariance constraint on the weights with an expectation over a learnable distribution, which is analytically computed by leveraging its Fourier decomposition.
Lunch 12:30 - 14:00 (not included in the workshop)
14:00 - 15:00
Abstract:
In deep learning, we would like to develop principled approaches for constructing neural network architectures. One important approach involves encoding symmetries into neural network architectures using representations of groups such that the learned functions are equivariant to the group. In this talk, we show how certain group equivariant neural network architectures can be built using set partition diagrams. In many cases, we can establish a category theory framework both for the set partition diagrams and for the equivariant linear maps between layer spaces. We extend this framework to characterise the weight matrices that appear in neural networks that are equivariant to the automorphism group of a graph.
References:
[1] Haggai Maron, Heli Ben-Hamu, Nadav Shamir, and Yaron Lipman. Invariant and Equivariant Graph Networks. In International Conference on Learning Representations, 2019. URL: https://openreview.net/forum?id=Syx72jC9tm.
[2] Edward Pearce-Crump. Connecting Permutation Equivariant Neural Networks and Partition Diagrams, 2022. arXiv:2212.08648.
[3] Charles Godfrey, Michael G. Rawson, Davis Brown, and Henry Kvinge. Fast computation of permutation equivariant layers with the partition algebra. In ICLR 2023 Workshop on Physics for Machine Learning, 2023. URL: https://openreview.net/forum?id=VXwts-IZFi.
[4] Edward Pearce-Crump. Brauer’s Group Equivariant Neural Networks. In Proceedings of the 40th International Conference on Machine Learning, volume 202 of Proceedings of Machine Learning Research, pages 27461–27482. PMLR, 23–29 Jul 2023. URL: https://proceedings.mlr.press/v202/pearcecrump23a.html.
[5] Edward Pearce-Crump. How Jellyfish Characterise Alternating Group Equivariant Neural Networks. In Proceedings of the 40th International Conference on Machine Learning, volume 202 of Proceedings of Machine Learning Research, pages 27483–27495. PMLR, 23–29 Jul 2023. URL https://proceedings.mlr.press/v202/pearce-crump23b.html.
[6] Edward Pearce-Crump. Categorification of Group Equivariant Neural Networks, 2023. arXiv:2304.14144.
[7] Edward Pearce-Crump and William J. Knottenbelt. Graph Automorphism Group Equivariant Neural Networks, 2023. arXiv:2307.07810. To appear at ICML 2024.
15:00 - 16:00
Abstract:
Impressive probabilistic models of complex images and physical fields are obtained with score diffusion algorithms, where the score is estimated with deep neural networks. Are they memorising or generalising ? How are such models computed in high dimension without facing a curse of dimensionality ?
We address both problems by showing that neural network architectures are adapted to prior information on high-dimensional multiscale geometry. In physics, the curse of dimensionality is avoided by factorising probability distributions, with a renormalisation group approach. Results are shown on natural scences, turbulences and cosmological fields.
16:00 - 18:00
Thursday: Day 4 ( Sept 5, 2024)
chaired by Maurizio Parton
9:00 - 9:10
Maurizio Parton (Ud'A, Italy)
9:10 - 10:00
Abstract:
We discuss the manifold structure of the data space for popular datasets and we give some ideas from geometry and physics on how to perform dimensionality reduction, both in Deep Learning and Geometric Deep Learning
10:00 - 11:00
Abstract: This talk and related research lie at the intersection of two research areas, namely artificial intelligence and infinite-dimensional geometry. We will center our attention on the notion of gauge symmetries, so crucial in gauge theories in physics, but very rarely taken into account in computer vision applications. We will focus on fighting data augmentation, which is generally used to compensate for the fact that certain symmetries inherent in a given problem have not been taken into account in the design of the neural network architecture and the choice of loss function. The classification of 2D or 3D objects modulo translations and rotations is a typical example, where translated and rotated objects must be added to the dataset to obtain good classification performance. In this case, however, the symmetries are encoded in a finite-dimensional group, the Euclidean motions group $\operatorname{SE}(2)$ or $\operatorname{SE}(3)$. Reparameterization symmetries are more problematic, for example in the case of time-dependent signals, where a delay in acquisition may depend on network traffic. In this case, the group of symmetries is infinite-dimensional, like the diffeomorphism group of temporal reparameterizations. To our knowledge, gauge symmetries, so crucial in gauge theories in physics, are very rarely taken into account in computer vision applications. They are however omnipresent, in particular in algorithms dealing with contours in images or objects in 3D-scenes, which can be compared in a meaningful way only modulo the action of the gauge group of (possibly time-dependant) reparameterizations.
Coffee Break 11:00 - 11:30
11:30 - 12:30
Abstract:
In recent years, traditional "wet" biology studies have been complemented by "in silico" research, which involves machine learning
analyses of large biological datasets, often referred to as "omics." Numerous approaches exist, tailored to specific datasets or designed to be broadly applicable. However, in this context, traditional deep learning techniques, particularly autoencoders, often fail to achieve the state-of-the-art performance expected from them in other fields. One reason is that the input for these models—genes—is high-dimensional and lacks a clear structure. This issue is generally addressed through handcrafted feature extraction and the use of fully connected architectures. In this paper, we introduce a graph neural network approach that leverages gene network information derived from a recent statistical model for RNA co-expression. This method offers a structured way to analyze cell types and cell functionalities, potentially enhancing model performance and insights.
Lunch 12:30 - 14:00 (not included in the workshop)
14:00 - 15:00
Abstract:
The message-passing paradigm has been the “battle horse” of deep learning on graphs for several years, making graph neural
networks a big success in a wide range of applications, from particle physics to protein design. From a theoretical viewpoint, it
established the link to the Weisfeiler-Lehman hierarchy, allowing to analyse the expressive power of GNNs. We argue that the very
“node-and-edge”-centric mindset of current graph deep learning schemes may hinder future progress in the field. As an alternative, we propose physics-inspired “continuous” learning models that open up a new trove of tools from the fields of differential geometry, algebraic topology, and differential equations so far largely unexplored in graph ML.
15:00 - 15:15