We assume that the reader has a working familiarity with a sufficient amount of real analysis as in say, this book by John Hunter and Bruno Nachtergaele. and some familiarity with basic learning theory as in say these notes by Lorenzo Rosasco. What follows is a list of what I think are the most immediate references that a seriously intentioned person can get started with - the reader is encouraged to try to choose their most comfortable combination of resources for each of these 1+4 groups below. (I also keep this file (link) listing the best (and freely available) expository references that I have found on topics of my interest- but I have stopped updating this.)
Note : Though these very beautiful lecture notes linked above are almost entirely self-contained, its still likely that it might be hard to follow unless someone has some familiarity with the kind of contents as in one of these ``Mathematics of Machine Learning" courses as listed below.
High-Dimensional Statistics/Geometrical Functional Analysis (Introduction)
Various ``Mathematics of Machine Learning" courses :
by Afonso Bandeira , by Phillip Rigollet, by Yuxin Chen,
High-Dimensional Statistics/Geometrical Functional Analysis (Details)
Lecture notes by Boddhisattva Sen on Non-Parametric Statistics,
Lecture notes by Larry Wasserman ("Intermediate Statistics")
Learning Theory
Lecture notes by Francis Bach and his book
Personally I am hugely indebted to the lecture notes of Sham Kakde and Ambuj Tewari for getting me started - I have hardly seen such a beautiful cruise directly into the core concepts - and fast! Roi Livni's notes are a very beautiful path through the subject whose initial parts cover a lot of stuff that is not covered in the other sources mentioned above. Francis Bach's lectures, towards the end, cover some very modern topics which aren't covered in the rest of the references given here.
Continuous Optimization Theory
3. Lecture notes by Geoff Gordon and Ryan Tibshirani
4. Lecture notes by Robert Freund on constrained non-linear optimization
5. Lecture notes by Francis Bach
6. Lecture notes by Yuxin Chen
Sebastian Bubeck's above lectures are possibly the best first tour of the subject that I have seen yet!
---------------------------------------------------------------------------------------------------------------------
Niche Techniques
a.
One of the most succinct introductions to PDE are these set of 2 courses at Stanford, Math 220A and Math 220B
and also see 18.152 and 18.303 at MIT.
(These lectures by Evy Kersale seem to be a more beginner-friendly introduction to P.D.E.s)
Towards what's often needed in research see these P.D.E notes,
by Gerald Teschl, by John Hunter, by Gustav Holzegel, by Lenya Ryzhik (general) , Lenya Ryzhik (fluids)
b.
For O.D.E see these comprehensive lectures by Christopher P. Grant
(For a more beginner friendly approach see the lectures by Simon J Malham)
c.
A specialized book on measure concentration by Maxim Raginsky and Igal Sason
d.
Lecture notes by Philip Clement on gradient flows (based on the book by Luigi Ambrosio, Nicola Gigli, Giuseppe Savaré)
e.
Lecture notes by Boddhisattva Sen on empirical processes
f.
Lecture notes by Bruce Hajek on random processes
g.
h.
Lecture notes by John Thickstun on generative models
i.
Lecture notes by Zico Kolter, David Duvenaud, and Matt Johnson on deep implicit layers.