Summary of research

(For more on my research, you can look at my Research Statement in the Application Documents folder and also the preprints in the Selected Papers folder.)

My research lies in Complex and Algebraic Geometry, One and Several Complex Variables, and Dynamical Systems. Recently, my work in automorphisms of the affine space C^n led me to be interested in the field of Computer Algebra as well. I am also interested in applications of these fields to other fields. Recently I also research about Gradient Descent methods and applications to Deep Learning.

Several results of mine on automorphism groups, dynamical degrees, primitive meromorphic maps and (uni)rationality of some threefolds were presented in Keiji Oguiso's invited talk at ICM 2014. Tien-Cuong Dinh's upcoming invited talk at ICM 2018 also mentions some of my works.

I am an enthusiast of using computer powers to solve problems, either in mathematics, in science or in life. Now we are in a different age, everyone uses computers, cellphones and internets, and so I think the questions to ask and the way to do research (and many other aspects of life) must be also changed. Pencils and papers, I think, now need computer aids. For real questions. Recently, I also do research on Gradient Descent methods and applications in Deep Learning, with help from (Random) Dynamical Systems and Geometry research. My joint work arXiv:1808.05160 demonstrated the feasibility and good performance of Backtracking Gradient Descent in Deep Neural Networks, and the results therein have been vindicated by subsequent work by other authors such as arXiv:1905.09997.

In another recent paper arXiv:2006.01512, my collaborators and I proposed a new modification of Newton's method, roughly having the following property: if the sequence {x_n}, constructed by the new method from a random initial point x_0, converges, then the limit point is a local minimum, and the rate of convergence is quadratic. The complexity of the algorithm is O(m^3) at each step, where m is the dimension.

Since I am concerned about the correctness of the proofs of claims in mathematics (in many cases - most of cases, I think - people either do not have the competence or time to check, and hence just believe the claims, in particular if the claimants are famous), I am doing research also in Automated Proof Checking. It is interesting to know that there is growing interest of applying Machine Learning techniques into Automated Proof Checking. By Curry-Howard correspondence, roughly speaking, checking the correctness of mathematical proofs are equivalent to verifying the correctness of computer programs. Therefore, if the mentioned idea works well, then it will influence enormously both whole mathematics and your daily life (given that computers and computer softwares are now universal).

My collaborators:

Nina Abarenkova (Link), Eric Bedford (Link), Cao Xuan Phuong (Link), Fabrizio Catanese (Link), Dan Coman (Link), Dang Duc Trong (Link), Dinh Ngoc Thanh (Link), Tien-Cuong Dinh (Link), Duong Minh Duc (Link), Anastasiya Dykyy (Link ), Paulo Ferreira (Link ), Maged Helmy (Link), Fei Hu (Link ), Eric Jul (Link ), Shulim Kaliman (Link), Kyounghee Kim (Link), Frank Kutzschebauch (Link), Finnur Larusson (Link), Nam Le (Link), Jean-Marie Maillard (Link), Hoang Phuong Nguyen ((Link)), Thu Hang Nguyen (Link), Tuan Hang Nguyen (Link), Luc Nguyen (Link), Viet-Anh Nguyen (Link), Keiji Oguiso (Link), Alain Pham Ngoc Dinh (Link), Phan Thanh Nam (Link), Tat Dat To (Link).

Some highlights of my recent works:

In recent work, using the ideas from New Q-Newton's method Backtracking, I propose new versions of New Q-Newton's method Backtracking and of Levenberg-Marquardt method, for cost functions of the form f=||F||^2, where F: R^m-> R^n. These cost functions appear in various settings: solving systems of equations, the least square fit problem in statistics. Theoretical guarantees are given, concerning global convergence, rate of convergence, and avoidance of saddle points.

In recent work, I propose a family of algorithms generalising both Backtracking gradient descent and New Q-Newton's method Backtracking. Some versions have good theoretical guaranteess (convergence for functions having at most countably many critical points or those satisfy the Lojasiewicz gradient inequality). Some versions have favours of quasi-Newton's methods.

n recent work, I propose a new iterative optimization method, named New Q-Newton's method Backtracking. This method incorporates Backtracking line search into New Q-Newton's method (the latter is proposed in one my recent joint work). The new method has, as far as I know, for Morse functions, the best theoretical guarantee for iterative optimization methods in the literature: either the constructed sequence x_n goes to infinity, or it converges to a local minimum with quadratic rate of convergence. The method is easy to implement, and works well on various small scale problems.

In recent work, my collaborator and I showed that Conjecture Gr (which is stated in a recent paper of us) is a simpler alternative way to prove the positive characteristic analog of the wellknown Serre's result on eigenvalues of polarized endomorphisms (generalized Weil's Riemann hypothesis and semisimplicity) than Standard conjectures. Indeed, if Standard conjectures hold, then Conjecture Gr follows (for graphs of polarized endomorphisms).

In recent work, my collaborators and I applied Deep Learning techniques to monitor capillary. This helps to bring more automatic treatment of this research topic than currently done in the literature.

In recent work, my collaborator and I proposed a new conjecture, called Conjecture G_r, which is a quantitative version of Standard Conjecture C. This new conjecture helps with resolving the positive characteristic analog of the well known Serre's result on eigenvalues of polarised morphisms in zero characteristic (itself a generalisation of Weil's Riemann hypothesis). More generally, it helps solving a conjecture, applicable to effective correspondences, which I proposed ~ 4 years ago. This new conjecture and Standard Conjecture D can replace various positivity notions in complex manifolds.

In recent work, I defined the Riemannian manifold versions of (Local) Backtracking GD and New Q-Newton's method, and proved corresponding results on convergence to critical points and/or avoidance of saddle points.

In recent work, my collaborators and I proposed a new modification of Newton's method which avoids saddle points while still fast. The complexity of the algorithm (at each step) is O(m^3), where m=dimension. We are exploring efficient ways to implement in huge scale optimisation.

In recent work, I provide further insights into the properness of maps of the form x+(Ax)^k. As an application, I show that the approach in arXiv:2002.10249 is invalid, by constructing a counter example (where A is a 3x3 matrix) to the main claim of that paper.

In recent work, I provide various sufficient conditions and necessary conditions for maps of the form x+(Ax)^k to be proper on the real vector space R^m. These maps include the well known Druzkowski maps x+(Ax)^3. It is known that if all Druzkowski maps are proper, then the Jacobian Conjecture holds. My paper is inspired by arXiv:2002.10249, where all maps x+(Ax)^3, satisfying the property that both x+(Ax)^3 and x-(Ax)^3 have only one zero at x=0, are asserted to be proper. If arXiv:2002.10249 is correct, then all Druzkowski maps are proper, and hence the Jacobian conjecture follows. My work shows that it is very likely for a map x+(Ax)^3 to be proper, and hence it seems that the approach of showing these maps are proper is a promising one.

In recent work, I extended my previous work on Coordinate-wise Backtracking GD to the general setting of a C^1 function. This helps to adapt better in case partial derivatives in different directions have different sizes, such as a|x|+y, x^3 sin(1/x) + y^3 sin (1/y), or Rosenbrock's function. I also argued that the ReLU function, while has a similar shape like |x|, is better for numerical optimisation algorithms, and that may be the reason why we see it successful as an activation function for DNN in practice.

In recent work, I extended several of my results on Backtracking GD to the setting of Banach spaces. The results are stated in convergence in weak topology only, and are valid for functions f having the following property: Condition C: If x_n weakly converges to x and ||\nabla f(x_n)|| converges to 0, then \nabla f(x)=0. Condition C is satisfied by quadratic functions, convex functions and functions in Class (S)_+ introduced by Browder in Nonlinear PDE.

In recent work, I showed that in Backtracking GD, learning rate \delta (x) needs not to be bounded but can be such that \delta (x) ||\nabla f(x)|| converges to 0 when x converges to a critical point of f.

In recent work, I proposed a coordinate-wise version of Armijo's condition in Gradient Descent methods, for special functions of the form f(x,y)=g(x)+h(y). I then extended my previous results (including joint work with Tuan Hang Nguyen) in convergence for Backtracking GD and its variants, to this setting. As an example, I constructed specific iterations for finding minima of the function f(x,y)=x^3 \sin (1/x) + y^3 \sin (1/y), which for all initial points z_0 = (x_0,y_0) outside a pre-determined set of Lebesgue measure 0, will converge and the limit point is either (0,0), (0,y) (where y is a local minimum of g(t)=t^3 \sin (1/t)), (x,0) (where x is a local minimum of g) or a minimum of f. As far as I know, current methods by other authors cannot apply to this (quite singular) example.

In recent work, I defined a continuous version of Backtracking Gradient Descent x ->H(x)= x -\delta (x) \nabla f(x) for cost functions whose gradients are locally Lipschitz continuous. If f is moreover C^2 near its generalised saddle points, then I showed that if the initial point x_0 is outside a predetermined set of Lebesgue measure 0, then the sequence x_{n+1}=H(x_n) cannot converge to a generalised saddle point, and its cluster points cannot contain an isolated saddle point. If the local Lipschitz constants vary continuously, then I define a new discrete version of Backtracking GD, and proved similar result for this new algorithm.

In recent joint work with Tuan Hang Nguyen, I showed that the backtracking gradient descent method is a very good method for finding critical points of a C^1 function. In particular, we showed that any cluster point of the sequence {z_n} constructed from the method is a critical point of f. Moreover, if f has at most countably many critical points (for example, if it is a Morse function), then either the sequence z_n diverges to infinity or converges. Based on this result, it is argued that in the long run the backtracking gradient descent method will stabilise to the usual gradient descent method. We also showed that in a sense it is very rare for the cluster points of z_n to contain a saddle point. Backtracking versions of Momentum and NAG are also proposed and convergence are proven under the same assumptions. We then proposed a new method, mixing between the usual gradient descent method and (a slight modification of) the backtracking gradient descent method, which aims to combine good properties of these two methods. Our new method provides a good automatic fine-tuning for learning rates. Experiments on the CIFAR10 and CIFAR100 datasets show that the new method is better than state-of-the-art methods in machine learning, such as MMT, NAG, Adagrad, Adadelta, RMSProp and Adam. Accompanying source codes are available on GitHub: [Link]

In recent work, I showed that the following weaker version of the birationality problem is computable, that is there is an explicit algorithm to solve the weaker problem. Bounded Birationality Problem: Given irreducible subvarieties X,Y of P^n, where n-2 >= dim(X)=dim(Y). Is there a rational map F:P^n-->P^n, whose degree is bounded by a given number d and whose restriction f=F|_X to X is birational map onto Y? Based on this, the paper also proposes an approach toward solving the Birationality Problem: Whether two given varieties X and Y are birational equivalent to each other? Similar results are also proven for the case where f is a dominant rational map, regular morphism or regular isomorphism. Similar results hold for general (not necessary projective) algebraic varieties X and Y. A rough strategy for the birationality problem, via Iitaka's fibrations, is proposed, and is checked in some simple cases.

In recent work, I applied a generalisation of measures, the so-called sub-measures, to several questions in dynamics and complex geometry, for example: pullback and push forward of measures by meromorphic maps, intersection of positive closed (1,1) currents. I showed, for example, that while there is no reasonable way to pullback a measure to a measure, we can pullback a measure to a sub-measure, and any meromorphic map has a non-zero invariant sub-measure.

In joint work with Finnur Larusson, I studied analogue of the Oka theory in the algebraic setting. Our results illustrate that algebraic subellipticity seems to be a good analogue, while the analogue of some other properties in the holomorphic setting such as the interpolation property is not interesting because they are not satisfied by any complete algebraic manifold (not necessarily projective) of positive dimension.

In recent work with Shulim Kaliman and Frank Kutzschebauch, I extended my previous work with Finnur Larusson to blowups of algebraic manifolds which, up to cross product with an affine space C^N, locally are flexible. The later are manifolds X with a lot of automorphisms in a certain sense. More precisely, at any point x of X, there are groups G isomorphic to the C acting on X, so that the tangent vectors at x to the orbit G.x generate the tangent vector space T_xX. These manifolds are intensively studied in affine algebraic geometry. A fundamental result by H. Flenner, S. Kaliman and M. Zaidenberg says that the complement of a subvariety of codimension at least 2 of a flexible manifold is again flexible.

Related to the two questions in the below paragraph is: Question 3: Can dynamical degrees be computed in terms of l-adic cohomology groups? Recently, Esnault - Srinivas showed that the answer for this question is affirmative for automorphisms of surfaces. Using consequences of Deligne's proof of Weil's Riemann hypothesis, I extended their result to a large class of endomorphisms of surfaces. I also answered affirmatively a question of them about whether two different notions of entropy, one defined using algebraic cycles and the other one defined using l-adic cohomology groups, coincide. If the Standard Conjecture D (numerical vs homological equivalence) holds, more general results are proven for rational maps and correspondences. I also showed that if some properties of dynamical degrees, known over the field of complex numbers C, extend to positive characteristics, then simple new proofs of the Weil's Riemann hypothesis follow.

I proposed to use the idea from etale topology to resolve two seemingly faraway questions. Question 1: What should be the topological entropy of a dynamical system (f,X,\Omega ), where (X,\Omega ) is non-compact? Question 2: What should be the topological entropy of a rational map, over an arbitrary characteristic, so that it is as closely related to the pullback of (iterates of) the map on cohomology groups as possible, a la the Entropy Conjecture and the Gromov - Yomdin's theorem for holomorphic maps on compact Kahler manifolds? The main idea is to each map f, we study also its etale covers (which in general are not rational maps, only correspondences). The etale topological entropy is then the maximum of topological entropies of all such covers. The so defined (etale) topological entropy may help in answering the puzzle of why the Gromov - Yomdin's theorem does not hold in non-Archimedean dynamics, even in dimension 1.

With Finnur Larusson, I proved that the blowup of the affine space along a smooth algebraic submanifold is algebraically subelliptic (as defined by M. Gromov), and hence is Oka. Our result may be viewed as a positive step toward resolving the question whether the Oka property is invariant under birational transformations. As an application, it follows that if D is a smooth connected algebraic curve in C^n, then it is a holomorphically hypersurface retract. This consequence is related to the well-known open problem that an open Riemann surface can be embedded as a closed complex curve in C^2.

I extended the results of Tien Cuong Dinh and Nessim Sibony on dynamical degrees for correspondences and of Tien Cuong Dinh, Viet Anh Nguyen and myself on relative dynamical degrees for meromorphic selfmaps on compact Kahler manifolds to the setting of arbitrary varieties and fields of arbitrary characteristic. The main idea is to make use of de Jong's alterations, Roberts' version of Chow's moving lemma and semi-conjugacies of correspondences.

I obtained a new reduction of the Jacobian conjecture, which in particular shows that there are equivalent formulations of the Jacobian conjecture which hold with the probability of 100 percent. In particular, this gives more support to the truth of the conjecture. The work provided a fast method to check whether a given Druzkowski map satisfies the Jacobian conjecture, and proposed some approaches toward the resolution of the Jacobian conjecture with the help of Computer Algebra. It also gives a simpler proof of the fact that Jacobian conjecture holds for symmetric Druzkowski maps.

I showed that an approach by JH Sampson toward the Hodge conjecture on Abelian varieties does not work. However, a revised version of its may work.

With Tien Cuong Dinh and Viet Anh Nguyen, I obtained results concerning dynamical degrees which can be used to show that some rational or meromorphic maps cannot have invariant fibrations over a nontrivial base (i.e. not a point or a manifold of the same dimension as the original one). We also proved that if the topological degree (i.e. the number of inverse images of a generic point) of a given meromorphic selfmap is large enough in a certain sense, then the map has good dynamical properties. We obtain quite satisfactory results on the growth of the number of periodic points of meromorphic selfmaps of surfaces.

With Keiji Oguiso, I obtained a complete characterisation of complex 3-tori having interesting automorphisms. With Keiji Oguiso and subsequently with Fabrizio Catanese and Keiji Oguiso, I showed that some quotients of complex tori are (uni)rational. These gave the first two, and currently the only, examples of rational threefolds with interesting automorphisms. Our work was used by Colliot-Thelene to solve a question posed in 1975 by Kenji Ueno.

With Dan Coman, I showed that some upper level Lelong sets of positive closed currents on P^n and compact Kahler manifolds must be contained in some subvarieties with certain restricted geometric properties.

My work in dynamics led me to state the following conjecture:

Conjecture: a) There is no (finite composition of) smooth blowup(s) of a three fold or higher dimension manifold of Picard number 1 having an automorphism of positive entropy. b) There is no (finite composition of) smooth blowup(s) of a Fano threefold or higher dimension having a primitive (i.e. without invariant fibrations over a base which is not a point or a manifold of the same dimension as the original manifold) automorphism of positive entropy.

A geometric consequence of part b) of this conjecture is: The minimal resolution of singularities of Ueno's example, while being rational, cannot be obtained as a blowup of P^3 or P^2xP^1 or P^1xP^1xP^1.

Part b) of this conjecture has been confirmed for 3-folds recently by John Lesieutre. For the claim concerning the Ueno's example, a different proof was also given by Cinzia Bisi, Paolo Casini and Luca Tasin.