An Intuitive Exploration of Artificial Intelligence: Theory and Applications of Deep Learning, Springer, 2021.
DOI: 10.1007/978-3-030-68624-6.
Links:
Springer Link (from where the front matter and the back matter can be downloaded; the eBook can be downloaded instantly)
Google Books (with some preview)
Excerpt from the book: See arXiv article: Expressive Power and Loss Surfaces of Deep Learning Models, arXiv:2108.03579 [cs.LG], 2021.
Author's Note for Instructors: If you are teaching a course, the solutions for all the exercises at the end of chapters are available from Springer.
Author's Note for Readers:
Figure 4.2: The figure is meant to be a general illustration of the carving mechanism in general N-dimensional input space. For an actual 2-D case the carving will be simpler as for certain convex polytopes the function may be zero e.g. for (I,I) activation pattern, which disconnects the graph. The lines may be parallel to sides of a polytope for some neurons. See a simpler carving below.
Section 4.7: In actual capsule networks and self-attention networks, there are other non-linear operations such as SoftMax in the middle of networks. The mathematical analysis in this section intentionally ignores these additional non-linearities for simplicity though it easily generalizes to arbitrary functions and their level sets. Then the carving of the manifold is more complex and the fitted functions are more general. See the above arXiv article.
Section 4.7: There is a typo, Page 79. "Since y1" should be "Since each component of y1".
Section 4.9.5: The expectation of the random variable indicating when SGD will get stuck in a local minimum has the lower bound exp(kn^2). The reciprocal of an upper bound gives a lower bound.
Simpler carving by a 2-D FFN (See Figure 4.2 in the book)
Experience in Industry
I have been a specialist in the fields of Computer Vision, AI and Machine Learning, and most of my career has been in industry where I have worked in diverse application areas. Though the applications have been diverse, the underlying technology and ideas have been remarkably unifying, along with how innovation happens in the same way in different problem domains.
Mathematical Beauty of World
Natural objects exhibit a hierarchical structure. Objects are made of parts, which are made of sub-parts, and so forth. And a scene is made of many objects. This idea is actually more universal. Complexity arises when parts interact with each other and build larger systems. Consider natural language. A text, say Odyssey by Homer, has high level plot and setting, characters and themes at different levels, and sub-plots, and interrelationships between them. In fact, current AI models based on deep neural networks exploit this hierarchical nature to some extent for inverse process of object recognition and text understanding. Combining this parts-whole concept with fractals in a graph directed construction with some randomization elements leads to some pretty pictures.
Hai Le Xuan, one of students I guided on this idea many years ago in early-mid 1990s and who is now Chairman & CEO of VietSoftware International, created these beautiful images in color.
Students
Besides Hai, who generated the images shown above, I have supervised two students for PhD. Daryl got his PhD from UNSW and Li from Univ of Iowa in the fields of Image Generation and Image Processing. All three are doing great in their respective careers.
Publications
One of my most favorite papers, among those I have published, where Statistics played a major component, can be downloaded in pdf here and viewed here online.
Another paper which I greatly enjoyed writing shows that certain problems in fractal geometry are undecidable (Complex Systems journal, Vol. 7, No. 6). For example, no algorithm exists to decide if a fractal set intersects with a line segment, see figure for illustration. In fact, there is a nice link between Turing Machines and Fractals which leads to this result.
Experience in Academia (2015-16)
After 2 decades of working in industry, primarily in Silicon Valley, I participated for 1 year in building of a world class Data Science and Statistics program at New College of Florida, the Honors college of Florida.
Here are my notes on my teaching experience.
Notes from Fall 2015
Taught two courses.
"Statistical Inference" - Graduate Level Course
"Dealing with Data" - Undergraduate Level Course
Teaching philosophy was to invert the class-room, by making students actively learn when they leave the classrooms and to make the courses more interesting by making them work on real datasets which they can relate to.
Recommended Books for Dealing with Data undergraduate course:
Uncharted, Aiden and Michel. Riverhead, 2013.
Super Crunchers, Ayres. Bantam, 2007.
Statistics, Freedman, Pisani and Purves. Norton, 2007.
Data Computing, Kaplan. Project MOSAIC, forthcoming, 2015.
Notes from Spring 2016
In Spring 2016, I was involved in 3 courses: machine learning, computer vision and statistical inference.
In Computer Vision, my topics were primarily around Deep Learning / AI. The topics covered:
Image Classification using Deep Learning
Deep Neural Networks as Feature Extractors for Computer Vision Problems
Why does Deep Learning work in Computer Vision?
I taught introduction to machine learning course to a mix of graduate and undergraduate students. Covered topics such as Linear Regression, K-NN, Bayes Classifier, LDA, QDA, CART, SVM, Neural Networks and Deep Learning. We were using the following as references:
An Introduction to Statistical Learning: With Applications in R, by G. James, D. Witten, T. Hastie and R. Tibshirani, Springer Texts in Statistics, 2013
The Elements of Statistical Learning: Data Mining, Inference and Prediction, by T. Hastie, R. Tibshirani and J. Friedman, Springer Series in Statistics, 2011
Research papers in Machine Learning