About Me:

In August 2021, I entered into the mathematics PhD program at Texas A&M where I'm supported by a university merit fellowship. A few points of order:

Basically, my goal is to study machine learning from a theoretical and mathematical perspective. The holy grail would be to "do for machine learning what Kolmogorov did for statistics," though, of course, that's not a realistic goal as stated. For more, see "About My Research" above.

I'm increasingly prioritizing software development as I anticipate transitioning into industry after my PhD. I'll always have much to learn, but my foundations are very solid in real analysis and probability theory, as well as adequate in the standard tools from statistics and computer science, both of which my math background helps me pick up quickly. I also have a degree in economics and some work experience in business and agriculture. At networking events, I like to ask people "do you want to predict or estimate anything? [and they obviously respond 'yes'] Well, then I can help you."

On a personal level, all I'll say here is that I'm a super conversational person. Say hi!

About Data Science:

I'm not a source of authority on the matter, nor am I an expert (although, I'm studying to be one), but these are my unofficial impressions, which officially reflect no one's views on anything, other than my own. Please pardon the inevitable generalizations. Most people who say they "do data science" are, either, developing algorithms or implementing algorithms. However, mostly in academia, a third group is trying to understand why existing algorithms work (as a mathematician would say, "establish error bounds" or "prove convergence").

Our approach to data science has been much like our ancestors' approach to fire. We've discovered its usefulness before discovering why it works. So far, what can be said within the bounds of theoretical credibility is very limited. Drawing a conclusion based on the prediction of a neural network would, arguably, be like inferring causation based on correlation. A few problems with this are the following:

The good news is that, at bare minimum, data science is already a powerful brainstorming tool, which can be used to detect subtle patterns, and then we can defer to a specialist to investigate these patterns. This is already safe and useful, provided we rely only on the expert to draw conclusions, and not on the data science algorithm, itself.

The bad news is that data science (usually) cannot "safely" be used for much beyond brainstorming. My area of research is the aforementioned third, neglected area of trying to improve our understanding of what data science algorithms really "tell us." Historically, the short answer is "not much." No wonder this pessimistic area of research has been neglected! However, many of us are still interested in trying to discover inherent meaning in existing data science algorithms and, failing that, to invent "better" data science tools: ones which do have inherent meaning.