I am currently a machine learning engineer at Qualtrics working on a team responsible for internal AI platforms. My team enables other teams to leverage AI effectively, and we have a variety of use cases. For example, all AI workflows within Qualtrics surveys flow through our model inference gateway, and we also enable internal science experimentation, agentic workflows through our LangGraph platform, and new business use cases!
So far my biggest responsibilities have been the following:
1) Design and develop a new brand based rate limiting system that effectively handles the high throughput (10K+ LLM calls per minute) that we are currently seeing and address the flaws of the current rate limiting system that was in place (primarily over prisioning and under utilizing capacity from our AI vendors for the models that we enable for our end users). This involved using Lua Scripts within Redis to enable atomic functionality (preventing any race conditions with concurrent request processing) and constantly keeping track of use cases with high usage in relation to the capacity we have on the various models we enable for our consumers.
2) Improve metadata for every team that uses our service so that we have better governance over the flow of LLM calls into our team's primary service. The challenge with this was that we had over 100 teams using our platform, and I had to effectively orchestrate communications across all of these teams.
3) Continue to develop and improve our team's LangGraph service to enable teams creating scalable AI/agentic workflows. This goes beyond just working with external langchain engineers and internal Aualtrics engineers, as we were also exploring the future of AI platforms (such as Bedrock AgentCore) and constantly researching the emerging frontier of AI.