Research and Consulting Gigs
June 2016
Topic: Analytics and Visualization on Time Series Data
Client: A major software provider building applications on top of a real-time data streaming system
Summary: I shared my experience working on time-series data, e.g., sensor and cybersecurity data, interactive data exploration, and building research tools based on these technologies in in finance, oil exploration, and health. Although I have worked with teams building streaming applications, my personal work has focused on using extracts of data to find patterns, which then can be tested and implemented on larger, real-time systems. In this gig, I noted that commercial time-series applications have very simplistic interfaces, which either show multiple variables as a function of time, or display simple dashboards combining line charts with pie charts and geographical plots (a la Tableau), and are weak on analytics. I focused my discussion on on the integration of analysis and visualization tasks, such as finding patterns and insights in data, looking for relationships that are conditional on other factors (interaction effects) , developing hypotheses about the underlying phenomena, and testing these hypotheses. Or, perhaps the goal is to find features in the data, and then use these patterns to search for or predict other occurrences. I discussed the importance of having interactive tools that provide interactive mathematical operations, e.g., to identify cross-correlations, compute principle components, and to mathematical descriptors for operations, which can provide a more quantitative set of indicators. The questions about time scale and particular methods all depend on the types of features being sought, and the task at hand. Also, the feature might be a combination of factors, that may be different at different time scales, or different under different operational conditions. The skill in analysis lies in understanding the deeper question, and using that to guide the selection of exploration methods to extract relevant features. I described an example of work I did with a downstream oil refinery, in which we were able to identify a boiler that was driving excess pollutants into the atmosphere. To do so, we treated the data stream not just as a set of numbers, but as a set parameters reflecting the performance of different operations within the plant. Focusing on the pollutant output, since this is the factor the oil refinery wanted to reduce, we looked at different functional sensor outputs, from different components, with respect to pollution output, which allowed me to quickly focus in on a particular boiler. Without a hypothesis to guide the exploration, the result would simply have been a jumble of correlations and relationships that had no operational value. So, the key, then, is to drive the analysis process with human analysts skilled in visual analytics and visualization. Visualization is a key enabler for human problem solving, and can make for a very powerful user application, in conjunction with tools to dynamically create new functions and combined variables.
June 2016
Topic: Augmented Reality: Object Detection and Tracking
Client: An international application services provider
Summary: This company is interested in building augmented reality tools for in-the-field repair of electronic systems, using machine learning algorithms to identify objects. My role in this gig was to shift interest to the human observer who is using the system, as a precurser to selecting machine-learning algorithms or designing the system. What are the features that are important for the human doing the task?" For every environment, there is a seemingly infinite number of parameters, and the best way to identify those few that are critical is to involve the human observer directly, not just to "train" the system, but to constrain the search problem. On the feature description side, I shared with them work I had done on image analysis and search, which searched for images in a database based on semantic features. These semantic features were created based on experiments with human observers, and then expressed as computer algorithms. This work produced 5 papers and 2 issued US patents, including:
Mojsilovic, A., Rogowitz, B.E. and Gomes, J. “System and method for measuring image similarity based on semantic meaning,” Part I ,” US Patent 7,478,091 (2006)
Mojsilovic, A., Rogowitz, B.E. and Gomes, J. “System and method for measuring image similarity based on semantic meaning,” Part II,” US Patent 7,043,474 (2006)
Also, in a service task, the user may not solve the problem on the first attempt. One concept I introduced was the use of a rule-based system, that could be used to provide a set of interactive instructions to the user, as he/she did the task, taking each action as a kind of metadata to constrain choices on other operations. This idea is based on a patent:
Rogowitz, B.E., Treinish, L., and Rabenhorst, D. "Interactive rule-based system with selection feedback that parameterizes rules to constrain choices for multiple operations," US Patent 5,894,955 (1999)