Before Ph.D.

Since the beginning of my research work in 1986, I have always wished to apply computer science to a domain that I can share with specialists, or master myself, i.e. in agriculture, biology or music.

Before being a scientist, I am an engineer. So, research in computer science is not an end by itself, its value lies in the applications that can be useful to some domain experts and end-users. This first part describes my research activities that I led before doing my Ph.D. It explains the experimental approach that should be used in natural sciences.

Agriculture

I specialized myself in knowledge engineering at Cognitech, the first AI service society in Europe that developed expert systems as decision supports systems for industrial domains. But their first contract was with INRA in the domain of plant pathology. In 1986, they used a tool named Tigre-1 that was derived from Emycin for building rule-based expert systems. It was applied for diagnosis of symptoms of plant diseases, following thus the medical application domain of the pioneer Mycin. The first expert system in agriculture was called TOM for helping to diagnose tomato diseases, and was followed by sixteen other diagnosis systems on wheat, maize, grapevine, peaches, etc.. J. Le Renard was the project leader of the team of knowledge engineers and A. Coleno was the Head of Plant pathology department at INRA for the coordination of experts. As I had no experience in computer science, I worked one year at INRA Avignon with the tomato expert D. Blancard to make social experimentations of TOM. It was used on a Minitel in the field with technicians and market gardeners. I made a technical and socioeconomic study on the use of expert systems in the field context. It was the theme of my Engineer end of study Mémoire at ISARA in 1987. It emphasized the use of drawings and pictures for learning to observe and identify the name of diseases within a structured questionnaire. One book was published after TOM experience by D. Blancard on tomato diseases. This book has been translated from French to English and is now a reference in the world for diagnosing plant pathology.

From Biology to AIWith this concrete experience in agriculture field, I wished to develop the ideas that came from analyzing the user requirements for building really used expert systems in Biology. So I postulated for making a Master of Advanced Studies in Artificial Intelligence with ergonomic principles of knowledge acquisition in mind. The bottom-up approach has always guided my research philosophy, which is human-centered, based on observation of practices and description of business field objects. With inductive logics, machine learning appeared to be better in accordance with the design of knowledge bases in natural sciences. Indeed, the bottleneck acquisition of expert knowledge through elicitation techniques (interviews) and deductive rule-based expert systems could not stand for the right method, because the particular case is always the rule in Biology. Experts prefer to describe real cases than to give rules. The only method was to iterate on the process of modeling knowledge bases as mock-ups, tune them as prototypes, and verify the consistency and use of the model at each stage with the end-users, i.e. expert producers on one hand and technicians and farmers on the other hand.At the same time in 1988, computer scientists were mostly focused on numerical techniques for the processing of cases in data tables.With TOM, we could bring another dimension in the modeling of cases, i.e. more qualitative aspects on numeric and symbolic representation, based on AI techniques. Biological objects are more complex to describe than Human-made objects, they are structured with different types of dependencies between components. These objects on the table need individual descriptions that cannot be represented in data tables because variables (in column) are not independent between themselves.

Consequently, the frame-based representation was better suited to manage these complex descriptions.

But the decision tree algorithms such as ID3 needed some adaptation to process these structured descriptions. Moreover, the collected information about diseases are noisy: irrelevant observation, unknown responses, polymorphisms of symptoms, etc. So I had the opportunity to collaborate to the first ESPRIT (European Strategic Program for Research and Development in Information Technology) project in Machine Learning, called INSTIL (Integration of Numeric and Symbolic Techniques in Learning) for bringing these complex data through a questionnaire. The partners of INSTIL were GEC research (UK), LRI Orsay (F), and Cognitech (F).

Interface programming

I really discovered computer programming in 1987 with the Macintosh (Mac SE) and the astonishing HyperCard high level hypermedia system for human computer interaction and pedagogy. This software was a revolution for the rest of us because it gave the possibility for "non serious" programmers (C, Pascal, Lisp) to prototype the interface of applications as we imagined them. We could test the design with end-users and modify very quickly the code for getting appealing results. HyperCard 2.0 in 1990 was a must for making questionnaires that integrated drawings, pictures and animations, it replaced databases advantageously, and its language called HyperTalk was object-oriented as SmallTalk! It’s the original Macintosh dream of making the power of personal computer accessible to individuals. HyperCard was a real fun and pleasure for creativity and easy-to-learn programming!

Thanks to C. Millier, head of Informatics Department of INRA, I had the opportunity to make a civil national service at INRA Guadeloupe, called VAT (Volontaire à l'Aide Technique). During sixteen months between 1989 and 1990, I designed a questionnaire for tropical tomato diseases description on a Macintosh SE/30. Unfortunately, I was not able to experiment it in the fields with Guadeloupe market gardeners for the reason that the expert system had not stored the following disease in its knowledge base : Hugo hurricane. All the tomato plants had disappeared! After coming back to France, I mastered the HyperTalk language in order to conceive a questionnaire generator, called HyperQuest. Indeed, the scripting language allows to generate graphical objects such as cards, fields, buttons that correspond to frame descriptions, list of attributes attached to it, and values. This software was the start of my desire to make a Ph.D. in knowledge acquisition for building inductive knowledge bases in natural sciences.