Tips

How to get feedback from the visitors?
Google Docs Form can be used to get feedback from the site visitors. Learn more here.

Some useful handy software

HOW to?

Recent site activity

411days since
Nepalese New Year 2068

Research‎ > ‎

Machine learning, data driven modelling

Traditionally, modelling in civil engineering is based on good understanding of the underlying processes and use so-called "physically-based" (or "knowledge-driven", behavioral) models. Physical principles describing the water dynamics are written down as the equations (for example, the Navier-Stokes equations), they are transformed them into the solvable form and solved using a computer.

These could be for example, models based on Navier-Stokes equation describing behavior of water in particular circumstances. Examples are surface (river) water 1D models, coastal 2D models, groundwater models, etc. Equations are solved using finite-difference, finite-element or other schemes and results - normally water levels, discharges - are presented to decision makers. Often such models are called simulation models. Knowledge-driven models can be also "social", "economic", etc. The observed data is used during the model calibration. Such models are referred as physically-based, simulation, or process models.

On the contrary, "data-driven" model of a system is defined as a model connecting the system state variables (input, internal and output variables) with only a limited knowledge of the details about the "physical" behavior of the system. Probably the simplest data-driven model is a linear regression model.

The general knowledge of physics is of course needed (for the proper choice of relevant parameters) but this knowledge may be not so detailed as needed for the physically-based models. "Hybrid models" combine both types of models.

Generally speaking, the physically-based models are more accurate and more general. The problem is that sometimes it is not possible to build trustworthy models. In such cases, if the observation data is available, the data-driven models may help. DDM complements the simulation modelling and in some cases could replace it.

Machine learning

ML is traditionally considered as a part of Artificial Intelligence. It is aiming at building programs that improve with experience (that is, learn). The two most important problems that ML solves are:

  • classification – when an example (a point in the input space) has to be classified to one of several classes;
  • regression (numerical prediction).

Important results were achieved in this area in the 1970-80s. The researchers from Prof. Aizerman's group in the Institute of Control Problems of the Russian Academy of Sciences could be mentioned here. One of the outstanding results of that group was the development by Vladimir Vapnik of statistical learning theory, used currently in techniques based on the so-called support vector machines (do not confuse with vector computers!). This area received lately a lot of attention (see Vapnik's book The Nature of Staistical Learning Theory, 1995).