Statistical inference is the branch of applied mathematics that is needed when we want to learn about some information from the data which have been collected experimentally. My own research on statistical inference, published in several papers, is summarized in the BayesianPoissonBasedInference twiki page (view the PDF at the bottom of this page), which may be a bit too technical for a general reader. Thus here I provide a very simple introduction to the field.
Measurements are at the heart of the scientific method. They are necessary to identify phenomena, isolate them from any source of disturbance, and verify that our understanding of the system is correct by testing our predictions. The measurement of one constant quantity, in general, does not give a unique precise value. Instead, by repeating the measurement multiple times with identical conditions we obtain different results. Hopefully their spread is small, and sometimes the instrumental resolution is so poor that we always get the same result. But this is never the case with high-precision measurements, in which a number of possible interfering effects cause random fluctuations (and sometimes also systematic deviations due to our poor knowledge of some aspect).
If the measurement result is not unique, what is the true value? We can only estimate it using statistical inference procedures. There are two families of approaches to statistical inference, and they differ in the definition of probability and in the interpretation of the result, although very often the two approaches give almost the same numerical output, especially when there is a large number of measurement results. In the frequentist approach, one always assumes that it is possible to repeat identically the same experiment many times (at least in principle, as in practice this is seldom the case). By relying on the law of large numbers, the probability of an event is defined as its long-run frequency of occurrence. Moreover, the true value is taken as unknown and fixed, and statistical inference deals with the fluctuations of the measurement results in terms of coverage. This term quantifies the fraction of an ideally infinite sequence of identically repeated experiments, whose result correctly estimates the true (but unknown) value. For example, if the result of the inference is that the quantity in which we are interested has a value estimated to be in the range 5.5-7.1 with 95% confidence level, this means that we expect that 95% of the identically repeated experiments would give a range which contains the true value (but the range will be different from 5.5-7.1 in general), while 5% would give a range which does not contain the true value.
In the Bayesian approach, whose name indicates that inference is based on Bayes theorem, probability is instead interpreted as our degree of belief in the occurrence of an event, based on the available amount of information. This definition is subjective, because it implies the existence of an intelligent being, which is able to process information and guess about the likelihood that something will happen. However, not every probability assignment is allowed. By following rules that are based on a fair and coherent probability assignment, everybody would end up into the same probability based on the same information. One example of fair and coherent probability assignment is offered by betting odds, as illustrated by Bruno de Finetti. Although a degree of belief sounds very different from the long-run frequency of occurrence of a given event, they are actually the same whenever the latter is definable. This is guaranteed by the law of large numbers, a theorem which is valid independently of the interpretation we give to the probability. Moreover, other definitions of probability based for example on symmetry judgment are also contained in the interpretation as degree of belief. Examples are the equal probability assigned to a coin's head and tail and to the different faces of a die. As we believe that the coin (or die) is fair, i.e. that there is no specific reason why a result should be more likely than any other one, we decide to assign equal probability to all possible outcomes. In short, a degree of belief is the widest interpretation of probability, which include all others. For example, if we interpret probability in terms of long-run frequency or we assign it based on symmetry arguments, we cannot speak about the probability that it rains tomorrow. Instead, thanks to the concept of degree of belief, we can.
The role of the Bayes theorem in the Bayesian statistical inference is to evolve our knowledge from whatever information we had before performing the experiment (called prior information) to the updated information that is available once the experimental results are known (a.k.a. posterior information). Thus the Bayes theorem is a model for the learning process: it tells us how our knowledge changes in view of the observed outcome.
The result of Bayesian statistical inference is also interpreted in terms of degree of belief. If the result is that our best estimate for a quantity is the interval 3.3-4.2 with 95% posterior probability (often called "credibility"), we believe that there is 95% probability that the true value is contained within this interval. Note that an infinite repetition of the same experiment may have a coverage different from 95%. As the coverage is a very important criterion, some care needs to be applied in choosing the prior, but I omit these details here (see BayesianPoissonBasedInference for additional references).
Finally, the result of frequentist statistical inference is always cast in the form P(data|model) = probability to observe some result given the model, in the case of an infinite identical replication of the same experiment. On the other side, a Bayesian result has the form P(model|data) = probability of some model given the observed data. As such, a Bayesian result is much simpler to interpret, as it answers our natural question "what is the probability that my model (i.e. my understanding of the system) is correct?"
If this sounds interesting and you want to look for introductory material, you may check wikipedia or the "Introduction to Probability" course by Grinstead and Snell. I also recommend Jaynes' book "Probability theory as extended logic" if you are prepared for a more technical reading. Finally, physicists may also find interesting the proceedings from the PhyStat conferences PhyStat2003, PhyStat2005, PhyStat2007, PhyStat2011, PhyStat2016. And of course have a look at the PDF below, presenting a summary of my work on objective Bayesian statistics :-)