Computer language

Innumerably many computer languages have been proposed since the 1960s, and there has been a sometimes strange evolution of popularity among languages. Fundamental work on data structures, algorithm analysis, program control, and code intelligibility by Dijkstra, Knuth, Wirth, Kahan, and many others have vastly improved the expressiveness of languages, greatly improved the efficiency of algorithms and the coding cycle, and radically broadened the accessibility of coding across engineering and scientific disciplines. Exemplars of coding power and simplicity, such as R and MATLAB, have unleashed a revolution in science, and especially engineering which has been utterly transformed from an individual paper-and-pencil endeavor to a priesthood consulting computer algorithms and index calculators that have been socially validated by practitioners.

But the most common programming languages are still really not conducive to practical work in science and engineering. They lack convenient structures and procedures to do several basic things needed in essentially all practical work:

1. Dimension and units checking and propagation. Errors like adding meters and seconds or trying to raise a number to the two inch power can end in catastrophic failure. Famously, the Mars Climate Orbiter was lost because NASA's different contractors used different units. Such errors pervade quantitative work, and there is a clear need for automatic checking and, in some situations, automated correction. It is possible to define a programming language that understands and robustly handles units as a natural extension of current common practice among scientists and engineers. Dimension and units checking and correction constitutes a kind of typing that is actually useful rather than merely a burden on coders. It should be noted that several computational environments such as F#, Frink, RAMAS Risk Calc, and several libraries for Matlab and Python have tried to offer solutions, but in many cases their awkward syntax creates a burden that many programmers and practicing engineers are not always willing to tolerate.
2. Elementary, extensible uncertainty propagation and sensitivity analysis. Virtually no numerical value in science or engineering is a perfect mathematical number. Every physically measured value has some imprecision, which often has non-negligible implications that limit the reliability of all numerical and decision results calculated from those values. Ignoring uncertainty in mathematical calculations leads to a variety of profound problems in scientific computing. Despite broad acknowledgement of this fact, very few analysts invest their codes with appropriate uncertainty and sensitivity analyses, simply because doing so can be quite cumbersome. In the absence of careful uncertainty quantification, specious precision of numerical results can sometimes beguile and mislead even sophisticated analysts. These uncertainty analyses are thus too important to be left to analysts; programs should undertake these assessments automatically. Compilers and interpreters should invisibly propagate uncertainty and variability through calculations without requiring any special skill in uncertainty analysis. Because best-possible characterization of uncertainty is generally NP-hard and complex to organize, a weaker strategy that bounds uncertainties from above and below is most useful in practice. Input schemes allow programmers to say whether the uncertainty represents variability or imprecision (ignorance) and whether input numbers are measurements, guessed values, or just placeholders for a family of possible calculations. Using smart dependency tracking to determine functional and stochastic dependence across uncertain variables, the ordinary floating-point calculations in scientific functions and engineering simulations are mirrored by ancillary computations that account for the projected uncertainty of results and their sensitivities to changes in inputs. Outputs are automatically sculpted according to their uncertainties so that misinterpretations are minimized.
3. Automated sensitivity analyses. <<Marco: do you think the language should allow for the dual numbers (x, x') suggested by Fischer? If, so, should it be a light implementation of forward-mode AD, or deeper implementation with facilities for reverse-computation AD and full integration with intervals and automatically verified calculations? [Fischer, H.-C., 1993. Automatic differential and applications. Scientific Computing with Automatic Result Verification, E. Adams and U. Kulisch (eds.), Academic Press].>>
4. Tracking justifications, data provenance, statistical assumptions. There is an acute and growing need to reliably link code to the underlying justifications, arguments, evidence and assumptions embodied in its development, as well as to trackably maintain records of the underlying data on which it is based, including sources, manipulations, transformations, analyses, and resulting inferences. Chains of such links connect programming choices about user inputs, parameters, and even algorithms to justifications and evidence sources. Knuth has called for “literate” code, but developments in this direction have seemed rather modest. Until quite recently, the art of commenting code had hardly changed since the days of Fortran. Although Jupyter, Knitr for R, integration of code snippets into documents with Markdown, and various kinds of meta-data schemes and documentation generators have been very promising, practicality often restricts code commenting to unlinked, unformattable annotations at a single level, which is usually totally ignored by compilers and interpreters (except for some often poorly conceived and idiosyncratic compiler switches). The level of detail needed to properly document the thinking and analyses invested in scientific programs and engineering simulations would obscure the functionality of the underlying code. Much like security concerns in webdesign, commenting is always a poorly attended afterthought. But these histories and connections are actually the most important parts of scientific and statistical code. Obviously, several facilities to enrich comments and self-documentation would be useful, including structuring to reveal the kind of comment, color and other formatting options, pointing with across-code links and urls, embeddable images, audio and video, multiply expandable and contractible parts, time-stamping and auto-commenting to encode available provenance and tracking data.
5. Flexible syntax supporting common conventions. Programmers often say they have learned to think in their favorite programming languages. This makes the prospect of switching to another programming language onerous, because each language has made seemingly idiosyncratic choices for critical syntax details, even though it has been clear for decades that standardizing such syntax choices would be useful. For instance, many symbols are used for assignment operators, including =, :=, ~, <=, and the word 'is'. Likewise, common conventions for blocking, looping, subroutine definition, commenting, specifying strings, arrays and lists, as well as other functions use disparate syntax conventions across different languages. Some examples are given below.

Python-like statements
a = 3 + b
for i = range(10) :
a = a + rand(1,1);
end
if not straddles(a) :
print( b/a )
else :
print('inf')
c = 'a string'
# this is a comment

Matlab-like statements
a = 3 + b;
for i = 1:10
a = a + rand(1,1);
end
if not straddles(a)
say b/a
else
print 'inf';
end
c = 'a string';
% this is a comment

C-like statements
a = 3 + b;
if (! straddles(a)) {print b/a} else {print "inf"}
while (a > b) {print a; a = a - 1;}
c = "a string";
// this is a comment

Pascal-like statements
a := 3 + b;
for i := 1 to 10 do a := a + random;
if not straddles(a) then say b/a else say 'inf';
while a > b do begin
print a;
a := a - 1;
end
c := 'a string';

What is perhaps surprising is that it is possible to support more than a single convention for each of these functions within one well-defined programming language. This would allow programmers to carry their high-level preferences from other languages to the new multiply conversant language.

Each of these five capacities would certainly enrich the usefulness of computing across applications in science and engineering. Although schemes and algorithms for each of these capacities have been in service for many years now, they have not yet been assembled as native features into a general computing platform or computer language.

See also Humane Algorithms (Units) and the references therein

See also Exeter's Literate Programming project

See also Uncertainty calculus, Thomas Kirchner's QS Calc, FuziCalc and its relations, Applied Biomathematics' Risk Calc, Nick Gray's Puffin and his recent paper, and the DigiTwin project's DTOP, and the rest of this website

Page updated

Google Sites

Report abuse