A metalanguage is a language that describes the syntax and format of another language. In this course, the examples we are looking at are EBNF and Railroad diagrams.
EBNF describes the grammar functions included in most languages, and uses terminal and non-terminal symbol rules called 'Production Rules' (meaning definitions). These rules govern how these terminal and non-terminal symbols are organised and sequenced.
Non-Terminal | Definition /Terminal
For the example above, we see that the product of 'DIGIT' is being defined by a set of digits from one to nine, where the digit is considered as a non-terminal and the numerical values are seen as terminals.
When discussing about non-terminals, they are essentially a collection of terminals and other non-terminals that are organised and sequenced in a particular way, or in a simpler definition a predefined product.
Non-Terminal | Terminal | Non-Terminal
In this example, we see that the integer is our non-terminal that we are going to define and through using a sequenced use of terminals and non terminals where in this example it’s through the terminals of 0 and its ALTERNATION of an OPTIONAL terminal negative value and its linking statement of a non-terminal being a natural number.
These non-terminals then can be used and repeated throughout other non-terminal products to further develop and define your grammar and language such as in the identifier where it uses two non-terminals as the product where it must have a alphabetical character at the start then a repetition of either more alphabetical characters or an integer.
These applications TERMINALS can be seen within programming language such as C# which are also typically seen within EBNF. these include identifiers which are the names used for functions, methods, variables, classes etc. Typically different programming languages will have certain rules about these identifiers in terms of their grammatical notation, for example; with C#, when considering this variable we have to make sure of 3 rules when determining an identifier.
First, it cannot be the same name as a C# keyword so in this instance the identifier Boolean is not allowed.
Secondly, an identifier must begin with a letter, an @ symbol or an underscore symbol.
Finally, it must also not have white spaces in between nor can it have symbols unless an @ symbol is used at the beginning as a prefix.
Another terminal used would be keywords which using the example of C# are an exact string that are used to identify the start of a definition, some of these keywords include;
Public,
Private,
Switch,
Foreach,
Enum,
Etc.
The final three common terminals used are for separators and delimiters which include colons, semi colons, commas, parenthesis, curly brackets, square brackets, etc.
Then we have white spaces, new lines, and tabs to then finally code commenting where most modern programming languages have in-built code commenting.
This symbol is used to define the an expression. For the below example the '=' symbol defines the term 'DIGIT' to be a number from one to nine.
Within EBNF the comma symbol is named 'Concatenation'. This symbol is used to represent linking statements. In the below example, the statement of 'digit excluding zero' is linked with the next statement of the possible repetition of 'digit'.
The semicolons, which are known as 'Terminations' in EBNF are used in a similar way to their use in C# coding. Terminations are used to define the end of a definition or product where in this example we can see that the integer definition ends after the natural number's non-terminal definition.
In EBNF a simple vertical line is labelled an 'Alteration', and is used as an OR statement to separate between two possible outcomes of the statement. In this example you can see that the Alteration symbol is separating between the possible outcomes of an integer that can either equal zero, or a positive or negative natural number.
The closed square brackets are used to define an optional statement, where the terminals or non-terminals within are can either occur or not occur depending on the situation. In this example we see that if the integer is a Natural Number, the Optional segment includes the negative symbol within, meaning that the Natural Number may be positive OR negative.
Curly brackets in EBNF are used to indicate Repetition, and are used when a typically non-terminal determines any amount for the following digits including zero. Repetition is good to use when dealing with definitions that may have a large numerical value or are going to be repeated. In the example below you can see the definition of the number is determined by the optional positive/negative, then the first digit is defined before finally, with the use of Repetition any amount of the following digits (including 0) are defined.
The brackets are used to Group in EBNF, grouping multiple terminals and non-terminals together, isolating them from other ones in the definition. In the below example you can see that 'Min' is being defined as a number from 0 to 5, using the Grouping symbol to combine them and separate them from the 'Digit'.
Inverted commas or speech marks are used to represent Terminals in EBNF. These are used to define regular expressions, and can include a negative symbol, letters from A-B, numbers from 0-9, etc. In the below example the " ___ " symbol is used to enclose the negative sign within the Optional brackets.
A combination of open/close brackets and asterisks creates a Comment which, in a similar fashion to C# coding will contain text detailing what exactly the EBNF's purpose is.
Also known as Syntax Diagrams
Graphical alternative for EBNF but are written as metalanguages.
Visual representations of grammar rules of a language or data types.
Starts off with the basics of defining the simplest units of the programming language.
For example digits and letters , then going to words, numbers stated variables, and finally to statements which only the language can define in the used terms.
To read a Railroad Diagram, it must always be started from the top left and following the flow of the diagram to the bottom right.
Railroad diagrams cannot go backwards unless there is an arrow indicating it which is mainly seen in a loop.
To have an Optional choice within the Railroad Diagram, it is drawn to allow for the terminal or nonterminal to be passed around it with a line around it, or to use the other option of * using the statement within the circle or rectangle.
EBNF Example 1: < K > = D [ M ]
In this example, it allows the choice for multiple outputs such as K = DM & K = D.
EBNF Example 2: < Q > = [ - ] < Digit >
In this example it has the choice for the final outputs to be a Digit between zero to nine or to have signed integer of the minus symbol with any digit between zero and nine.
For Repetition to occur within a Railroad Diagram there can either be a curved line under the terminal or nonterminal statement or the terminal or nonterminal will be evident within the curved line.
EBNF Example 1: < B > = E { E }
In this example, a terminal of the letter “E” is in the main railroad with a curved line coming off it which allows the letter E to be repeated. Since the terminal is located on the mainline, it shows that the item/ terminal will be repeated.
EBNF Example 1: < T > = { E }
In this example, there is no terminal on the main line but the terminal with letter of E is placed within the curved loop line. This means that there could be a possible repetition. There is no declaration for the number of times this is to be repeated, the process can happen for an unlimited amount of times.
For Grouping to occur in a railroad Diagram there can separate curved lines that detach from the main line which allow the terminal or nonterminal statement to possibly be called.
EBNF Example 1: R = ( + | - ) < digit >
In this example, it gives the choice for either a plus or minus integer to be chosen before the digit is selected. Some of the outputs for these railroads include “R = - digit” & “R = + digit”
EBNF Example 2: P = [ ( + | - ) ] < digit >
In this example, there is an option for none of the grouping terminals to be used and only for a digit to be selected. Some of the outputs for this example would be “P = digit”, “P = + digit” and lastly “P = - digit”.
Used to form the alphabet of the language for the building of statements.
If an item cannot be defined into any more detail then it is put into a circle.
The digits were written as an EBNF in between quotation marks and appeared within multiple circles on a Railroad diagram.
Allows for an item to be abbreviated where the item has more information included,
This statement will always be seen within a rectangle.
A “digit” is called in the diagram where another diagram searches through for a suitable Digit. Once a Digit is called there is an option for a loop. If this loop does occur, another Digit will be included.
Link to a YouTube explaining the content above: