haskell_compiler2
Ghazi Bouselmi's Pages
Here you can find out who I am. Well, also my CV, my publications
LinkedIn : http://www.linkedin.com/in/ghazibouselmi
The challenge definition can be found here "While language". The code is a bit too dense, it had to be compressed to fit in the maximum 50KB allowed by HakerRank.
07-06-2017 : first draft : grammar definition and input tokenization.
08-06-2017 : grammar change : unary prefix & postfix increment operators, token types, function definition parsing.
08-06-2017_bis : statements & expressions parsing.
09-06-2017 : statements & expressions parsing, factorized binary & unary operators parsing.
09-06-2017_bis : compressed the code : with only empty parsers, it exceeded the maximum 50KB allowed by HakerRank.
11-06-2017 : 'int' variables declaration, parsing data structure, printing of parsed program.
12-06-2017 : instructions data structures, stack & stack frames, dynamic variable creation, execution functions.
13-06-2017 : first accepted version : Proper storing for left-2-right expression tree; compilation directives to disable function declaration & enable stack frames (disabled by default to comply with HakerRank problem definition).
13-06-2017_bis : function calls, full proper evaluation of expressions (postfix & prefix increment operators).
TODO: add these statements : "switch () {}", "break", "return".
Sample program:
<< allow_stack_frames >>
int i, j, k, m;
for (int i := 0; i < 10; ++i) {
j += i < 5 ? 1 : 10
}
k:=0
while (k < 10) do { k++ }
int fibonacci0 := 0; fibonacci1 :=1;
do {
k:= fibonacci0 + fibonacci1;
fibonacci0 := fibonacci1;
fibonacci1 := k;
}
while (fibonacci1 < 50);
Sample program:
<< allow_stack_frames >>
function factorial(n) {
if ( n <= 1 )
then {
1;
} else {
n * factorial(n-1);
}
}
f3 := factorial( 3 );
f5 := factorial( 5 );
f10 := factorial( 10 );
f20 := factorial( 20 );
Output: the value of the variables at program end:
f10 3628800 f20 2432902008176640000 f3 6 f5 120
Program parsing output:
all input done: lines count = 15 tokens count = 61 chars=233 parsed 61 out of 61tokens. instruction count=68 IP=0 : Program IP=2 : Function definition list IP=3 FuncDefList : Function definition 'factorial' IP=5 FuncDefList FuncDef factorial( : Name list IP=6 FuncDefList FuncDef factorial( : Var 'n' IP=7 FuncDefList FuncDef factorial() { : Statement block IP=8 FuncDefList FuncDef factorial() { { : Statement sequence IP=9 FuncDefList FuncDef factorial() { { SS : IF IP=10 FuncDefList FuncDef factorial() { { SS if( : Expression list IP=12 FuncDefList FuncDef factorial() { { SS if( EL- : Binary operator '<=' L2R=1 PRIO=7 IP=11 FuncDefList FuncDef factorial() { { SS if( EL-opB<= : Var 'n' IP=13 FuncDefList FuncDef factorial() { { SS if( EL-opB<= : Number '1' IP=14 FuncDefList FuncDef factorial() { { SS if( EL=> : Expression list IP=15 FuncDefList FuncDef factorial() { { SS if() then { : Statement block IP=16 FuncDefList FuncDef factorial() { { SS if() then { { : Statement sequence IP=17 FuncDefList FuncDef factorial() { { SS if() then { { SS : Expression list IP=18 FuncDefList FuncDef factorial() { { SS if() then { { SS EL- : Number '1' IP=19 FuncDefList FuncDef factorial() { { SS if() then { { SS EL=> : Expression list IP=20 FuncDefList FuncDef factorial() { { SS if() then {} else { : Statement block IP=21 FuncDefList FuncDef factorial() { { SS if() then {} else { { : Statement sequence IP=22 FuncDefList FuncDef factorial() { { SS if() then {} else { { SS : Expression list IP=24 FuncDefList FuncDef factorial() { { SS if() then {} else { { SS EL- : Binary operator '*' L2R=1 PRIO=10 IP=23 FuncDefList FuncDef factorial() { { SS if() then {} else { { SS EL-opB* : Var 'n' IP=25 FuncDefList FuncDef factorial() { { SS if() then {} else { { SS EL-opB* : Function-call 'factorial' IP=26 FuncDefList FuncDef factorial() { { SS if() then {} else { { SS EL-opB* Call-factorial( : Expression list IP=28 FuncDefList FuncDef factorial() { { SS if() then {} else { { SS EL-opB* Call-factorial( EL- : Binary operator '-' L2R=1 PRIO=9 IP=27 FuncDefList FuncDef factorial() { { SS if() then {} else { { SS EL-opB* Call-factorial( EL-opB- : Var 'n' IP=29 FuncDefList FuncDef factorial() { { SS if() then {} else { { SS EL-opB* Call-factorial( EL-opB- : Number '1' IP=30 FuncDefList FuncDef factorial() { { SS if() then {} else { { SS EL-opB* Call-factorial( EL=> : Expression list IP=31 FuncDefList FuncDef factorial() { { SS if() then {} else { { SS EL=> : Expression list IP=32 : Statement sequence IP=33 SS : Expression list IP=34 SS EL- : Assignment ':=' IP=35 SS EL-:= : Var 'f3' IP=36 SS EL-:= : Function-call 'factorial' IP=37 SS EL-:= Call-factorial( : Expression list IP=38 SS EL-:= Call-factorial( EL- : Number '3' IP=39 SS EL-:= Call-factorial( EL=> : Expression list IP=40 SS EL=> : Expression list IP=41 SS, : Statement sequence IP=42 SS, SS : Expression list IP=43 SS, SS EL- : Assignment ':=' IP=44 SS, SS EL-:= : Var 'f5' IP=45 SS, SS EL-:= : Function-call 'factorial' IP=46 SS, SS EL-:= Call-factorial( : Expression list IP=47 SS, SS EL-:= Call-factorial( EL- : Number '5' IP=48 SS, SS EL-:= Call-factorial( EL=> : Expression list IP=49 SS, SS EL=> : Expression list IP=50 SS, SS, : Statement sequence IP=51 SS, SS, SS : Expression list IP=52 SS, SS, SS EL- : Assignment ':=' IP=53 SS, SS, SS EL-:= : Var 'f10' IP=54 SS, SS, SS EL-:= : Function-call 'factorial' IP=55 SS, SS, SS EL-:= Call-factorial( : Expression list IP=56 SS, SS, SS EL-:= Call-factorial( EL- : Number '10' IP=57 SS, SS, SS EL-:= Call-factorial( EL=> : Expression list IP=58 SS, SS, SS EL=> : Expression list IP=59 SS, SS, SS, : Statement sequence IP=60 SS, SS, SS, SS : Expression list IP=61 SS, SS, SS, SS EL- : Assignment ':=' IP=62 SS, SS, SS, SS EL-:= : Var 'f20' IP=63 SS, SS, SS, SS EL-:= : Function-call 'factorial' IP=64 SS, SS, SS, SS EL-:= Call-factorial( : Expression list IP=65 SS, SS, SS, SS EL-:= Call-factorial( EL- : Number '20' IP=66 SS, SS, SS, SS EL-:= Call-factorial( EL=> : Expression list IP=67 SS, SS, SS, SS EL=> : Expression list Stack-Frames 1 BP=99999 SP=99995
The grammar:
Program ::= Compilation_Directive_List Function_list Statement_sequence
Compilation_Directive_List ::= '<<' Compilation_Directive '>>' Compilation_Directive_List
| <NOTHING>
Compilation_Directive ::= 'allow_functions' | 'allow_stack_frames'
Function_list ::= Function Function_list
| <NOTHING> <== special care needed for <NOTHING>
Function ::= 'function' Name '(' Name_list ')' '{' Statement_sequence '}'
Name_list ::= Name Name_list_tail
| <NOTHING> <== special care needed for <NOTHING>
Name_list_tail ::= ',' Name Name_list_tail
| <NOTHING> <== special care needed for <NOTHING>
Name ::= [a..z_]Name_tail
Name_tail ::= [a..z_0..9]Name_tail
| <NOTHING> <== special care needed for <NOTHING>
Statement_sequence ::= Statement
| Statement ';' Statement_sequence
| ';' Statement_sequence
| '{' Statement_sequence '}' Statement_sequence
| <NOTHING> <== special care needed for <NOTHING>
-- FIXME: add a return statement
Statement ::= Expression_list
| 'if' '(' Expression_list ')' 'then' '{' Statement_sequence '}' else '{' Statement_sequence '}'
| 'while' '(' Expression_list ')' 'do' '{' Statement_sequence '}'
| 'do' '{' Statement_sequence '}' 'while' '(' Expression_list ')'
| 'for' '(' Expression_list_Or_Declaration ';' Expression_list_empty ';' Expression_list_empty ')' '{' Statement_sequence '}'
| Statement_declaration <= create the variable in the current depth block, init with 0
Expression_list_Or_Declaration ::= Expression_list
| Statement_declaration
| <NOTHING>
Statement_declaration ::= 'int' Statement_declaration_body
Statement_declaration_body ::= Name
| Name ':=' Expression
| Name ':=' Expression ',' Statement_declaration_body
| Name ',' Statement_declaration_body
Expression_list ::= Expression Expression_list_tail
Expression_list_empty ::= Expression Expression_list_tail
| <NOTHING> <== special care needed for <NOTHING>
Expression_list_tail ::= ',' Expression Expression_list_tail
| <NOTHING> <== special care needed for <NOTHING>
-- ternary operator => lowest priority
-- Right-to-left
Expression ::= LeftValue
| LeftValue '?' LeftValue ':' Expression
--
-- A proper POSTFIX & PREFIX increment ++ &-- operators, as well as proper assignment (based on complex LValue expressions) will-- need either a much more complicated grammar, or a 2 pass compilation
-- too much for now :D
--
-- Right-to-left
LeftValue ::= p_RV_log_OR
| Name [:= += -= *= /= %= >>= <<= |= &= ^= ||= &&= ~=] Expression
-- Left-to-right
p_RV_log_OR ::= RightValue_log_AND
| RightValue_log_AND ['||' 'or']RightValue
-- Left-to-right
RightValue_log_AND ::= RightValue_bit_OR
| RightValue_bit_OR ['&&' 'and']RightValue_log_AND
-- Left-to-right
RightValue_bit_OR ::= RightValue_bit_X_OR
| RightValue_bit_X_OR '|' RightValue_bit_AND
-- Left-to-right
RightValue_bit_X_OR ::= RightValue_bit_AND
| RightValue_bit_AND '^' RightValue_bit_AND
-- Left-to-right
RightValue_bit_AND ::= RightValue_EQUAL
| RightValue_EQUAL '&' RightValue_bit_AND
-- Left-to-right
RightValue_EQUAL ::= RightValue_REL
| RightValue_REL [== !=]RightValue_EQUAL
-- Left-to-right
RightValue_REL ::= RightValue_SHIFT
| RightValue_SHIFT [< <= > >=]RightValue_REL
-- Left-to-right
RightValue_SHIFT ::= RightValue_ADDITION
| RightValue_ADDITION [<< >>]RightValue_SHIFT
-- Left-to-right
RightValue_ADDITION ::= RightValue_MULTIPLY
| RightValue_MULTIPLY [+ -]RightValue_ADDITION
-- Left-to-right
RightValue_MULTIPLY ::= RightValue_UNARY
| RightValue_UNARY [* / %]RightValue_MULTIPLY
-- Right-to-left
RightValue_UNARY ::= RightValue_PREFIX_INC
| [+ - ~ !]RightValue_UNARY
-- Right-to-left
RightValue_PREFIX_INC ::= RightValue_ATOM
| ['++' '--'] RightValue_PREFIX_INC_TAIL
RightValue_PREFIX_INC_TAIL ::= RightValue_Name_and_POSTFIX_INC
| ['++' '--'] RightValue_PREFIX_INC_TAIL
-- left to right
RightValue_Name_and_POSTFIX_INC ::= Name RightValue_POSTFIX_INC_TAIL
-- left to right
RightValue_POSTFIX_INC_TAIL ::= ['++' '--']RightValue_POSTFIX_INC_TAIL
| <NOTHING> <== special care needed for <NOTHING>
RightValue_ATOM ::= Number
| RightValue_Name_and_POSTFIX_INC
| Name '(' Expression_list_empty ')' <= function call
| 'true'
| 'false'
| '(' Expression ')' <= reccursive expression