haskell_compiler2

Ghazi Bouselmi's Pages

Here you can find out who I am. Well, also my CV, my publications

LinkedIn : http://www.linkedin.com/in/ghazibouselmi

The challenge definition can be found here "While language". The code is a bit too dense, it had to be compressed to fit in the maximum 50KB allowed by HakerRank.

07-06-2017 : first draft : grammar definition and input tokenization.

08-06-2017 : grammar change : unary prefix & postfix increment operators, token types, function definition parsing.

08-06-2017_bis : statements & expressions parsing.

09-06-2017 : statements & expressions parsing, factorized binary & unary operators parsing.

09-06-2017_bis : compressed the code : with only empty parsers, it exceeded the maximum 50KB allowed by HakerRank.

11-06-2017 : 'int' variables declaration, parsing data structure, printing of parsed program.

12-06-2017 : instructions data structures, stack & stack frames, dynamic variable creation, execution functions.

13-06-2017 : first accepted version : Proper storing for left-2-right expression tree; compilation directives to disable function declaration & enable stack frames (disabled by default to comply with HakerRank problem definition).

13-06-2017_bis : function calls, full proper evaluation of expressions (postfix & prefix increment operators).

TODO: add these statements : "switch () {}", "break", "return".

Sample program:

<< allow_stack_frames >>

int i, j, k, m;

for (int i := 0; i < 10; ++i) {

j += i < 5 ? 1 : 10

}

k:=0

while (k < 10) do { k++ }

int fibonacci0 := 0; fibonacci1 :=1;

do {

k:= fibonacci0 + fibonacci1;

fibonacci0 := fibonacci1;

fibonacci1 := k;

}

while (fibonacci1 < 50);

Sample program:

<< allow_stack_frames >>

function factorial(n) {

if ( n <= 1 )

then {

1;

} else {

n * factorial(n-1);

}

}

f3 := factorial( 3 );

f5 := factorial( 5 );

f10 := factorial( 10 );

f20 := factorial( 20 );

Output: the value of the variables at program end:

f10 3628800 f20 2432902008176640000 f3 6 f5 120

Program parsing output:

all input done: lines count = 15 tokens count = 61 chars=233 parsed 61 out of 61tokens. instruction count=68 IP=0 : Program IP=2 : Function definition list IP=3 FuncDefList : Function definition 'factorial' IP=5 FuncDefList FuncDef factorial( : Name list IP=6 FuncDefList FuncDef factorial( : Var 'n' IP=7 FuncDefList FuncDef factorial() { : Statement block IP=8 FuncDefList FuncDef factorial() { { : Statement sequence IP=9 FuncDefList FuncDef factorial() { { SS : IF IP=10 FuncDefList FuncDef factorial() { { SS if( : Expression list IP=12 FuncDefList FuncDef factorial() { { SS if( EL- : Binary operator '<=' L2R=1 PRIO=7 IP=11 FuncDefList FuncDef factorial() { { SS if( EL-opB<= : Var 'n' IP=13 FuncDefList FuncDef factorial() { { SS if( EL-opB<= : Number '1' IP=14 FuncDefList FuncDef factorial() { { SS if( EL=> : Expression list IP=15 FuncDefList FuncDef factorial() { { SS if() then { : Statement block IP=16 FuncDefList FuncDef factorial() { { SS if() then { { : Statement sequence IP=17 FuncDefList FuncDef factorial() { { SS if() then { { SS : Expression list IP=18 FuncDefList FuncDef factorial() { { SS if() then { { SS EL- : Number '1' IP=19 FuncDefList FuncDef factorial() { { SS if() then { { SS EL=> : Expression list IP=20 FuncDefList FuncDef factorial() { { SS if() then {} else { : Statement block IP=21 FuncDefList FuncDef factorial() { { SS if() then {} else { { : Statement sequence IP=22 FuncDefList FuncDef factorial() { { SS if() then {} else { { SS : Expression list IP=24 FuncDefList FuncDef factorial() { { SS if() then {} else { { SS EL- : Binary operator '*' L2R=1 PRIO=10 IP=23 FuncDefList FuncDef factorial() { { SS if() then {} else { { SS EL-opB* : Var 'n' IP=25 FuncDefList FuncDef factorial() { { SS if() then {} else { { SS EL-opB* : Function-call 'factorial' IP=26 FuncDefList FuncDef factorial() { { SS if() then {} else { { SS EL-opB* Call-factorial( : Expression list IP=28 FuncDefList FuncDef factorial() { { SS if() then {} else { { SS EL-opB* Call-factorial( EL- : Binary operator '-' L2R=1 PRIO=9 IP=27 FuncDefList FuncDef factorial() { { SS if() then {} else { { SS EL-opB* Call-factorial( EL-opB- : Var 'n' IP=29 FuncDefList FuncDef factorial() { { SS if() then {} else { { SS EL-opB* Call-factorial( EL-opB- : Number '1' IP=30 FuncDefList FuncDef factorial() { { SS if() then {} else { { SS EL-opB* Call-factorial( EL=> : Expression list IP=31 FuncDefList FuncDef factorial() { { SS if() then {} else { { SS EL=> : Expression list IP=32 : Statement sequence IP=33 SS : Expression list IP=34 SS EL- : Assignment ':=' IP=35 SS EL-:= : Var 'f3' IP=36 SS EL-:= : Function-call 'factorial' IP=37 SS EL-:= Call-factorial( : Expression list IP=38 SS EL-:= Call-factorial( EL- : Number '3' IP=39 SS EL-:= Call-factorial( EL=> : Expression list IP=40 SS EL=> : Expression list IP=41 SS, : Statement sequence IP=42 SS, SS : Expression list IP=43 SS, SS EL- : Assignment ':=' IP=44 SS, SS EL-:= : Var 'f5' IP=45 SS, SS EL-:= : Function-call 'factorial' IP=46 SS, SS EL-:= Call-factorial( : Expression list IP=47 SS, SS EL-:= Call-factorial( EL- : Number '5' IP=48 SS, SS EL-:= Call-factorial( EL=> : Expression list IP=49 SS, SS EL=> : Expression list IP=50 SS, SS, : Statement sequence IP=51 SS, SS, SS : Expression list IP=52 SS, SS, SS EL- : Assignment ':=' IP=53 SS, SS, SS EL-:= : Var 'f10' IP=54 SS, SS, SS EL-:= : Function-call 'factorial' IP=55 SS, SS, SS EL-:= Call-factorial( : Expression list IP=56 SS, SS, SS EL-:= Call-factorial( EL- : Number '10' IP=57 SS, SS, SS EL-:= Call-factorial( EL=> : Expression list IP=58 SS, SS, SS EL=> : Expression list IP=59 SS, SS, SS, : Statement sequence IP=60 SS, SS, SS, SS : Expression list IP=61 SS, SS, SS, SS EL- : Assignment ':=' IP=62 SS, SS, SS, SS EL-:= : Var 'f20' IP=63 SS, SS, SS, SS EL-:= : Function-call 'factorial' IP=64 SS, SS, SS, SS EL-:= Call-factorial( : Expression list IP=65 SS, SS, SS, SS EL-:= Call-factorial( EL- : Number '20' IP=66 SS, SS, SS, SS EL-:= Call-factorial( EL=> : Expression list IP=67 SS, SS, SS, SS EL=> : Expression list Stack-Frames 1 BP=99999 SP=99995

The grammar:

Program ::= Compilation_Directive_List Function_list Statement_sequence

Compilation_Directive_List ::= '<<' Compilation_Directive '>>' Compilation_Directive_List

| <NOTHING>

Compilation_Directive ::= 'allow_functions' | 'allow_stack_frames'

Function_list ::= Function Function_list

| <NOTHING> <== special care needed for <NOTHING>

Function ::= 'function' Name '(' Name_list ')' '{' Statement_sequence '}'

Name_list ::= Name Name_list_tail

| <NOTHING> <== special care needed for <NOTHING>

Name_list_tail ::= ',' Name Name_list_tail

| <NOTHING> <== special care needed for <NOTHING>

Name ::= [a..z_]Name_tail

Name_tail ::= [a..z_0..9]Name_tail

| <NOTHING> <== special care needed for <NOTHING>

Statement_sequence ::= Statement

| Statement ';' Statement_sequence

| ';' Statement_sequence

| '{' Statement_sequence '}' Statement_sequence

| <NOTHING> <== special care needed for <NOTHING>

-- FIXME: add a return statement

Statement ::= Expression_list

| 'if' '(' Expression_list ')' 'then' '{' Statement_sequence '}' else '{' Statement_sequence '}'

| 'while' '(' Expression_list ')' 'do' '{' Statement_sequence '}'

| 'do' '{' Statement_sequence '}' 'while' '(' Expression_list ')'

| 'for' '(' Expression_list_Or_Declaration ';' Expression_list_empty ';' Expression_list_empty ')' '{' Statement_sequence '}'

| Statement_declaration <= create the variable in the current depth block, init with 0

Expression_list_Or_Declaration ::= Expression_list

| Statement_declaration

| <NOTHING>

Statement_declaration ::= 'int' Statement_declaration_body

Statement_declaration_body ::= Name

| Name ':=' Expression

| Name ':=' Expression ',' Statement_declaration_body

| Name ',' Statement_declaration_body

Expression_list ::= Expression Expression_list_tail

Expression_list_empty ::= Expression Expression_list_tail

| <NOTHING> <== special care needed for <NOTHING>

Expression_list_tail ::= ',' Expression Expression_list_tail

| <NOTHING> <== special care needed for <NOTHING>

-- ternary operator => lowest priority

-- Right-to-left

Expression ::= LeftValue

| LeftValue '?' LeftValue ':' Expression

--

-- A proper POSTFIX & PREFIX increment ++ &-- operators, as well as proper assignment (based on complex LValue expressions) will-- need either a much more complicated grammar, or a 2 pass compilation

-- too much for now :D

--

-- Right-to-left

LeftValue ::= p_RV_log_OR

| Name [:= += -= *= /= %= >>= <<= |= &= ^= ||= &&= ~=] Expression

-- Left-to-right

p_RV_log_OR ::= RightValue_log_AND

| RightValue_log_AND ['||' 'or']RightValue

-- Left-to-right

RightValue_log_AND ::= RightValue_bit_OR

| RightValue_bit_OR ['&&' 'and']RightValue_log_AND

-- Left-to-right

RightValue_bit_OR ::= RightValue_bit_X_OR

| RightValue_bit_X_OR '|' RightValue_bit_AND

-- Left-to-right

RightValue_bit_X_OR ::= RightValue_bit_AND

| RightValue_bit_AND '^' RightValue_bit_AND

-- Left-to-right

RightValue_bit_AND ::= RightValue_EQUAL

| RightValue_EQUAL '&' RightValue_bit_AND

-- Left-to-right

RightValue_EQUAL ::= RightValue_REL

| RightValue_REL [== !=]RightValue_EQUAL

-- Left-to-right

RightValue_REL ::= RightValue_SHIFT

| RightValue_SHIFT [< <= > >=]RightValue_REL

-- Left-to-right

RightValue_SHIFT ::= RightValue_ADDITION

| RightValue_ADDITION [<< >>]RightValue_SHIFT

-- Left-to-right

RightValue_ADDITION ::= RightValue_MULTIPLY

| RightValue_MULTIPLY [+ -]RightValue_ADDITION

-- Left-to-right

RightValue_MULTIPLY ::= RightValue_UNARY

| RightValue_UNARY [* / %]RightValue_MULTIPLY

-- Right-to-left

RightValue_UNARY ::= RightValue_PREFIX_INC

| [+ - ~ !]RightValue_UNARY

-- Right-to-left

RightValue_PREFIX_INC ::= RightValue_ATOM

| ['++' '--'] RightValue_PREFIX_INC_TAIL

RightValue_PREFIX_INC_TAIL ::= RightValue_Name_and_POSTFIX_INC

| ['++' '--'] RightValue_PREFIX_INC_TAIL

-- left to right

RightValue_Name_and_POSTFIX_INC ::= Name RightValue_POSTFIX_INC_TAIL

-- left to right

RightValue_POSTFIX_INC_TAIL ::= ['++' '--']RightValue_POSTFIX_INC_TAIL

| <NOTHING> <== special care needed for <NOTHING>

RightValue_ATOM ::= Number

| RightValue_Name_and_POSTFIX_INC

| Name '(' Expression_list_empty ')' <= function call

| 'true'

| 'false'

| '(' Expression ')' <= reccursive expression