_build/content/articles/xerl-0.3-atomic-expressions.asciidoc


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135

+++
date = "2013-02-18T00:00:00+01:00"
title = "Xerl: atomic expressions"

+++

We will be adding atomic integer expressions to our language.
These look as follow in Erlang:

[source,erlang]
42.

And the result of this expression is of course 42.

We will be running this expression at compile time, since we
don't have the means to run code at runtime yet. This will of
course result in no module being compiled, but that's OK, it will
allow us to discuss a few important things we'll have to plan for
later on.

First, we must of course accept integers in the tokenizer.

[source,erlang]
{D}+ : {token, {integer, TokenLine, list_to_integer(TokenChars)}}.

We must then accept atomic integer expressions in the parser.
This is a simple change. The integer token is terminal so we need
to add it to the list of terminals, and then we only need to add
it as a possible expression.

[source,erlang]
expr -> integer : '$1'.

A file containing only the number 42 (with no terminating dot)
will give the following result when parsing it. This is incidentally
the same result as when tokenizing.

[source,erlang]
----
[{integer,1,42}]
----

We must then evaluate it. We're going to interpret it for now.
Since the result of this expression is not stored in a variable,
we are going to simply print it on the screen and discard it.

[source,erlang]
----
execute(Filename, [{integer, _, Int}|Tail], Modules) ->
    io:format("integer ~p~n", [Int]),
    execute(Filename, Tail, Modules).
----

You might think by now that what we've done so far this time
is useless. It brings up many interesting questions though.

* What happens if a file contains two integers?
* Can we live without expression separators?
* Do we need an interpreter for the compile step?

This is what happens when we create a file that contains two
integers on two separate lines:

[source,erlang]
----
[{integer,1,42},{integer,2,43}]
----

And on the same lines:

[source,erlang]
----
[{integer,1,42},{integer,1,43}]
----

Does this mean we do not need separators between expressions?
Not quite. The `+` and `-` operators are an
example of why we can't have nice things. They are ambiguous. They
have two different meanings: make an atomic integer positive or
negative, or perform an addition or a substraction between two
integers. Without a separator you won't be able to know if the
following snippet is one or two expressions:

[source,erlang]
42 - 12

Can we use the line ending as an expression separator then?
Some languages make whitespace important, often the line
separator becomes the expression separator. I do not think this
is the best idea, it can lead to errors. For example the following
snippet would be two expressions:

[source,erlang]
----
Var = some_module:some_function() + some_module:other_function()
    + another_module:another_function()
----

It is not obvious what would happen unless you are a veteran
of the language, and so we will not go down that road. We will use
an expression separator just like in Erlang: the comma. We will
however allow a trailing comma to make copy pasting code easier,
even if this means some old academics guy will go nuts about it
later on. This trailing comma will be optional and simply discarded
by the parser when encountered. We will implement this next.

The question as to how we will handle running expressions
remains. We have two choices here: we can write an interpreter,
or we can compile the code and run it. Writing an interpreter
would require us to do twice the work, and we are lazy, so we will
not do that.

You might already know that Erlang does not use the same code
for compiling and for evaluating commands in the shell. The main
reason for this is that in Erlang everything isn't an expression.
Indeed, the compiler compiles forms which contain expressions,
but you can't have forms in the shell.

How are we going to compile the code that isn't part of a module
then? What do we need to run at compile-time, anyway? The body of
the file itself, of course. The body of module declarations. That's
about it.

For the file itself, we can simply compile it as a big function
that will be executed. Then, everytime we encounter a module
declaration, we will run the compiler on its body, making its body
essentially a big function that will be executed. The same mechanism
will be applied when we encounter a module declaration at runtime.

At runtime there's nothing else for us to do, the result of this
operation will load all the compiled modules. At compile time we
will also want to save them to a file. We'll see later how we can
do that.

* https://github.com/extend/xerl/blob/0.3/[View the source]