COMPILATEUR TIGER
This document presents the EPITA version of the Tiger project. It was
last updated 5 February 2003.
Copyright © 2000, 2001, 2002, 2003 Akim Demaille.
Permission is granted to copy, distribute and/or modify this document
under the terms of the GNU Free Documentation License, Version 1.1 or
any later version published by the Free Software Foundation; with no
Invariant Sections, no Front-Cover texts and with the no Back-Cover
Texts. A copy of the license is included in the section entitled "GNU
Free Documentation License."
Reports
Everything exposed in this document is expected to be known.
Compilation: Projects and Reports
This document was last edited on 5 February 2003. Make sure the sections have
been updated for your promotion.
This document (split and
beautiful or in
one simple piece) details the various tasks the "Compilation" students must
complete. It includes both the practical part (Tiger compiler) and the
theoretical part (reports).
Tarballs
There are a few mandatory requirements over the tarballs.
Delivery Dates
The tarballs must be made via make distcheck, and put in the
requested location (see Student
Tarballs). This tarball will be copied at the exact moment of the first
delivery date, and the list of project which were found/not found is
reported in epita.cours.compile.
If your tarball was not fetched, send it immediately to yaka@epita.fr and akim@epita.fr, with [Tiger]:
bardec_f-tc-n.tar.bz2 as Subject. If you are
quick enough (within a couple of hours), there won't be penalties. Run this: mutt -c akim@epita.fr \
-s "[Tiger]: bardec_f-tc-n.tar.bz2" \
-a bardec_f-tc-n.tar.bz2 \
yaka@epita.fr </dev/null
Warning: Do not use elm to send your tarballs.
Each time someone tried, it failed. And anyway, just never use
elm.
Within a couple of hours after delivery date, a program evaluates your
tarball. The results are available on the web, as
bardec_f-tc-n.log. You should read this
file as soon as possible, do not wait for the marks to be computed to
react. If you realize there were stupid mistakes (such as displaying
/* escapes */ instead of /* escaping */), send
immediately (the day of the first delivery date) another tarball. This will cost
you a patch, which is typically -1.
If you have massive failures, there is a second delivery date, which costs
super_patch, typically -3. Beyond the second delivery date, no
additional change will be accepted. Not that the second delivery is not
automated: you have to send immediately to yaka@epita.fr and akim@epita.fr, with [Tiger]:
bardec_f-tc-n.tar.bz2 as Subject.
Given Tarballs
The naming scheme for provided tarballs is different from the scheme you must
follow (see Student
Tarballs). Our naming scheme looks like 2004-tc-2.0.tar.bz2.
If we update the tarballs, they will be named 2004-tc-2.x.tar.bz2.
But your tarball must be named
login-tc-2.tar.bz2, even if you send a
second version of your project.
We also provide some test cases. There are few of them, and you will have to
write your own tests. Writing tests is part of the project. Do not just
copy test cases from other groups, as you will not understand why they were
written.
The initial test suite is available for download at tests.tgz.
It contains the following directories:
good
- This programs are correct.
scan
- These programs have lexical errors.
parse
- These programs have syntax errors.
type
- These programs contain type mismatches.
Project Layout
Here is the mandatory layout of the tarball:
AUTHORS
- In the top level of the distribution, there must be a file
AUTHORS which contents is as follows: Fabrice Bardèche <bardec_f@epita.fr>
Jean-Paul Sartre <sartre_j@epita.fr>
Jean-Paul Deux <deux_j@epita.fr>
Jean-Paul Belmondo <belmon_j@epita.fr>
The group leader is the first in the list. Do not include emails other than
those of EPITA. I repeat: give the 6_1@epita.fr address. Note
that the file AUTHORS is automatically distributed, but pay
attention to the spelling.
ChangeLog
- Optional. The list of the changes made in the compiler, with the dates and
names of the people who worked on it. See the Emacs key binding
C-x 4
a.
README
- Various free information.
src/
- All the sources are in this directory.
tests/
- Test files. Do not include this directory in the tarball, I
already have my own test files, there is no need for you to give it to me
again. Save my disk!
src/tc
- Your compiler.
src/tc.cc
- Main entry, option processing. Called, the driver.
src/symbol/
- Namespace
symbol:
symbol.hh
- The handling of the symbols.
table.hh
- The handling of generic symbol tables, i.e., it is independent of
functions, types and variables.
src/ast/
- Namespace
ast. The abstract syntax.
src/parse/
- Namespace
parse: parsetiger.yy,
scantiger.ll, libparse.hh which prototypes what
tc.cc needs to know about the module parse.
src/type/
- Namespace
type. Type checking.
libtype.hh
- The interface for the driver.
types.hh
- The definition of all the types. You are free to use whatever layout you
wish (several files); we have a single
types.hh file.
type-entry.hh
- Definitions of
type::TypeEntry,
type::VarEntry, and type::FunEntry, used in
type::TypeEnv to associate data to types, variables, and
functions (obviously).
type-env.hh
- The types environment, comprising three symbol tables: types, functions,
and variables.
src/translate/
- Namespace
translate. Translation to intermediate code
translation. It includes:
libtranslate.hh
- The interface.
libtranslate.cc
- The compiled module.
translation.hh
- Helping routines to perform the translations. For instance, it contains
Exp *simpleVar (const Access &access, const Level
&level), Exp *callExp (const temp::Label &label,
std::list<Exp *> args) etc. which are routines that produce
some Tree::Exp.
translate-visitor.hh
- Implements the class
TranslateVisitor which performs the IR
generation thanks to translation.hh. It must not be polluted
with translation details: it is only coordinating the AST traversal with the
invocation of translation routines. For instance, here is the translation of
a ast::SimpleVar: virtual void visit (const SimpleVar& e)
{
const Access &access = _env.var_access_get (e.name_get ());
_exp = simpleVar (access, *_level);
}
Student Tarballs
Each group must provide a tarball. If bardec_f is the head of
your group, the tarball must be
~bardec_f/c/rendu/tiger/bardec_f-tc-n.tar.bz2
where n is the number of the "release". The following commands must
work properly: $ bunzip2 -cd bardec_f-tc-n.tar.bz2 | tar xvf -
$ cd bardec_f-tc-n
$ ./configure CC=gcc-3.2 CXX=g++-3.2
$ make
$ cd src
$ ./tc /tmp/test.tig
where gcc-3.2 and g++-3.2 are the name of the GNU C and
C++ Compiler 3.2 executables.
Your tarball must be done via make distcheck. If you just run
make dist, then you might forget to include some files in the
distribution. If you don't even run make dist, then not only some
files might be missing, but you have no guarantee that the tarball will compile
elsewhere (not to mention that we don't care about object files etc.).
Do not deliver the tests that we gave: we already have them.
Any tarball which is not built thanks to make distcheck (this is
easy to see: they include files we don't want, and don't contain some files we
need...) will be penalized with at least ### tarball_not_clean.
Code Sanity
The code you deliver must be clean. In particular, when some code is
provided, and you have to fill in the blanks denoted by FIXME: Some code
has been deleted.. Sometimes you will have to write the code from
scratch.
In any case, dead code and dead comments must be removed. You are
free to leave comments spotting places where you fixed a FIXME:,
but never leave a fixed FIXME: in your code. Nor any irrelevant
comment.
The official compiler for this project, is GNU C++ Compiler, 3.2 or higher.
Evaluation
Some stages are evaluated only by a program, and others are evaluated both by
humans, and a program.
Automated Evaluation
Each stage of the compiler will be evaluated by an automatic corrector. As
soon as the tarball are delivered, the logs are available on
http://www.lrde.epita.fr/~akim/compil, in the directory
corresponding to your promotion and stage. For instance, 2004 students ought to
read
http://www.lrde.epita.fr/~akim/compil/2004/4/bardec_f-tc-4.log.
We stress that automated evaluation enforces the requirements: you
must stick to what is being asked. For instance, for T3 it is
explicitly asked to display something like: var /* escaping */ i : int := 2
so if you display any of the following outputs var i : int /* escaping */ := 2
var i /* escaping */ : int := 2
var /* Escapes */ i : int := 2
be sure to fail all the tests, even if the computation is correct.
If you find some unexpected errors (your project does compile with the
reference compiler, some files are missing, your output is slightly incorrect
etc.) immediately send a new tarball to yaka@epita.fr with [Tiger] as
prefix of the subject. This corresponds to ### patch.
Do not wait for the final marks to be computed, this is extremely irritating,
and doomed to failure. You must understand that (i) you increase our workload,
and (ii) anyway this is the wrong approach, the Tiger Compiler is a big project
which must be continuously improved.
If, anyway, you send a tarball to fix your problems long after the initial
date, you will be flagged as ### super_late, which impact on the
mark is quite bad...
During the Examination
When you are defending your projects, here are a few rules to follow:
- Don't talk
- Don't talk unless you are asked to: when a person is asked a question,
s/he is the only one to answer. You must not talk to each other either: often,
when one cannot answer a question, the question is asked to another member. It
is then obvious why the members of the group shall not talk.
- Don't touch the screen
- Don't touch my display! You have nice fingers, but I don't need their
prints on my screen.
- Tell the truth
- If there is something the examiner must know (someone did not work on the
project at all, some files are coming from another group etc.), say it
immediately, for, if we discover that by ourselves, you will be severely
sanctioned.
- Learn
- It is explicitly stated that you can not have worked on a stage
provided this was an agreement with the group. But it is also
explicitly stated that you must have learned what was to be learned from
that compiler stage, which includes C++ techniques, Bison and Flex
mastering, object oriented concepts, design patterns and so forth.
- Complain now!
- If you don't agree with the notation, say it immediately. Private messages
about "this is unfair: I worked much more than bardec_f but his grade is
better than mine" are thrown away.
Conversely, there is something I wish to make clear: I, Akim, and the other
examiners, will probably be harsh (maybe even very harsh), but this does not
mean I disrespect you, or judge you badly.
You are here to defend your project and knowledge, I'm here to stress them,
to make sure they are right. Learning to be strong under pressure is part of the
exercise. Don't burst into tears, react! Don't be shy, that's not the proper
time: you are selling me something, and I will never buy something from someone
who cries when I'm criticizing his product.
You should also understand that human examination is the moment where we try
to evaluate who, or what group, needs help. We are here to diagnose your project
and provide solutions to your problems. If you know there is a problem in your
project, but you failed to fix it, tell it to the examiner! Work with
her/him to fix your project.
Human Evaluation
The point of this evaluation is to measure:
- the quality of the code
- How clean it is, amount of code duplication, bad hacks, standards
violations (e.g.,
stderr is forbidden in proper C++ code) and so
forth. It also aims at detecting cheaters, who will be severely punished (mark
= -42).
- the knowledge each member acquired
- While we do not require that each member worked on a stage, we do require
that each member (i) knows how the stage works and (ii) has perfectly
understood the (C++, Bison etc.) techniques needed to implement the stage.
In results in an evaluation file
(login-tc-stage.eval,
e.g., bardec_f-tc-2.eval), structured as follows.
The format is free, except some lines starting with specific markers,
when they are at the first column. The first marker is *,
used to denote the login of the group head: * bardec_f
----------------------------------------
2002-03-07: T2 par akim
tc-check: Summary:
tc-check: == Testing ./tc --parse-trace -l --ast-display ==
Successes: 177/177
Suppléments:
- --stdin pour parser stdin.
- Le parser fait attention à ses leaks, même sur les Symbol.
- --gcc-ast
- ChangeLog
- Ne meurt pas quand on lui donne un répertoire en entrée.
Le code est joli, et en 80 col !
Plutôt que des listes vides, passent 0.
As you can see, the person who evaluated is allowed to put whatever comment
comes to her mind. Specific features, specific failures, explanations to help
the students to fix their project should be included. Each time an entry is
made, there should be the date, and who wrote it (2002-03-07: T2 par
akim).
Then, the mark of the project, aka, the note of the group: ### Note T2 = 20
and optionally:
### late
- The code was delivered reasonably late (the same day).
### super_late
- More than one day.
### patch
- Within 24 hours from the first delivery date, the group submitted another
(hopefully) fixed tarball.
### super_patch
- Before the second delivery date, the group submitted another tarball.
### not_compile
- It doesn't compile, and the examiner had to change the code.
### tarball_not_clean
- Clear enough: bad name, bad content etc.
### cheater
- The group has cheated. The examiner is then asked to write as many details
as possible, and to flag the group as cheaters. The mark is likely to be
negative, and exclusion from EPITA is possible.
### reevaluation
- The group had a mark < 8, and is now being reexamined.
### bonus
- The group has implemented additional features, such as XML
output etc. and the project works properly. We do not care
about extra features on top of a broken project.
Important note to the examiners:
Note field.
The examiner should not take (too much) the automated tests into account to
decide the mark. This is because the mark is computed later, taking this into
account, so don't do it twice. Similarly, a very nice project, which is super
late, shall be flagged as super late, but should have a good ##
Note entry.
Important note to the examiners: broken tarballs.
If you fixed the tarball (### not_compile, or ###
tarball_not_clean or whatever modification, you must
run make distcheck again, and replace the tarball they delivered
with the new one. Do not keep the old tarball, do not install it in a special
place: just replace the first tarball with it, but say so in the
eval file.
The rationale is simple: only tarballs pass the tests, and every tarball
must be able to pass the tests. If you don't do that, then someone else will
have to do it again.
The next bits is the evaluation of each member of the group: ----------------------------------------
## bardec_f:100:120
Parser, locations, scanner. A pratiquement e'crit tout T2.
##sartre_j:100:40
AST, Visitor
Pourtant ne comprend rien au Visitor. Se fout de ma gueule.
##deux_j:80:80
Pas de problème C++.
FIXME: Sa note de T1 est à revoir (resoutenance).
##belmon_j:100:100
Actions, les delete dans le parser.
A implémenté la sortie en AST compatible avec GCC.
The formalism is ##
login:bonus1:bonus2.
If the stage is T2, then bonus1 refers to a bonus on T1, and
bonus2 on T2. For T4, it covers T3 and T4, and so forth.
The values can be "negative" (80, 60,
40, 0), or positive if a member deserves it
(100, 120, or even more if justifiable).
There should be information about who did what.
Marks Computation
Because the Tiger Compiler is a project with stages, the computation of the
marks depends on the stages too. To spell it out explicitly:
A stage is penalized by bad results on tests performed for
previous stages.
It means, for instance, that a T3 compiler will be exercised on T1, T2, and
T3. If there are still errors on T1 and T2 tests, they will pessimize the result
of T3 tests. The older the errors are, the more expensive they are.
As an example, here are the formulas to compute the global success rate of T3
and T5: global-rate-T3 := rate-T3 * (+ 2 * rate-T1
+ 1 * rate-T2) / 3
global-rate-T5 := rate-T5 * (+ 4 * rate-T1
+ 3 * rate-T2
+ 2 * rate-T3
+ 1 * rate-T4) / 10
Because a project which fail half of the time is not a project that deserves
half of 20, the global-rate is elevated to 1.7 before computing the mark: mark-T3 := roundup (power (global-rate-T3, 1.7) * 20 - malus-T3, 1)
where roundup (x, 1) is x
rounded up to one decimal (roundup (15, 1) = 15, roundup
(15.01, 1) = 15.1).
When the project is also evaluated by a human, power is not
used. Rather, the success rate modifies the mark given by the examiner: mark-T2 := roundup (eval-T2 * global-rate-T2 - malus-T2, 1)
Reevaluation
It happens that some groups have big problems with their project. Here is how
to solve these issues.
But first of all, you must know that cheating, stealing another group's
project, is not a solution, because:
- you won't learn. Do not forget that the Tiger Compiler is only an
exercise: having a good grade is less important than understanding and being
efficient when you will be hired by a company.
- we will certainly notice you are not able to answer our questions.
- when we find cheaters, they pay it very severely (
###
cheater).
Because the Tiger Compiler is a long project, you continuously have to
improve it, but marks starting from 8 to higher are definitive. If your
mark is less than 8, can be re-examined: you must provide a better tarball, and
in the case of human evaluated stages, another audition will be made.
This new audition will be flagged as ### reevaluation. It cannot
be better than 12/20, even if your project is excellent: a good project out of
date cannot be judged better than a reasonable good project on time.
Pay attention that when providing an updated tarball for reexamination of
stage n, it must follow the naming scheme of stage n, even
if the contents goes further. For instance, it is perfectly valid to ask for a
T2 reevaluation with a compiler that implements T4, but the tarball must be
bardec_f-tc-2.tar.bz2.
You have to ask for a reevaluation, we will not look after you.
Compiler Stages
The compiler will be written in several steps, described below.
T0, Naive Scanner and Parser
This section has been updated for EPITA-2005.
T0 is a weak form of T1: the scanner and the parser are written, but there is
a set of simplifications:
- the hierarchy
- We don't need several directories, you can program in the top level of the
package.
- the build
- The GNU Build System is not used: there is no need for Autoconf, Automake
etc.
- no driver
- The driver is reduced to its simplest form:
int
main (int argc, const char *argc[])
{
assert (argc == 1);
yyin = fopen (argv[1]);
assert (yyin);
return !!yyparse ();
}
i.e., there is no support for options at all.
SCAN, PARSE
- There is no support for options, but that does not mean that you cannot
use the tracing/debugging features from Flex/Bison. It is required that you
bind tracing to the environment variables
SCAN and
PARSE. I.e., running PARSE=1 ./tc foo.tig
will set yydebug to 1, which causes the traces of the parsing
to be displayed.
YYPRINT
- There is no requirement to implement
YYPRINT support.
yylval
- All the values are supported: strings, integers and even symbols.
Nevertheless, symbols are returned as plain strings for the time being: the
class
symbol::Symbol is to be implemented in T1.
T0 Code to Write
You must write
scantiger.ll
- The scanner.
parsetiger.yy
- The parser, and maybe
main if you wish.
tc.cc (optionally)
- The driver, i.e.,
main. Putting it into
parsetiger.yy is OK in T0.
Makefile
- This file is mandatory. Running
make must build the
binary tc.
The requirements on the tarball are the same as usual, see Tarballs.
T1, Scanner and Parser
T1 delivery is Friday, February 14th 2003 at
noon. This section is updated for EPITA-2005.
Scanner and parser are properly running, but the abstract syntax tree is not
built yet. Differences with T0 include:
- GNU Build System
- Autoconf, Automake are used.
- Options, Tasks
- The compiler supports basic options via in the Task module.
- locations
- The locations are computed are properly reported in the error messages
T1 Samples
The only information the compiler will give is about lexical and syntax
errors.
If there are no errors, the compiler shuts up, and exits successfully: /* an array type and an array variable */
let
type arrtype = array of int
var arr1 : arrtype := arrtype [10] of 0
in
arr1[2]
end
File 1: test01.tig $ tc test01.tig
=>0
Example 2: tc test01.tig
If there are lexical errors, the exit status is 2, and a an error message is
output on the standard error output. Note that its format is standard: file,
(precise) location, and then the message. 1
/* This comments starts at /* 2.2 */
File 3: unterminated-comment.tig $ tc unterminated-comment.tig
error-->unterminated-comment.tig:2.2-3.0: unexpected end of file in a comment
=>2
Example 4: tc unterminated-comment.tig
If there are syntax errors, the exit status is set to 3: let var a : nil := ()
in
1
end
File 5: type-nil.tig $ tc type-nil.tig
error-->type-nil.tig:1.13-15: syntax error, unexpected "nil", expecting "identifier"
error-->Parsing Failed
=>3
Example 6: tc type-nil.tig
If there are errors which are non lexical, nor syntactic (Windows will not
pass by me): $ tc C:/TIGER/SAMPLE.TIG
error-->cannot open `C:/TIGER/SAMPLE.TIG': No such file or directory
=>1
Example 7: tc C:/TIGER/SAMPLE.TIG
The option --parse-trace, which relies on Bison's
%debug directive, and the use of YYPRINT, must work
properly: $ cat foo.tig
a + "a"
$ ./tc --parse-trace foo.tig
Starting parse
Entering state 0
Reading a token: Next token is 258 (ID a)
Shifting token 258 (ID), Entering state 2
Reading a token: Next token is 270 (PLUS)
Reducing via rule 74 (line 318), ID -> varid
state stack now 0
Entering state 16
Reducing via rule 34 (line 196), varid -> lvalue
state stack now 0
Entering state 13
Next token is 270 (PLUS)
Reducing via rule 33 (line 192), lvalue -> exp
state stack now 0
Entering state 12
Next token is 270 (PLUS)
Shifting token 270 (PLUS), Entering state 30
Reading a token: Next token is 257 (STRING a)
Shifting token 257 (STRING), Entering state 1
Reducing via rule 15 (line 151), STRING -> exp
state stack now 0 12 30
Entering state 65
Reading a token: Now at end of input.
Reducing via rule 27 (line 182), exp PLUS exp -> exp
state stack now 0
Entering state 12
Now at end of input.
Reducing via rule 1 (line 106), exp -> program
state stack now 0
Entering state 159
Now at end of input.
Shifting token 0 ($), Entering state 160
Now at end of input.
$ echo $?
=>0
Note that (i), it cannot see that the variable is not declared nor that there
is a type checking error, since type checking... is not implemented, and (ii),
the output might be slightly different, depending upon the version of Bison you
use. But what matters is that one can see the items: ID a,
STRING a.
T1 Given Code
Some code is provided: 2005-tc-1.0.tar.bz2.
It contains:
src/common.hh
- Used throughout the project.
src/tc.cc
- The compiler driver (option handling etc.).
src/parse/scantiger.ll
- The scanner.
src/parse/parsetiger.yy
- The parser.
src/ast/position.hh
- Keeping track of a point (cursor) in a file.
src/ast/location.hh
- Keeping track of a range (two cursors) in a (or two) file.
T1 Code to Write
Be sure to read Flex and Bison documentations and tutorials, see Flex
& Bison.
src/scantiger.ll
- The scanner must be completed. It must be able to read strings,
identifiers etc. Strings and identifiers will be stored as C++
std::string. See the following code for the basics. ...
\" yylval->str = new std::string (); BEGIN STATE_STRING;
<STATE_STRING>{ /* Handling of the strings. Initial " is eaten. */
\" {
BEGIN INITIAL;
return STRING;
}
...
\\x[0-9a-fA-F]{2} {
yylval->str->append (1, strtol (yytext + 2, 0, 16));
}
...
}
The locations are tracked.
src/parsetiger.yy
- The grammar must be complete, but do not implement actions. You have to
implement support for
--parse-trace (see T1
Samples). Pay special attention to the display of strings and identifiers.
Bison will certainly complain because of a type clash for some actions. For
instance, if you have given a type to STRING, but none to
exp, then it will choke on: exp: STRING;
because it actually means exp: STRING { $$ = $1; };
which is not type coherent. So write this instead: exp: STRING {};
src/ast/position.hh
- The class
ast::Position is completed.
src/ast/location.hh
- The class
ast::Location must be completed.
src/symbol/symbol.hxx
- The class
symbol::Symbol keeps a single copy of identifiers,
so that (i) we save space, and (ii) symbol comparison is fast. The file
src/symbol/symbol.hh describes the interface of the class
symbol::Symbol, but the implementation is to be written in
src/symbol/symbol.hxx.
T1 FAQ
- Options
- The first version of argp we used (1.1) was buggy: on
./tc foo.tig
-A, it considered -A was a file, not an option. This bug
is now fixed. The patch is from Niels Möller: diff -u -r1.16 -r1.17
--- argp-parse.c 18 Feb 2001 22:40:03 -0000 1.16
+++ argp-parse.c 4 Feb 2003 19:52:30 -0000 1.17
@@ -1021,6 +1021,8 @@
*arg_ebadkey = 1;
if (parser->first_nonopt != parser->last_nonopt)
{
+ exchange(parser);
+
/* Start processing the arguments we skipped previously. */
parser->state.next = parser->first_nonopt;
Go into argp/, run patch -p0, paste the patch,
type <CTRL>-d.
T2, Building the Abstract Syntax Tree
This section has been updated for EPITA-2004. Last update: 2002-02-14.
At the end of this stage, the compiler can build abstract syntax trees of
Tiger programs and pretty-print them.
T2 Samples
Here are a few examples of expected features.
The parser builds abstract syntax trees that can be output by a
pretty-printing module: /* define a recursive function */
let
/* calculate n! */
function fact (n : int) : int =
if n = 0
then 1
else n * fact (n - 1)
in
fact (10)
end
File 8: simple-fact.tig $ tc --ast-display simple-fact.tig
/* == Abstract Syntax Tree. == */
let
function fact (n : int) : int =
if (n = 0)
then 1
else (n * fact ((n - 1)))
in
fact (10)
end
Example 9: tc -ast-display simple-fact.tig
The output from your pretty-printer must be valid Tiger code, and be
equivalent to the input.
By valid, we mean that any Tiger compiler must be able to parse with
success your output. Pay attention to the banners such as ==
Abstract...: you should use comments: /* == Abstract... */.
There is one exception: you are not asked to escape quotes and backslashes when
printing a literal string.
By equivalent, we mean that except for syntactic sugar, the output
and the input are equal. Syntactic sugar refers to &,
|, unary -, and in some cases if then: % cat boolean-1.tig
1 = 1 & 2 = 2
% ./tc --ast-display boolean-1.tig
/* == Abstract Syntax Tree. == */
if (1 = 1)
then (2 = 2)
else 0
For loops must be properly displayed, i.e., although we use a
ast::VarDec for the index of the loop, you must not display
var: /* valid for and let */
let
var a := 0
in
for i :=0 to 100 do (a :=a+1; ())
end
File 10: for-loop.tig $ tc --ast-display for-loop.tig
/* == Abstract Syntax Tree. == */
let
var a := 0
in
for i := 0 to 100 do
(
a := (a + 1);
()
)
end
Example 11: tc -ast-display for-loop.tig
T2 Given Code
Some code is provided: 2004-tc-2.0.tar.bz2.
It contains:
src/ast
- Implementation of the abstract syntax tree. The file
README
gives an overview of the involved class hierarchy.
src/ast/location.hh, position.hh
- Classes used to hold locations computed from Tiger programs.
src/ast/visitor.hh
- Abstract base class of the compiler's visitor hierarchy.
src/ast/default-visitor.hh
- Implementation of the DefaultVisitor class, which walks the abstract
syntax tree, doing nothing. It is mainly used as a basis for deriving other
visitors.
src/ast/print-visitor.hh
- Implementation of the PrintVisitor class, which performs pretty-printing
in the tiger compiler.
src/symbol.*
- Have a look at how this class is implemented. Ask questions if you don't
understand it. You will not be given questions about it in T2, but T3, there
will be.
T2 Code to Write
What is to be done:
src/parse/scantiger.ll
- The scanner must now use
symbol::Symbol instead of
std::string for identifiers. Of course, the parser must be
adjusted too.
The scanner must be updated to keep track of locations of tokens in Tiger
programs. To adjust your scanner, you are strongly encouraged to use
YY_USER_ACTION, and also the yylex prologue: ...
%%
%{
// Everything here is run each time yylex is invoked.
%}
"if" return IF;
...
%%
...
Have a look at the scanner and parser chapters of this draft.
src/parse/parsetiger.yy
- The parser must be updated to ensure error recovery. The grammar must be
changed to process declarations by chunks.
In Tiger, the following program is invalid: let function foo () = ()
function foo () = ()
var foo := 0
in
()
end
while the following code is valid: let function foo () = ()
var foo := 0
function foo () = ()
in
()
end
this is because declarations are cut in "chunks" of declarations of the
same kind. In the first example, there are two chunks: one chunk of two
function declarations and one variable declaration. In the second example:
there are three chunks: one chunk containing only one function declaration,
one with one variable declaration, and the third one, with a single function
declaration again.
The rule is "a chunk cannot define twice the same name".
In order to implement this easily, you must adjust your grammar so that
declarations are parsed by chunks. Pay special attention to the
implementation of ast::FunctionDecs, ast::VarDecs,
and ast::TypeDecs (which are all implemented thanks to
ast::AnyDecs): they are these chunks of declaration. Therefore,
an ast::LetExp uses a list of chunks.
src/ast
- The abstract syntax tree module must be completed. Basically, this means
there should remain no
FIXME: anywhere in the code we gave.
Several files are missing (fieldvar.hh nilexp.hh
intexp.hh stringexp.hh callexp.hh
assignexp.hh whileexp.hh breakexp.hh
arrayexp.hh). See src/ast/README for additional
information on the missing classes.
src/ast/location.hh, position.hh
- The classes Position and Location must be completed.
src/ast/default-visitor.hh
- The DefaultVisitor class must be completed, and must be able to walk whole
abstract syntax trees. Do not forget that your DefaultVisitor must be a sound
basis for your further work on the Tiger compiler.
src/ast/print-visitor.hh
- The PrintVisitor class must be completed, and able to pretty-print any
Tiger program.
T2 FAQ
bison
- Be sure to read its dedicated section: Flex
& Bison.
- escapes (
escapes_get, etc.)
- These are place holders for T3. You don't have to complete them. Actually,
the easiest is probably just to comment these functions out. Or better yet:
#if 0/#endif.
T3, Computing the Escaping Variables
This section was updated for Tiger 2004. The project will be taken on Friday,
March 15th, at noon.
At the end of this stage, the compiler must be able to compute and display
the escaping variables. Be sure to read the chapter "Escapes" in the lecture
notes. Its parser is equipped with error recovery.
T3 Samples
This example demonstrates the computation and display of escaping
variables/formals. let
var escaping := "I rule the world!"
var not_escaping := "Peace on Earth for humans of good will."
function print_escaping () =
print (escaping)
in
(not_escaping; print_escaping ())
end
File 12: variable-escapes.tig $ tc --escapes-display variable-escapes.tig
/* == Escapes. == */
let
var /* escaping */ escaping := "I rule the world!"
var /* escaping */ not_escaping := "Peace on Earth for humans of good will."
function print_escaping () =
print (escaping)
in
(
not_escaping;
print_escaping ()
)
end
Example 13: tc -escapes-display
variable-escapes.tig $ tc --ast-display --escapes-compute --escapes-display variable-escapes.tig
/* == Abstract Syntax Tree. == */
let
var escaping := "I rule the world!"
var not_escaping := "Peace on Earth for humans of good will."
function print_escaping () =
print (escaping)
in
(
not_escaping;
print_escaping ()
)
end
/* == Escapes. == */
let
var /* escaping */ escaping := "I rule the world!"
var not_escaping := "Peace on Earth for humans of good will."
function print_escaping () =
print (escaping)
in
(
not_escaping;
print_escaping ()
)
end
Example 14: tc -ast-display -escapes-compute
-escapes-display variable-escapes.tig
You are strongly encouraged to run your compiler on merge.tig
and to study its output.
Another part of T3 is the improvement of your parser: it must be robust to
some forms of errors. On the following input: (
1;
(2, 3);
(4, 5);
6
)
it finds two errors: it does not stop on the first one: % ./tc error.tig
error-->error.tig:3.5: parse error, unexpected ",", expecting ";"
error-->error.tig:4.5: parse error, unexpected ",", expecting ";"
Of course, the exit status still reveals the parse error.
Be sure that your error recovery does not break the rest of the compiler...
T3 Code To Write
src/parse/parsetiger.yy
- Implement error recovery. There should be at least three uses of the token
error. Read the Bison documentation about it.
src/parse/parsetc-class.hh
- The current version of Bison no longer outputs this file. Be sure to
remove it from your project:
src/parse/parsetiger.hh alone is
enough.
ast::VarExp
- What is it used for? What is its meaning? Find another means to do the
same things, but without it. Hence destroy it, and remove its file:
ast must no longer use it.
ast::PrintVisitor
- Be sure to display the
/* escaping */ flag where needed, and
only where needed. If you don't pay attention, you might display
meaningless flags due to implementation details.
symbol::Table< class Entry_T >
- This item is irrelevant for Tiger 2004, as the code was accidently
delivered to you. Nevertheless, if you're willing to understand templates in
C++, you are encouraged to try writing your own version, without taking
inspiration from the code we gave.
Write the class template symbol::Table in
src/symbol/table.hh which is a table of symbols dedicated to
storing some data which type is Entry_T * (i.e., it maps a
symbol::Symbol to an Entry_T *). You are encouraged
to implement something simple, based on stacks (see std::list) of
maps (see std::map). We used to promote hash tables, but STL does
not include hash tables: this is a GNU (SGI actually) extension.
symbol::Table is implemented as a class template because we
will use it with several other kinds of Entry_T in both the
type::TypeVisitor and translate::TranslateVisitor.
symbol::Table must provide this interface:
Close the last scope, forgetting everything since the latest
scope_begin (). |
| put (Symbol key, Entry_T &
value) |
void |
| Associate value to key in the current scope.
|
| get (Symbol key) const |
Entry_T * |
If key was associated to some Entry_T in the
open scopes, return the most recent insertion. Otherwise return the
empty pointer. |
| print (std::ostream & ostr)
const |
void |
| Send the content of this table on ostr in a readable
manner, the top of the stack being displayed last.
|
ast::EscapeVisitor
- Write the class
ast::EscapeVisitor in
src/ast/escape-visitor.hh.
You are suggested to implement three additional classes in this file:
Definition
- An abstract class which is used to instantiate the template class
symbol::Table into Table <Definition>.
| escape_set (void) |
virtual void |
| Depth at which this object has been created. |
Returns the depth associated to this Definition
object. |
VariableDefinition
- Inherits from
Definition. It has one additional attribute,
a VarDec &. The method escape_set is
implemented, and when invoked, set the escapes flags of the
corresponding VarDec.
FormalDefinition
- Inherits from
Definition. To be designed by yourself. Do
not forget that the ast class used to register formals is used
elsewhere, and it would be a pity that your implementation makes no
difference... Be sure to write a test that verifies that your implementation
is not abused. I have one such test...
- Equip
ast
- All the sites where variables and formals (i.e., the arguments of the
functions being defined, not being used) are introduced must be
equipped with the
escape_get and escape_set methods.
Most probably the code was already given, and is using
const_casts; try to use mutable instead.
Modify the code so that each definition of an escaping variable/formal is
preceded by the comment /* escaping */ if the flag
display_escapes_p is true. See the item "Driver" for an example.
- Driver
- The driver,
src/tc.cc, must be enhanced to support two
additional options: --escapes-compute, which triggers the
ast::EscapeVisitor, and --escapes-display, which
sets display_escapes_p to true. This is not a modification of
--ast-display, it is a genuine second
ast::PrintVisitor which is invoked.
Please, note that --escapes-display does not imply
--escapes-compute (see T3
Samples).
T4, Type Checking
T4 delivery is Friday, April 12th 2002 at
noon. At the end of this stage, the compiler type checks Tiger
programs. Clear error messages are required.
T4 Samples
Here are a few examples: $ echo '1 + "2"' | ./tc -
$ echo '1 + "2"' | ./tc - --types-check
error-->standard input:1.1-7: type mismatch
error--> right operand type: string
error--> expected type: int
$ echo '1 + a' | ./tc -
$ echo '1 + a' | ./tc - --types-check
error-->standard input:1.5: undeclared variable: a
$ echo 'if 1 then 2' | ./tc -
$ echo 'if 1 then 2' | ./tc - --types-check
error-->standard input:1.1-11: type mismatch
error--> then clause type: int
error--> else clause type: void
T4 Given Code
src/type/libtype.hh
-
src/type/libtype.cc
- The interface of the Type module. It exports a simple procedure,
type_check.
src/type/types.hh
- The definition of all the types. The type
type::Nil is given
in full.
src/type/type-env.hh
- The environment used by the
type::TypeVisitor.
T4 Code to Write
What is to be done.
src/type
- New directory. Don't forget to update the
SUBDIRS variable in
src/Makefile.am, and the AC_CONFIG_FILES invocation
in configure.ac.
src/type/types.hh
type::String, type::Int, and
type::Void are to be implemented. It's dead easy. Using templates
would be particularly appreciated to factor the code between the four
singleton classes.
type::Named is almost entirely given.
type::Array is even simpler than the four Singletons.
type::Record is somewhat incomplete.
Pay attention to the implementation of
type::Type::assignable_to and
type::Type::comparable_to.
src/type/type-entry.hh
- This file is really a empty nutshell, so we give it complete so that you
concentrate on things that matter. Nonetheless you will be asked questions on
this file, so have a look at it.
src/type/type-env.hh
- The constructor of
type::TypeEnv must be completed: it must
fill the environment with the definition of builtin types and functions. See
the Tiger Reference Manual.
The handling of types is left as an example, you still have to implement
the variables and functions support.
type::TypeVisitor
- Of course this is the most tricky part. I hope there are enough comments
in there so that you understand what is to be done. Please, post your
questions and help me improve it.
- Driver
- The driver must run the type checker when given
--types-check. Note the plural, it spells
--types-check, not --type-check.
T5, Translating to the High Level Intermediate
Representation
At the end of this stage, all the errors of previous stages must have been
fixed, and the compiler translates the AST into the high level
intermediate representation, HIR for short.
T5 first delivery is Friday, June 14th, at
noon. Second delivery date is Monday, June 17th, at noon.
T5 Samples
T5 can be started (and should be started if you don't want to finish it in a
hurry) by first making sure your compiler can handle code that uses no
variables. Then, you can complete your compiler to support more and more Tiger
features.
T5 Primitive Samples
This example is probably the simplest Tiger program. 0
File 15: 0.tig $ tc --hir-display 0.tig
/* == High Level Intermediate representation. == */
# Routine: Main Program
label l`Main'
# Prologue
# Body
sxp
const 0
# Epilogue
label end
Example 16: tc -hir-display 0.tig
You should then probably try to make more difficult programs with literals
only. Arithmetics is one of the easiest tasks. 1 + 2 * 3
File 17: arith.tig $ tc --hir-display arith.tig
/* == High Level Intermediate representation. == */
# Routine: Main Program
label l`Main'
# Prologue
# Body
sxp
binop (+)
const 1
binop (*)
const 2
const 3
# Epilogue
label end
Example 18: tc -hir-display arith.tig
You should use havm to exercise your output. $ tc --hir-display arith.tig >arith.hir
=>0
Example 19: tc -hir-display arith.tig
>arith.hir $ havm arith.hir
=>0
Example 20: havm arith.hir
Unfortunately, without actually printing something, you won't see the final
result, which means you need to implement calls. Fortunately, you can ask
havm for a verbose execution: $ havm --trace arith.hir
error-->call ( name Main )
error-->const 1
error-->const 2
error-->const 3
error-->binop (*) 2 3
error-->binop (+) 1 6
error-->sxp 7
Example 21: havm -trace arith.hir
If you look carefully, you will find an sxp 7 in there...
Then you are encouraged to implement control structures. if 101 then 102 else 103
File 22: if-101.tig $ tc --hir-display if-101.tig
/* == High Level Intermediate representation. == */
# Routine: Main Program
label l`Main'
# Prologue
# Body
sxp
eseq
seq
cjump ne
const 101
const 0
name l0
name l1
label l0
move
temp t0
const 102
jump
name l2
label l1
move
temp t0
const 103
jump
name l2
label l2
seq end
temp t0
# Epilogue
label end
Example 23: tc -hir-display if-101.tig
And even more difficult control structure uses: while 101 & 102
do (if 103 then break)
File 24: while-101.tig $ tc --hir-display while-101.tig
/* == High Level Intermediate representation. == */
# Routine: Main Program
label l`Main'
# Prologue
# Body
seq
label l7
cjump ne
eseq
seq
cjump ne
const 101
const 0
name l0
name l1
label l0
move
temp t0
const 102
jump
name l2
label l1
move
temp t0
const 0
jump
name l2
label l2
seq end
temp t0
const 0
name l8
name l3
label l8
seq
cjump ne
const 103
const 0
name l4
name l5
label l4
jump
name l3
jump
name l6
label l5
sxp
const 0
jump
name l6
label l6
seq end
jump
name l7
label l3
seq end
# Epilogue
label end
Example 25: tc -hir-display while-101.tig
T5 Builtin Calls Samples
But the game becomes more interesting when you implement function calls
(which is easier than compiling functions). print_int is probably
the first builtin to implement: (print_int (101); print ("\n"))
File 26: print-101.tig $ tc --hir-display print-101.tig > print-101.hir
=>0
Example 27: tc -hir-display print-101.tig >
print-101.hir $ havm print-101.hir
101
Example 28: havm print-101.hir
Complex values, arrays and records, also need calls to the runtime system: let type list = { h: int, t: list }
var list := list { h = 1, t = list { h = 2, t = nil } }
in
print_int (list.t.h); print ("\n")
end
File 29: print-list.tig $ tc --hir-display print-list.tig
/* == High Level Intermediate representation. == */
label l0
"
"
# Routine: Main Program
label l`Main'
# Prologue
move
temp t2
temp fp
move
temp fp
temp sp
move
temp sp
binop (-)
temp sp
const 4
# Body
seq
move
mem
temp fp
eseq
seq
move
temp t1
call
name l`malloc'
const 8
call end
move
mem
binop (+)
temp t1
const 0
const 1
move
mem
binop (+)
temp t1
const 4
eseq
seq
move
temp t0
call
name l`malloc'
const 8
call end
move
mem
binop (+)
temp t0
const 0
const 2
move
mem
binop (+)
temp t0
const 4
const 0
seq end
temp t0
seq end
temp t1
seq
sxp
call
name l`print_int'
mem
binop (+)
mem
binop (+)
mem
temp fp
const 4
const 0
call end
sxp
call
name l`print'
name l0
call end
seq end
seq end
# Epilogue
move
temp sp
temp fp
move
temp fp
temp t2
label end
Example 30: tc -hir-display print-list.tig $ tc --hir-display print-list.tig > print-list.hir
=>0
Example 31: tc -hir-display print-list.tig >
print-list.hir $ havm print-list.hir
2
Example 32: havm print-list.hir
T5 Samples with Variables
Here is an example which demonstrates the usefulness of information about
escapes: in the first case, all the variables are expected to escape, hence are
stored on the stack, while in the second case, we know they don't
escape, hence they can be stored in a temporary. let var a := 1
var b := 2
var c := 3
in
a := 2;
c := a + b + c;
print_int (c);
print ("\n")
end
File 33: vars.tig $ tc --hir-display vars.tig
/* == High Level Intermediate representation. == */
label l0
"
"
# Routine: Main Program
label l`Main'
# Prologue
move
temp t0
temp fp
move
temp fp
temp sp
move
temp sp
binop (-)
temp sp
const 12
# Body
seq
move
mem
temp fp
const 1
move
mem
binop (+)
temp fp
const -4
const 2
move
mem
binop (+)
temp fp
const -8
const 3
seq
move
mem
temp fp
const 2
move
mem
binop (+)
temp fp
const -8
binop (+)
binop (+)
mem
temp fp
mem
binop (+)
temp fp
const -4
mem
binop (+)
temp fp
const -8
sxp
call
name l`print_int'
mem
binop (+)
temp fp
const -8
call end
sxp
call
name l`print'
name l0
call end
seq end
seq end
# Epilogue
move
temp sp
temp fp
move
temp fp
temp t0
label end
Example 34: tc -hir-display vars.tig $ tc --escapes-compute --hir-display vars.tig
/* == High Level Intermediate representation. == */
label l0
"
"
# Routine: Main Program
label l`Main'
# Prologue
# Body
seq
move
temp t0
const 1
move
temp t1
const 2
move
temp t2
const 3
seq
move
temp t0
const 2
move
temp t2
binop (+)
binop (+)
temp t0
temp t1
temp t2
sxp
call
name l`print_int'
temp t2
call end
sxp
call
name l`print'
name l0
call end
seq end
seq end
# Epilogue
label end
Example 35: tc -escapes-compute -hir-display
vars.tig $ tc --escapes-compute --hir-display vars.tig >vars.hir
=>0
Example 36: tc -escapes-compute -hir-display
vars.tig >vars.hir $ havm vars.hir
7
Example 37: havm vars.hir
The second example uses a simple function. let function fact (i: int) : int =
if i = 0 then 1
else i * fact (i - 1)
in
print_int (fact (15));
print ("\n")
end
File 38: fact15.tig $ tc --hir-display fact15.tig
/* == High Level Intermediate representation. == */
# Routine: fact
label l0
# Prologue
move
temp t1
temp fp
move
temp fp
temp sp
move
temp sp
binop (-)
temp sp
const 8
move
mem
temp fp
temp i0
move
mem
binop (+)
temp fp
const -4
temp i1
# Body
move
temp rv
eseq
seq
cjump eq
mem
binop (+)
temp fp
const -4
const 0
name l1
name l2
label l1
move
temp t0
const 1
jump
name l3
label l2
move
temp t0
binop (*)
mem
binop (+)
temp fp
const -4
call
name l0
mem
temp fp
binop (-)
mem
binop (+)
temp fp
const -4
const 1
call end
jump
name l3
label l3
seq end
temp t0
# Epilogue
move
temp sp
temp fp
move
temp fp
temp t1
label end
label l4
"
"
# Routine: Main Program
label l`Main'
# Prologue
# Body
seq
sxp
call
name l`print_int'
call
name l0
temp fp
const 15
call end
call end
sxp
call
name l`print'
name l4
call end
seq end
# Epilogue
label end
Example 39: tc -hir-display fact15.tig $ tc --hir-display fact15.tig >fact15.hir
=>0
Example 40: tc -hir-display fact15.tig
>fact15.hir $ havm fact15.hir
2004310016
Example 41: havm fact15.hir
T5 Given Code
Code that was delivered:
src/temp/temp.hh
- So called temporaries are pseudo-registers: we may allocate as
many temporaries as we want. Eventually the register allocator will map those
temporaries to either an actual register, or it will allocate a slot in the
activation block (aka frame) of the current function.
src/temp/label.hh
- We need labels for JUMPs, for functions, strings etc.
src/tree
- The implementation of the intermediate representation. The file
README should give enough explanations to understand how it
works.
Reading the corresponding explanations in Appel's book is mandatory.
It is worth noting that contrary to A. Appel, just as we did for
ast, we use n-ary structures. For instance, where Appel uses a
binary SEQ, we have an n-ary SEQ which allows us
to put as many statements as we want.
To avoid gratuitous name clashes, what Appel denotes EXP is
denoted SXP (Statement Expression), implemented in
translate::Sxp.
Please, pay extra attention to the fact that there are
temp::Temp used to create unique temporaries (similar to
symbol::Symbol), and tree::Temp which is the
intermediate representation instruction denoting a temporary (hence a
tree::Temp needs a temp::Temp). Similarly, on the
one hand, there is temp::Label which is used to create unique
labels, and on the other hand there are tree::Label which is the
IR statement to define to a label, and tree::Name used
to refer to a label (typically, a tree::Jump needs a
tree::Name which in turn needs a temp::Label).
src/translate/level.hh, src/translate/level.cc
- They implement
translate::Level, which are wrappers around
frame::Frame. A Frame knows only what are the
"variables" it contains, but for a language such as Tiger, where you may need
to access the variables of the "parent function", you must be able to go into
the parent's frame: this is the role of Level.
Level are Frame with, in addition, a link
(parallel to the static link) to the level of the enclosing
function.
src/translate/level-entry.hh
- All the information that the environment must keep about variables and
functions.
src/translate/level-env.hh
- The levels environment, containing
LevelVarEntry's and
LevelFunEntry's. We don't need to store information related to
types here.
src/translate/translation.hh
- This file contains functions used by the
translate::TranslateVisitor to translate the AST into IR. They
handle all the unCx etc. magic.
src/translate/translate-visitor.hh
- The
translate::TranslateVisitor which walks the AST and
builds the IR tree.
T5 Code to Write
You are encouraged to try first very simple examples: nil,
1 + 2, "foo" < "bar" etc. Then consider supporting
variables, and finally handle the case of the functions.
- Driver
- The driver must always perform the translation, but displays the result
iff the option
--hir-display was given. Obviously, an input that
has not been type-checked cannot be translated, so all the T5 options
(--hir-display, and optionally --hir-use-ifexp)
imply --types-check.
- Type Checking
- Beware that there is a new builtin:
print_int.
src/translate/fragment.hh
- It implements
translate::Fragment, an abstract class,
translate::DataFrag to store the literal strings, and
translate::ProcFrag to store the routines.
There remains to implement translate::ProcFrag::print which
outputs the routine themselves plus the glue code (allocating the
frame etc.).
src/translate/level-env.hh
- Code is missing. In particular, bear in mind that the initial environment
is not empty...
src/translate/translation.hh
- There are many holes to fill.
src/translate/translate-visitor.hh
- There are holes to fill.
T5 Options
This section documents possible extensions you could implement in T5.
T5 Optimizing Cascading If
This optimization aims at optimizing the number of jumps needed to compute
nested if. It is easy to implement, and therefore will be granted
with a reasonable bonus.
It consists in implementing the option --hir-use-ifexp, which
causes the compiler to use translate::IfThenElseExp instead of the
simpler scheme based on translate::Cx, Nx, and
Ex only.
Warning: If you support the option
--hir-use-ifexp then you will be tested on
translate::IfThenElseExp. If this options does nothing in your
compiler, then be sure to get 0 on all the corresponding tests.
First, here is the output without --hir-use-ifexp, and the
performances reported by havm: 1 = 1 & 2 = 2
File 42: boolean.tig $ tc --hir-display boolean.tig
/* == High Level Intermediate representation. == */
# Routine: Main Program
label l`Main'
# Prologue
# Body
sxp
eseq
seq
cjump eq
const 1
const 1
name l0
name l1
label l0
move
temp t0
eseq
seq
move
temp t1
const 1
cjump eq
const 2
const 2
name l3
name l4
label l4
move
temp t1
const 0
label l3
seq end
temp t1
jump
name l2
label l1
move
temp t0
const 0
jump
name l2
label l2
seq end
temp t0
# Epilogue
label end
Example 43: tc -hir-display boolean.tig $ tc --hir-display boolean.tig >boolean-1.hir
=>0
Example 44: tc -hir-display boolean.tig
>boolean-1.hir $ havm --profile boolean-1.hir
error-->/* Profiling. */
error-->fetches from temporary : 2
error-->fetches from memory : 0
error-->binary operations : 0
error-->function calls : 1
error-->stores to temporary : 2
error-->stores to memory : 0
error-->jumps : 1
error-->conditional jumps : 2
error-->/* Execution time. */
error-->number of cycles : 17
Example 45: havm -profile boolean-1.hir
and here is the same, but with --hir-use-ifexp (pay attention
that it is passed before --hir-display): $ tc --hir-use-ifexp --hir-display boolean.tig
/* == High Level Intermediate representation. == */
# Routine: Main Program
label l`Main'
# Prologue
# Body
seq
cjump eq
const 1
const 1
name l0
name l1
label l0
seq
sxp
const 2
sxp
const 2
seq end
jump
name l2
label l1
sxp
const 0
label l2
seq end
# Epilogue
label end
Example 46: tc -hir-use-ifexp -hir-display
boolean.tig $ tc --hir-use-ifexp --hir-display boolean.tig >boolean-2.hir
=>0
Example 47: tc -hir-use-ifexp -hir-display
boolean.tig >boolean-2.hir $ havm --profile boolean-2.hir
error-->/* Profiling. */
error-->fetches from temporary : 0
error-->fetches from memory : 0
error-->binary operations : 0
error-->function calls : 1
error-->stores to temporary : 0
error-->stores to memory : 0
error-->jumps : 1
error-->conditional jumps : 1
error-->/* Execution time. */
error-->number of cycles : 12
Example 48: havm -profile boolean-2.hir
Of course, if you did not implement translate::IfThenElseExp
support, --hir-use-ifexp must cause tc to exit 64.
T5 Bound Checking
Implementing bound checking is quite simple: it consists in having the
program die when the program accesses an invalid subscript in an array. For
instance, the following code is "succeeds" with a non-bound-checking compiler. let type int_array = array of int
var size := 2
var arr1 := int_array [size] of 0
var arr2 := int_array [size] of 0
var two := 2
var m_one := -1
in
arr1[two] := 3;
arr2[m_one] := -1;
print_int (arr1[1]);
print ("\n");
print_int (arr2[0]);
print ("\n")
end
File 49: bound-violation.tig $ tc --hir-display bound-violation.tig >bound-violation.hir
=>0
Example 50: tc -hir-display bound-violation.tig
>bound-violation.hir $ havm bound-violation.hir
-1
3
Example 51: havm bound-violation.hir
When run with --bound-checking, have your compiler produce code
that dies on such cases.
T5 Optimizing Static Links
Warning: this optimization is difficult to do it perfectly, and
therefore, expect a big bonus.
In a first and conservative extension, the compiler considers that all the
functions (but the builtins!) need a static link. This is correct, but
inefficient: for instance, the traditional fact function will spend
almost as much time handling the static link, than its real argument.
Some functions need a static link, but don't need to save it on the stack.
For instance, in the following: let var foo := 1
function foo () : int = foo
in
foo ()
end
the function foo does need a static link to access the variable
foo, but does not need to store its static link on the stack.
It is suggested to address these problems in the following order:
- Implement the detection of functions that do not need a static
link (see exercise 6.5 in "Modern Implementation of Compilers"), but still
consider any static link escapes.
- Adjust the output of
--escapes-display to display /*
escaping sl */ before the first formal argument of the
functions (declarations) that need the static link: $ tc --escapes-display fact.tig
/* == Escapes. == */
let
function fact (/* escaping sl *//* escaping */ n : int) : int =
if (n = 0)
then 1
else (n * fact ( (n - 1)))
in
fact (10)
end
$ tc --escapes-display --escapes-com fact.tig
/* == Escapes. == */
let
function |