général
curriculum vitae
contact

CARDIWEB
2007
2006
2005

Etudes
les projets
les sites web
les exposés

liens
OTB World
EPITA
PCN
CARDIWEB

 

COMPILATEUR TIGER

This document presents the EPITA version of the Tiger project. It was last updated 5 February 2003.

Copyright © 2000, 2001, 2002, 2003 Akim Demaille.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.1 or any later version published by the Free Software Foundation; with no Invariant Sections, no Front-Cover texts and with the no Back-Cover Texts. A copy of the license is included in the section entitled "GNU Free Documentation License."

Reports

Everything exposed in this document is expected to be known.


Node:Top, Next:, Up:(dir)

Compilation: Projects and Reports

This document was last edited on 5 February 2003. Make sure the sections have been updated for your promotion.

This document (split and beautiful or in one simple piece) details the various tasks the "Compilation" students must complete. It includes both the practical part (Tiger compiler) and the theoretical part (reports).


Node:Tarballs, Next:, Previous:Top, Up:Top

Tarballs

There are a few mandatory requirements over the tarballs.

Delivery Dates

The tarballs must be made via make distcheck, and put in the requested location (see Student Tarballs). This tarball will be copied at the exact moment of the first delivery date, and the list of project which were found/not found is reported in epita.cours.compile.

If your tarball was not fetched, send it immediately to yaka@epita.fr and akim@epita.fr, with [Tiger]: bardec_f-tc-n.tar.bz2 as Subject. If you are quick enough (within a couple of hours), there won't be penalties. Run this:

     mutt -c akim@epita.fr \
          -s "[Tiger]: bardec_f-tc-n.tar.bz2" \
          -a bardec_f-tc-n.tar.bz2 \
          yaka@epita.fr </dev/null
     

Warning: Do not use elm to send your tarballs. Each time someone tried, it failed. And anyway, just never use elm.

Within a couple of hours after delivery date, a program evaluates your tarball. The results are available on the web, as bardec_f-tc-n.log. You should read this file as soon as possible, do not wait for the marks to be computed to react. If you realize there were stupid mistakes (such as displaying /* escapes */ instead of /* escaping */), send immediately (the day of the first delivery date) another tarball. This will cost you a patch, which is typically -1.

If you have massive failures, there is a second delivery date, which costs super_patch, typically -3. Beyond the second delivery date, no additional change will be accepted. Not that the second delivery is not automated: you have to send immediately to yaka@epita.fr and akim@epita.fr, with [Tiger]: bardec_f-tc-n.tar.bz2 as Subject.

Given Tarballs

The naming scheme for provided tarballs is different from the scheme you must follow (see Student Tarballs). Our naming scheme looks like 2004-tc-2.0.tar.bz2. If we update the tarballs, they will be named 2004-tc-2.x.tar.bz2. But your tarball must be named login-tc-2.tar.bz2, even if you send a second version of your project.

We also provide some test cases. There are few of them, and you will have to write your own tests. Writing tests is part of the project. Do not just copy test cases from other groups, as you will not understand why they were written.

The initial test suite is available for download at tests.tgz. It contains the following directories:

good
This programs are correct.
scan
These programs have lexical errors.
parse
These programs have syntax errors.
type
These programs contain type mismatches.

Project Layout

Here is the mandatory layout of the tarball:

AUTHORS
In the top level of the distribution, there must be a file AUTHORS which contents is as follows:
          Fabrice Bardèche     <bardec_f@epita.fr>
          Jean-Paul Sartre     <sartre_j@epita.fr>
          Jean-Paul Deux       <deux_j@epita.fr>
          Jean-Paul Belmondo   <belmon_j@epita.fr>
          

The group leader is the first in the list. Do not include emails other than those of EPITA. I repeat: give the 6_1@epita.fr address. Note that the file AUTHORS is automatically distributed, but pay attention to the spelling.

ChangeLog
Optional. The list of the changes made in the compiler, with the dates and names of the people who worked on it. See the Emacs key binding C-x 4 a.
README
Various free information.
src/
All the sources are in this directory.
tests/
Test files. Do not include this directory in the tarball, I already have my own test files, there is no need for you to give it to me again. Save my disk!
src/tc
Your compiler.
src/tc.cc
Main entry, option processing. Called, the driver.
src/symbol/
Namespace symbol:
symbol.hh
The handling of the symbols.
table.hh
The handling of generic symbol tables, i.e., it is independent of functions, types and variables.

src/ast/
Namespace ast. The abstract syntax.
src/parse/
Namespace parse: parsetiger.yy, scantiger.ll, libparse.hh which prototypes what tc.cc needs to know about the module parse.
src/type/
Namespace type. Type checking.
libtype.hh
The interface for the driver.
types.hh
The definition of all the types. You are free to use whatever layout you wish (several files); we have a single types.hh file.
type-entry.hh
Definitions of type::TypeEntry, type::VarEntry, and type::FunEntry, used in type::TypeEnv to associate data to types, variables, and functions (obviously).
type-env.hh
The types environment, comprising three symbol tables: types, functions, and variables.

src/translate/
Namespace translate. Translation to intermediate code translation. It includes:
libtranslate.hh
The interface.
libtranslate.cc
The compiled module.
translation.hh
Helping routines to perform the translations. For instance, it contains Exp *simpleVar (const Access &access, const Level &level), Exp *callExp (const temp::Label &label, std::list<Exp *> args) etc. which are routines that produce some Tree::Exp.
translate-visitor.hh
Implements the class TranslateVisitor which performs the IR generation thanks to translation.hh. It must not be polluted with translation details: it is only coordinating the AST traversal with the invocation of translation routines. For instance, here is the translation of a ast::SimpleVar:
               virtual void visit (const SimpleVar& e)
               {
                 const Access &access = _env.var_access_get (e.name_get ());
                 _exp = simpleVar (access, *_level);
               }
               

Student Tarballs

Each group must provide a tarball. If bardec_f is the head of your group, the tarball must be ~bardec_f/c/rendu/tiger/bardec_f-tc-n.tar.bz2 where n is the number of the "release". The following commands must work properly:

     $ bunzip2 -cd bardec_f-tc-n.tar.bz2 | tar xvf -
     $ cd bardec_f-tc-n
     $ ./configure CC=gcc-3.2 CXX=g++-3.2
     $ make
     $ cd src
     $ ./tc /tmp/test.tig
     

where gcc-3.2 and g++-3.2 are the name of the GNU C and C++ Compiler 3.2 executables.

Your tarball must be done via make distcheck. If you just run make dist, then you might forget to include some files in the distribution. If you don't even run make dist, then not only some files might be missing, but you have no guarantee that the tarball will compile elsewhere (not to mention that we don't care about object files etc.).

Do not deliver the tests that we gave: we already have them.

Any tarball which is not built thanks to make distcheck (this is easy to see: they include files we don't want, and don't contain some files we need...) will be penalized with at least ### tarball_not_clean.

Code Sanity

The code you deliver must be clean. In particular, when some code is provided, and you have to fill in the blanks denoted by FIXME: Some code has been deleted.. Sometimes you will have to write the code from scratch.

In any case, dead code and dead comments must be removed. You are free to leave comments spotting places where you fixed a FIXME:, but never leave a fixed FIXME: in your code. Nor any irrelevant comment.

The official compiler for this project, is GNU C++ Compiler, 3.2 or higher.


Node:Evaluation, Next:, Previous:Tarballs, Up:Top

Evaluation

Some stages are evaluated only by a program, and others are evaluated both by humans, and a program.

Automated Evaluation

Each stage of the compiler will be evaluated by an automatic corrector. As soon as the tarball are delivered, the logs are available on http://www.lrde.epita.fr/~akim/compil, in the directory corresponding to your promotion and stage. For instance, 2004 students ought to read http://www.lrde.epita.fr/~akim/compil/2004/4/bardec_f-tc-4.log.

We stress that automated evaluation enforces the requirements: you must stick to what is being asked. For instance, for T3 it is explicitly asked to display something like:

     var /* escaping */ i : int := 2
     

so if you display any of the following outputs

     var i : int /* escaping */ := 2
     var i /* escaping */ : int := 2
     var /* Escapes */ i : int := 2
     

be sure to fail all the tests, even if the computation is correct.

If you find some unexpected errors (your project does compile with the reference compiler, some files are missing, your output is slightly incorrect etc.) immediately send a new tarball to yaka@epita.fr with [Tiger] as prefix of the subject. This corresponds to ### patch.

Do not wait for the final marks to be computed, this is extremely irritating, and doomed to failure. You must understand that (i) you increase our workload, and (ii) anyway this is the wrong approach, the Tiger Compiler is a big project which must be continuously improved.

If, anyway, you send a tarball to fix your problems long after the initial date, you will be flagged as ### super_late, which impact on the mark is quite bad...

During the Examination

When you are defending your projects, here are a few rules to follow:

Don't talk
Don't talk unless you are asked to: when a person is asked a question, s/he is the only one to answer. You must not talk to each other either: often, when one cannot answer a question, the question is asked to another member. It is then obvious why the members of the group shall not talk.
Don't touch the screen
Don't touch my display! You have nice fingers, but I don't need their prints on my screen.
Tell the truth
If there is something the examiner must know (someone did not work on the project at all, some files are coming from another group etc.), say it immediately, for, if we discover that by ourselves, you will be severely sanctioned.
Learn
It is explicitly stated that you can not have worked on a stage provided this was an agreement with the group. But it is also explicitly stated that you must have learned what was to be learned from that compiler stage, which includes C++ techniques, Bison and Flex mastering, object oriented concepts, design patterns and so forth.
Complain now!
If you don't agree with the notation, say it immediately. Private messages about "this is unfair: I worked much more than bardec_f but his grade is better than mine" are thrown away.

Conversely, there is something I wish to make clear: I, Akim, and the other examiners, will probably be harsh (maybe even very harsh), but this does not mean I disrespect you, or judge you badly.

You are here to defend your project and knowledge, I'm here to stress them, to make sure they are right. Learning to be strong under pressure is part of the exercise. Don't burst into tears, react! Don't be shy, that's not the proper time: you are selling me something, and I will never buy something from someone who cries when I'm criticizing his product.

You should also understand that human examination is the moment where we try to evaluate who, or what group, needs help. We are here to diagnose your project and provide solutions to your problems. If you know there is a problem in your project, but you failed to fix it, tell it to the examiner! Work with her/him to fix your project.

Human Evaluation

The point of this evaluation is to measure:

the quality of the code
How clean it is, amount of code duplication, bad hacks, standards violations (e.g., stderr is forbidden in proper C++ code) and so forth. It also aims at detecting cheaters, who will be severely punished (mark = -42).
the knowledge each member acquired
While we do not require that each member worked on a stage, we do require that each member (i) knows how the stage works and (ii) has perfectly understood the (C++, Bison etc.) techniques needed to implement the stage.

In results in an evaluation file (login-tc-stage.eval, e.g., bardec_f-tc-2.eval), structured as follows.

The format is free, except some lines starting with specific markers, when they are at the first column. The first marker is *, used to denote the login of the group head:

     * bardec_f
     ----------------------------------------
     2002-03-07: T2 par akim
     tc-check: Summary:
     tc-check: == Testing ./tc --parse-trace -l --ast-display ==
     Successes: 177/177

     Suppléments:
     - --stdin pour parser stdin.
     - Le parser fait attention à ses leaks, même sur les Symbol.
     - --gcc-ast
     - ChangeLog
     - Ne meurt pas quand on lui donne un répertoire en entrée.

     Le code est joli, et en 80 col !
     Plutôt que des listes vides, passent 0.
     

As you can see, the person who evaluated is allowed to put whatever comment comes to her mind. Specific features, specific failures, explanations to help the students to fix their project should be included. Each time an entry is made, there should be the date, and who wrote it (2002-03-07: T2 par akim).

Then, the mark of the project, aka, the note of the group:

     ### Note T2 = 20
     

and optionally:

### late
The code was delivered reasonably late (the same day).
### super_late
More than one day.
### patch
Within 24 hours from the first delivery date, the group submitted another (hopefully) fixed tarball.
### super_patch
Before the second delivery date, the group submitted another tarball.
### not_compile
It doesn't compile, and the examiner had to change the code.
### tarball_not_clean
Clear enough: bad name, bad content etc.
### cheater
The group has cheated. The examiner is then asked to write as many details as possible, and to flag the group as cheaters. The mark is likely to be negative, and exclusion from EPITA is possible.
### reevaluation
The group had a mark < 8, and is now being reexamined.
### bonus
The group has implemented additional features, such as XML output etc. and the project works properly. We do not care about extra features on top of a broken project.
Important note to the examiners: Note field.

The examiner should not take (too much) the automated tests into account to decide the mark. This is because the mark is computed later, taking this into account, so don't do it twice. Similarly, a very nice project, which is super late, shall be flagged as super late, but should have a good ## Note entry.

Important note to the examiners: broken tarballs.

If you fixed the tarball (### not_compile, or ### tarball_not_clean or whatever modification, you must run make distcheck again, and replace the tarball they delivered with the new one. Do not keep the old tarball, do not install it in a special place: just replace the first tarball with it, but say so in the eval file.

The rationale is simple: only tarballs pass the tests, and every tarball must be able to pass the tests. If you don't do that, then someone else will have to do it again.

The next bits is the evaluation of each member of the group:

     ----------------------------------------
     ## bardec_f:100:120
     Parser, locations, scanner.  A pratiquement e'crit tout T2.

     ##sartre_j:100:40
     AST, Visitor
     Pourtant ne comprend rien au Visitor.  Se fout de ma gueule.

     ##deux_j:80:80
     Pas de problème C++.
     FIXME: Sa note de T1 est à revoir (resoutenance).

     ##belmon_j:100:100
     Actions, les delete dans le parser.
     A implémenté la sortie en AST compatible avec GCC.
     

The formalism is ## login:bonus1:bonus2. If the stage is T2, then bonus1 refers to a bonus on T1, and bonus2 on T2. For T4, it covers T3 and T4, and so forth.

The values can be "negative" (80, 60, 40, 0), or positive if a member deserves it (100, 120, or even more if justifiable).

There should be information about who did what.

Marks Computation

Because the Tiger Compiler is a project with stages, the computation of the marks depends on the stages too. To spell it out explicitly:

A stage is penalized by bad results on tests performed for previous stages.

It means, for instance, that a T3 compiler will be exercised on T1, T2, and T3. If there are still errors on T1 and T2 tests, they will pessimize the result of T3 tests. The older the errors are, the more expensive they are.

As an example, here are the formulas to compute the global success rate of T3 and T5:

     global-rate-T3 := rate-T3 * (+ 2 * rate-T1
                                  + 1 * rate-T2) / 3
     global-rate-T5 := rate-T5 * (+ 4 * rate-T1
                                  + 3 * rate-T2
                                  + 2 * rate-T3
                                  + 1 * rate-T4) / 10
     

Because a project which fail half of the time is not a project that deserves half of 20, the global-rate is elevated to 1.7 before computing the mark:

     mark-T3 := roundup (power (global-rate-T3, 1.7) * 20 - malus-T3, 1)
     

where roundup (x, 1) is x rounded up to one decimal (roundup (15, 1) = 15, roundup (15.01, 1) = 15.1).

When the project is also evaluated by a human, power is not used. Rather, the success rate modifies the mark given by the examiner:

     mark-T2 := roundup (eval-T2 * global-rate-T2 - malus-T2, 1)
     

Reevaluation

It happens that some groups have big problems with their project. Here is how to solve these issues.

But first of all, you must know that cheating, stealing another group's project, is not a solution, because:

  • you won't learn. Do not forget that the Tiger Compiler is only an exercise: having a good grade is less important than understanding and being efficient when you will be hired by a company.
  • we will certainly notice you are not able to answer our questions.
  • when we find cheaters, they pay it very severely (### cheater).

Because the Tiger Compiler is a long project, you continuously have to improve it, but marks starting from 8 to higher are definitive. If your mark is less than 8, can be re-examined: you must provide a better tarball, and in the case of human evaluated stages, another audition will be made.

This new audition will be flagged as ### reevaluation. It cannot be better than 12/20, even if your project is excellent: a good project out of date cannot be judged better than a reasonable good project on time.

Pay attention that when providing an updated tarball for reexamination of stage n, it must follow the naming scheme of stage n, even if the contents goes further. For instance, it is perfectly valid to ask for a T2 reevaluation with a compiler that implements T4, but the tarball must be bardec_f-tc-2.tar.bz2.

You have to ask for a reevaluation, we will not look after you.


Node:Compiler Stages, Next:, Previous:Evaluation, Up:Top

Compiler Stages

The compiler will be written in several steps, described below.


Node:T0, Next:, Up:Compiler Stages

T0, Naive Scanner and Parser

This section has been updated for EPITA-2005.

T0 is a weak form of T1: the scanner and the parser are written, but there is a set of simplifications:

the hierarchy
We don't need several directories, you can program in the top level of the package.
the build
The GNU Build System is not used: there is no need for Autoconf, Automake etc.
no driver
The driver is reduced to its simplest form:
     int
     main (int argc, const char *argc[])
     {
       assert (argc == 1);
       yyin = fopen (argv[1]);
       assert (yyin);
       return !!yyparse ();
     }
     

i.e., there is no support for options at all.

SCAN, PARSE
There is no support for options, but that does not mean that you cannot use the tracing/debugging features from Flex/Bison. It is required that you bind tracing to the environment variables SCAN and PARSE. I.e., running
     PARSE=1 ./tc foo.tig
     

will set yydebug to 1, which causes the traces of the parsing to be displayed.

YYPRINT
There is no requirement to implement YYPRINT support.
yylval
All the values are supported: strings, integers and even symbols. Nevertheless, symbols are returned as plain strings for the time being: the class symbol::Symbol is to be implemented in T1.


Node:T0 Code to Write, Up:T0

T0 Code to Write

You must write

scantiger.ll
The scanner.
parsetiger.yy
The parser, and maybe main if you wish.
tc.cc (optionally)
The driver, i.e., main. Putting it into parsetiger.yy is OK in T0.
Makefile
This file is mandatory. Running make must build the binary tc.

The requirements on the tarball are the same as usual, see Tarballs.


Node:T1, Next:, Previous:T0, Up:Compiler Stages

T1, Scanner and Parser

T1 delivery is Friday, February 14th 2003 at noon.
This section is updated for EPITA-2005.

Scanner and parser are properly running, but the abstract syntax tree is not built yet. Differences with T0 include:

GNU Build System
Autoconf, Automake are used.
Options, Tasks
The compiler supports basic options via in the Task module.
locations
The locations are computed are properly reported in the error messages


Node:T1 Samples, Next:, Up:T1

T1 Samples

The only information the compiler will give is about lexical and syntax errors.

If there are no errors, the compiler shuts up, and exits successfully:

     /* an array type and an array variable */
     let
       type  arrtype = array of int
       var arr1 : arrtype := arrtype [10] of 0
     in
       arr1[2]
     end
     
File 1: test01.tig
     $ tc test01.tig

     =>0
     
Example 2: tc test01.tig

If there are lexical errors, the exit status is 2, and a an error message is output on the standard error output. Note that its format is standard: file, (precise) location, and then the message.

     1
      /* This comments starts at /* 2.2 */
     
File 3: unterminated-comment.tig
     $ tc unterminated-comment.tig

     error-->unterminated-comment.tig:2.2-3.0: unexpected end of file in a comment
     =>2
     
Example 4: tc unterminated-comment.tig

If there are syntax errors, the exit status is set to 3:

     let var a : nil := ()
     in
       1
     end
     
File 5: type-nil.tig
     $ tc type-nil.tig

     error-->type-nil.tig:1.13-15: syntax error, unexpected "nil", expecting "identifier"
     error-->Parsing Failed
     =>3
     
Example 6: tc type-nil.tig

If there are errors which are non lexical, nor syntactic (Windows will not pass by me):

     $ tc C:/TIGER/SAMPLE.TIG

     error-->cannot open `C:/TIGER/SAMPLE.TIG': No such file or directory
     =>1
     
Example 7: tc C:/TIGER/SAMPLE.TIG

The option --parse-trace, which relies on Bison's %debug directive, and the use of YYPRINT, must work properly:

     $ cat foo.tig
     a + "a"
     $ ./tc --parse-trace foo.tig
     Starting parse
     Entering state 0
     Reading a token: Next token is 258 (ID a)
     Shifting token 258 (ID), Entering state 2
     Reading a token: Next token is 270 (PLUS)
     Reducing via rule 74 (line 318), ID  -> varid
     state stack now 0
     Entering state 16
     Reducing via rule 34 (line 196), varid  -> lvalue
     state stack now 0
     Entering state 13
     Next token is 270 (PLUS)
     Reducing via rule 33 (line 192), lvalue  -> exp
     state stack now 0
     Entering state 12
     Next token is 270 (PLUS)
     Shifting token 270 (PLUS), Entering state 30
     Reading a token: Next token is 257 (STRING a)
     Shifting token 257 (STRING), Entering state 1
     Reducing via rule 15 (line 151), STRING  -> exp
     state stack now 0 12 30
     Entering state 65
     Reading a token: Now at end of input.
     Reducing via rule 27 (line 182), exp PLUS exp  -> exp
     state stack now 0
     Entering state 12
     Now at end of input.
     Reducing via rule 1 (line 106), exp  -> program
     state stack now 0
     Entering state 159
     Now at end of input.
     Shifting token 0 ($), Entering state 160
     Now at end of input.
     $ echo $?
     =>0
     

Note that (i), it cannot see that the variable is not declared nor that there is a type checking error, since type checking... is not implemented, and (ii), the output might be slightly different, depending upon the version of Bison you use. But what matters is that one can see the items: ID a, STRING a.


Node:T1 Given Code, Next:, Previous:T1 Samples, Up:T1

T1 Given Code

Some code is provided: 2005-tc-1.0.tar.bz2. It contains:

src/common.hh
Used throughout the project.
src/tc.cc
The compiler driver (option handling etc.).
src/parse/scantiger.ll
The scanner.
src/parse/parsetiger.yy
The parser.
src/ast/position.hh
Keeping track of a point (cursor) in a file.
src/ast/location.hh
Keeping track of a range (two cursors) in a (or two) file.


Node:T1 Code to Write, Next:, Previous:T1 Given Code, Up:T1

T1 Code to Write

Be sure to read Flex and Bison documentations and tutorials, see Flex & Bison.

src/scantiger.ll
The scanner must be completed. It must be able to read strings, identifiers etc. Strings and identifiers will be stored as C++ std::string. See the following code for the basics.
          ...
          \"          yylval->str = new std::string (); BEGIN STATE_STRING;

          <STATE_STRING>{ /* Handling of the strings.  Initial " is eaten. */
               \" {
                 BEGIN INITIAL;
                 return STRING;
               }
          ...
               \\x[0-9a-fA-F]{2}  {
                 yylval->str->append (1, strtol (yytext + 2, 0, 16));
               }
          ...
          }
          

The locations are tracked.

src/parsetiger.yy
The grammar must be complete, but do not implement actions. You have to implement support for --parse-trace (see T1 Samples). Pay special attention to the display of strings and identifiers.

Bison will certainly complain because of a type clash for some actions. For instance, if you have given a type to STRING, but none to exp, then it will choke on:

          exp: STRING;
          

because it actually means

          exp: STRING { $$ = $1; };
          

which is not type coherent. So write this instead:

          exp: STRING {};
          

src/ast/position.hh
The class ast::Position is completed.
src/ast/location.hh
The class ast::Location must be completed.
src/symbol/symbol.hxx
The class symbol::Symbol keeps a single copy of identifiers, so that (i) we save space, and (ii) symbol comparison is fast. The file src/symbol/symbol.hh describes the interface of the class symbol::Symbol, but the implementation is to be written in src/symbol/symbol.hxx.


Node:T1 FAQ, Previous:T1 Code to Write, Up:T1

T1 FAQ

Options
The first version of argp we used (1.1) was buggy: on ./tc foo.tig -A, it considered -A was a file, not an option. This bug is now fixed. The patch is from Niels Möller:
     diff -u -r1.16 -r1.17
     --- argp-parse.c	18 Feb 2001 22:40:03 -0000	1.16
     +++ argp-parse.c	4 Feb 2003 19:52:30 -0000	1.17
     @@ -1021,6 +1021,8 @@
      	  *arg_ebadkey = 1;
      	  if (parser->first_nonopt != parser->last_nonopt)
      	    {
     +	      exchange(parser);
     +
      	      /* Start processing the arguments we skipped previously. */
      	      parser->state.next = parser->first_nonopt;

     

Go into argp/, run patch -p0, paste the patch, type <CTRL>-d.


Node:T2, Next:, Previous:T1, Up:Compiler Stages

T2, Building the Abstract Syntax Tree

This section has been updated for EPITA-2004. Last update: 2002-02-14.

At the end of this stage, the compiler can build abstract syntax trees of Tiger programs and pretty-print them.


Node:T2 Samples, Next:, Up:T2

T2 Samples

Here are a few examples of expected features.

The parser builds abstract syntax trees that can be output by a pretty-printing module:

     /* define a recursive function */
     let
       /* calculate n! */
       function fact (n : int) : int =
         if  n = 0
           then 1
           else n * fact (n - 1)
     in
       fact (10)
     end
     
File 8: simple-fact.tig
     $ tc --ast-display simple-fact.tig

     
     /* == Abstract Syntax Tree. == */
     let
        function fact (n : int) : int =
           if (n = 0)
              then 1
              else (n * fact ((n - 1)))
     in
        fact (10)
     end
     
Example 9: tc -ast-display simple-fact.tig

The output from your pretty-printer must be valid Tiger code, and be equivalent to the input.

By valid, we mean that any Tiger compiler must be able to parse with success your output. Pay attention to the banners such as == Abstract...: you should use comments: /* == Abstract... */. There is one exception: you are not asked to escape quotes and backslashes when printing a literal string.

By equivalent, we mean that except for syntactic sugar, the output and the input are equal. Syntactic sugar refers to &, |, unary -, and in some cases if then:

     % cat boolean-1.tig
     1 = 1 & 2 = 2
     % ./tc --ast-display boolean-1.tig
     /* == Abstract Syntax Tree. == */
     if  (1 = 1)
        then  (2 = 2)
        else 0
     

For loops must be properly displayed, i.e., although we use a ast::VarDec for the index of the loop, you must not display var:

     /* valid for and let */
     let
             var a := 0
     in
             for i :=0 to 100 do (a :=a+1; ())
     end
     
File 10: for-loop.tig
     $ tc --ast-display for-loop.tig

     
     /* == Abstract Syntax Tree. == */
     let
        var a := 0
     in
        for i := 0 to 100 do
           (
              a := (a + 1);
              ()
           )
     end
     
Example 11: tc -ast-display for-loop.tig


Node:T2 Given Code, Next:, Previous:T2 Samples, Up:T2

T2 Given Code

Some code is provided: 2004-tc-2.0.tar.bz2. It contains:

src/ast
Implementation of the abstract syntax tree. The file README gives an overview of the involved class hierarchy.
src/ast/location.hh, position.hh
Classes used to hold locations computed from Tiger programs.
src/ast/visitor.hh
Abstract base class of the compiler's visitor hierarchy.
src/ast/default-visitor.hh
Implementation of the DefaultVisitor class, which walks the abstract syntax tree, doing nothing. It is mainly used as a basis for deriving other visitors.
src/ast/print-visitor.hh
Implementation of the PrintVisitor class, which performs pretty-printing in the tiger compiler.
src/symbol.*
Have a look at how this class is implemented. Ask questions if you don't understand it. You will not be given questions about it in T2, but T3, there will be.


Node:T2 Code to Write, Next:, Previous:T2 Given Code, Up:T2

T2 Code to Write

What is to be done:

src/parse/scantiger.ll
The scanner must now use symbol::Symbol instead of std::string for identifiers. Of course, the parser must be adjusted too.

The scanner must be updated to keep track of locations of tokens in Tiger programs. To adjust your scanner, you are strongly encouraged to use YY_USER_ACTION, and also the yylex prologue:

          ...
          %%
          %{
            // Everything here is run each time yylex is invoked.
          %}
          "if"    return IF;
          ...
          %%
          ...
          

Have a look at the scanner and parser chapters of this draft.

src/parse/parsetiger.yy
The parser must be updated to ensure error recovery. The grammar must be changed to process declarations by chunks.

In Tiger, the following program is invalid:

          let function foo () = ()
              function foo () = ()
              var      foo    := 0
          in
            ()
          end
          

while the following code is valid:

          let function foo () = ()
              var      foo    := 0
              function foo () = ()
          in
            ()
          end
          

this is because declarations are cut in "chunks" of declarations of the same kind. In the first example, there are two chunks: one chunk of two function declarations and one variable declaration. In the second example: there are three chunks: one chunk containing only one function declaration, one with one variable declaration, and the third one, with a single function declaration again.

The rule is "a chunk cannot define twice the same name".

In order to implement this easily, you must adjust your grammar so that declarations are parsed by chunks. Pay special attention to the implementation of ast::FunctionDecs, ast::VarDecs, and ast::TypeDecs (which are all implemented thanks to ast::AnyDecs): they are these chunks of declaration. Therefore, an ast::LetExp uses a list of chunks.

src/ast
The abstract syntax tree module must be completed. Basically, this means there should remain no FIXME: anywhere in the code we gave. Several files are missing (fieldvar.hh nilexp.hh intexp.hh stringexp.hh callexp.hh assignexp.hh whileexp.hh breakexp.hh arrayexp.hh). See src/ast/README for additional information on the missing classes.
src/ast/location.hh, position.hh
The classes Position and Location must be completed.
src/ast/default-visitor.hh
The DefaultVisitor class must be completed, and must be able to walk whole abstract syntax trees. Do not forget that your DefaultVisitor must be a sound basis for your further work on the Tiger compiler.
src/ast/print-visitor.hh
The PrintVisitor class must be completed, and able to pretty-print any Tiger program.


Node:T2 FAQ, Previous:T2 Code to Write, Up:T2

T2 FAQ

bison
Be sure to read its dedicated section: Flex & Bison.
escapes (escapes_get, etc.)
These are place holders for T3. You don't have to complete them. Actually, the easiest is probably just to comment these functions out. Or better yet: #if 0/#endif.


Node:T3, Next:, Previous:T2, Up:Compiler Stages

T3, Computing the Escaping Variables

This section was updated for Tiger 2004. The project will be taken on Friday, March 15th, at noon.

At the end of this stage, the compiler must be able to compute and display the escaping variables. Be sure to read the chapter "Escapes" in the lecture notes. Its parser is equipped with error recovery.


Node:T3 Samples, Next:, Up:T3

T3 Samples

This example demonstrates the computation and display of escaping variables/formals.

     let
        var escaping := "I rule the world!"
        var not_escaping := "Peace on Earth for humans of good will."
        function print_escaping () =
           print (escaping)
     in
        (not_escaping; print_escaping ())
     end
     
File 12: variable-escapes.tig
     $ tc --escapes-display variable-escapes.tig

     
     /* == Escapes. == */
     let
        var /* escaping */ escaping := "I rule the world!"
        var /* escaping */ not_escaping := "Peace on Earth for humans of good will."
        function print_escaping () =
           print (escaping)
     in
        (
           not_escaping;
           print_escaping ()
        )
     end
     
Example 13: tc -escapes-display variable-escapes.tig
     $ tc --ast-display --escapes-compute --escapes-display variable-escapes.tig

     
     /* == Abstract Syntax Tree. == */
     let
        var escaping := "I rule the world!"
        var not_escaping := "Peace on Earth for humans of good will."
        function print_escaping () =
           print (escaping)
     in
        (
           not_escaping;
           print_escaping ()
        )
     end
     /* == Escapes. == */
     let
        var /* escaping */ escaping := "I rule the world!"
        var not_escaping := "Peace on Earth for humans of good will."
        function print_escaping () =
           print (escaping)
     in
        (
           not_escaping;
           print_escaping ()
        )
     end
     
Example 14: tc -ast-display -escapes-compute -escapes-display variable-escapes.tig

You are strongly encouraged to run your compiler on merge.tig and to study its output.

Another part of T3 is the improvement of your parser: it must be robust to some forms of errors. On the following input:

     (
       1;
       (2, 3);
       (4, 5);
       6
     )
     

it finds two errors: it does not stop on the first one:

     % ./tc error.tig
     error-->error.tig:3.5: parse error, unexpected ",", expecting ";"
     error-->error.tig:4.5: parse error, unexpected ",", expecting ";"
     

Of course, the exit status still reveals the parse error.

Be sure that your error recovery does not break the rest of the compiler...


Node:T3 Code To Write, Previous:T3 Samples, Up:T3

T3 Code To Write

src/parse/parsetiger.yy
Implement error recovery. There should be at least three uses of the token error. Read the Bison documentation about it.
src/parse/parsetc-class.hh
The current version of Bison no longer outputs this file. Be sure to remove it from your project: src/parse/parsetiger.hh alone is enough.
ast::VarExp
What is it used for? What is its meaning? Find another means to do the same things, but without it. Hence destroy it, and remove its file: ast must no longer use it.
ast::PrintVisitor
Be sure to display the /* escaping */ flag where needed, and only where needed. If you don't pay attention, you might display meaningless flags due to implementation details.
symbol::Table< class Entry_T >
This item is irrelevant for Tiger 2004, as the code was accidently delivered to you. Nevertheless, if you're willing to understand templates in C++, you are encouraged to try writing your own version, without taking inspiration from the code we gave.

Write the class template symbol::Table in src/symbol/table.hh which is a table of symbols dedicated to storing some data which type is Entry_T * (i.e., it maps a symbol::Symbol to an Entry_T *). You are encouraged to implement something simple, based on stacks (see std::list) of maps (see std::map). We used to promote hash tables, but STL does not include hash tables: this is a GNU (SGI actually) extension.

symbol::Table is implemented as a class template because we will use it with several other kinds of Entry_T in both the type::TypeVisitor and translate::TranslateVisitor.

symbol::Table must provide this interface:

scope_begin () void
Open a new scope.

scope_end () void
Close the last scope, forgetting everything since the latest scope_begin ().

put (Symbol key, Entry_T & value) void
Associate value to key in the current scope.

get (Symbol key) const Entry_T *
If key was associated to some Entry_T in the open scopes, return the most recent insertion. Otherwise return the empty pointer.

print (std::ostream & ostr) const void
Send the content of this table on ostr in a readable manner, the top of the stack being displayed last.

ast::EscapeVisitor
Write the class ast::EscapeVisitor in src/ast/escape-visitor.hh.

You are suggested to implement three additional classes in this file:

Definition
An abstract class which is used to instantiate the template class symbol::Table into Table <Definition>.

escape_set (void) virtual void
Sets the escape to true.

int _depth Variable
Depth at which this object has been created.

depth_get () const int
Returns the depth associated to this Definition object.

VariableDefinition
Inherits from Definition. It has one additional attribute, a VarDec &. The method escape_set is implemented, and when invoked, set the escapes flags of the corresponding VarDec.
FormalDefinition
Inherits from Definition. To be designed by yourself. Do not forget that the ast class used to register formals is used elsewhere, and it would be a pity that your implementation makes no difference... Be sure to write a test that verifies that your implementation is not abused. I have one such test...

Equip ast
All the sites where variables and formals (i.e., the arguments of the functions being defined, not being used) are introduced must be equipped with the escape_get and escape_set methods. Most probably the code was already given, and is using const_casts; try to use mutable instead.

Modify the code so that each definition of an escaping variable/formal is preceded by the comment /* escaping */ if the flag display_escapes_p is true. See the item "Driver" for an example.

Driver
The driver, src/tc.cc, must be enhanced to support two additional options: --escapes-compute, which triggers the ast::EscapeVisitor, and --escapes-display, which sets display_escapes_p to true. This is not a modification of --ast-display, it is a genuine second ast::PrintVisitor which is invoked.

Please, note that --escapes-display does not imply --escapes-compute (see T3 Samples).


Node:T4, Next:, Previous:T3, Up:Compiler Stages

T4, Type Checking

T4 delivery is Friday, April 12th 2002 at noon.
At the end of this stage, the compiler type checks Tiger programs. Clear error messages are required.


Node:T4 Samples, Next:, Up:T4

T4 Samples

Here are a few examples:

     $ echo '1 + "2"' | ./tc -
     $ echo '1 + "2"' | ./tc - --types-check
     error-->standard input:1.1-7: type mismatch
     error-->  right operand type: string
     error-->  expected type: int
     
     $ echo '1 + a' | ./tc -
     $ echo '1 + a' | ./tc - --types-check
     error-->standard input:1.5: undeclared variable: a
     
     $ echo 'if 1 then 2' | ./tc -
     $ echo 'if 1 then 2' | ./tc - --types-check
     error-->standard input:1.1-11: type mismatch
     error-->  then clause type: int
     error-->  else clause type: void
     


Node:T4 Given Code, Next:, Previous:T4 Samples, Up:T4

T4 Given Code

src/type/libtype.hh
src/type/libtype.cc
The interface of the Type module. It exports a simple procedure, type_check.
src/type/types.hh
The definition of all the types. The type type::Nil is given in full.
src/type/type-env.hh
The environment used by the type::TypeVisitor.


Node:T4 Code to Write, Previous:T4 Given Code, Up:T4

T4 Code to Write

What is to be done.

src/type
New directory. Don't forget to update the SUBDIRS variable in src/Makefile.am, and the AC_CONFIG_FILES invocation in configure.ac.
src/type/types.hh
type::String, type::Int, and type::Void are to be implemented. It's dead easy. Using templates would be particularly appreciated to factor the code between the four singleton classes.

type::Named is almost entirely given.

type::Array is even simpler than the four Singletons.

type::Record is somewhat incomplete.

Pay attention to the implementation of type::Type::assignable_to and type::Type::comparable_to.

src/type/type-entry.hh
This file is really a empty nutshell, so we give it complete so that you concentrate on things that matter. Nonetheless you will be asked questions on this file, so have a look at it.
src/type/type-env.hh
The constructor of type::TypeEnv must be completed: it must fill the environment with the definition of builtin types and functions. See the Tiger Reference Manual.

The handling of types is left as an example, you still have to implement the variables and functions support.

type::TypeVisitor
Of course this is the most tricky part. I hope there are enough comments in there so that you understand what is to be done. Please, post your questions and help me improve it.
Driver
The driver must run the type checker when given --types-check. Note the plural, it spells --types-check, not --type-check.


Node:T5, Next:, Previous:T4, Up:Compiler Stages

T5, Translating to the High Level Intermediate Representation

At the end of this stage, all the errors of previous stages must have been fixed, and the compiler translates the AST into the high level intermediate representation, HIR for short.

T5 first delivery is Friday, June 14th, at noon.
Second delivery date is Monday, June 17th, at noon.


Node:T5 Samples, Next:, Up:T5

T5 Samples

T5 can be started (and should be started if you don't want to finish it in a hurry) by first making sure your compiler can handle code that uses no variables. Then, you can complete your compiler to support more and more Tiger features.

T5 Primitive Samples

This example is probably the simplest Tiger program.

     0
     
File 15: 0.tig
     $ tc --hir-display 0.tig

     
     /* == High Level Intermediate representation. == */
     # Routine: Main Program
     label l`Main'
     # Prologue
     # Body
     sxp
         const 0
     # Epilogue
     label end

     
Example 16: tc -hir-display 0.tig

You should then probably try to make more difficult programs with literals only. Arithmetics is one of the easiest tasks.

     1 + 2 * 3
     
File 17: arith.tig
     $ tc --hir-display arith.tig

     
     /* == High Level Intermediate representation. == */
     # Routine: Main Program
     label l`Main'
     # Prologue
     # Body
     sxp
         binop (+)
             const 1
             binop (*)
                 const 2
                 const 3
     # Epilogue
     label end

     
Example 18: tc -hir-display arith.tig

You should use havm to exercise your output.

     $ tc --hir-display arith.tig >arith.hir

     =>0
     
Example 19: tc -hir-display arith.tig >arith.hir
     $ havm arith.hir

     =>0
     
Example 20: havm arith.hir

Unfortunately, without actually printing something, you won't see the final result, which means you need to implement calls. Fortunately, you can ask havm for a verbose execution:

     $ havm --trace arith.hir

     error-->call ( name Main )
     error-->const 1
     error-->const 2
     error-->const 3
     error-->binop (*) 2 3
     error-->binop (+) 1 6
     error-->sxp 7
     
Example 21: havm -trace arith.hir

If you look carefully, you will find an sxp 7 in there...

Then you are encouraged to implement control structures.

     if 101 then 102 else 103
     
File 22: if-101.tig
     $ tc --hir-display if-101.tig

     
     /* == High Level Intermediate representation. == */
     # Routine: Main Program
     label l`Main'
     # Prologue
     # Body
     sxp
         eseq
         seq
             cjump ne
                 const 101
                 const 0
                 name l0
                 name l1
             label l0
             move
                 temp t0
                 const 102
             jump
                 name l2
             label l1
             move
                 temp t0
                 const 103
             jump
                 name l2
             label l2
         seq end
             temp t0
     # Epilogue
     label end

     
Example 23: tc -hir-display if-101.tig

And even more difficult control structure uses:

     while 101 & 102
       do (if 103 then break)
     
File 24: while-101.tig
     $ tc --hir-display while-101.tig

     
     /* == High Level Intermediate representation. == */
     # Routine: Main Program
     label l`Main'
     # Prologue
     # Body
     seq
         label l7
         cjump ne
             eseq
             seq
                 cjump ne
                     const 101
                     const 0
                     name l0
                     name l1
                 label l0
                 move
                     temp t0
                     const 102
                 jump
                     name l2
                 label l1
                 move
                     temp t0
                     const 0
                 jump
                     name l2
                 label l2
             seq end
                 temp t0
             const 0
             name l8
             name l3
         label l8
         seq
             cjump ne
                 const 103
                 const 0
                 name l4
                 name l5
             label l4
             jump
                 name l3
             jump
                 name l6
             label l5
             sxp
                 const 0
             jump
                 name l6
             label l6
         seq end
         jump
             name l7
         label l3
     seq end
     # Epilogue
     label end

     
Example 25: tc -hir-display while-101.tig
T5 Builtin Calls Samples

But the game becomes more interesting when you implement function calls (which is easier than compiling functions). print_int is probably the first builtin to implement:

     (print_int (101); print ("\n"))
     
File 26: print-101.tig
     $ tc --hir-display print-101.tig > print-101.hir

     =>0
     
Example 27: tc -hir-display print-101.tig > print-101.hir
     $ havm print-101.hir

     
     101
     
Example 28: havm print-101.hir

Complex values, arrays and records, also need calls to the runtime system:

let type list = { h: int, t: list }
    var list := list { h = 1, t = list { h = 2, t = nil } }
in
  print_int (list.t.h); print ("\n")
end
File 29: print-list.tig
     $ tc --hir-display print-list.tig

     
     /* == High Level Intermediate representation. == */

     label l0
     	"
     "
     # Routine: Main Program
     label l`Main'
     # Prologue
     move
         temp t2
         temp fp
     move
         temp fp
         temp sp
     move
         temp sp
         binop (-)
             temp sp
             const 4
     # Body
     seq
         move
             mem
                 temp fp
             eseq
             seq
                 move
                     temp t1
                     call
                         name l`malloc'
                         const 8
                     call end
                 move
                     mem
                         binop (+)
                             temp t1
                             const 0
                     const 1
                 move
                     mem
                         binop (+)
                             temp t1
                             const 4
                     eseq
                     seq
                         move
                             temp t0
                             call
                                 name l`malloc'
                                 const 8
                             call end
                         move
                             mem
                                 binop (+)
                                     temp t0
                                     const 0
                             const 2
                         move
                             mem
                                 binop (+)
                                     temp t0
                                     const 4
                             const 0
                     seq end
                         temp t0
             seq end
                 temp t1
         seq
             sxp
                 call
                     name l`print_int'
                     mem
                         binop (+)
                             mem
                                 binop (+)
                                     mem
                                         temp fp
                                     const 4
                             const 0
                 call end
             sxp
                 call
                     name l`print'
                     name l0
                 call end
         seq end
     seq end
     # Epilogue
     move
         temp sp
         temp fp
     move
         temp fp
         temp t2
     label end

     
Example 30: tc -hir-display print-list.tig
     $ tc --hir-display print-list.tig > print-list.hir

     =>0
     
Example 31: tc -hir-display print-list.tig > print-list.hir
     $ havm print-list.hir

     
     2
     
Example 32: havm print-list.hir
T5 Samples with Variables

Here is an example which demonstrates the usefulness of information about escapes: in the first case, all the variables are expected to escape, hence are stored on the stack, while in the second case, we know they don't escape, hence they can be stored in a temporary.

     let var a := 1
         var b := 2
         var c := 3
     in
       a := 2;
       c := a + b + c;
       print_int (c);
       print ("\n")
     end
     
File 33: vars.tig
     $ tc --hir-display vars.tig

     
     /* == High Level Intermediate representation. == */

     label l0
     	"
     "
     # Routine: Main Program
     label l`Main'
     # Prologue
     move
         temp t0
         temp fp
     move
         temp fp
         temp sp
     move
         temp sp
         binop (-)
             temp sp
             const 12
     # Body
     seq
         move
             mem
                 temp fp
             const 1
         move
             mem
                 binop (+)
                     temp fp
                     const -4
             const 2
         move
             mem
                 binop (+)
                     temp fp
                     const -8
             const 3
         seq
             move
                 mem
                     temp fp
                 const 2
             move
                 mem
                     binop (+)
                         temp fp
                         const -8
                 binop (+)
                     binop (+)
                         mem
                             temp fp
                         mem
                             binop (+)
                                 temp fp
                                 const -4
                     mem
                         binop (+)
                             temp fp
                             const -8
             sxp
                 call
                     name l`print_int'
                     mem
                         binop (+)
                             temp fp
                             const -8
                 call end
             sxp
                 call
                     name l`print'
                     name l0
                 call end
         seq end
     seq end
     # Epilogue
     move
         temp sp
         temp fp
     move
         temp fp
         temp t0
     label end

     
Example 34: tc -hir-display vars.tig
     $ tc --escapes-compute --hir-display vars.tig

     
     /* == High Level Intermediate representation. == */

     label l0
     	"
     "
     # Routine: Main Program
     label l`Main'
     # Prologue
     # Body
     seq
         move
             temp t0
             const 1
         move
             temp t1
             const 2
         move
             temp t2
             const 3
         seq
             move
                 temp t0
                 const 2
             move
                 temp t2
                 binop (+)
                     binop (+)
                         temp t0
                         temp t1
                     temp t2
             sxp
                 call
                     name l`print_int'
                     temp t2
                 call end
             sxp
                 call
                     name l`print'
                     name l0
                 call end
         seq end
     seq end
     # Epilogue
     label end

     
Example 35: tc -escapes-compute -hir-display vars.tig
     $ tc --escapes-compute --hir-display vars.tig >vars.hir

     =>0
     
Example 36: tc -escapes-compute -hir-display vars.tig >vars.hir
     $ havm vars.hir

     
     7
     
Example 37: havm vars.hir

The second example uses a simple function.

     let function fact (i: int) : int =
         if i = 0 then 1
                  else i * fact (i - 1)
     in
       print_int (fact (15));
       print ("\n")
     end
     
File 38: fact15.tig
     $ tc --hir-display fact15.tig

     
     /* == High Level Intermediate representation. == */
     # Routine: fact
     label l0
     # Prologue
     move
         temp t1
         temp fp
     move
         temp fp
         temp sp
     move
         temp sp
         binop (-)
             temp sp
             const 8
     move
         mem
             temp fp
         temp i0
     move
         mem
             binop (+)
                 temp fp
                 const -4
         temp i1
     # Body
     move
         temp rv
         eseq
         seq
             cjump eq
                 mem
                     binop (+)
                         temp fp
                         const -4
                 const 0
                 name l1
                 name l2
             label l1
             move
                 temp t0
                 const 1
             jump
                 name l3
             label l2
             move
                 temp t0
                 binop (*)
                     mem
                         binop (+)
                             temp fp
                             const -4
                     call
                         name l0
                         mem
                             temp fp
                         binop (-)
                             mem
                                 binop (+)
                                     temp fp
                                     const -4
                             const 1
                     call end
             jump
                 name l3
             label l3
         seq end
             temp t0
     # Epilogue
     move
         temp sp
         temp fp
     move
         temp fp
         temp t1
     label end


     label l4
     	"
     "
     # Routine: Main Program
     label l`Main'
     # Prologue
     # Body
     seq
         sxp
             call
                 name l`print_int'
                 call
                     name l0
                     temp fp
                     const 15
                 call end
             call end
         sxp
             call
                 name l`print'
                 name l4
             call end
     seq end
     # Epilogue
     label end

     
Example 39: tc -hir-display fact15.tig
     $ tc --hir-display fact15.tig >fact15.hir

     =>0
     
Example 40: tc -hir-display fact15.tig >fact15.hir
     $ havm fact15.hir

     
     2004310016
     
Example 41: havm fact15.hir


Node:T5 Given Code, Next:, Previous:T5 Samples, Up:T5

T5 Given Code

Code that was delivered:

src/temp/temp.hh
So called temporaries are pseudo-registers: we may allocate as many temporaries as we want. Eventually the register allocator will map those temporaries to either an actual register, or it will allocate a slot in the activation block (aka frame) of the current function.
src/temp/label.hh
We need labels for JUMPs, for functions, strings etc.
src/tree
The implementation of the intermediate representation. The file README should give enough explanations to understand how it works.

Reading the corresponding explanations in Appel's book is mandatory.

It is worth noting that contrary to A. Appel, just as we did for ast, we use n-ary structures. For instance, where Appel uses a binary SEQ, we have an n-ary SEQ which allows us to put as many statements as we want.

To avoid gratuitous name clashes, what Appel denotes EXP is denoted SXP (Statement Expression), implemented in translate::Sxp.

Please, pay extra attention to the fact that there are temp::Temp used to create unique temporaries (similar to symbol::Symbol), and tree::Temp which is the intermediate representation instruction denoting a temporary (hence a tree::Temp needs a temp::Temp). Similarly, on the one hand, there is temp::Label which is used to create unique labels, and on the other hand there are tree::Label which is the IR statement to define to a label, and tree::Name used to refer to a label (typically, a tree::Jump needs a tree::Name which in turn needs a temp::Label).

src/translate/level.hh, src/translate/level.cc
They implement translate::Level, which are wrappers around frame::Frame. A Frame knows only what are the "variables" it contains, but for a language such as Tiger, where you may need to access the variables of the "parent function", you must be able to go into the parent's frame: this is the role of Level.

Level are Frame with, in addition, a link (parallel to the static link) to the level of the enclosing function.

src/translate/level-entry.hh
All the information that the environment must keep about variables and functions.
src/translate/level-env.hh
The levels environment, containing LevelVarEntry's and LevelFunEntry's. We don't need to store information related to types here.
src/translate/translation.hh
This file contains functions used by the translate::TranslateVisitor to translate the AST into IR. They handle all the unCx etc. magic.
src/translate/translate-visitor.hh
The translate::TranslateVisitor which walks the AST and builds the IR tree.


Node:T5 Code to Write, Next:, Previous:T5 Given Code, Up:T5

T5 Code to Write

You are encouraged to try first very simple examples: nil, 1 + 2, "foo" < "bar" etc. Then consider supporting variables, and finally handle the case of the functions.

Driver
The driver must always perform the translation, but displays the result iff the option --hir-display was given. Obviously, an input that has not been type-checked cannot be translated, so all the T5 options (--hir-display, and optionally --hir-use-ifexp) imply --types-check.
Type Checking
Beware that there is a new builtin: print_int.
src/translate/fragment.hh
It implements translate::Fragment, an abstract class, translate::DataFrag to store the literal strings, and translate::ProcFrag to store the routines.

There remains to implement translate::ProcFrag::print which outputs the routine themselves plus the glue code (allocating the frame etc.).

src/translate/level-env.hh
Code is missing. In particular, bear in mind that the initial environment is not empty...
src/translate/translation.hh
There are many holes to fill.
src/translate/translate-visitor.hh
There are holes to fill.


Node:T5 Options, Previous:T5 Code to Write, Up:T5

T5 Options

This section documents possible extensions you could implement in T5.

T5 Optimizing Cascading If

This optimization aims at optimizing the number of jumps needed to compute nested if. It is easy to implement, and therefore will be granted with a reasonable bonus.

It consists in implementing the option --hir-use-ifexp, which causes the compiler to use translate::IfThenElseExp instead of the simpler scheme based on translate::Cx, Nx, and Ex only.

Warning: If you support the option --hir-use-ifexp then you will be tested on translate::IfThenElseExp. If this options does nothing in your compiler, then be sure to get 0 on all the corresponding tests.

First, here is the output without --hir-use-ifexp, and the performances reported by havm:

     1 = 1 & 2 = 2
     
File 42: boolean.tig
     $ tc --hir-display boolean.tig

     
     /* == High Level Intermediate representation. == */
     # Routine: Main Program
     label l`Main'
     # Prologue
     # Body
     sxp
         eseq
         seq
             cjump eq
                 const 1
                 const 1
                 name l0
                 name l1
             label l0
             move
                 temp t0
                 eseq
                 seq
                     move
                         temp t1
                         const 1
                     cjump eq
                         const 2
                         const 2
                         name l3
                         name l4
                     label l4
                     move
                         temp t1
                         const 0
                     label l3
                 seq end
                     temp t1
             jump
                 name l2
             label l1
             move
                 temp t0
                 const 0
             jump
                 name l2
             label l2
         seq end
             temp t0
     # Epilogue
     label end

     
Example 43: tc -hir-display boolean.tig
     $ tc --hir-display boolean.tig >boolean-1.hir

     =>0
     
Example 44: tc -hir-display boolean.tig >boolean-1.hir
     $ havm --profile boolean-1.hir

     error-->/* Profiling.  */
     error-->fetches from temporary : 2
     error-->fetches from memory    : 0
     error-->binary operations      : 0
     error-->function calls         : 1
     error-->stores to temporary    : 2
     error-->stores to memory       : 0
     error-->jumps                  : 1
     error-->conditional jumps      : 2
     error-->/* Execution time.  */
     error-->number of cycles : 17
     
Example 45: havm -profile boolean-1.hir

and here is the same, but with --hir-use-ifexp (pay attention that it is passed before --hir-display):

     $ tc --hir-use-ifexp --hir-display boolean.tig

     
     /* == High Level Intermediate representation. == */
     # Routine: Main Program
     label l`Main'
     # Prologue
     # Body
     seq
         cjump eq
             const 1
             const 1
             name l0
             name l1
         label l0
         seq
             sxp
                 const 2
             sxp
                 const 2
         seq end
         jump
             name l2
         label l1
         sxp
             const 0
         label l2
     seq end
     # Epilogue
     label end

     
Example 46: tc -hir-use-ifexp -hir-display boolean.tig
     $ tc --hir-use-ifexp --hir-display boolean.tig >boolean-2.hir

     =>0
     
Example 47: tc -hir-use-ifexp -hir-display boolean.tig >boolean-2.hir
     $ havm --profile boolean-2.hir

     error-->/* Profiling.  */
     error-->fetches from temporary : 0
     error-->fetches from memory    : 0
     error-->binary operations      : 0
     error-->function calls         : 1
     error-->stores to temporary    : 0
     error-->stores to memory       : 0
     error-->jumps                  : 1
     error-->conditional jumps      : 1
     error-->/* Execution time.  */
     error-->number of cycles : 12
     
Example 48: havm -profile boolean-2.hir

Of course, if you did not implement translate::IfThenElseExp support, --hir-use-ifexp must cause tc to exit 64.

T5 Bound Checking

Implementing bound checking is quite simple: it consists in having the program die when the program accesses an invalid subscript in an array. For instance, the following code is "succeeds" with a non-bound-checking compiler.

     let type int_array = array of int
         var  size  := 2
         var  arr1  := int_array [size] of 0
         var  arr2  := int_array [size] of 0
         var  two   := 2
         var  m_one := -1
     in
       arr1[two]   := 3;
       arr2[m_one] := -1;

       print_int (arr1[1]);
       print ("\n");
       print_int (arr2[0]);
       print ("\n")
     end
     
File 49: bound-violation.tig
     $ tc --hir-display bound-violation.tig >bound-violation.hir

     =>0
     
Example 50: tc -hir-display bound-violation.tig >bound-violation.hir
     $ havm bound-violation.hir

     
     -1
     3
     
Example 51: havm bound-violation.hir

When run with --bound-checking, have your compiler produce code that dies on such cases.

T5 Optimizing Static Links

Warning: this optimization is difficult to do it perfectly, and therefore, expect a big bonus.

In a first and conservative extension, the compiler considers that all the functions (but the builtins!) need a static link. This is correct, but inefficient: for instance, the traditional fact function will spend almost as much time handling the static link, than its real argument.

Some functions need a static link, but don't need to save it on the stack. For instance, in the following:

     let var foo := 1
         function foo () : int = foo
     in
       foo ()
     end
     

the function foo does need a static link to access the variable foo, but does not need to store its static link on the stack.

It is suggested to address these problems in the following order:

  1. Implement the detection of functions that do not need a static link (see exercise 6.5 in "Modern Implementation of Compilers"), but still consider any static link escapes.
  2. Adjust the output of --escapes-display to display /* escaping sl */ before the first formal argument of the functions (declarations) that need the static link:
              $ tc --escapes-display fact.tig
              /* == Escapes. == */
              let
                 function fact (/* escaping sl *//* escaping */ n : int) : int =
                    if  (n = 0)
                       then 1
                       else  (n * fact ( (n - 1)))
              in
                 fact (10)
              end
    
              $ tc --escapes-display --escapes-com fact.tig
              /* == Escapes. == */
              let
                 function fact (n : int) : int =
                    if  (n = 0)
                       then 1
                       else  (n * fact ( (n - 1)))
              in
                 fact (10)
              end
              
  3. Adjust your call and progFrag prologues.
  4. Improve your computation so that non escaping static links are detected:
              $ tc --escapes-compute --escapes-display escaping-sl.tig
              /* == Escapes. == */
              let
                 var      toto := 1
                 function outer (/* escaping sl */) : int =
                   let function inner (/* sl */) : int = toto
                   in inner () end
              in
                outer ()
              end
              

    Watch out, it is not trivial to find the minimum. What do you think about the static link of the function sister below?

              let
                 var      toto := 1
                 function outer () : int =
                   let function inner () : int = toto
                   in inner () end
                 function sister () : int = outer ()
              in
                sister ()
              end
              


Node:T6, Previous:T5, Up:Compiler Stages

T6, Translating to the Low Level Intermediate Representation

T6 first delivery is Monday, July 15th at noon.
Second delivery date is Wednesday, July 17th at noon.

There will be no additional code: there is no "holes" to fill, you have to write the whole thing. Consequently, you may start T6 as soon as you want.

At the end of this stage, the compiler produces low level intermediate representation: LIR. LIR is a subset of the HIR: some patterns are forbidden. This is why it is also named canonicalization.


Node:T6 Samples, Next:, Up:T6

T6 Samples

There are several stages in T6.

T6 Canonicalization Samples

The first task in T6 is getting rid of all the eseq. To do this, you have to move the statement part of an eseq at the end of the current sequence point, and keeping the expression part in place.

Compare for instance the HIR to the LIR in the following case:

     let function print_ints (a: int, b: int) =
         (print_int (a); print (", "); print_int (b); print ("\n"))
         var a := 0
     in
       print_ints (1,  (a := a + 1; a))
     end
     
File 52: preincr-1.tig

One possible HIR translation is:

     $ tc --escapes-compute --hir-display preincr-1.tig

     
     /* == High Level Intermediate representation. == */

     label l1
     	", "

     label l2
     	"
     "
     # Routine: print_ints
     label l0
     # Prologue
     move
         temp t2
         temp fp
     move
         temp fp
         temp sp
     move
         temp sp
         binop (-)
             temp sp
             const 4
     move
         mem
             temp fp
         temp i0
     move
         temp t0
         temp i1
     move
         temp t1
         temp i2
     # Body
     seq
         sxp
             call
                 name l`print_int'
                 temp t0
             call end
         sxp
             call
                 name l`print'
                 name l1
             call end
         sxp
             call
                 name l`print_int'
                 temp t1
             call end
         sxp
             call
                 name l`print'
                 name l2
             call end
     seq end
     # Epilogue
     move
         temp sp
         temp fp
     move
         temp fp
         temp t2
     label end

     # Routine: Main Program
     label l`Main'
     # Prologue
     # Body
     seq
         move
             temp t3
             const 0
         sxp
             call
                 name l0
                 temp fp
                 const 1
                 eseq
                     move
                         temp t3
                         binop (+)
                             temp t3
                             const 1
                     temp t3
             call end
     seq end
     # Epilogue
     label end

     
Example 53: tc -escapes-compute -hir-display preincr-1.tig

A possible canonicalization is then:

     $ tc --escapes-compute --lir-display preincr-1.tig

     
     /* == Low Level Intermediate representation. == */

     label l1
     	", "

     label l2
     	"
     "
     # Routine: print_ints
     label l0
     # Prologue
     move
         temp t2
         temp fp
     move
         temp fp
         temp sp
     move
         temp sp
         binop (-)
             temp sp
             const 4
     move
         mem
             temp fp
         temp i0
     move
         temp t0
         temp i1
     move
         temp t1
         temp i2
     # Body
     seq
         label l3
         sxp
             call
                 name l`print_int'
                 temp t0
             call end
         sxp
             call
                 name l`print'
                 name l1
             call end
         sxp
             call
                 name l`print_int'
                 temp t1
             call end
         sxp
             call
                 name l`print'
                 name l2
             call end
         label l4
     seq end
     # Epilogue
     move
         temp sp
         temp fp
     move
         temp fp
         temp t2
     label end

     # Routine: Main Program
     label l`Main'
     # Prologue
     # Body
     seq
         label l5
         move
             temp t3
             const 0
         move
             temp t5
             temp fp
         move
             temp t3
             binop (+)
                 temp t3
                 const 1
         sxp
             call
                 name l0
                 temp t5
                 const 1
                 temp t3
             call end
         label l6
     seq end
     # Epilogue
     label end

     
Example 54: tc -escapes-compute -lir-display preincr-1.tig

But please note the example above is simple because 1 commutes with (a := a + 1; a): the order does not matter. But if you change the 1 into a, then you cannot exchange a and (a := a + 1; a), so the translation is different. Compare the previous LIR with the following, and pay attention to

     let function print_ints (a: int, b: int) =
         (print_int (a); print (", "); print_int (b); print ("\n"))
         var a := 0
     in
       print_ints (a,  (a := a + 1; a))
     end
     
File 55: preincr-2.tig
     $ tc --escapes-compute --lir-display preincr-2.tig

     
     /* == Low Level Intermediate representation. == */

     label l1
     	", "

     label l2
     	"
     "
     # Routine: print_ints
     label l0
     # Prologue
     move
         temp t2
         temp fp
     move
         temp fp
         temp sp
     move
         temp sp
         binop (-)
             temp sp
             const 4
     move
         mem
             temp fp
         temp i0
     move
         temp t0
         temp i1
     move
         temp t1
         temp i2
     # Body
     seq
         label l3
         sxp
             call
                 name l`print_int'
                 temp t0
             call end
         sxp
             call
                 name l`print'
                 name l1
             call end
         sxp
             call
                 name l`print_int'
                 temp t1
             call end
         sxp
             call
                 name l`print'
                 name l2
             call end
         label l4
     seq end
     # Epilogue
     move
         temp sp
         temp fp
     move
         temp fp
         temp t2
     label end

     # Routine: Main Program
     label l`Main'
     # Prologue
     # Body
     seq
         label l5
         move
             temp t3
             const 0
         move
             temp t5
             temp fp
         move
             temp t6
             temp t3
         move
             temp t3
             binop (+)
                 temp t3
                 const 1
         sxp
             call
                 name l0
                 temp t5
                 temp t6
                 temp t3
             call end
         label l6
     seq end
     # Epilogue
     label end

     
Example 56: tc -escapes-compute -lir-display preincr-2.tig

As you can see, the output is the same for the HIR and the LIR:

     $ tc --escapes-compute --hir-display preincr-2.tig >preincr-2.hir

     =>0
     
Example 57: tc -escapes-compute -hir-display preincr-2.tig >preincr-2.hir
     $ havm preincr-2.hir

     
     0, 1
     
Example 58: havm preincr-2.hir
     $ tc --escapes-compute --lir-display preincr-2.tig >preincr-2.lir

     =>0
     
Example 59: tc -escapes-compute -lir-display preincr-2.tig >preincr-2.lir
     $ havm preincr-2.lir

     
     0, 1
     
Example 60: havm preincr-2.lir

Be very careful when dealing with mem. For instance, rewriting something like:

     call (foo, eseq (move (temp t, const 51), temp t))
     

into

     move temp t1, temp t
     move temp t, const 51
     call (foo, temp t)
     

is dead wrong: temp t is a subexpression: it is being defined here. You should produce:

     move temp t, const 51
     call (foo, temp t)
     

Another danger is the handling of move (mem, ). For instance:

     move (mem foo, x)
     

must be rewritten into:

     move (temp t, foo)
     move (mem (temp t), x)
     

not as:

     move (temp t, mem (foo))
     move (temp t, x)
     

In other words, the first subexpression of move (mem (foo), ) is foo, not mem (foo). The following example is a good crash test against this problem:

     let type int_array = array of int
         var tab := int_array [2] of 51
     in
       tab[0] := 100;
       tab[1] := 200;
       print_int (tab[0]); print ("\n");
       print_int (tab[1]); print ("\n")
     end
     
File 61: move-mem.tig
     $ tc --escapes-compute --lir-display move-mem.tig >move-mem.lir

     =>0
     
Example 62: tc -escapes-compute -lir-display move-mem.tig >move-mem.lir
     $ havm move-mem.lir

     
     100
     200
     
Example 63: havm move-mem.lir

You also ought to get rid of nested calls:

     print (chr (ord ("\n")))
     
File 64: nested-calls.tig
     $ tc --lir-display nested-calls.tig

     
     /* == Low Level Intermediate representation. == */

     label l0
     	"
     "
     # Routine: Main Program
     label l`Main'
     # Prologue
     # Body
     seq
         label l1
         move
             temp t1
             call
                 name l`ord'
                 name l0
             call end
         move
             temp t2
             call
                 name l`chr'
                 temp t1
             call end
         sxp
             call
                 name l`print'
                 temp t2
             call end
         label l2
     seq end
     # Epilogue
     label end

     
Example 65: tc -lir-display nested-calls.tig

In fact there are only two valid call forms: sxp (call (...)), and move (temp (...), call (...)).

Note that, contrary to C, the HIR and LIR always denote the same value. For instance the following Tiger code:

     let var a := 1
         function a (t: int) : int =
            (a := a + 1;
             print_int (t); print (" -> "); print_int (a); print ("\n");
             a)
         var b := a (1) + a (2) * a (3)
     in
       print_int (b); print ("\n")
     end
     

should always produce:

     1 -> 2
     2 -> 3
     3 -> 4
     14
     

independently of the what IR you ran. Note that it has nothing to do with the precedence of the operators!

In C, you have no such guarantee:

#include <stdio.h>

int _a = 1;
int a (int t)
{
  ++_a;
  printf ("%d -> %d\n", t, _a);
  return _a;
}

int main (void)
{
  int b = a (1) + a (2) * a (3);
  printf ("%d\n", b);
  return 0;
}

will always give the same result on all compilers and architectures.

T6 Scheduling Samples

Once you have canonicalized your eseq and call, you have to canonicalize cjumps: they must always be followed by their "false" label. This goes in two steps:

  1. Split in basic blocks.

    A basic block is a sequence of code starting with a label, ending with a jump (conditional or not), and with no jumps, no labels inside.

  2. Build the traces.

    Now put all the basic blocks into a single sequence.

In the following, the result of the whole conversion is visible.

The following examples highlights the need for new labels: at least one for the entry point, and one for the exit point:

     1 & 2
     
File 66: 1-and-2.tig
     $ tc --lir-display 1-and-2.tig

     
     /* == Low Level Intermediate representation. == */
     # Routine: Main Program
     label l`Main'
     # Prologue
     # Body
     seq
         label l3
         cjump ne
             const 1
             const 0
             name l0
             name l1
         label l1
         move
             temp t0
             const 0
         label l2
         jump
             name l4
         label l0
         move
             temp t0
             const 2
         jump
             name l2
         label l4
     seq end
     # Epilogue
     label end

     
Example 67: tc -lir-display 1-and-2.tig

The following example contains many jumps. Compare the hir to the lir:

     while 10 | 20 do if 30 | 40 then break else break
     
File 68: broken-while.tig
     $ tc --hir-display broken-while.tig

     
     /* == High Level Intermediate representation. == */
     # Routine: Main Program
     label l`Main'
     # Prologue
     # Body
     seq
         label l10
         cjump ne
             eseq
             seq
                 cjump ne
                     const 10
                     const 0
                     name l0
                     name l1
                 label l0
                 move
                     temp t0
                     const 1
                 jump
                     name l2
                 label l1
                 move
                     temp t0
                     const 20
                 jump
                     name l2
                 label l2
             seq end
                 temp t0
             const 0
             name l11
             name l3
         label l11
         seq
             cjump ne
                 eseq
                 seq
                     cjump ne
                         const 30
                         const 0
                         name l4
                         name l5
                     label l4
                     move
                         temp t1
                         const 1
                     jump
                         name l6
                     label l5
                     move
                         temp t1
                         const 40
                     jump
                         name l6
                     label l6
                 seq end
                     temp t1
                 const 0
                 name l7
                 name l8
             label l7
             jump
                 name l3
             jump
                 name l9
             label l8
             jump
                 name l3
             jump
                 name l9
             label l9
         seq end
         jump
             name l10
         label l3
     seq end
     # Epilogue
     label end

     
Example 69: tc -hir-display broken-while.tig
     $ tc --lir-display broken-while.tig

     
     /* == Low Level Intermediate representation. == */
     # Routine: Main Program
     label l`Main'
     # Prologue
     # Body
     seq
         label l12
         label l10
         cjump ne
             const 10
             const 0
             name l0
             name l1
         label l1
         move
             temp t0
             const 20
         label l2
         cjump ne
             temp t0
             const 0
             name l11
             name l3
         label l3
         jump
             name l13
         label l11
         cjump ne
             const 30
             const 0
             name l4
             name l5
         label l5
         move
             temp t1
             const 40
         label l6
         cjump ne
             temp t1
             const 0
             name l7
             name l8
         label l8
         jump
             name l3
         label l7
         jump
             name l3
         label l4
         move
             temp t1
             const 1
         jump
             name l6
         label l0
         move
             temp t0
             const 1
         jump
             name l2
         label l13
     seq end
     # Epilogue
     label end

     
Example 70: tc -lir-display broken-while.tig

See also that --hir-use-ifexp is really much more pleasant:

     $ tc --hir-use-ifexp --lir-display broken-while.tig

     
     /* == Low Level Intermediate representation. == */
     # Routine: Main Program
     label l`Main'
     # Prologue
     # Body
     seq
         label l10
         label l1
         cjump ne
             const 10
             const 0
             name l8
             name l9
         label l9
         cjump ne
             const 20
             const 0
             name l2
             name l0
         label l0
         jump
             name l11
         label l2
         cjump ne
             const 30
             const 0
             name l6
             name l7
         label l7
         cjump ne
             const 40
             const 0
             name l3
             name l4
         label l4
         jump
             name l0
         label l3
         jump
             name l0
         label l6
         cjump ne
             const 1
             const 0
             name l3
             name l13
         label l13
         jump
             name l4
         label l8
         cjump ne
             const 1
             const 0
             name l2
             name l0
         label l11
     seq end
     # Epilogue
     label end

     
Example 71: tc -hir-use-ifexp -lir-display broken-while.tig


Node:T6 Given Code, Next:, Previous:T6 Samples, Up:T6

T6 Given Code

No additional code will be given.


Node:T6 Code to Write, Previous:T6 Given Code, Up:T6

T6 Code to Write

Everything you need.


Node:Tools, Next:, Previous:Compiler Stages, Up:Top

Tools

This chapter aims at providing some helpful information about the various tools that you are likely to use to implement tc. It does not replace the reading of the genuine documentation, nevertheless, helpful tips are given. Feel free to contribute additional information.

Modern Compiler Implementation

The single most important tool for implementing the Tiger Project is the original book, Modern Compiler Implementation in C/Java/ML, by Andrew W. Appel, published by Cambridge University Press (New York, Cambridge). ISBN 0-521-58388-8/

There are three flavors of this book:

C
The code samples are written in C. I strongly recommend that you avoid this edition, as C is not appropriate to describe the elaborate algorithms involved: most of the time, the simple ideas are destroyed with longuish unpleasant lines of code.
Java
The samples are written in Java. This book is the closest to the EPITA Tiger Project, since it is written in an object oriented language. Nevertheless, the modelisation is very poor, and therefore, don't be surprised if the EPITA project is significantly different. For a start, there is no Visitors at all. Of course the main purpose of the book is compilers, but it is not a reason for such a poor modelisation.
ML
This book, which is the "original", provides code samples in ML, which is a very adequate language to write compilers. Therefore it is very readable, even if you are not fluent in ML. I recommend this edition, unless you have severe problems with functional programming.

This book addresses many more issues than the sole Tiger Project as we implement it. In other words, it is an extremely interesting book which provides insights on garbage collection, object oriented and functional languages etc.

There is a dozen copies at the EPITA library, but buying it is a good idea.

Pay extra attention: there are several errors in the books, some of which are reported on Andrew Appel's pages (C Java, and ML), some which are not.

The GNU Build System

Automake is used to facilitate the writing of power Makefile. Autoconf is required by Automake: we don't not address portability issues for this project.

You may read the Autoconf documentation, and the Automake documentation. Using info is pleasant: info autoconf on any properly set up system. The Goat Book covers the whole GNU Build System: Autoconf, Automake and Libtool.

Package Name and Version

To set the name and version of your package, change the AC_INIT invocation. For instance, T4 for the bardec_f group gives AC_INIT(bardec_f-tiger, 4). Warning: Autoconf 2.53 smashes the underscores into dashes. To workaround this misfeature, use:

     AC_INIT([bardec_f-tiger], 4, [bardec_f@epita.fr], [bardec_f-tiger])
     

Bootstrapping the Package

If something goes wrong, or if it is simply the first time yu create configure.ac or a Makefile.am, you need to set up the GNU Build System. The simplest invocation is:

     $ autoreconf -fvi
     

The various files (configure, Makefile.in, etc.) are created. Then invoke configure and make:

     $ ./configure CC=gcc-3.2 CXX=g++-3.2
     $ make
     

where gcc-3.2 and g++-3.2 are the name of the GNU C and C++ Compiler 3.2 executables.

At this stage, if running make distcheck does not create bardec_f-tc-4.tar.bz2, then something is wrong in your package. Do not rename it, do not create the tarball by hand: something is rotten and be sure it will break on the examiner's machine.


Node:Flex & Bison, Next:, Previous:The GNU Build System, Up:Tools

Flex & Bison

We use Bison 1.875a which is able to produce a C++ parser. This Bison is unpublished, as the maintainers still have issues to fix. Nevertheless, it is usable, and perfectly functional for Tiger. It is installed in ~akim/bin, under the name bison. Be aware that Bison 1.875 produces buggy C++ parsers.

If you don't use this Bison, you will be in trouble. If you are willing to work at home, use bison-1.875a.tar.bz2.

The original papers on Lex and Yacc are:

Johnson, Stephen C. [1975].
Yacc: Yet Another Compiler Compiler. Computing Science Technical Report No. 32, Bell Laboratories, Murray hill, New Jersey.
Lesk, M. E. and E. Schmidt [1975].
Lex: A Lexical Analyzer Generator. Computing Science Technical Report No. 39, Bell Laboratories, Murray Hill, New Jersey.

The following introduction guides can help beginners:

Thomas Niemann.
A Compact Guide to Lex & Yacc.

An introduction to Lex and Yacc.

Collective Work
Programming with GNU Software.

Contains information about Autoconf, Automake, Gperf, Flex, Bison, and GCC.

The Bison documentation, and the Flex documentation are available for browsing.


Node:Doxygen, Previous:Flex & Bison, Up:Tools

Doxygen

We use Doxygen as the standard tool for producing the developer's documentation of the project. Its features must be used to produce good documentation, with an explanation of the role of the arguments etc. The quality of the documentation will be part of the notation. Details on how to use proper comments are given in the Doxygen Manual.

The documentation produced by Doxygen must not be included, but the target html must produce the html documentation in the doc/html directory.


Node:Reports, Next:, Previous:Tools, Up:Top

Reports

This chapter is meaningless to 2003 students.

Each group will have to deliver two reports in English, and will defend them in English too.

Dates for the Reports

The 2003 are not yet concerned by this section. Reports will be announced in epita.cours.compil first.

The reports will have to be given at:

Sometime
Computability.
Some other time
You must have sent your proposal of a subject for the second report to Akim.
Later
Second report.

Requirements over the Reports

The reports you have to write are a training for a task you will regularly have to fulfill as an engineer. Therefore, there are a few strong requirements. Being requirements, they will not figure in the mark, nevertheless, a report that does not meet the requirements will be penalized.

This report will of course include the usual items:

cover
Upper-right corner must list the members of the group, the leader of the team being listed first.

Include a title, the date, and an abstract below the title. An abstract is a < 20 line-long summary of the report.

table of contents
With page numbers of course. Needless to say that the table of contents must be accurate, and therefore shall point to the right pages.
introduction or foreword
Closely related to the abstract, it details the issue the document is covering, and highlights its structure.
body
The report must be structured. In particular, you shall not simply present chapters named after the questions: you must find the logic in the topic of the report, and intelligently organize your report. As a rule of thumb, imagine your report is `lecture notes': you are teaching someone.
index
Admittedly useless for such short reports, the index is required so that you learn how to make indexes.
bibliography
Any source must be documented, the reader must be able to find whatever document you used.

In the body of text, do not give all the details over a book or a Web page, instead, leave an anchor, aka a key, such as [Richie77], or [1], and in the bibliography, map each key to a detailed description of the source.

Note that this rule shall be observed for urls too: don't drop urls in the body, instead leave a key, and detail the key in the bibliography. Also, by no means a web page shall be mentioned with just its address! That's about the same thing as just leaving the ISBN for a book: that's useless. Instead, report the title of the page, the name of its author(s), and its address.

Don't hesitate to produce an annotated bibliography, i.e., entries might be enriched with a few words which describe its purpose, but also your ranking on the page (old information, very technical, completely wrong etc.).

glossary
When they talk about science in English, most French people just map their French words to an approximative English writing. Of course this is wrong. Therefore, you will be noted on the accuracy of your glossary: there, you will list of the unusual words you needed, and their definition.

No glossary shall be smaller than 20 words. No entry shall duplicate the content of the report. For instance, if you have a chapter on regular languages, the glossary shall not redefine the concept again.

In addition, don't provide an encyclopedia in this glossary, rather pay attention to the definitions of very simple and common technical words (e.g., un ensemble, inclus dans etc.).

Computability

This report is both an initiation to scientific reading and writing --in English--, and to the limitations of computer science.

Your report should explain:

P
What does it mean that a problem is a P-problem? What is the definition?
NP and NP completeness

Decidability
But also undecidability, and semi-decidability.
Computability.

We expect a report, not just a list of "question-answer", i.e., we expect something which is structured. It does not need to be long, but your knowledge must be accurate. The understanding of each member of the group will be evaluated.

Formal Languages

This report is dedicated to formal languages and their processing; this includes from programming languages and their compilation/interpretation but you are not limited to it.

Ranging from 20 to 50 pages this report should explain the purpose of the language, its designers, its advantages, its flaws, what inspired its inception, how it is meeting its intentions, its associated tools, the specific problems it gives to the user (implementor of a tool for this language, and/or someone programming in this language), standardization, versions etc.

It is expected from you that you understand the spirit of this report, not the letter, so do not follow exactly what is said in the previous paragraph.

You are encouraged to choose a language which is not too much widespread, such as C, Pascal etc. but if you do, beware that we will be extremely exigent with you.

Interesting languages includes Claire, Pliant, Erlang, OCaml, Small etc. Documents on old languages are absolutely welcome: SmallTalk, Snobol, Cobol, etc.

You must write a proposal (less than a single page) for this report (see Dates for the Reports). We might accept, require changes, or simply refuse your proposal. Your abstract must explain why you are interested in this topic, and explain why your work will be pleasant and enriching for both you and us.

Of course, there is a default subject:

Not available yet. Will be given in a near future.


Node:Appendices, Previous:Reports, Up:Top

Appendices

GNU Free Documentation License

Version 1.1, March 2000
     Copyright © 2000 Free Software Foundation, Inc.
     59 Temple Place, Suite 330, Boston, MA  02111-1307, USA

     Everyone is permitted to copy and distribute verbatim copies
     of this license document, but changing it is not allowed.
     
  1. PREAMBLE

    The purpose of this License is to make a manual, textbook, or other written document free in the sense of freedom: to assure everyone the effective freedom to copy and redistribute it, with or without modifying it, either commercially or noncommercially. Secondarily, this License preserves for the author and publisher a way to get credit for their work, while not being considered responsible for modifications made by others.

    This License is a kind of "copyleft", which means that derivative works of the document must themselves be free in the same sense. It complements the GNU General Public License, which is a copyleft license designed for free software.

    We have designed this License in order to use it for manuals for free software, because free software needs free documentation: a free program should come with manuals providing the same freedoms that the software does. But this License is not limited to software manuals; it can be used for any textual work, regardless of subject matter or whether it is published as a printed book. We recommend this License principally for works whose purpose is instruction or reference.

  2. APPLICABILITY AND DEFINITIONS

    This License applies to any manual or other work that contains a notice placed by the copyright holder saying it can be distributed under the terms of this License. The "Document", below, refers to any such manual or work. Any member of the public is a licensee, and is addressed as "you".

    A "Modified Version" of the Document means any work containing the Document or a portion of it, either copied verbatim, or with modifications and/or translated into another language.

    A "Secondary Section" is a named appendix or a front-matter section of the Document that deals exclusively with the relationship of the publishers or authors of the Document to the Document's overall subject (or to related matters) and contains nothing that could fall directly within that overall subject. (For example, if the Document is in part a textbook of mathematics, a Secondary Section may not explain any mathematics.) The relationship could be a matter of historical connection with the subject or with related matters, or of legal, commercial, philosophical, ethical or political position regarding them.

    The "Invariant Sections" are certain Secondary Sections whose titles are designated, as being those of Invariant Sections, in the notice that says that the Document is released under this License.

    The "Cover Texts" are certain short passages of text that are listed, as Front-Cover Texts or Back-Cover Texts, in the notice that says that the Document is released under this License.

    A "Transparent" copy of the Document means a machine-readable copy, represented in a format whose specification is available to the general public, whose contents can be viewed and edited directly and straightforwardly with generic text editors or (for images composed of pixels) generic paint programs or (for drawings) some widely available drawing editor, and that is suitable for input to text formatters or for automatic translation to a variety of formats suitable for input to text formatters. A copy made in an otherwise Transparent file format whose markup has been designed to thwart or discourage subsequent modification by readers is not Transparent. A copy that is not "Transparent" is called "Opaque".

    Examples of suitable formats for Transparent copies include plain ASCII without markup, Texinfo input format, LaTeX input format, SGML or XML using a publicly available DTD, and standard-conforming simple HTML designed for human modification. Opaque formats include PostScript, PDF, proprietary formats that can be read and edited only by proprietary word processors, SGML or XML for which the DTD and/or processing tools are not generally available, and the machine-generated HTML produced by some word processors for output purposes only.

    The "Title Page" means, for a printed book, the title page itself, plus such following pages as are needed to hold, legibly, the material this License requires to appear in the title page. For works in formats which do not have any title page as such, "Title Page" means the text near the most prominent appearance of the work's title, preceding the beginning of the body of the text.

  3. VERBATIM COPYING

    You may copy and distribute the Document in any medium, either commercially or noncommercially, provided that this License, the copyright notices, and the license notice saying this License applies to the Document are reproduced in all copies, and that you add no other conditions whatsoever to those of this License. You may not use technical measures to obstruct or control the reading or further copying of the copies you make or distribute. However, you may accept compensation in exchange for copies. If you distribute a large enough number of copies you must also follow the conditions in section 3.

    You may also lend copies, under the same conditions stated above, and you may publicly display copies.

  4. COPYING IN QUANTITY

    If you publish printed copies of the Document numbering more than 100, and the Document's license notice requires Cover Texts, you must enclose the copies in covers that carry, clearly and legibly, all these Cover Texts: Front-Cover Texts on the front cover, and Back-Cover Texts on the back cover. Both covers must also clearly and legibly identify you as the publisher of these copies. The front cover must present the full title with all words of the title equally prominent and visible. You may add other material on the covers in addition. Copying with changes limited to the covers, as long as they preserve the title of the Document and satisfy these conditions, can be treated as verbatim copying in other respects.

    If the required texts for either cover are too voluminous to fit legibly, you should put the first ones listed (as many as fit reasonably) on the actual cover, and continue the rest onto adjacent pages.

    If you publish or distribute Opaque copies of the Document numbering more than 100, you must either include a machine-readable Transparent copy along with each Opaque copy, or state in or with each Opaque copy a publicly-accessible computer-network location containing a complete Transparent copy of the Document, free of added material, which the general network-using public has access to download anonymously at no charge using public-standard network protocols. If you use the latter option, you must take reasonably prudent steps, when you begin distribution of Opaque copies in quantity, to ensure that this Transparent copy will remain thus accessible at the stated location until at least one year after the last time you distribute an Opaque copy (directly or through your agents or retailers) of that edition to the public.

    It is requested, but not required, that you contact the authors of the Document well before redistributing any large number of copies, to give them a chance to provide you with an updated version of the Document.

  5. MODIFICATIONS

    You may copy and distribute a Modified Version of the Document under the conditions of sections 2 and 3 above, provided that you release the Modified Version under precisely this License, with the Modified Version filling the role of the Document, thus licensing distribution and modification of the Modified Version to whoever possesses a copy of it. In addition, you must do these things in the Modified Version:

    1. Use in the Title Page (and on the covers, if any) a title distinct from that of the Document, and from those of previous versions (which should, if there were any, be listed in the History section of the Document). You may use the same title as a previous version if the original publisher of that version gives permission.
    2. List on the Title Page, as authors, one or more persons or entities responsible for authorship of the modifications in the Modified Version, together with at least five of the principal authors of the Document (all of its principal authors, if it has less than five).
    3. State on the Title page the name of the publisher of the Modified Version, as the publisher.
    4. Preserve all the copyright notices of the Document.
    5. Add an appropriate copyright notice for your modifications adjacent to the other copyright notices.
    6. Include, immediately after the copyright notices, a license notice giving the public permission to use the Modified Version under the terms of this License, in the form shown in the Addendum below.
    7. Preserve in that license notice the full lists of Invariant Sections and required Cover Texts given in the Document's license notice.
    8. Include an unaltered copy of this License.
    9. Preserve the section entitled "History", and its title, and add to it an item stating at least the title, year, new authors, and publisher of the Modified Version as given on the Title Page. If there is no section entitled "History" in the Document, create one stating the title, year, authors, and publisher of the Document as given on its Title Page, then add an item describing the Modified Version as stated in the previous sentence.
    10. Preserve the network location, if any, given in the Document for public access to a Transparent copy of the Document, and likewise the network locations given in the Document for previous versions it was based on. These may be placed in the "History" section. You may omit a network location for a work that was published at least four years before the Document itself, or if the original publisher of the version it refers to gives permission.
    11. In any section entitled "Acknowledgments" or "Dedications", preserve the section's title, and preserve in the section all the substance and tone of each of the contributor acknowledgments and/or dedications given therein.
    12. Preserve all the Invariant Sections of the Document, unaltered in their text and in their titles. Section numbers or the equivalent are not considered part of the section titles.
    13. Delete any section entitled "Endorsements". Such a section may not be included in the Modified Version.
    14. Do not retitle any existing section as "Endorsements" or to conflict in title with any Invariant Section.

    If the Modified Version includes new front-matter sections or appendices that qualify as Secondary Sections and contain no material copied from the Document, you may at your option designate some or all of these sections as invariant. To do this, add their titles to the list of Invariant Sections in the Modified Version's license notice. These titles must be distinct from any other section titles.

    You may add a section entitled "Endorsements", provided it contains nothing but endorsements of your Modified Version by various parties--for example, statements of peer review or that the text has been approved by an organization as the authoritative definition of a standard.

    You may add a passage of up to five words as a Front-Cover Text, and a passage of up to 25 words as a Back-Cover Text, to the end of the list of Cover Texts in the Modified Version. Only one passage of Front-Cover Text and one of Back-Cover Text may be added by (or through arrangements made by) any one entity. If the Document already includes a cover text for the same cover, previously added by you or by arrangement made by the same entity you are acting on behalf of, you may not add another; but you may replace the old one, on explicit permission from the previous publisher that added the old one.

    The author(s) and publisher(s) of the Document do not by this License give permission to use their names for publicity for or to assert or imply endorsement of any Modified Version.

  6. COMBINING DOCUMENTS

    You may combine the Document with other documents released under this License, under the terms defined in section 4 above for modified versions, provided that you include in the combination all of the Invariant Sections of all of the original documents, unmodified, and list them all as Invariant Sections of your combined work in its license notice.

    The combined work need only contain one copy of this License, and multiple identical Invariant Sections may be replaced with a single copy. If there are multiple Invariant Sections with the same name but different contents, make the title of each such section unique by adding at the end of it, in parentheses, the name of the original author or publisher of that section if known, or else a unique number. Make the same adjustment to the section titles in the list of Invariant Sections in the license notice of the combined work.

    In the combination, you must combine any sections entitled "History" in the various original documents, forming one section entitled "History"; likewise combine any sections entitled "Acknowledgments", and any sections entitled "Dedications". You must delete all sections entitled "Endorsements."

  7. COLLECTIONS OF DOCUMENTS

    You may make a collection consisting of the Document and other documents released under this License, and replace the individual copies of this License in the various documents with a single copy that is included in the collection, provided that you follow the rules of this License for verbatim copying of each of the documents in all other respects.

    You may extract a single document from such a collection, and distribute it individually under this License, provided you insert a copy of this License into the extracted document, and follow this License in all other respects regarding verbatim copying of that document.

  8. AGGREGATION WITH INDEPENDENT WORKS

    A compilation of the Document or its derivatives with other separate and independent documents or works, in or on a volume of a storage or distribution medium, does not as a whole count as a Modified Version of the Document, provided no compilation copyright is claimed for the compilation. Such a compilation is called an "aggregate", and this License does not apply to the other self-contained works thus compiled with the Document, on account of their being thus compiled, if they are not themselves derivative works of the Document.

    If the Cover Text requirement of section 3 is applicable to these copies of the Document, then if the Document is less than one quarter of the entire aggregate, the Document's Cover Texts may be placed on covers that surround only the Document within the aggregate. Otherwise they must appear on covers around the whole aggregate.

  9. TRANSLATION

    Translation is considered a kind of modification, so you may distribute translations of the Document under the terms of section 4. Replacing Invariant Sections with translations requires special permission from their copyright holders, but you may include translations of some or all Invariant Sections in addition to the original versions of these Invariant Sections. You may include a translation of this License provided that you also include the original English version of this License. In case of a disagreement between the translation and the original English version of this License, the original English version will prevail.

  10. TERMINATION

    You may not copy, modify, sublicense, or distribute the Document except as expressly provided for under this License. Any other attempt to copy, modify, sublicense or distribute the Document is void, and will automatically terminate your rights under this License. However, parties who have received copies, or rights, from you under this License will not have their licenses terminated so long as such parties remain in full compliance.

  11. FUTURE REVISIONS OF THIS LICENSE

    The Free Software Foundation may publish new, revised versions of the GNU Free Documentation License from time to time. Such new versions will be similar in spirit to the present version, but may differ in detail to address new problems or concerns. See http://www.gnu.org/copyleft/.

    Each version of the License is given a distinguishing version number. If the Document specifies that a particular numbered version of this License "or any later version" applies to it, you have the option of following the terms and conditions either of that specified version or of any later version that has been published (not as a draft) by the Free Software Foundation. If the Document does not specify a version number of this License, you may choose any version ever published (not as a draft) by the Free Software Foundation.

ADDENDUM: How to use this License for your documents

To use this License in a document you have written, include a copy of the License in the document and put the following copyright and license notices just after the title page:

       Copyright (C)  year  your name.
       Permission is granted to copy, distribute and/or modify this document
       under the terms of the GNU Free Documentation License, Version 1.1
       or any later version published by the Free Software Foundation;
       with the Invariant Sections being list their titles, with the
       Front-Cover Texts being list, and with the Back-Cover Texts being list.
       A copy of the license is included in the section entitled ``GNU
       Free Documentation License''.
     

If you have no Invariant Sections, write "with no Invariant Sections" instead of saying which ones are invariant. If you have no Front-Cover Texts, write "no Front-Cover Texts" instead of "Front-Cover Texts being list"; likewise for Back-Cover Texts.

If your document contains nontrivial examples of program code, we recommend releasing these examples in parallel under your choice of free software license, such as the GNU General Public License, to permit their use in free software.

Index

 

 
 
     

| Copyright 2002 © OTB World Conception |
.:: version du site : v2.0 ::.