COMPILATEUR TIGER
This document presents the EPITA version of the Tiger project. It was
last updated 5 February 2003.
Copyright © 2000, 2001, 2002, 2003 Akim Demaille.
Permission is granted to copy, distribute and/or modify this document
under the terms of the GNU Free Documentation License, Version 1.1 or
any later version published by the Free Software Foundation; with no
Invariant Sections, no Front-Cover texts and with the no Back-Cover
Texts. A copy of the license is included in the section entitled "GNU
Free Documentation License."
Reports
Everything exposed in this document is expected to be known.
Compilation: Projects and Reports
This document was last edited on 5 February 2003. Make sure the sections have
been updated for your promotion.
This document (split and
beautiful or in
one simple piece) details the various tasks the "Compilation" students must
complete. It includes both the practical part (Tiger compiler) and the
theoretical part (reports).
Tarballs
There are a few mandatory requirements over the tarballs.
Delivery Dates
The tarballs must be made via make distcheck, and put in the
requested location (see Student
Tarballs). This tarball will be copied at the exact moment of the first
delivery date, and the list of project which were found/not found is
reported in epita.cours.compile.
If your tarball was not fetched, send it immediately to yaka@epita.fr and akim@epita.fr, with [Tiger]:
bardec_f-tc-n.tar.bz2 as Subject. If you are
quick enough (within a couple of hours), there won't be penalties. Run this: mutt -c akim@epita.fr \
-s "[Tiger]: bardec_f-tc-n.tar.bz2" \
-a bardec_f-tc-n.tar.bz2 \
yaka@epita.fr </dev/null
Warning: Do not use elm to send your tarballs.
Each time someone tried, it failed. And anyway, just never use
elm.
Within a couple of hours after delivery date, a program evaluates your
tarball. The results are available on the web, as
bardec_f-tc-n.log. You should read this
file as soon as possible, do not wait for the marks to be computed to
react. If you realize there were stupid mistakes (such as displaying
/* escapes */ instead of /* escaping */), send
immediately (the day of the first delivery date) another tarball. This will cost
you a patch, which is typically -1.
If you have massive failures, there is a second delivery date, which costs
super_patch, typically -3. Beyond the second delivery date, no
additional change will be accepted. Not that the second delivery is not
automated: you have to send immediately to yaka@epita.fr and akim@epita.fr, with [Tiger]:
bardec_f-tc-n.tar.bz2 as Subject.
Given Tarballs
The naming scheme for provided tarballs is different from the scheme you must
follow (see Student
Tarballs). Our naming scheme looks like 2004-tc-2.0.tar.bz2.
If we update the tarballs, they will be named 2004-tc-2.x.tar.bz2.
But your tarball must be named
login-tc-2.tar.bz2, even if you send a
second version of your project.
We also provide some test cases. There are few of them, and you will have to
write your own tests. Writing tests is part of the project. Do not just
copy test cases from other groups, as you will not understand why they were
written.
The initial test suite is available for download at tests.tgz.
It contains the following directories:
good
- This programs are correct.
scan
- These programs have lexical errors.
parse
- These programs have syntax errors.
type
- These programs contain type mismatches.
Project Layout
Here is the mandatory layout of the tarball:
AUTHORS
- In the top level of the distribution, there must be a file
AUTHORS which contents is as follows: Fabrice Bardèche <bardec_f@epita.fr>
Jean-Paul Sartre <sartre_j@epita.fr>
Jean-Paul Deux <deux_j@epita.fr>
Jean-Paul Belmondo <belmon_j@epita.fr>
The group leader is the first in the list. Do not include emails other than
those of EPITA. I repeat: give the 6_1@epita.fr address. Note
that the file AUTHORS is automatically distributed, but pay
attention to the spelling.
ChangeLog
- Optional. The list of the changes made in the compiler, with the dates and
names of the people who worked on it. See the Emacs key binding
C-x 4
a.
README
- Various free information.
src/
- All the sources are in this directory.
tests/
- Test files. Do not include this directory in the tarball, I
already have my own test files, there is no need for you to give it to me
again. Save my disk!
src/tc
- Your compiler.
src/tc.cc
- Main entry, option processing. Called, the driver.
src/symbol/
- Namespace
symbol:
symbol.hh
- The handling of the symbols.
table.hh
- The handling of generic symbol tables, i.e., it is independent of
functions, types and variables.
src/ast/
- Namespace
ast. The abstract syntax.
src/parse/
- Namespace
parse: parsetiger.yy,
scantiger.ll, libparse.hh which prototypes what
tc.cc needs to know about the module parse.
src/type/
- Namespace
type. Type checking.
libtype.hh
- The interface for the driver.
types.hh
- The definition of all the types. You are free to use whatever layout you
wish (several files); we have a single
types.hh file.
type-entry.hh
- Definitions of
type::TypeEntry,
type::VarEntry, and type::FunEntry, used in
type::TypeEnv to associate data to types, variables, and
functions (obviously).
type-env.hh
- The types environment, comprising three symbol tables: types, functions,
and variables.
src/translate/
- Namespace
translate. Translation to intermediate code
translation. It includes:
libtranslate.hh
- The interface.
libtranslate.cc
- The compiled module.
translation.hh
- Helping routines to perform the translations. For instance, it contains
Exp *simpleVar (const Access &access, const Level
&level), Exp *callExp (const temp::Label &label,
std::list<Exp *> args) etc. which are routines that produce
some Tree::Exp.
translate-visitor.hh
- Implements the class
TranslateVisitor which performs the IR
generation thanks to translation.hh. It must not be polluted
with translation details: it is only coordinating the AST traversal with the
invocation of translation routines. For instance, here is the translation of
a ast::SimpleVar: virtual void visit (const SimpleVar& e)
{
const Access &access = _env.var_access_get (e.name_get ());
_exp = simpleVar (access, *_level);
}
Student Tarballs
Each group must provide a tarball. If bardec_f is the head of
your group, the tarball must be
~bardec_f/c/rendu/tiger/bardec_f-tc-n.tar.bz2
where n is the number of the "release". The following commands must
work properly: $ bunzip2 -cd bardec_f-tc-n.tar.bz2 | tar xvf -
$ cd bardec_f-tc-n
$ ./configure CC=gcc-3.2 CXX=g++-3.2
$ make
$ cd src
$ ./tc /tmp/test.tig
where gcc-3.2 and g++-3.2 are the name of the GNU C and
C++ Compiler 3.2 executables.
Your tarball must be done via make distcheck. If you just run
make dist, then you might forget to include some files in the
distribution. If you don't even run make dist, then not only some
files might be missing, but you have no guarantee that the tarball will compile
elsewhere (not to mention that we don't care about object files etc.).
Do not deliver the tests that we gave: we already have them.
Any tarball which is not built thanks to make distcheck (this is
easy to see: they include files we don't want, and don't contain some files we
need...) will be penalized with at least ### tarball_not_clean.
Code Sanity
The code you deliver must be clean. In particular, when some code is
provided, and you have to fill in the blanks denoted by FIXME: Some code
has been deleted.. Sometimes you will have to write the code from
scratch.
In any case, dead code and dead comments must be removed. You are
free to leave comments spotting places where you fixed a FIXME:,
but never leave a fixed FIXME: in your code. Nor any irrelevant
comment.
The official compiler for this project, is GNU C++ Compiler, 3.2 or higher.
Evaluation
Some stages are evaluated only by a program, and others are evaluated both by
humans, and a program.
Automated Evaluation
Each stage of the compiler will be evaluated by an automatic corrector. As
soon as the tarball are delivered, the logs are available on
http://www.lrde.epita.fr/~akim/compil, in the directory
corresponding to your promotion and stage. For instance, 2004 students ought to
read
http://www.lrde.epita.fr/~akim/compil/2004/4/bardec_f-tc-4.log.
We stress that automated evaluation enforces the requirements: you
must stick to what is being asked. For instance, for T3 it is
explicitly asked to display something like: var /* escaping */ i : int := 2
so if you display any of the following outputs var i : int /* escaping */ := 2
var i /* escaping */ : int := 2
var /* Escapes */ i : int := 2
be sure to fail all the tests, even if the computation is correct.
If you find some unexpected errors (your project does compile with the
reference compiler, some files are missing, your output is slightly incorrect
etc.) immediately send a new tarball to yaka@epita.fr with [Tiger] as
prefix of the subject. This corresponds to ### patch.
Do not wait for the final marks to be computed, this is extremely irritating,
and doomed to failure. You must understand that (i) you increase our workload,
and (ii) anyway this is the wrong approach, the Tiger Compiler is a big project
which must be continuously improved.
If, anyway, you send a tarball to fix your problems long after the initial
date, you will be flagged as ### super_late, which impact on the
mark is quite bad...
During the Examination
When you are defending your projects, here are a few rules to follow:
- Don't talk
- Don't talk unless you are asked to: when a person is asked a question,
s/he is the only one to answer. You must not talk to each other either: often,
when one cannot answer a question, the question is asked to another member. It
is then obvious why the members of the group shall not talk.
- Don't touch the screen
- Don't touch my display! You have nice fingers, but I don't need their
prints on my screen.
- Tell the truth
- If there is something the examiner must know (someone did not work on the
project at all, some files are coming from another group etc.), say it
immediately, for, if we discover that by ourselves, you will be severely
sanctioned.
- Learn
- It is explicitly stated that you can not have worked on a stage
provided this was an agreement with the group. But it is also
explicitly stated that you must have learned what was to be learned from
that compiler stage, which includes C++ techniques, Bison and Flex
mastering, object oriented concepts, design patterns and so forth.
- Complain now!
- If you don't agree with the notation, say it immediately. Private messages
about "this is unfair: I worked much more than bardec_f but his grade is
better than mine" are thrown away.
Conversely, there is something I wish to make clear: I, Akim, and the other
examiners, will probably be harsh (maybe even very harsh), but this does not
mean I disrespect you, or judge you badly.
You are here to defend your project and knowledge, I'm here to stress them,
to make sure they are right. Learning to be strong under pressure is part of the
exercise. Don't burst into tears, react! Don't be shy, that's not the proper
time: you are selling me something, and I will never buy something from someone
who cries when I'm criticizing his product.
You should also understand that human examination is the moment where we try
to evaluate who, or what group, needs help. We are here to diagnose your project
and provide solutions to your problems. If you know there is a problem in your
project, but you failed to fix it, tell it to the examiner! Work with
her/him to fix your project.
Human Evaluation
The point of this evaluation is to measure:
- the quality of the code
- How clean it is, amount of code duplication, bad hacks, standards
violations (e.g.,
stderr is forbidden in proper C++ code) and so
forth. It also aims at detecting cheaters, who will be severely punished (mark
= -42).
- the knowledge each member acquired
- While we do not require that each member worked on a stage, we do require
that each member (i) knows how the stage works and (ii) has perfectly
understood the (C++, Bison etc.) techniques needed to implement the stage.
In results in an evaluation file
(login-tc-stage.eval,
e.g., bardec_f-tc-2.eval), structured as follows.
The format is free, except some lines starting with specific markers,
when they are at the first column. The first marker is *,
used to denote the login of the group head: * bardec_f
----------------------------------------
2002-03-07: T2 par akim
tc-check: Summary:
tc-check: == Testing ./tc --parse-trace -l --ast-display ==
Successes: 177/177
Suppléments:
- --stdin pour parser stdin.
- Le parser fait attention à ses leaks, même sur les Symbol.
- --gcc-ast
- ChangeLog
- Ne meurt pas quand on lui donne un répertoire en entrée.
Le code est joli, et en 80 col !
Plutôt que des listes vides, passent 0.
As you can see, the person who evaluated is allowed to put whatever comment
comes to her mind. Specific features, specific failures, explanations to help
the students to fix their project should be included. Each time an entry is
made, there should be the date, and who wrote it (2002-03-07: T2 par
akim).
Then, the mark of the project, aka, the note of the group: ### Note T2 = 20
and optionally:
### late
- The code was delivered reasonably late (the same day).
### super_late
- More than one day.
### patch
- Within 24 hours from the first delivery date, the group submitted another
(hopefully) fixed tarball.
### super_patch
- Before the second delivery date, the group submitted another tarball.
### not_compile
- It doesn't compile, and the examiner had to change the code.
### tarball_not_clean
- Clear enough: bad name, bad content etc.
### cheater
- The group has cheated. The examiner is then asked to write as many details
as possible, and to flag the group as cheaters. The mark is likely to be
negative, and exclusion from EPITA is possible.
### reevaluation
- The group had a mark < 8, and is now being reexamined.
### bonus
- The group has implemented additional features, such as XML
output etc. and the project works properly. We do not care
about extra features on top of a broken project.
Important note to the examiners:
Note field.
The examiner should not take (too much) the automated tests into account to
decide the mark. This is because the mark is computed later, taking this into
account, so don't do it twice. Similarly, a very nice project, which is super
late, shall be flagged as super late, but should have a good ##
Note entry.
Important note to the examiners: broken tarballs.
If you fixed the tarball (### not_compile, or ###
tarball_not_clean or whatever modification, you must
run make distcheck again, and replace the tarball they delivered
with the new one. Do not keep the old tarball, do not install it in a special
place: just replace the first tarball with it, but say so in the
eval file.
The rationale is simple: only tarballs pass the tests, and every tarball
must be able to pass the tests. If you don't do that, then someone else will
have to do it again.
The next bits is the evaluation of each member of the group: ----------------------------------------
## bardec_f:100:120
Parser, locations, scanner. A pratiquement e'crit tout T2.
##sartre_j:100:40
AST, Visitor
Pourtant ne comprend rien au Visitor. Se fout de ma gueule.
##deux_j:80:80
Pas de problème C++.
FIXME: Sa note de T1 est à revoir (resoutenance).
##belmon_j:100:100
Actions, les delete dans le parser.
A implémenté la sortie en AST compatible avec GCC.
The formalism is ##
login:bonus1:bonus2.
If the stage is T2, then bonus1 refers to a bonus on T1, and
bonus2 on T2. For T4, it covers T3 and T4, and so forth.
The values can be "negative" (80, 60,
40, 0), or positive if a member deserves it
(100, 120, or even more if justifiable).
There should be information about who did what.
Marks Computation
Because the Tiger Compiler is a project with stages, the computation of the
marks depends on the stages too. To spell it out explicitly:
A stage is penalized by bad results on tests performed for
previous stages.
It means, for instance, that a T3 compiler will be exercised on T1, T2, and
T3. If there are still errors on T1 and T2 tests, they will pessimize the result
of T3 tests. The older the errors are, the more expensive they are.
As an example, here are the formulas to compute the global success rate of T3
and T5: global-rate-T3 := rate-T3 * (+ 2 * rate-T1
+ 1 * rate-T2) / 3
global-rate-T5 := rate-T5 * (+ 4 * rate-T1
+ 3 * rate-T2
+ 2 * rate-T3
+ 1 * rate-T4) / 10
Because a project which fail half of the time is not a project that deserves
half of 20, the global-rate is elevated to 1.7 before computing the mark: mark-T3 := roundup (power (global-rate-T3, 1.7) * 20 - malus-T3, 1)
where roundup (x, 1) is x
rounded up to one decimal (roundup (15, 1) = 15, roundup
(15.01, 1) = 15.1).
When the project is also evaluated by a human, power is not
used. Rather, the success rate modifies the mark given by the examiner: mark-T2 := roundup (eval-T2 * global-rate-T2 - malus-T2, 1)
Reevaluation
It happens that some groups have big problems with their project. Here is how
to solve these issues.
But first of all, you must know that cheating, stealing another group's
project, is not a solution, because:
- you won't learn. Do not forget that the Tiger Compiler is only an
exercise: having a good grade is less important than understanding and being
efficient when you will be hired by a company.
- we will certainly notice you are not able to answer our questions.
- when we find cheaters, they pay it very severely (
###
cheater).
Because the Tiger Compiler is a long project, you continuously have to
improve it, but marks starting from 8 to higher are definitive. If your
mark is less than 8, can be re-examined: you must provide a better tarball, and
in the case of human evaluated stages, another audition will be made.
This new audition will be flagged as ### reevaluation. It cannot
be better than 12/20, even if your project is excellent: a good project out of
date cannot be judged better than a reasonable good project on time.
Pay attention that when providing an updated tarball for reexamination of
stage n, it must follow the naming scheme of stage n, even
if the contents goes further. For instance, it is perfectly valid to ask for a
T2 reevaluation with a compiler that implements T4, but the tarball must be
bardec_f-tc-2.tar.bz2.
You have to ask for a reevaluation, we will not look after you.
Compiler Stages
The compiler will be written in several steps, described below.
T0, Naive Scanner and Parser
This section has been updated for EPITA-2005.
T0 is a weak form of T1: the scanner and the parser are written, but there is
a set of simplifications:
- the hierarchy
- We don't need several directories, you can program in the top level of the
package.
- the build
- The GNU Build System is not used: there is no need for Autoconf, Automake
etc.
- no driver
- The driver is reduced to its simplest form:
int
main (int argc, const char *argc[])
{
assert (argc == 1);
yyin = fopen (argv[1]);
assert (yyin);
return !!yyparse ();
}
i.e., there is no support for options at all.
SCAN, PARSE
- There is no support for options, but that does not mean that you cannot
use the tracing/debugging features from Flex/Bison. It is required that you
bind tracing to the environment variables
SCAN and
PARSE. I.e., running PARSE=1 ./tc foo.tig
will set yydebug to 1, which causes the traces of the parsing
to be displayed.
YYPRINT
- There is no requirement to implement
YYPRINT support.
yylval
- All the values are supported: strings, integers and even symbols.
Nevertheless, symbols are returned as plain strings for the time being: the
class
symbol::Symbol is to be implemented in T1.
T0 Code to Write
You must write
scantiger.ll
- The scanner.
parsetiger.yy
- The parser, and maybe
main if you wish.
tc.cc (optionally)
- The driver, i.e.,
main. Putting it into
parsetiger.yy is OK in T0.
Makefile
- This file is mandatory. Running
make must build the
binary tc.
The requirements on the tarball are the same as usual, see Tarballs.
T1, Scanner and Parser
T1 delivery is Friday, February 14th 2003 at
noon. This section is updated for EPITA-2005.
Scanner and parser are properly running, but the abstract syntax tree is not
built yet. Differences with T0 include:
- GNU Build System
- Autoconf, Automake are used.
- Options, Tasks
- The compiler supports basic options via in the Task module.
- locations
- The locations are computed are properly reported in the error messages
T1 Samples
The only information the compiler will give is about lexical and syntax
errors.
If there are no errors, the compiler shuts up, and exits successfully: /* an array type and an array variable */
let
type arrtype = array of int
var arr1 : arrtype := arrtype [10] of 0
in
arr1[2]
end
File 1: test01.tig $ tc test01.tig
=>0
Example 2: tc test01.tig
If there are lexical errors, the exit status is 2, and a an error message is
output on the standard error output. Note that its format is standard: file,
(precise) location, and then the message. 1
/* This comments starts at /* 2.2 */
File 3: unterminated-comment.tig $ tc unterminated-comment.tig
error-->unterminated-comment.tig:2.2-3.0: unexpected end of file in a comment
=>2
Example 4: tc unterminated-comment.tig
If there are syntax errors, the exit status is set to 3: let var a : nil := ()
in
1
end
File 5: type-nil.tig $ tc type-nil.tig
error-->type-nil.tig:1.13-15: syntax error, unexpected "nil", expecting "identifier"
error-->Parsing Failed
=>3
Example 6: tc type-nil.tig
If there are errors which are non lexical, nor syntactic (Windows will not
pass by me): $ tc C:/TIGER/SAMPLE.TIG
error-->cannot open `C:/TIGER/SAMPLE.TIG': No such file or directory
=>1
Example 7: tc C:/TIGER/SAMPLE.TIG
The option --parse-trace, which relies on Bison's
%debug directive, and the use of YYPRINT, must work
properly: $ cat foo.tig
a + "a"
$ ./tc --parse-trace foo.tig
Starting parse
Entering state 0
Reading a token: Next token is 258 (ID a)
Shifting token 258 (ID), Entering state 2
Reading a token: Next token is 270 (PLUS)
Reducing via rule 74 (line 318), ID -> varid
state stack now 0
Entering state 16
Reducing via rule 34 (line 196), varid -> lvalue
state stack now 0
Entering state 13
Next token is 270 (PLUS)
Reducing via rule 33 (line 192), lvalue -> exp
state stack now 0
Entering state 12
Next token is 270 (PLUS)
Shifting token 270 (PLUS), Entering state 30
Reading a token: Next token is 257 (STRING a)
Shifting token 257 (STRING), Entering state 1
Reducing via rule 15 (line 151), STRING -> exp
state stack now 0 12 30
Entering state 65
Reading a token: Now at end of input.
Reducing via rule 27 (line 182), exp PLUS exp -> exp
state stack now 0
Entering state 12
Now at end of input.
Reducing via rule 1 (line 106), exp -> program
state stack now 0
Entering state 159
Now at end of input.
Shifting token 0 ($), Entering state 160
Now at end of input.
$ echo $?
=>0
Note that (i), it cannot see that the variable is not declared nor that there
is a type checking error, since type checking... is not implemented, and (ii),
the output might be slightly different, depending upon the version of Bison you
use. But what matters is that one can see the items: ID a,
STRING a.
T1 Given Code
Some code is provided: 2005-tc-1.0.tar.bz2.
It contains:
src/common.hh
- Used throughout the project.
src/tc.cc
- The compiler driver (option handling etc.).
src/parse/scantiger.ll
- The scanner.
src/parse/parsetiger.yy
- The parser.
src/ast/position.hh
- Keeping track of a point (cursor) in a file.
src/ast/location.hh
- Keeping track of a range (two cursors) in a (or two) file.
T1 Code to Write
Be sure to read Flex and Bison documentations and tutorials, see Flex
& Bison.
src/scantiger.ll
- The scanner must be completed. It must be able to read strings,
identifiers etc. Strings and identifiers will be stored as C++
std::string. See the following code for the basics. ...
\" yylval->str = new std::string (); BEGIN STATE_STRING;
<STATE_STRING>{ /* Handling of the strings. Initial " is eaten. */
\" {
BEGIN INITIAL;
return STRING;
}
...
\\x[0-9a-fA-F]{2} {
yylval->str->append (1, strtol (yytext + 2, 0, 16));
}
...
}
The locations are tracked.
src/parsetiger.yy
- The grammar must be complete, but do not implement actions. You have to
implement support for
--parse-trace (see T1
Samples). Pay special attention to the display of strings and identifiers.
Bison will certainly complain because of a type clash for some actions. For
instance, if you have given a type to STRING, but none to
exp, then it will choke on: exp: STRING;
because it actually means exp: STRING { $$ = $1; };
which is not type coherent. So write this instead: exp: STRING {};
src/ast/position.hh
- The class
ast::Position is completed.
src/ast/location.hh
- The class
ast::Location must be completed.
src/symbol/symbol.hxx
- The class
symbol::Symbol keeps a single copy of identifiers,
so that (i) we save space, and (ii) symbol comparison is fast. The file
src/symbol/symbol.hh describes the interface of the class
symbol::Symbol, but the implementation is to be written in
src/symbol/symbol.hxx.
T1 FAQ
- Options
- The first version of argp we used (1.1) was buggy: on
./tc foo.tig
-A, it considered -A was a file, not an option. This bug
is now fixed. The patch is from Niels Möller: diff -u -r1.16 -r1.17
--- argp-parse.c 18 Feb 2001 22:40:03 -0000 1.16
+++ argp-parse.c 4 Feb 2003 19:52:30 -0000 1.17
@@ -1021,6 +1021,8 @@
*arg_ebadkey = 1;
if (parser->first_nonopt != parser->last_nonopt)
{
+ exchange(parser);
+
/* Start processing the arguments we skipped previously. */
parser->state.next = parser->first_nonopt;
Go into argp/, run patch -p0, paste the patch,
type <CTRL>-d.
T2, Building the Abstract Syntax Tree
This section has been updated for EPITA-2004. Last update: 2002-02-14.
At the end of this stage, the compiler can build abstract syntax trees of
Tiger programs and pretty-print them.
T2 Samples
Here are a few examples of expected features.
The parser builds abstract syntax trees that can be output by a
pretty-printing module: /* define a recursive function */
let
/* calculate n! */
function fact (n : int) : int =
if n = 0
then 1
else n * fact (n - 1)
in
fact (10)
end
File 8: simple-fact.tig $ tc --ast-display simple-fact.tig
/* == Abstract Syntax Tree. == */
let
function fact (n : int) : int =
if (n = 0)
then 1
else (n * fact ((n - 1)))
in
fact (10)
end
Example 9: tc -ast-display simple-fact.tig
The output from your pretty-printer must be valid Tiger code, and be
equivalent to the input.
By valid, we mean that any Tiger compiler must be able to parse with
success your output. Pay attention to the banners such as ==
Abstract...: you should use comments: /* == Abstract... */.
There is one exception: you are not asked to escape quotes and backslashes when
printing a literal string.
By equivalent, we mean that except for syntactic sugar, the output
and the input are equal. Syntactic sugar refers to &,
|, unary -, and in some cases if then: % cat boolean-1.tig
1 = 1 & 2 = 2
% ./tc --ast-display boolean-1.tig
/* == Abstract Syntax Tree. == */
if (1 = 1)
then (2 = 2)
else 0
For loops must be properly displayed, i.e., although we use a
ast::VarDec for the index of the loop, you must not display
var: /* valid for and let */
let
var a := 0
in
for i :=0 to 100 do (a :=a+1; ())
end
File 10: for-loop.tig $ tc --ast-display for-loop.tig
/* == Abstract Syntax Tree. == */
let
var a := 0
in
for i := 0 to 100 do
(
a := (a + 1);
()
)
end
Example 11: tc -ast-display for-loop.tig
T2 Given Code
Some code is provided: 2004-tc-2.0.tar.bz2.
It contains:
src/ast
- Implementation of the abstract syntax tree. The file
README
gives an overview of the involved class hierarchy.
src/ast/location.hh, position.hh
- Classes used to hold locations computed from Tiger programs.
src/ast/visitor.hh
- Abstract base class of the compiler's visitor hierarchy.
src/ast/default-visitor.hh
- Implementation of the DefaultVisitor class, which walks the abstract
syntax tree, doing nothing. It is mainly used as a basis for deriving other
visitors.
src/ast/print-visitor.hh
- Implementation of the PrintVisitor class, which performs pretty-printing
in the tiger compiler.
src/symbol.*
- Have a look at how this class is implemented. Ask questions if you don't
understand it. You will not be given questions about it in T2, but T3, there
will be.
T2 Code to Write
What is to be done:
src/parse/scantiger.ll
- The scanner must now use
symbol::Symbol instead of
std::string for identifiers. Of course, the parser must be
adjusted too.
The scanner must be updated to keep track of locations of tokens in Tiger
programs. To adjust your scanner, you are strongly encouraged to use
YY_USER_ACTION, and also the yylex prologue: ...
%%
%{
// Everything here is run each time yylex is invoked.
%}
"if" return IF;
...
%%
...
Have a look at the scanner and parser chapters of this draft.
src/parse/parsetiger.yy
- The parser must be updated to ensure error recovery. The grammar must be
changed to process declarations by chunks.
In Tiger, the following program is invalid: let function foo () = ()
function foo () = ()
var foo := 0
in
()
end
while the following code is valid: let function foo () = ()
var foo := 0
function foo () = ()
in
()
end
this is because declarations are cut in "chunks" of declarations of the
same kind. In the first example, there are two chunks: one chunk of two
function declarations and one variable declaration. In the second example:
there are three chunks: one chunk containing only one function declaration,
one with one variable declaration, and the third one, with a single function
declaration again.
The rule is "a chunk cannot define twice the same name".
In order to implement this easily, you must adjust your grammar so that
declarations are parsed by chunks. Pay special attention to the
implementation of ast::FunctionDecs, ast::VarDecs,
and ast::TypeDecs (which are all implemented thanks to
ast::AnyDecs): they are these chunks of declaration. Therefore,
an ast::LetExp uses a list of chunks.
src/ast
- The abstract syntax tree module must be completed. Basically, this means
there should remain no
FIXME: anywhere in the code we gave.
Several files are missing (fieldvar.hh nilexp.hh
intexp.hh stringexp.hh callexp.hh
assignexp.hh whileexp.hh breakexp.hh
arrayexp.hh). See src/ast/README for additional
information on the missing classes.
src/ast/location.hh, position.hh
- The classes Position and Location must be completed.
src/ast/default-visitor.hh
- The DefaultVisitor class must be completed, and must be able to walk whole
abstract syntax trees. Do not forget that your DefaultVisitor must be a sound
basis for your further work on the Tiger compiler.
src/ast/print-visitor.hh
- The PrintVisitor class must be completed, and able to pretty-print any
Tiger program.
T2 FAQ
bison
- Be sure to read its dedicated section: Flex
& Bison.
- escapes (
escapes_get, etc.)
- These are place holders for T3. You don't have to complete them. Actually,
the easiest is probably just to comment these functions out. Or better yet:
#if 0/#endif.
T3, Computing the Escaping Variables
This section was updated for Tiger 2004. The project will be taken on Friday,
March 15th, at noon.
At the end of this stage, the compiler must be able to compute and display
the escaping variables. Be sure to read the chapter "Escapes" in the lecture
notes. Its parser is equipped with error recovery.
T3 Samples
This example demonstrates the computation and display of escaping
variables/formals. let
var escaping := "I rule the world!"
var not_escaping := "Peace on Earth for humans of good will."
function print_escaping () =
print (escaping)
in
(not_escaping; print_escaping ())
end
File 12: variable-escapes.tig $ tc --escapes-display variable-escapes.tig
/* == Escapes. == */
let
var /* escaping */ escaping := "I rule the world!"
var /* escaping */ not_escaping := "Peace on Earth for humans of good will."
function print_escaping () =
print (escaping)
in
(
not_escaping;
print_escaping ()
)
end
Example 13: tc -escapes-display
variable-escapes.tig $ tc --ast-display --escapes-compute --escapes-display variable-escapes.tig
/* == Abstract Syntax Tree. == */
let
var escaping := "I rule the world!"
var not_escaping := "Peace on Earth for humans of good will."
function print_escaping () =
print (escaping)
in
(
not_escaping;
print_escaping ()
)
end
/* == Escapes. == */
let
var /* escaping */ escaping := "I rule the world!"
var not_escaping := "Peace on Earth for humans of good will."
function print_escaping () =
print (escaping)
in
(
not_escaping;
print_escaping ()
)
end
Example 14: tc -ast-display -escapes-compute
-escapes-display variable-escapes.tig
You are strongly encouraged to run your compiler on merge.tig
and to study its output.
Another part of T3 is the improvement of your parser: it must be robust to
some forms of errors. On the following input: (
1;
(2, 3);
(4, 5);
6
)
it finds two errors: it does not stop on the first one: % ./tc error.tig
error-->error.tig:3.5: parse error, unexpected ",", expecting ";"
error-->error.tig:4.5: parse error, unexpected ",", expecting ";"
Of course, the exit status still reveals the parse error.
Be sure that your error recovery does not break the rest of the compiler...
T3 Code To Write
src/parse/parsetiger.yy
- Implement error recovery. There should be at least three uses of the token
error. Read the Bison documentation about it.
src/parse/parsetc-class.hh
- The current version of Bison no longer outputs this file. Be sure to
remove it from your project:
src/parse/parsetiger.hh alone is
enough.
ast::VarExp
- What is it used for? What is its meaning? Find another means to do the
same things, but without it. Hence destroy it, and remove its file:
ast must no longer use it.
ast::PrintVisitor
- Be sure to display the
/* escaping */ flag where needed, and
only where needed. If you don't pay attention, you might display
meaningless flags due to implementation details.
symbol::Table< class Entry_T >
- This item is irrelevant for Tiger 2004, as the code was accidently
delivered to you. Nevertheless, if you're willing to understand templates in
C++, you are encouraged to try writing your own version, without taking
inspiration from the code we gave.
Write the class template symbol::Table in
src/symbol/table.hh which is a table of symbols dedicated to
storing some data which type is Entry_T * (i.e., it maps a
symbol::Symbol to an Entry_T *). You are encouraged
to implement something simple, based on stacks (see std::list) of
maps (see std::map). We used to promote hash tables, but STL does
not include hash tables: this is a GNU (SGI actually) extension.
symbol::Table is implemented as a class template because we
will use it with several other kinds of Entry_T in both the
type::TypeVisitor and translate::TranslateVisitor.
symbol::Table must provide this interface:
Close the last scope, forgetting everything since the latest
scope_begin (). |
| put (Symbol key, Entry_T &
value) |
void |
| Associate value to key in the current scope.
|
| get (Symbol key) const |
Entry_T * |
If key was associated to some Entry_T in the
open scopes, return the most recent insertion. Otherwise return the
empty pointer. |
| print (std::ostream & ostr)
const |
void |
| Send the content of this table on ostr in a readable
manner, the top of the stack being displayed last.
|
ast::EscapeVisitor
- Write the class
ast::EscapeVisitor in
src/ast/escape-visitor.hh.
You are suggested to implement three additional classes in this file:
Definition
- An abstract class which is used to instantiate the template class
symbol::Table into Table <Definition>.
| escape_set (void) |
virtual void |
| Depth at which this object has been created. |
Returns the depth associated to this Definition
object. |
VariableDefinition
- Inherits from
Definition. It has one additional attribute,
a VarDec &. The method escape_set is
implemented, and when invoked, set the escapes flags of the
corresponding VarDec.
FormalDefinition
- Inherits from
Definition. To be designed by yourself. Do
not forget that the ast class used to register formals is used
elsewhere, and it would be a pity that your implementation makes no
difference... Be sure to write a test that verifies that your implementation
is not abused. I have one such test...
- Equip
ast
- All the sites where variables and formals (i.e., the arguments of the
functions being defined, not being used) are introduced must be
equipped with the
escape_get and escape_set methods.
Most probably the code was already given, and is using
const_casts; try to use mutable instead.
Modify the code so that each definition of an escaping variable/formal is
preceded by the comment /* escaping */ if the flag
display_escapes_p is true. See the item "Driver" for an example.
- Driver
- The driver,
src/tc.cc, must be enhanced to support two
additional options: --escapes-compute, which triggers the
ast::EscapeVisitor, and --escapes-display, which
sets display_escapes_p to true. This is not a modification of
--ast-display, it is a genuine second
ast::PrintVisitor which is invoked.
Please, note that --escapes-display does not imply
--escapes-compute (see T3
Samples).
T4, Type Checking
T4 delivery is Friday, April 12th 2002 at
noon. At the end of this stage, the compiler type checks Tiger
programs. Clear error messages are required.
T4 Samples
Here are a few examples: $ echo '1 + "2"' | ./tc -
$ echo '1 + "2"' | ./tc - --types-check
error-->standard input:1.1-7: type mismatch
error--> right operand type: string
error--> expected type: int
$ echo '1 + a' | ./tc -
$ echo '1 + a' | ./tc - --types-check
error-->standard input:1.5: undeclared variable: a
$ echo 'if 1 then 2' | ./tc -
$ echo 'if 1 then 2' | ./tc - --types-check
error-->standard input:1.1-11: type mismatch
error--> then clause type: int
error--> else clause type: void
T4 Given Code
src/type/libtype.hh
-
src/type/libtype.cc
- The interface of the Type module. It exports a simple procedure,
type_check.
src/type/types.hh
- The definition of all the types. The type
type::Nil is given
in full.
src/type/type-env.hh
- The environment used by the
type::TypeVisitor.
T4 Code to Write
What is to be done.
src/type
- New directory. Don't forget to update the
SUBDIRS variable in
src/Makefile.am, and the AC_CONFIG_FILES invocation
in configure.ac.
src/type/types.hh
type::String, type::Int, and
type::Void are to be implemented. It's dead easy. Using templates
would be particularly appreciated to factor the code between the four
singleton classes.
type::Named is almost entirely given.
type::Array is even simpler than the four Singletons.
type::Record is somewhat incomplete.
Pay attention to the implementation of
type::Type::assignable_to and
type::Type::comparable_to.
src/type/type-entry.hh
- This file is really a empty nutshell, so we give it complete so that you
concentrate on things that matter. Nonetheless you will be asked questions on
this file, so have a look at it.
src/type/type-env.hh
- The constructor of
type::TypeEnv must be completed: it must
fill the environment with the definition of builtin types and functions. See
the Tiger Reference Manual.
The handling of types is left as an example, you still have to implement
the variables and functions support.
type::TypeVisitor
- Of course this is the most tricky part. I hope there are enough comments
in there so that you understand what is to be done. Please, post your
questions and help me improve it.
- Driver
- The driver must run the type checker when given
--types-check. Note the plural, it spells
--types-check, not --type-check.
T5, Translating to the High Level Intermediate
Representation
At the end of this stage, all the errors of previous stages must have been
fixed, and the compiler translates the AST into the high level
intermediate representation, HIR for short.
T5 first delivery is Friday, June 14th, at
noon. Second delivery date is Monday, June 17th, at noon.
T5 Samples
T5 can be started (and should be started if you don't want to finish it in a
hurry) by first making sure your compiler can handle code that uses no
variables. Then, you can complete your compiler to support more and more Tiger
features.
T5 Primitive Samples
This example is probably the simplest Tiger program. 0
File 15: 0.tig $ tc --hir-display 0.tig
/* == High Level Intermediate representation. == */
# Routine: Main Program
label l`Main'
# Prologue
# Body
sxp
const 0
# Epilogue
label end
Example 16: tc -hir-display 0.tig
You should then probably try to make more difficult programs with literals
only. Arithmetics is one of the easiest tasks. 1 + 2 * 3
File 17: arith.tig $ tc --hir-display arith.tig
/* == High Level Intermediate representation. == */
# Routine: Main Program
label l`Main'
# Prologue
# Body
sxp
binop (+)
const 1
binop (*)
const 2
const 3
# Epilogue
label end
Example 18: tc -hir-display arith.tig
You should use havm to exercise your output. $ tc --hir-display arith.tig >arith.hir
=>0
Example 19: tc -hir-display arith.tig
>arith.hir $ havm arith.hir
=>0
Example 20: havm arith.hir
Unfortunately, without actually printing something, you won't see the final
result, which means you need to implement calls. Fortunately, you can ask
havm for a verbose execution: $ havm --trace arith.hir
error-->call ( name Main )
error-->const 1
error-->const 2
error-->const 3
error-->binop (*) 2 3
error-->binop (+) 1 6
error-->sxp 7
Example 21: havm -trace arith.hir
If you look carefully, you will find an sxp 7 in there...
Then you are encouraged to implement control structures. if 101 then 102 else 103
File 22: if-101.tig $ tc --hir-display if-101.tig
/* == High Level Intermediate representation. == */
# Routine: Main Program
label l`Main'
# Prologue
# Body
sxp
eseq
seq
cjump ne
const 101
const 0
name l0
name l1
label l0
move
temp t0
const 102
jump
name l2
label l1
move
temp t0
const 103
jump
name l2
label l2
seq end
temp t0
# Epilogue
label end
Example 23: tc -hir-display if-101.tig
And even more difficult control structure uses: while 101 & 102
do (if 103 then break)
File 24: while-101.tig $ tc --hir-display while-101.tig
/* == High Level Intermediate representation. == */
# Routine: Main Program
label l`Main'
# Prologue
# Body
seq
label l7
cjump ne
eseq
seq
cjump ne
const 101
const 0
name l0
name l1
label l0
move
temp t0
const 102
jump
name l2
label l1
move
temp t0
const 0
jump
name l2
label l2
seq end
temp t0
const 0
name l8
name l3
label l8
seq
cjump ne
const 103
const 0
name l4
name l5
label l4
jump
name l3
jump
name l6
label l5
sxp
const 0
jump
name l6
label l6
seq end
jump
name l7
label l3
seq end
# Epilogue
label end
Example 25: tc -hir-display while-101.tig
T5 Builtin Calls Samples
But the game becomes more interesting when you implement function calls
(which is easier than compiling functions). print_int is probably
the first builtin to implement: (print_int (101); print ("\n"))
File 26: print-101.tig $ tc --hir-display print-101.tig > print-101.hir
=>0
Example 27: tc -hir-display print-101.tig >
print-101.hir $ havm print-101.hir
101
Example 28: havm print-101.hir
Complex values, arrays and records, also need calls to the runtime system: let type list = { h: int, t: list }
var list := list { h = 1, t = list { h = 2, t = nil } }
in
print_int (list.t.h); print ("\n")
end
File 29: print-list.tig $ tc --hir-display print-list.tig
/* == High Level Intermediate representation. == */
label l0
"
"
# Routine: Main Program
label l`Main'
# Prologue
move
temp t2
temp fp
move
temp fp
temp sp
move
temp sp
binop (-)
temp sp
const 4
# Body
seq
move
mem
temp fp
eseq
seq
move
temp t1
call
name l`malloc'
const 8
call end
move
mem
binop (+)
temp t1
const 0
const 1
move
mem
binop (+)
temp t1
const 4
eseq
seq
move
temp t0
call
name l`malloc'
const 8
call end
move
mem
binop (+)
temp t0
const 0
const 2
move
mem
binop (+)
temp t0
const 4
const 0
seq end
temp t0
seq end
temp t1
seq
sxp
call
name l`print_int'
mem
binop (+)
mem
binop (+)
mem
temp fp
const 4
const 0
call end
sxp
call
name l`print'
name l0
call end
seq end
seq end
# Epilogue
move
temp sp
temp fp
move
temp fp
temp t2
label end
Example 30: tc -hir-display print-list.tig $ tc --hir-display print-list.tig > print-list.hir
=>0
Example 31: tc -hir-display print-list.tig >
print-list.hir $ havm print-list.hir
2
Example 32: havm print-list.hir
T5 Samples with Variables
Here is an example which demonstrates the usefulness of information about
escapes: in the first case, all the variables are expected to escape, hence are
stored on the stack, while in the second case, we know they don't
escape, hence they can be stored in a temporary. let var a := 1
var b := 2
var c := 3
in
a := 2;
c := a + b + c;
print_int (c);
print ("\n")
end
File 33: vars.tig $ tc --hir-display vars.tig
/* == High Level Intermediate representation. == */
label l0
"
"
# Routine: Main Program
label l`Main'
# Prologue
move
temp t0
temp fp
move
temp fp
temp sp
move
temp sp
binop (-)
temp sp
const 12
# Body
seq
move
mem
temp fp
const 1
move
mem
binop (+)
temp fp
const -4
const 2
move
mem
binop (+)
temp fp
const -8
const 3
seq
move
mem
temp fp
const 2
move
mem
binop (+)
temp fp
const -8
binop (+)
binop (+)
mem
temp fp
mem
binop (+)
temp fp
const -4
mem
binop (+)
temp fp
const -8
sxp
call
name l`print_int'
mem
binop (+)
temp fp
const -8
call end
sxp
call
name l`print'
name l0
call end
seq end
seq end
# Epilogue
move
temp sp
temp fp
move
temp fp
temp t0
label end
Example 34: tc -hir-display vars.tig $ tc --escapes-compute --hir-display vars.tig
/* == High Level Intermediate representation. == */
label l0
"
"
# Routine: Main Program
label l`Main'
# Prologue
# Body
seq
move
temp t0
const 1
move
temp t1
const 2
move
temp t2
const 3
seq
move
temp t0
const 2
move
temp t2
binop (+)
binop (+)
temp t0
temp t1
temp t2
sxp
call
name l`print_int'
temp t2
call end
sxp
call
name l`print'
name l0
call end
seq end
seq end
# Epilogue
label end
Example 35: tc -escapes-compute -hir-display
vars.tig $ tc --escapes-compute --hir-display vars.tig >vars.hir
=>0
Example 36: tc -escapes-compute -hir-display
vars.tig >vars.hir $ havm vars.hir
7
Example 37: havm vars.hir
The second example uses a simple function. let function fact (i: int) : int =
if i = 0 then 1
else i * fact (i - 1)
in
print_int (fact (15));
print ("\n")
end
File 38: fact15.tig $ tc --hir-display fact15.tig
/* == High Level Intermediate representation. == */
# Routine: fact
label l0
# Prologue
move
temp t1
temp fp
move
temp fp
temp sp
move
temp sp
binop (-)
temp sp
const 8
move
mem
temp fp
temp i0
move
mem
binop (+)
temp fp
const -4
temp i1
# Body
move
temp rv
eseq
seq
cjump eq
mem
binop (+)
temp fp
const -4
const 0
name l1
name l2
label l1
move
temp t0
const 1
jump
name l3
label l2
move
temp t0
binop (*)
mem
binop (+)
temp fp
const -4
call
name l0
mem
temp fp
binop (-)
mem
binop (+)
temp fp
const -4
const 1
call end
jump
name l3
label l3
seq end
temp t0
# Epilogue
move
temp sp
temp fp
move
temp fp
temp t1
label end
label l4
"
"
# Routine: Main Program
label l`Main'
# Prologue
# Body
seq
sxp
call
name l`print_int'
call
name l0
temp fp
const 15
call end
call end
sxp
call
name l`print'
name l4
call end
seq end
# Epilogue
label end
Example 39: tc -hir-display fact15.tig $ tc --hir-display fact15.tig >fact15.hir
=>0
Example 40: tc -hir-display fact15.tig
>fact15.hir $ havm fact15.hir
2004310016
Example 41: havm fact15.hir
T5 Given Code
Code that was delivered:
src/temp/temp.hh
- So called temporaries are pseudo-registers: we may allocate as
many temporaries as we want. Eventually the register allocator will map those
temporaries to either an actual register, or it will allocate a slot in the
activation block (aka frame) of the current function.
src/temp/label.hh
- We need labels for JUMPs, for functions, strings etc.
src/tree
- The implementation of the intermediate representation. The file
README should give enough explanations to understand how it
works.
Reading the corresponding explanations in Appel's book is mandatory.
It is worth noting that contrary to A. Appel, just as we did for
ast, we use n-ary structures. For instance, where Appel uses a
binary SEQ, we have an n-ary SEQ which allows us
to put as many statements as we want.
To avoid gratuitous name clashes, what Appel denotes EXP is
denoted SXP (Statement Expression), implemented in
translate::Sxp.
Please, pay extra attention to the fact that there are
temp::Temp used to create unique temporaries (similar to
symbol::Symbol), and tree::Temp which is the
intermediate representation instruction denoting a temporary (hence a
tree::Temp needs a temp::Temp). Similarly, on the
one hand, there is temp::Label which is used to create unique
labels, and on the other hand there are tree::Label which is the
IR statement to define to a label, and tree::Name used
to refer to a label (typically, a tree::Jump needs a
tree::Name which in turn needs a temp::Label).
src/translate/level.hh, src/translate/level.cc
- They implement
translate::Level, which are wrappers around
frame::Frame. A Frame knows only what are the
"variables" it contains, but for a language such as Tiger, where you may need
to access the variables of the "parent function", you must be able to go into
the parent's frame: this is the role of Level.
Level are Frame with, in addition, a link
(parallel to the static link) to the level of the enclosing
function.
src/translate/level-entry.hh
- All the information that the environment must keep about variables and
functions.
src/translate/level-env.hh
- The levels environment, containing
LevelVarEntry's and
LevelFunEntry's. We don't need to store information related to
types here.
src/translate/translation.hh
- This file contains functions used by the
translate::TranslateVisitor to translate the AST into IR. They
handle all the unCx etc. magic.
src/translate/translate-visitor.hh
- The
translate::TranslateVisitor which walks the AST and
builds the IR tree.
T5 Code to Write
You are encouraged to try first very simple examples: nil,
1 + 2, "foo" < "bar" etc. Then consider supporting
variables, and finally handle the case of the functions.
- Driver
- The driver must always perform the translation, but displays the result
iff the option
--hir-display was given. Obviously, an input that
has not been type-checked cannot be translated, so all the T5 options
(--hir-display, and optionally --hir-use-ifexp)
imply --types-check.
- Type Checking
- Beware that there is a new builtin:
print_int.
src/translate/fragment.hh
- It implements
translate::Fragment, an abstract class,
translate::DataFrag to store the literal strings, and
translate::ProcFrag to store the routines.
There remains to implement translate::ProcFrag::print which
outputs the routine themselves plus the glue code (allocating the
frame etc.).
src/translate/level-env.hh
- Code is missing. In particular, bear in mind that the initial environment
is not empty...
src/translate/translation.hh
- There are many holes to fill.
src/translate/translate-visitor.hh
- There are holes to fill.
T5 Options
This section documents possible extensions you could implement in T5.
T5 Optimizing Cascading If
This optimization aims at optimizing the number of jumps needed to compute
nested if. It is easy to implement, and therefore will be granted
with a reasonable bonus.
It consists in implementing the option --hir-use-ifexp, which
causes the compiler to use translate::IfThenElseExp instead of the
simpler scheme based on translate::Cx, Nx, and
Ex only.
Warning: If you support the option
--hir-use-ifexp then you will be tested on
translate::IfThenElseExp. If this options does nothing in your
compiler, then be sure to get 0 on all the corresponding tests.
First, here is the output without --hir-use-ifexp, and the
performances reported by havm: 1 = 1 & 2 = 2
File 42: boolean.tig $ tc --hir-display boolean.tig
/* == High Level Intermediate representation. == */
# Routine: Main Program
label l`Main'
# Prologue
# Body
sxp
eseq
seq
cjump eq
const 1
const 1
name l0
name l1
label l0
move
temp t0
eseq
seq
move
temp t1
const 1
cjump eq
const 2
const 2
name l3
name l4
label l4
move
temp t1
const 0
label l3
seq end
temp t1
jump
name l2
label l1
move
temp t0
const 0
jump
name l2
label l2
seq end
temp t0
# Epilogue
label end
Example 43: tc -hir-display boolean.tig $ tc --hir-display boolean.tig >boolean-1.hir
=>0
Example 44: tc -hir-display boolean.tig
>boolean-1.hir $ havm --profile boolean-1.hir
error-->/* Profiling. */
error-->fetches from temporary : 2
error-->fetches from memory : 0
error-->binary operations : 0
error-->function calls : 1
error-->stores to temporary : 2
error-->stores to memory : 0
error-->jumps : 1
error-->conditional jumps : 2
error-->/* Execution time. */
error-->number of cycles : 17
Example 45: havm -profile boolean-1.hir
and here is the same, but with --hir-use-ifexp (pay attention
that it is passed before --hir-display): $ tc --hir-use-ifexp --hir-display boolean.tig
/* == High Level Intermediate representation. == */
# Routine: Main Program
label l`Main'
# Prologue
# Body
seq
cjump eq
const 1
const 1
name l0
name l1
label l0
seq
sxp
const 2
sxp
const 2
seq end
jump
name l2
label l1
sxp
const 0
label l2
seq end
# Epilogue
label end
Example 46: tc -hir-use-ifexp -hir-display
boolean.tig $ tc --hir-use-ifexp --hir-display boolean.tig >boolean-2.hir
=>0
Example 47: tc -hir-use-ifexp -hir-display
boolean.tig >boolean-2.hir $ havm --profile boolean-2.hir
error-->/* Profiling. */
error-->fetches from temporary : 0
error-->fetches from memory : 0
error-->binary operations : 0
error-->function calls : 1
error-->stores to temporary : 0
error-->stores to memory : 0
error-->jumps : 1
error-->conditional jumps : 1
error-->/* Execution time. */
error-->number of cycles : 12
Example 48: havm -profile boolean-2.hir
Of course, if you did not implement translate::IfThenElseExp
support, --hir-use-ifexp must cause tc to exit 64.
T5 Bound Checking
Implementing bound checking is quite simple: it consists in having the
program die when the program accesses an invalid subscript in an array. For
instance, the following code is "succeeds" with a non-bound-checking compiler. let type int_array = array of int
var size := 2
var arr1 := int_array [size] of 0
var arr2 := int_array [size] of 0
var two := 2
var m_one := -1
in
arr1[two] := 3;
arr2[m_one] := -1;
print_int (arr1[1]);
print ("\n");
print_int (arr2[0]);
print ("\n")
end
File 49: bound-violation.tig $ tc --hir-display bound-violation.tig >bound-violation.hir
=>0
Example 50: tc -hir-display bound-violation.tig
>bound-violation.hir $ havm bound-violation.hir
-1
3
Example 51: havm bound-violation.hir
When run with --bound-checking, have your compiler produce code
that dies on such cases.
T5 Optimizing Static Links
Warning: this optimization is difficult to do it perfectly, and
therefore, expect a big bonus.
In a first and conservative extension, the compiler considers that all the
functions (but the builtins!) need a static link. This is correct, but
inefficient: for instance, the traditional fact function will spend
almost as much time handling the static link, than its real argument.
Some functions need a static link, but don't need to save it on the stack.
For instance, in the following: let var foo := 1
function foo () : int = foo
in
foo ()
end
the function foo does need a static link to access the variable
foo, but does not need to store its static link on the stack.
It is suggested to address these problems in the following order:
- Implement the detection of functions that do not need a static
link (see exercise 6.5 in "Modern Implementation of Compilers"), but still
consider any static link escapes.
- Adjust the output of
--escapes-display to display /*
escaping sl */ before the first formal argument of the
functions (declarations) that need the static link: $ tc --escapes-display fact.tig
/* == Escapes. == */
let
function fact (/* escaping sl *//* escaping */ n : int) : int =
if (n = 0)
then 1
else (n * fact ( (n - 1)))
in
fact (10)
end
$ tc --escapes-display --escapes-com fact.tig
/* == Escapes. == */
let
function fact (n : int) : int =
if (n = 0)
then 1
else (n * fact ( (n - 1)))
in
fact (10)
end
- Adjust your
call and progFrag prologues.
- Improve your computation so that non escaping static links are detected:
$ tc --escapes-compute --escapes-display escaping-sl.tig
/* == Escapes. == */
let
var toto := 1
function outer (/* escaping sl */) : int =
let function inner (/* sl */) : int = toto
in inner () end
in
outer ()
end
Watch out, it is not trivial to find the minimum. What do you think about
the static link of the function sister below? let
var toto := 1
function outer () : int =
let function inner () : int = toto
in inner () end
function sister () : int = outer ()
in
sister ()
end
T6, Translating to the Low Level Intermediate
Representation
T6 first delivery is Monday, July 15th at
noon. Second delivery date is Wednesday, July 17th at noon.
There will be no additional code: there is no "holes" to fill, you have to
write the whole thing. Consequently, you may start T6 as soon as you want.
At the end of this stage, the compiler produces low level intermediate
representation: LIR. LIR is a subset of the
HIR: some patterns are forbidden. This is why it is also named
canonicalization.
T6 Samples
There are several stages in T6.
T6 Canonicalization Samples
The first task in T6 is getting rid of all the eseq. To do this,
you have to move the statement part of an eseq at the end of the
current sequence point, and keeping the expression part in place.
Compare for instance the HIR to the LIR in the
following case: let function print_ints (a: int, b: int) =
(print_int (a); print (", "); print_int (b); print ("\n"))
var a := 0
in
print_ints (1, (a := a + 1; a))
end
File 52: preincr-1.tig
One possible HIR translation is: $ tc --escapes-compute --hir-display preincr-1.tig
/* == High Level Intermediate representation. == */
label l1
", "
label l2
"
"
# Routine: print_ints
label l0
# Prologue
move
temp t2
temp fp
move
temp fp
temp sp
move
temp sp
binop (-)
temp sp
const 4
move
mem
temp fp
temp i0
move
temp t0
temp i1
move
temp t1
temp i2
# Body
seq
sxp
call
name l`print_int'
temp t0
call end
sxp
call
name l`print'
name l1
call end
sxp
call
name l`print_int'
temp t1
call end
sxp
call
name l`print'
name l2
call end
seq end
# Epilogue
move
temp sp
temp fp
move
temp fp
temp t2
label end
# Routine: Main Program
label l`Main'
# Prologue
# Body
seq
move
temp t3
const 0
sxp
call
name l0
temp fp
const 1
eseq
move
temp t3
binop (+)
temp t3
const 1
temp t3
call end
seq end
# Epilogue
label end
Example 53: tc -escapes-compute -hir-display
preincr-1.tig
A possible canonicalization is then: $ tc --escapes-compute --lir-display preincr-1.tig
/* == Low Level Intermediate representation. == */
label l1
", "
label l2
"
"
# Routine: print_ints
label l0
# Prologue
move
temp t2
temp fp
move
temp fp
temp sp
move
temp sp
binop (-)
temp sp
const 4
move
mem
temp fp
temp i0
move
temp t0
temp i1
move
temp t1
temp i2
# Body
seq
label l3
sxp
call
name l`print_int'
temp t0
call end
sxp
call
name l`print'
name l1
call end
sxp
call
name l`print_int'
temp t1
call end
sxp
call
name l`print'
name l2
call end
label l4
seq end
# Epilogue
move
temp sp
temp fp
move
temp fp
temp t2
label end
# Routine: Main Program
label l`Main'
# Prologue
# Body
seq
label l5
move
temp t3
const 0
move
temp t5
temp fp
move
temp t3
binop (+)
temp t3
const 1
sxp
call
name l0
temp t5
const 1
temp t3
call end
label l6
seq end
# Epilogue
label end
Example 54: tc -escapes-compute -lir-display
preincr-1.tig
But please note the example above is simple because 1
commutes with (a := a + 1; a): the order does not
matter. But if you change the 1 into a, then you
cannot exchange a and (a := a + 1; a), so the
translation is different. Compare the previous LIR with the
following, and pay attention to let function print_ints (a: int, b: int) =
(print_int (a); print (", "); print_int (b); print ("\n"))
var a := 0
in
print_ints (a, (a := a + 1; a))
end
File 55: preincr-2.tig $ tc --escapes-compute --lir-display preincr-2.tig
/* == Low Level Intermediate representation. == */
label l1
", "
label l2
"
"
# Routine: print_ints
label l0
# Prologue
move
temp t2
temp fp
move
temp fp
temp sp
move
temp sp
binop (-)
temp sp
const 4
move
mem
temp fp
temp i0
move
temp t0
temp i1
move
temp t1
temp i2
# Body
seq
label l3
sxp
call
name l`print_int'
temp t0
call end
sxp
call
name l`print'
name l1
call end
sxp
call
name l`print_int'
temp t1
call end
sxp
call
name l`print'
name l2
call end
label l4
seq end
# Epilogue
move
temp sp
temp fp
move
temp fp
temp t2
label end
# Routine: Main Program
label l`Main'
# Prologue
# Body
seq
label l5
move
temp t3
const 0
move
temp t5
temp fp
move
temp t6
temp t3
move
temp t3
binop (+)
temp t3
const 1
sxp
call
name l0
temp t5
temp t6
temp t3
call end
label l6
seq end
# Epilogue
label end
Example 56: tc -escapes-compute -lir-display
preincr-2.tig
As you can see, the output is the same for the HIR and the
LIR: $ tc --escapes-compute --hir-display preincr-2.tig >preincr-2.hir
=>0
Example 57: tc -escapes-compute -hir-display
preincr-2.tig >preincr-2.hir $ havm preincr-2.hir
0, 1
Example 58: havm preincr-2.hir $ tc --escapes-compute --lir-display preincr-2.tig >preincr-2.lir
=>0
Example 59: tc -escapes-compute -lir-display
preincr-2.tig >preincr-2.lir $ havm preincr-2.lir
0, 1
Example 60: havm preincr-2.lir
Be very careful when dealing with mem. For instance, rewriting
something like: call (foo, eseq (move (temp t, const 51), temp t))
into move temp t1, temp t
move temp t, const 51
call (foo, temp t)
is dead wrong: temp t is a subexpression: it is being
defined here. You should produce: move temp t, const 51
call (foo, temp t)
Another danger is the handling of move (mem, ). For instance: move (mem foo, x)
must be rewritten into: move (temp t, foo)
move (mem (temp t), x)
not as: move (temp t, mem (foo))
move (temp t, x)
In other words, the first subexpression of move (mem (foo), ) is
foo, not mem (foo). The following example is a good
crash test against this problem: let type int_array = array of int
var tab := int_array [2] of 51
in
tab[0] := 100;
tab[1] := 200;
print_int (tab[0]); print ("\n");
print_int (tab[1]); print ("\n")
end
File 61: move-mem.tig $ tc --escapes-compute --lir-display move-mem.tig >move-mem.lir
=>0
Example 62: tc -escapes-compute -lir-display
move-mem.tig >move-mem.lir $ havm move-mem.lir
100
200
Example 63: havm move-mem.lir
You also ought to get rid of nested calls: print (chr (ord ("\n")))
File 64: nested-calls.tig $ tc --lir-display nested-calls.tig
/* == Low Level Intermediate representation. == */
label l0
"
"
# Routine: Main Program
label l`Main'
# Prologue
# Body
seq
label l1
move
temp t1
call
name l`ord'
name l0
call end
move
temp t2
call
name l`chr'
temp t1
call end
sxp
call
name l`print'
temp t2
call end
label l2
seq end
# Epilogue
label end
Example 65: tc -lir-display nested-calls.tig
In fact there are only two valid call forms: sxp (call (...)),
and move (temp (...), call (...)).
Note that, contrary to C, the HIR and LIR
always denote the same value. For instance the following Tiger code: let var a := 1
function a (t: int) : int =
(a := a + 1;
print_int (t); print (" -> "); print_int (a); print ("\n");
a)
var b := a (1) + a (2) * a (3)
in
print_int (b); print ("\n")
end
should always produce: 1 -> 2
2 -> 3
3 -> 4
14
independently of the what IR you ran. Note that it has
nothing to do with the precedence of the operators!
In C, you have no such guarantee: #include <stdio.h>
int _a = 1;
int a (int t)
{
++_a;
printf ("%d -> %d\n", t, _a);
return _a;
}
int main (void)
{
int b = a (1) + a (2) * a (3);
printf ("%d\n", b);
return 0;
}
will always give the same result on all compilers and architectures.
T6 Scheduling Samples
Once you have canonicalized your eseq and call, you
have to canonicalize cjumps: they must always be followed by their
"false" label. This goes in two steps:
- Split in basic blocks.
A basic block is a sequence of code starting with a label, ending with a
jump (conditional or not), and with no jumps, no labels inside.
- Build the traces.
Now put all the basic blocks into a single sequence.
In the following, the result of the whole conversion is visible.
The following examples highlights the need for new labels: at least one for
the entry point, and one for the exit point: 1 & 2
File 66: 1-and-2.tig $ tc --lir-display 1-and-2.tig
/* == Low Level Intermediate representation. == */
# Routine: Main Program
label l`Main'
# Prologue
# Body
seq
label l3
cjump ne
const 1
const 0
name l0
name l1
label l1
move
temp t0
const 0
label l2
jump
name l4
label l0
move
temp t0
const 2
jump
name l2
label l4
seq end
# Epilogue
label end
Example 67: tc -lir-display 1-and-2.tig
The following example contains many jumps. Compare the hir to
the lir: while 10 | 20 do if 30 | 40 then break else break
File 68: broken-while.tig $ tc --hir-display broken-while.tig
/* == High Level Intermediate representation. == */
# Routine: Main Program
label l`Main'
# Prologue
# Body
seq
label l10
cjump ne
eseq
seq
cjump ne
const 10
const 0
name l0
name l1
label l0
move
temp t0
const 1
jump
name l2
label l1
move
temp t0
const 20
jump
name l2
label l2
seq end
temp t0
const 0
name l11
name l3
label l11
seq
cjump ne
eseq
seq
cjump ne
const 30
const 0
name l4
name l5
label l4
move
temp t1
const 1
jump
name l6
label l5
move
temp t1
const 40
jump
name l6
label l6
seq end
temp t1
const 0
name l7
name l8
label l7
jump
name l3
jump
name l9
label l8
jump
name l3
jump
name l9
label l9
seq end
jump
name l10
label l3
seq end
# Epilogue
label end
Example 69: tc -hir-display broken-while.tig $ tc --lir-display broken-while.tig
/* == Low Level Intermediate representation. == */
# Routine: Main Program
label l`Main'
# Prologue
# Body
seq
label l12
label l10
cjump ne
const 10
const 0
name l0
name l1
label l1
move
temp t0
const 20
label l2
cjump ne
temp t0
const 0
name l11
name l3
label l3
jump
name l13
label l11
cjump ne
const 30
const 0
name l4
name l5
label l5
move
temp t1
const 40
label l6
cjump ne
temp t1
const 0
name l7
name l8
label l8
jump
name l3
label l7
jump
name l3
label l4
move
temp t1
const 1
jump
name l6
label l0
move
temp t0
const 1
jump
name l2
label l13
seq end
# Epilogue
label end
Example 70: tc -lir-display broken-while.tig
See also that --hir-use-ifexp is really much more pleasant: $ tc --hir-use-ifexp --lir-display broken-while.tig
/* == Low Level Intermediate representation. == */
# Routine: Main Program
label l`Main'
# Prologue
# Body
seq
label l10
label l1
cjump ne
const 10
const 0
name l8
name l9
label l9
cjump ne
const 20
const 0
name l2
name l0
label l0
jump
name l11
label l2
cjump ne
const 30
const 0
name l6
name l7
label l7
cjump ne
const 40
const 0
name l3
name l4
label l4
jump
name l0
label l3
jump
name l0
label l6
cjump ne
const 1
const 0
name l3
name l13
label l13
jump
name l4
label l8
cjump ne
const 1
const 0
name l2
name l0
label l11
seq end
# Epilogue
label end
Example 71: tc -hir-use-ifexp -lir-display
broken-while.tig
T6 Given Code
No additional code will be given.
T6 Code to Write
Everything you need.
Tools
This chapter aims at providing some helpful information about the various
tools that you are likely to use to implement tc. It does not
replace the reading of the genuine documentation, nevertheless, helpful tips are
given. Feel free to contribute additional information.
Modern Compiler Implementation
The single most important tool for implementing the Tiger Project is the
original book, Modern
Compiler Implementation in C/Java/ML, by Andrew W. Appel, published by
Cambridge University Press (New York, Cambridge). ISBN 0-521-58388-8/
There are three flavors of this book:
- C
- The code samples are written in C. I strongly recommend that you avoid
this edition, as C is not appropriate to describe the elaborate algorithms
involved: most of the time, the simple ideas are destroyed with longuish
unpleasant lines of code.
- Java
- The samples are written in Java. This book is the closest to the EPITA
Tiger Project, since it is written in an object oriented language.
Nevertheless, the modelisation is very poor, and therefore, don't be surprised
if the EPITA project is significantly different. For a start, there is no
Visitors at all. Of course the main purpose of the book is compilers, but it
is not a reason for such a poor modelisation.
- ML
- This book, which is the "original", provides code samples in ML, which is
a very adequate language to write compilers. Therefore it is very readable,
even if you are not fluent in ML. I recommend this edition, unless you have
severe problems with functional programming.
This book addresses many more issues than the sole Tiger Project as we
implement it. In other words, it is an extremely interesting book which provides
insights on garbage collection, object oriented and functional languages etc.
There is a dozen copies at the EPITA library, but buying it is a good idea.
Pay extra attention: there are several errors in the books, some of which are
reported on Andrew Appel's pages (C Java, and
ML), some
which are not.
The GNU Build System
Automake is used to facilitate the writing of power Makefile.
Autoconf is required by Automake: we don't not address portability issues for
this project.
You may read the Autoconf documentation,
and the Automake
documentation. Using info is pleasant: info
autoconf on any properly set up system. The Goat Book covers the whole
GNU Build System: Autoconf, Automake and Libtool.
Package Name and Version
To set the name and version of your package, change the AC_INIT
invocation. For instance, T4 for the bardec_f group gives
AC_INIT(bardec_f-tiger, 4). Warning: Autoconf 2.53 smashes the
underscores into dashes. To workaround this misfeature, use: AC_INIT([bardec_f-tiger], 4, [bardec_f@epita.fr], [bardec_f-tiger])
Bootstrapping the Package
If something goes wrong, or if it is simply the first time yu create
configure.ac or a Makefile.am, you need to set up the
GNU Build System. The simplest invocation is: $ autoreconf -fvi
The various files (configure, Makefile.in, etc.)
are created. Then invoke configure and make: $ ./configure CC=gcc-3.2 CXX=g++-3.2
$ make
where gcc-3.2 and g++-3.2 are the name of the GNU C and
C++ Compiler 3.2 executables.
At this stage, if running make distcheck does not create
bardec_f-tc-4.tar.bz2, then something is wrong in your package. Do
not rename it, do not create the tarball by hand: something is rotten and be
sure it will break on the examiner's machine.
Flex & Bison
We use Bison 1.875a which is able to produce a C++ parser. This Bison is
unpublished, as the maintainers still have issues to fix. Nevertheless, it is
usable, and perfectly functional for Tiger. It is installed in
~akim/bin, under the name bison. Be aware that Bison
1.875 produces buggy C++ parsers.
If you don't use this Bison, you will be in trouble. If you are willing to
work at home, use bison-1.875a.tar.bz2.
The original papers on Lex and Yacc are:
- Johnson, Stephen C. [1975].
- Yacc: Yet
Another Compiler Compiler. Computing Science Technical Report No. 32, Bell
Laboratories, Murray hill, New Jersey.
- Lesk, M. E. and E. Schmidt [1975].
- Lex: A
Lexical Analyzer Generator. Computing Science Technical Report No. 39,
Bell Laboratories, Murray Hill, New Jersey.
The following introduction guides can help beginners:
- Thomas Niemann.
- A Compact Guide
to Lex & Yacc.
An introduction to Lex and Yacc.
- Collective Work
- Programming with
GNU Software.
Contains information about Autoconf, Automake, Gperf, Flex, Bison, and GCC.
The Bison
documentation, and the Flex documentation are
available for browsing.
Doxygen
We use Doxygen as the
standard tool for producing the developer's documentation of the project. Its
features must be used to produce good documentation, with an
explanation of the role of the arguments etc. The quality of the documentation
will be part of the notation. Details on how to use proper comments are given in
the Doxygen
Manual.
The documentation produced by Doxygen must not be included, but the target
html must produce the html documentation in the
doc/html directory.
Reports
This chapter is meaningless to 2003 students.
Each group will have to deliver two reports in English, and will defend them
in English too.
Dates for the Reports
The 2003 are not yet concerned by this section. Reports will be announced
in epita.cours.compil first.
The reports will have to be given at:
- Sometime
- Computability.
- Some other time
- You must have sent your proposal of a subject for the second report to Akim.
- Later
- Second report.
Requirements over the Reports
The reports you have to write are a training for a task you will regularly
have to fulfill as an engineer. Therefore, there are a few strong requirements.
Being requirements, they will not figure in the mark, nevertheless, a report
that does not meet the requirements will be penalized.
This report will of course include the usual items:
- cover
- Upper-right corner must list the members of the group, the leader of the
team being listed first.
Include a title, the date, and an abstract below the title. An abstract is
a < 20 line-long summary of the report.
- table of contents
- With page numbers of course. Needless to say that the table of contents
must be accurate, and therefore shall point to the right pages.
- introduction or foreword
- Closely related to the abstract, it details the issue the document is
covering, and highlights its structure.
- body
- The report must be structured. In particular, you shall
not simply present chapters named after the questions: you must find the logic
in the topic of the report, and intelligently organize your report. As a rule
of thumb, imagine your report is `lecture notes': you are teaching someone.
- index
- Admittedly useless for such short reports, the index is required so that
you learn how to make indexes.
- bibliography
- Any source must be documented, the reader must be able to
find whatever document you used.
In the body of text, do not give all the details over a book or a Web page,
instead, leave an anchor, aka a key, such as [Richie77], or [1], and in the
bibliography, map each key to a detailed description of the source.
Note that this rule shall be observed for urls too: don't drop urls in the
body, instead leave a key, and detail the key in the bibliography. Also, by no
means a web page shall be mentioned with just its address! That's about the
same thing as just leaving the ISBN for a book: that's useless. Instead,
report the title of the page, the name of its author(s), and its address.
Don't hesitate to produce an annotated bibliography, i.e., entries might be
enriched with a few words which describe its purpose, but also your ranking on
the page (old information, very technical, completely wrong etc.).
- glossary
- When they talk about science in English, most French people just map their
French words to an approximative English writing. Of course this is wrong.
Therefore, you will be noted on the accuracy of your glossary: there, you will
list of the unusual words you needed, and their definition.
No glossary shall be smaller than 20 words. No entry shall duplicate the
content of the report. For instance, if you have a chapter on regular
languages, the glossary shall not redefine the concept again.
In addition, don't provide an encyclopedia in this glossary, rather pay
attention to the definitions of very simple and common technical words (e.g.,
un ensemble, inclus dans etc.).
Computability
This report is both an initiation to scientific reading and writing --in
English--, and to the limitations of computer science.
Your report should explain:
- P
- What does it mean that a problem is a P-problem? What is the definition?
- NP and NP completeness
- Decidability
- But also undecidability, and semi-decidability.
- Computability.
We expect a report, not just a list of "question-answer",
i.e., we expect something which is structured. It does not need to be
long, but your knowledge must be accurate. The understanding of each member of
the group will be evaluated.
Formal Languages
This report is dedicated to formal languages and their processing;
this includes from programming languages and their
compilation/interpretation but you are not limited to it.
Ranging from 20 to 50 pages this report should explain the purpose of the
language, its designers, its advantages, its flaws, what inspired its inception,
how it is meeting its intentions, its associated tools, the specific problems it
gives to the user (implementor of a tool for this language, and/or someone
programming in this language), standardization, versions etc.
It is expected from you that you understand the spirit of this report, not
the letter, so do not follow exactly what is said in the previous paragraph.
You are encouraged to choose a language which is not too much widespread,
such as C, Pascal etc. but if you do, beware that we will be extremely exigent
with you.
Interesting languages includes Claire, Pliant, Erlang, OCaml, Small etc. Documents on old languages
are absolutely welcome: SmallTalk, Snobol, Cobol, etc.
You must write a proposal (less than a single page) for this report (see Dates
for the Reports). We might accept, require changes, or simply refuse your
proposal. Your abstract must explain why you are interested in this topic, and
explain why your work will be pleasant and enriching for both you and us.
Of course, there is a default subject:
Not available yet. Will be given in a near
future.
Appendices
GNU Free Documentation License
Version 1.1, March 2000 Copyright © 2000 Free Software Foundation, Inc.
59 Temple Place, Suite 330, Boston, MA 02111-1307, USA
Everyone is permitted to copy and distribute verbatim copies
of this license document, but changing it is not allowed.
- PREAMBLE
The purpose of this License is to make a manual, textbook, or other written
document free in the sense of freedom: to assure everyone the
effective freedom to copy and redistribute it, with or without modifying it,
either commercially or noncommercially. Secondarily, this License preserves
for the author and publisher a way to get credit for their work, while not
being considered responsible for modifications made by others.
This License is a kind of "copyleft", which means that derivative works of
the document must themselves be free in the same sense. It complements the GNU
General Public License, which is a copyleft license designed for free
software.
We have designed this License in order to use it for manuals for free
software, because free software needs free documentation: a free program
should come with manuals providing the same freedoms that the software does.
But this License is not limited to software manuals; it can be used for any
textual work, regardless of subject matter or whether it is published as a
printed book. We recommend this License principally for works whose purpose is
instruction or reference.
- APPLICABILITY AND DEFINITIONS
This License applies to any manual or other work that contains a notice
placed by the copyright holder saying it can be distributed under the terms of
this License. The "Document", below, refers to any such manual or work. Any
member of the public is a licensee, and is addressed as "you".
A "Modified Version" of the Document means any work containing the Document
or a portion of it, either copied verbatim, or with modifications and/or
translated into another language.
A "Secondary Section" is a named appendix or a front-matter section of the
Document that deals exclusively with the relationship of the publishers or
authors of the Document to the Document's overall subject (or to related
matters) and contains nothing that could fall directly within that overall
subject. (For example, if the Document is in part a textbook of mathematics, a
Secondary Section may not explain any mathematics.) The relationship could be
a matter of historical connection with the subject or with related matters, or
of legal, commercial, philosophical, ethical or political position regarding
them.
The "Invariant Sections" are certain Secondary Sections whose titles are
designated, as being those of Invariant Sections, in the notice that says that
the Document is released under this License.
The "Cover Texts" are certain short passages of text that are listed, as
Front-Cover Texts or Back-Cover Texts, in the notice that says that the
Document is released under this License.
A "Transparent" copy of the Document means a machine-readable copy,
represented in a format whose specification is available to the general
public, whose contents can be viewed and edited directly and straightforwardly
with generic text editors or (for images composed of pixels) generic paint
programs or (for drawings) some widely available drawing editor, and that is
suitable for input to text formatters or for automatic translation to a
variety of formats suitable for input to text formatters. A copy made in an
otherwise Transparent file format whose markup has been designed to thwart or
discourage subsequent modification by readers is not Transparent. A copy that
is not "Transparent" is called "Opaque".
Examples of suitable formats for Transparent copies include plain
ASCII without markup, Texinfo input format, LaTeX input format,
SGML or XML using a publicly available
DTD, and standard-conforming simple HTML
designed for human modification. Opaque formats include PostScript,
PDF, proprietary formats that can be read and edited only by
proprietary word processors, SGML or XML for
which the DTD and/or processing tools are not generally
available, and the machine-generated HTML produced by some word
processors for output purposes only.
The "Title Page" means, for a printed book, the title page itself, plus
such following pages as are needed to hold, legibly, the material this License
requires to appear in the title page. For works in formats which do not have
any title page as such, "Title Page" means the text near the most prominent
appearance of the work's title, preceding the beginning of the body of the
text.
- VERBATIM COPYING
You may copy and distribute the Document in any medium, either commercially
or noncommercially, provided that this License, the copyright notices, and the
license notice saying this License applies to the Document are reproduced in
all copies, and that you add no other conditions whatsoever to those of this
License. You may not use technical measures to obstruct or control the reading
or further copying of the copies you make or distribute. However, you may
accept compensation in exchange for copies. If you distribute a large enough
number of copies you must also follow the conditions in section 3.
You may also lend copies, under the same conditions stated above, and you
may publicly display copies.
- COPYING IN QUANTITY
If you publish printed copies of the Document numbering more than 100, and
the Document's license notice requires Cover Texts, you must enclose the
copies in covers that carry, clearly and legibly, all these Cover Texts:
Front-Cover Texts on the front cover, and Back-Cover Texts on the back cover.
Both covers must also clearly and legibly identify you as the publisher of
these copies. The front cover must present the full title with all words of
the title equally prominent and visible. You may add other material on the
covers in addition. Copying with changes limited to the covers, as long as
they preserve the title of the Document and satisfy these conditions, can be
treated as verbatim copying in other respects.
If the required texts for either cover are too voluminous to fit legibly,
you should put the first ones listed (as many as fit reasonably) on the actual
cover, and continue the rest onto adjacent pages.
If you publish or distribute Opaque copies of the Document numbering more
than 100, you must either include a machine-readable Transparent copy along
with each Opaque copy, or state in or with each Opaque copy a
publicly-accessible computer-network location containing a complete
Transparent copy of the Document, free of added material, which the general
network-using public has access to download anonymously at no charge using
public-standard network protocols. If you use the latter option, you must take
reasonably prudent steps, when you begin distribution of Opaque copies in
quantity, to ensure that this Transparent copy will remain thus accessible at
the stated location until at least one year after the last time you distribute
an Opaque copy (directly or through your agents or retailers) of that edition
to the public.
It is requested, but not required, that you contact the authors of the
Document well before redistributing any large number of copies, to give them a
chance to provide you with an updated version of the Document.
- MODIFICATIONS
You may copy and distribute a Modified Version of the Document under the
conditions of sections 2 and 3 above, provided that you release the Modified
Version under precisely this License, with the Modified Version filling the
role of the Document, thus licensing distribution and modification of the
Modified Version to whoever possesses a copy of it. In addition, you must do
these things in the Modified Version:
- Use in the Title Page (and on the covers, if any) a title distinct from
that of the Document, and from those of previous versions (which should, if
there were any, be listed in the History section of the Document). You may
use the same title as a previous version if the original publisher of that
version gives permission.
- List on the Title Page, as authors, one or more persons or entities
responsible for authorship of the modifications in the Modified Version,
together with at least five of the principal authors of the Document (all of
its principal authors, if it has less than five).
- State on the Title page the name of the publisher of the Modified
Version, as the publisher.
- Preserve all the copyright notices of the Document.
- Add an appropriate copyright notice for your modifications adjacent to
the other copyright notices.
- Include, immediately after the copyright notices, a license notice
giving the public permission to use the Modified Version under the terms of
this License, in the form shown in the Addendum below.
- Preserve in that license notice the full lists of Invariant Sections and
required Cover Texts given in the Document's license notice.
- Include an unaltered copy of this License.
- Preserve the section entitled "History", and its title, and add to it an
item stating at least the title, year, new authors, and publisher of the
Modified Version as given on the Title Page. If there is no section entitled
"History" in the Document, create one stating the title, year, authors, and
publisher of the Document as given on its Title Page, then add an item
describing the Modified Version as stated in the previous sentence.
- Preserve the network location, if any, given in the Document for public
access to a Transparent copy of the Document, and likewise the network
locations given in the Document for previous versions it was based on. These
may be placed in the "History" section. You may omit a network location for
a work that was published at least four years before the Document itself, or
if the original publisher of the version it refers to gives permission.
- In any section entitled "Acknowledgments" or "Dedications", preserve the
section's title, and preserve in the section all the substance and tone of
each of the contributor acknowledgments and/or dedications given therein.
- Preserve all the Invariant Sections of the Document, unaltered in their
text and in their titles. Section numbers or the equivalent are not
considered part of the section titles.
- Delete any section entitled "Endorsements". Such a section may not be
included in the Modified Version.
- Do not retitle any existing section as "Endorsements" or to conflict in
title with any Invariant Section.
If the Modified Version includes new front-matter sections or appendices
that qualify as Secondary Sections and contain no material copied from the
Document, you may at your option designate some or all of these sections as
invariant. To do this, add their titles to the list of Invariant Sections in
the Modified Version's license notice. These titles must be distinct from any
other section titles.
You may add a section entitled "Endorsements", provided it contains nothing
but endorsements of your Modified Version by various parties--for example,
statements of peer review or that the text has been approved by an
organization as the authoritative definition of a standard.
You may add a passage of up to five words as a Front-Cover Text, and a
passage of up to 25 words as a Back-Cover Text, to the end of the list of
Cover Texts in the Modified Version. Only one passage of Front-Cover Text and
one of Back-Cover Text may be added by (or through arrangements made by) any
one entity. If the Document already includes a cover text for the same cover,
previously added by you or by arrangement made by the same entity you are
acting on behalf of, you may not add another; but you may replace the old one,
on explicit permission from the previous publisher that added the old one.
The author(s) and publisher(s) of the Document do not by this License give
permission to use their names for publicity for or to assert or imply
endorsement of any Modified Version.
- COMBINING DOCUMENTS
You may combine the Document with other documents released under this
License, under the terms defined in section 4 above for modified versions,
provided that you include in the combination all of the Invariant Sections of
all of the original documents, unmodified, and list them all as Invariant
Sections of your combined work in its license notice.
The combined work need only contain one copy of this License, and multiple
identical Invariant Sections may be replaced with a single copy. If there are
multiple Invariant Sections with the same name but different contents, make
the title of each such section unique by adding at the end of it, in
parentheses, the name of the original author or publisher of that section if
known, or else a unique number. Make the same adjustment to the section titles
in the list of Invariant Sections in the license notice of the combined work.
In the combination, you must combine any sections entitled "History" in the
various original documents, forming one section entitled "History"; likewise
combine any sections entitled "Acknowledgments", and any sections entitled
"Dedications". You must delete all sections entitled "Endorsements."
- COLLECTIONS OF DOCUMENTS
You may make a collection consisting of the Document and other documents
released under this License, and replace the individual copies of this License
in the various documents with a single copy that is included in the
collection, provided that you follow the rules of this License for verbatim
copying of each of the documents in all other respects.
You may extract a single document from such a collection, and distribute it
individually under this License, provided you insert a copy of this License
into the extracted document, and follow this License in all other respects
regarding verbatim copying of that document.
- AGGREGATION WITH INDEPENDENT WORKS
A compilation of the Document or its derivatives with other separate and
independent documents or works, in or on a volume of a storage or distribution
medium, does not as a whole count as a Modified Version of the Document,
provided no compilation copyright is claimed for the compilation. Such a
compilation is called an "aggregate", and this License does not apply to the
other self-contained works thus compiled with the Document, on account of
their being thus compiled, if they are not themselves derivative works of the
Document.
If the Cover Text requirement of section 3 is applicable to these copies of
the Document, then if the Document is less than one quarter of the entire
aggregate, the Document's Cover Texts may be placed on covers that surround
only the Document within the aggregate. Otherwise they must appear on covers
around the whole aggregate.
- TRANSLATION
Translation is considered a kind of modification, so you may distribute
translations of the Document under the terms of section 4. Replacing Invariant
Sections with translations requires special permission from their copyright
holders, but you may include translations of some or all Invariant Sections in
addition to the original versions of these Invariant Sections. You may include
a translation of this License provided that you also include the original
English version of this License. In case of a disagreement between the
translation and the original English version of this License, the original
English version will prevail.
- TERMINATION
You may not copy, modify, sublicense, or distribute the Document except as
expressly provided for under this License. Any other attempt to copy, modify,
sublicense or distribute the Document is void, and will automatically
terminate your rights under this License. However, parties who have received
copies, or rights, from you under this License will not have their licenses
terminated so long as such parties remain in full compliance.
- FUTURE REVISIONS OF THIS LICENSE
The Free Software Foundation may publish new, revised versions of the GNU
Free Documentation License from time to time. Such new versions will be
similar in spirit to the present version, but may differ in detail to address
new problems or concerns. See http://www.gnu.org/copyleft/.
Each version of the License is given a distinguishing version number. If
the Document specifies that a particular numbered version of this License "or
any later version" applies to it, you have the option of following the terms
and conditions either of that specified version or of any later version that
has been published (not as a draft) by the Free Software Foundation. If the
Document does not specify a version number of this License, you may choose any
version ever published (not as a draft) by the Free Software Foundation.
ADDENDUM: How to use this License for your
documents
To use this License in a document you have written, include a copy of the
License in the document and put the following copyright and license notices just
after the title page: Copyright (C) year your name.
Permission is granted to copy, distribute and/or modify this document
under the terms of the GNU Free Documentation License, Version 1.1
or any later version published by the Free Software Foundation;
with the Invariant Sections being list their titles, with the
Front-Cover Texts being list, and with the Back-Cover Texts being list.
A copy of the license is included in the section entitled ``GNU
Free Documentation License''.
If you have no Invariant Sections, write "with no Invariant Sections" instead
of saying which ones are invariant. If you have no Front-Cover Texts, write "no
Front-Cover Texts" instead of "Front-Cover Texts being list";
likewise for Back-Cover Texts.
If your document contains nontrivial examples of program code, we recommend
releasing these examples in parallel under your choice of free software license,
such as the GNU General Public License, to permit their use in free software.
Index
|