Details of Implementation for Syntaxtree

Used Structures

The so called syntay_tree is in reality a list of real trees. Each trees root is a Fun_node_ptr. Its children are Expr_node_ptr's, which may have Expr_node_ptr's as children theirself.

Fun_node_ptr

typedef struct _Fun_node_struct {
  gchar*    fun_id;
  Nodetype  nodetype;    //only e_invalid, e_userfun and 
                         //e_constructor are used
  MLType    resulttype;  //type of the value which is returned or 
                         //constructed
  GList*    paramslist;  //list of Expr_node_ptr's for each clause
  GList*    bindingslist;//list of Symboltable_entry_ptr's for each clause
                         //stores the type of id's occurring in parameters
  GList*    bodylist;    //unused in case of e_constructor, otherwise a
                         //list of Expr_node_ptr's for each clause
} Fun_node_struct, *Fun_node_ptr;
   

Expr_node_ptr

typedef struct _Expr_node_struct{
  Nodetype   nodetype;  //the kind of this node
  gchar*     strrep;    //string representation of node
  MLValue    value;     //the value represented by this node 
                        //(may by unused)
  MLType     valuetype; //type of the value found/computed by the typechecker
  MLType     valuetype_expected; //expected type of the value
  gpointer   misc;      //additional stuff for later use 
                        //(e.g. the position on screen)
  //Maybe the following data should be moved into a misc-structure.
  //So use ACCESSMACROS!!!
  gint       lineno;    //line where the token was read
  gint       colno;     //col where the token was read
  struct _Expr_node_struct *child[EXPR_NODE_MAX_CHILDREN];
} Expr_node_struct, *Expr_node_ptr;
   

Syntax Checking

Checking the syntax is done by the code automaticly generated by (LEX &) YACC.

Symbol Checking

Within a datatype-system only those types may be used, which are allready defined befor or will be defined later, but still within this datatype-system. That makes it a little bit harder to check, if the actually seen type is valid or not. To do so it is necessary to keep an actually undefined type in mind and check its definition again, when having parsed the whole datatype-sytem.
For function-systems it is similar. Within the body only those functions may occur, which are allready defined or will be defined within this system.

Datatype-systems

The actual datatype within a datatype-system is stored in the symbol_table, if its identifier is not allready stored in the symbol_table. Beeing in the symbol_table would mean, that this identifier is allready defined elsewhere.
The same is done for each constructor. Each type occuring in the definition of a constructor will be checked. If it is undefined, it's not an error until now. To keep this identifier in mind it is stored to a special list called undefined_symbols.
Having parsed the whole datatype-system for each identifier in the undefined_symbols is checked, if it is defined now. If not it is an error. After that undefined_symbols is cleared for the next system.

Function-systems

If the actual function within a function-system is not stored allready in the symbol_table, it is now. Otherwise it is an error. Each identifier of a parameter is stored in a binding_table.
At least the body is parsed. Each identifier, which is not a function application, has to be found in the symbol_table or the binding_table. If not it is an error. Each identifier of a function application, which is not allready in the symbol_table, will be stored to the undefined_symbols. After the whole function-system has been parsed, for those identifiers is checked, if they are defined in the symbol_table now.

Error messages

To be able to present the user a meaningfull error message, it is necessary to store the line and column number of the identifier with it in the undefined_symbols.

Type Checking

The type checking is done in a seperated step by traversing the whole syntaxtree. That keeps the YACC-file from becoming to complex. Building the syntaxtree and checking the symbold makes it complex enough. Another reason is that the syntaxtree is build in bottom-up manner and type checking is done top-down.
**** to be done ****

Memory Issues