Checker - nponeccop/HNC GitHub Wiki

The Checker assigns types to all subterms of a term.

Source files

Type inference is mostly contained in:

  • SPL.Types.T - data type for types
  • HN.Intermediate.Root - data type for terms
  • AG\Root.ag - propagates @finalSubstitution and @library down
  • AG\TypeInference.ag - calculates @s, @tau for all subterms and @templateArgs for expressions
  • AG\TemplateArgs.ag - calculates @templateArgs and @contextTemplateArgs for definitions

Data Type for Untyped Terms

The terms are represented by HN.Intermediate.Root Haskell type.

  • A term is an ordered collection of definitions.
  • A definition is a triple of a name, formal arguments and let-in block.
  • A let-in block is a possibly empty sequence of "lets" followed by a single "in".
  • A "let" is recursively defined as a definition.
  • An "in" is an expression.
  • An expression consists of identifiers, numerals, literals and applications.

A little more formally:

term = definition+
definition = name argument* letin
letin = let* in
let = definition
in = expression
expression = identifier | application | literal | numeral
application = expression expression+

Notes:

  • A term cannot contain type declarations
  • An expression doesn't contain lambda abstractions or let-in blocks.
  • Variables are represented by definitions with 0 arguments.
  • Variables always have 0 lets
  • Functions are represented by definitions with arguments.
  • A function must have a name - there are no anonymous functions.
  • Functions are uncurried and cannot be partially applied.
  • If a function identifier is mentioned in an expression, it has either 0 actual arguments (function reference) or a full list of actual arguments (full function application).

Data Type for Types

The types are represented by SPL.Types.T Haskell type. However, only first 5 constructors are used: T, TT, TU, TV and TD. The rest are either unused or used within SPL.

  • T is a primitive type
  • TT is a function type
  • TV is a generic type variable
  • TU is a non-generic type variable
  • TD is an application of a type list to a parametrized type
type = T string
     | TT type type+
     | TV string
     | TU string
     | TD string type+

Representation of Typed Terms

Typed terms are not represented by a Haskell type but by an attributed grammar over terms.

Each definition in a term has @definitionType, @templateArgs and @contextTemplateArgs local attributes.

Each (sub)expression in a term has @templateArguments local attribute.

They play the following roles generation of C++ variables:

@definitionType foo = bar(baz<@templateArguments>(2, "quux"));

With C++ functions it is more complex. Function return type and argument types are extracted from @definitionType (which is always TT for functions)

template <@contextTemplateArgs>
struct foo_impl
{
   ...
}

template <@templateArgs>
returnType foo(arg0type a, arg1type b)
{
    typedef foo_impl<@contextTemplateArgs> local;
    local impl;
    ...
}

Note that formal and actual template arguments for foo_impl are textually the same, so @contextTemplateArgs is used in two places.

Type Inference

Classical algorithm W from the Damas paper is used. Unfortunately the paper contained typos in critical places, so I had to use Cardelli's paper to correct them.

The @library inherited attribute provides definition types for free identifiers in a term.

Given @library and a term, the algorithm calculates @s and @tau attributes
for each definition and (sub)expression.

Since some extra information is required to generate C++ types besides @s and @tau, after genuine Damas-Milner-Cardelli inference I do some extra work:

  • All @s are propagated up and composed to create a giant @finalSubstitution for whole term.
  • This @finalSubstitution is then propagated back down to every subterm.
  • These three values (@s, @tau and @finalSubstitution) are then used to calculate the 4 attributes mentioned in the previous chapter.

References

  1. Damas, L. M. M. Type Assignment in Programming Languages. PhD thesis, University of Edinburgh, 1985

  2. Damas, L., and Milner, R. Principal type schemes for functional programs. Proceedings of the 9th Annual Symposium on Principles of Programming Languages (January 1982), 207-212

  3. L.Cardelli, P.Wegner. On understanding types, data abstraction and polymorphism, Computing Surveys, Vol 17 n. 4, pp 471-522, December 1985.