README - UCSC-CSE-114A/05-types GitHub Wiki
In Assignment 4, you wrote an interpreter for the Nano programming language, which resembles a (very, very small) cousin of Haskell. This interpreter lets you execute Nano programs and see what their results are.
However, some Nano programs don't really make sense: for example, what should a program like 3 && True
, or (\x -> x) && False
, or 3 + head
mean? Should these programs even have a meaning? In fact, our Nano interpreter will throw an error for them. Wouldn't it be great if we could tell if a Nano program was going to fail before we ran it?
By now, you will have seen Haskell report type errors for this kind of situation. For example, you can use the :t
command in GHCi to ask what the type of any Haskell expression is -- without actually evaluating the expression. This is called type checking, and it's a form of static analysis: static, because it occurs before runtime, and analysis, because we are studying a program to obtain some information about it. Type checking can't catch all errors; but if your program is considered well-typed by the type checker, then you're guaranteed that it won't exhibit the kinds of errors that your type system is meant to catch.
In this assignment, you will implement a type checker for Nano, just like the :t
command in GHCi. If a program fails to type-check, you will be able to identify the error (e.g., if your program tried to add 3
and False
); and if a program does type-check, you will be able to say what its type actually is.
As a bonus, your checker is designed in such a way that it can support type polymorphism, so that generic functions like \x -> x
and map
can be defined without having to write down specialized copies for every type of data you want to invoke them on.
Note: Start early! Type inference has many subtle points that can take a while to get right.
This assignment may be done either individually, or as a team of two.
If you choose to work as a team of two, both teammates will be expected to be able to understand and explain all of the submitted code. So you need to work on all the problems collaboratively (e.g., Alice and Bob pair program on all problems), rather than splitting it up (e.g., Alice does problem 1, Bob does problem 2, ...). As a result, working on a team may actually be harder than working alone (e.g., if your teammate isn't great at communication and collaboration). We recommend working alone unless you have already identified a teammate who you know well and are confident that you can communicate and collaborate with well.
When you accept the assignment on GitHub Classroom, you'll be asked to create a team or join an existing team.
- If you want to work individually, just create a new team for yourself (and name your team however you like).
- If you want to work together with another student, then decide which person will create the team, and then have the other person join the newly created team. Both of you will need to accept the assignment on GitHub Classroom, but the first person will create the team and the second person will join the team. The team will then work together in a single repository. Do not join someone's team without getting their permission first!
Even if you work on a team, each teammate still needs to individually submit the assignment to the autograder and will individually receive a grade.
When you are ready to submit your solutions to the autograder, follow these instructions step-by-step:
- You must create a file
INTEGRITY.md
at the top level of your assignment repository in which you give credit to all sources you used while working on the assignment. Thorough citation is the way to avoid running afoul of accusations of misconduct.- If you work as a team, then your team's INTEGRITY.md file should include the names of both teammates, and should explain how you collaborated on all the assignment problems.
- So that you have an idea of the level of detail that is expected, there is an example
INTEGRITY.md
file that you can reference. - If you didn't reference any external sources, just say so in
INTEGRITY.md
.
- Commit your changes and push your Git repository. (You should be doing this incrementally as you work on the assignment anyway.)
- Submit your solutions using the autograder client:
where
python3 -m autograder.run.submit \ INTEGRITY.md \ src/Language/Nano/TypeCheck.hs \ --user <email> \ --pass <password> \ --course <course>
<email>
is your UCSC email address,<password>
is your autograder password, and<course>
is an identifier specific to this course and section, which you can find on the Canvas assignment description for this assignment.
Running the autograder.run.submit
command will submit your code to the autograder for evaluation and print out the results. Have patience -- it might take a minute or two to see output from the autograder. You can submit as many times as you like before the deadline, and only the last submission will be counted.
The autograder client has other commands that you can use to interact with the autograder. In particular, you can run python3 -m autograder.run.peek
to see the results of your last submission, and python3 -m autograder.run.history
to see the results of all your submissions. Refer to the autograder client documentation for more information.
If you cannot get your implementation of a function to compile by the time you submit your solutions, leave the broken function defined as error ...
, with your partial solution enclosed below as a comment. Since all of your solutions are are compiled together, a compile-time error with one solution would prevent all of them from being graded!
In this assignment, you will edit the src/Language/Nano/TypeCheck.hs file. This file contains a set of functions with bodies that simply throw an error (error "TBD: ..."
); your mission is to complete those functions according to the problem descriptions below. You should only need to modify the parts of the files that say error "TBD: ..."
with suitable Haskell implementations. However, you may define your own helper functions if it helps you; and if you're asked to fill in a function definition, such as:
f xs = error "TBD: ..."
you are also allowed to split this definition into multiple equations, like so:
f [] = ...
f (x:xs) = ...
Most problems require relatively little code ranging from 2 to 10 lines. Some functions, particularly unify
and infer
, will have many clauses due to pattern matching; then each clause should individually require relatively little code. If any clause requires more than that, you can be sure that you need to rethink your solution.
As you complete each problem, you should run stack test
to check for various possible issues with your solution. If any tests fail, you will not get full credit. Conversely, you should assume that the autograder may run some additional tests that are not included with the assignment; so having a clean stack test
does not necessarily mean that your assignment will get full points. You should only trust the numbers reported by the autograder when you submit your assignment.
When you run stack test
, the end of the output should include the lines
All N tests passed (...)
OVERALL SCORE = ... / ...
or
K out of N tests failed
OVERALL SCORE = ... / ...
If your output does not have one of the above, your code will receive a zero.
The other lines will give you a readout for each test. You are encouraged to try to understand the testing code, but you will not be graded on this.
You may also find it useful to use the Haskell interpreter, GHCi, to test your code by hand. To do this, run the stack ghci
command, which should wait for further input on a ghci>
prompt. You can enter Haskell expressions at this prompt to evaluate them, including expressions that use functions defined in the assignment. The ghci
documentation has a fuller introduction to GHCi.
This assignment builds upon our previous implementation of Nano from HW4,
so most data types in Types.hs
are the same as they were before.
We have added the following definitions to support type inference.
We will use the following Haskell data types to represent Nano's types and poly-types. (A "poly-type" is a polymorphic type, like the type of the identity function, \x -> x
: its type is forall a . a -> a
. (Haskell also has polymorphic types, but you don't usually write the forall a .
part.)
data Type
= TInt -- Int
| TBool -- Bool
| Type :=> Type -- function type: T1 -> T2
| TVar TId -- type variable: a, b, c
| TList Type -- list type: [T]
data Poly = Mono Type -- mono-type
| Forall TVar Poly -- polymorphic type
Above, TId
is just a type alias for String
, used to represent type variable names:
type TId = String
For example, a Nano type Int -> Int
is represented inside your type checker as:
TInt :=> TInt
whereas a polymorphic Nano type forall a . List a -> a
is represented as:
Forall "a" (Mono (TList (TVar "a") :=> TVar "a"))
(or simply forall "a" (list "a" :=> "a")
using convenience functions forall
and list
, also defined in Types.hs
).
A type environment is a dictionary that maps program variables to poly-types. As before, the dictionary is implemented as a list of pairs:
type TypeEnv = [(Id, Poly)]
For example, a type environment that maps the program variable x
to the type Int
, and maps the program variable f
to the type Int -> Int
, is represented as:
[("x", Mono TInt), ("f", Mono (TInt :=> TInt))]
A type substitution (or just "substitution" for short) is also a dictionary, but it maps type variables to types they should be replaced with.
type Subst = [(TId, Type)]
For example, a substitution that maps the type variable a
to the type Int
, maps the type variable b
to the type Int -> [Int]
, and maps the type variable c
to the type d
(where d
is itself a type variable), is represented as:
[("a", TInt), ("b", TInt :=> TList TInt), ("c", TVar "d")]
When you implement operations on substitutions, you can assume that the keys (type variables) in a substitution are unique. Conversely, in code that uses operations on substitutions, you should guarantee that this is the case.
In this problem, you will implement some helper functions for your type checker. These functions will be useful in later parts of the assignment.
First, implement two functions that compute the list of free type variables in a type and in a poly-type, in the order they appear in the type.
A type variable is free if it is not bound by an enclosing forall
binder. For example, in the poly-type forall a . a -> [b]
, the type variable a
is bound, but b
is free, so freeTVars
should return ["b"]
:
ghci> freeTVars (forall "a" ("a" :=> "b"))
["b"]
The list that freeTVars
returns should not contain duplicate elements. For example, even though a
occurs free twice in a -> a -> b
, freeTVars
should still return only ["a", "b"]
.
To enable overloading (using a function with the same name for types and poly-types),
the freeTVars
function is part of a type class HasTVars
.
-- | Type variables of a type
instance HasTVars Type where
freeTVars :: Type -> [TId]
freeTVars t = error "TBD: type freeTVars"
-- | Free type variables of a poly-type (remove forall-bound vars)
instance HasTVars Poly where
freeTVars :: Poly -> [TId]
freeTVars s = error "TBD: poly freeTVars"
As always, you are allowed (and encouraged) to replace freeTVars t
and freeTVars s
with multiple equations for different patterns.
Hint
For the Type
instance of freeTVars
, You should pattern-match against each of the constructors for the Type
type: TInt
, TBool
, :=>
, and so on. Not all type constructors have free type variables. For example, TInt
doesn't. In that case, you should return []
.
The trickiest case is the :=>
constructor; in that case, you'll need to make a couple of recursive calls to freeTVars
and combine the results somehow, remembering to handle duplicates.
Hint
For the Poly
instance of freeTVars
, You should pattern-match against each of the constructors for the Poly
type. Call freeTVars
recursively as needed. forall
-bound variables aren't free, so remember to remove them!
Hint
Throughout this assignment, you are allowed to use any library functions,
as long as you don't add new import
statements.
In particular, some useful functions from the List
library include union
and delete
(look them up on Hoogle). Also, keep in mind that the List
library is imported qualified:
import qualified Data.List as L
So whenever you want to use a function from that library, you have to prefix its name with L.
(e.g., L.delete
).
When you are done, you should get the following behavior:
ghci> freeTVars TInt
[]
ghci> freeTVars (forall "a" (list "a" :=> "a"))
[]
ghci> freeTVars (TVar "a")
["a"]
ghci> freeTVars (list (list "a"))
["a"]
ghci> freeTVars (TVar "a" :=> TVar "b")
["a", "b"]
ghci> freeTVars (TVar "a" :=> TVar "a")
["a"]
ghci> freeTVars (forall "a" (TVar "a" :=> TVar "b"))
["b"]
ghci> freeTVars (forall "a" (TVar "a" :=> TVar "a"))
[]
Recall from the description above that a substitution maps type variables to types. For example, a substitution that maps the type variable a
to the type Int
, maps the type variable b
to the type Int -> [Int]
, and maps the type variable c
to the type d
(where d
is itself a type variable), is represented as:
[("a", TInt), ("b", TInt :=> TList TInt), ("c", TVar "d")]
Implement a function to look up what a type variable maps to in a substitution:
lookupTVar :: TVar -> Subst -> Type
and a function to remove the mapping for a given type variable from a substitution (recall that all type variables in a substitution are supposed to be unique):
removeTVar :: TVar -> Subst -> Subst
When you are done you should get the following behavior:
ghci> lookupTVar "a" [("a", TInt)]
Int
-- if the specified type variable isn't in the substitution,
-- lookupTVar should just return it (as a type)
ghci> lookupTVar "a" [("b", TInt)]
a
ghci> removeTVar "a" [("a", TInt)]
[]
ghci> removeTVar "a" [("b", TInt)]
[("b",Int)]
Hint
A Subst
is just a list. You know how to traverse these in Haskell. Hint: use recursion!
Hint
In removeTVar
, you're building a list that (potentially) leaves out an element from the original list.
Next, use lookupTVar
and removeTVar
to write
two functions that apply a substitution
to a type and to a poly-type
(both named apply
), and return the resulting type. Applying a substitution to a type means replacing any free type variables in the type with whatever the substitution maps them to. You should take care to leave bound type variables alone. For example, if you're applying the substitution [("a",TInt)]
to the type forall a . [a]
, then the resulting type is still forall a . [a]
.
instance Substitutable Type where
apply :: Subst -> Type -> Type
apply sub t = error "TBD: ..."
instance Substitutable Poly where
apply :: Subst -> Poly -> Poly
apply sub p = error "TBD: ..."
Once you have implemented this functionality, you should get the following behavior:
ghci> apply [("a",TInt)] (list "a")
[Int]
ghci> apply [("a",TInt)] (list "b")
[b]
ghci> apply [("a",list TInt), ("b", TInt)] ("b" :=> "a")
(Int) -> [Int]
ghci> apply [("a",TInt)] (forall "a" (list "a"))
forall a . [a]
ghci> apply [("b", TInt)] (forall "a" ("a" :=> "b"))
forall a . (a) -> Int
Hint
Use removeTVar
to deal with bound type variables in the implementation of apply
.
Finally, use apply
to implement the function extendSubst :: Subst -> TVar -> Type -> Subst
,
which extends a substitution with a new mapping from a type variable to a type.
The behavior of extendSubst
is rather subtle:
- If the existing substitution contains a mapping for a given type variable, and the type you're adding contains free occurrences of that type variable, then
extendSubst
should replace those occurrences with whatever the existing substitution maps them to. For example, if the current substitution is[("a", Int)]
and you are adding a mapping fromb
to[a]
, then the substitution becomes[("b",[Int]),("a",Int)]
-- not[("b",[a]),("a",Int)]
- If you are adding a mapping for a given type variable, and any types in the existing substitution contain occurrences of that type variable, then
extendSubst
should replace those occurrences with whatever the new mapping says they should be. For example, if the current substitution is[("a", [b]]
and you are adding a mapping fromb
toBool
, then the substitution becomes[("b",Bool),("a",[Bool])]
-- not[("b",Bool),("a",[b])]
.
When you are done, you should get the following behavior:
ghci> extendSubst [("a", TInt)] "b" TBool
[("b",Bool), ("a",Int)]
ghci> extendSubst [("a", TInt)] "b" (list "a")
[("b",[Int]),("a",Int)]
ghci> extendSubst [("a", list "b")] "b" TBool
[("b",Bool), ("a",[Bool])]
Hint
Your implementation of extendSubst
should call apply
twice.
Unification is a crucial part of type inference. Unification means coming up with a substitution that makes two types the "same".
For instance, consider the types a -> Int
and Bool -> b
. To make these types the same, the type variable a
should be replaced with Bool
, and the type variable b
should be replaced with Int
. Therefore, the substitution [(a, Bool), (b, Int)]
is a unifier for these two type.
Of course, some types can't be unified. For example, there is no substitution that is a unifier for the types Int
and Bool
. More interestingly, there is no substitution that is a unifier for the types a
and [a]
. Therefore, some attempts at unification will fail.
In a nutshell, the type inference algorithm that we'll implement works by:
- recursively traversing the program, assembling expression types from the types of their sub-expressions;
- when some type is not yet known, it picks a fresh type variable to represent this type;
- whenever two types must be the same, the algorithm tries to unify them and figure out what these type variables actually stand for. If unification fails, then type inference fails.
A fresh type variable is a type variable that is distinct from all type variables the algorithm has generated before.
Our implementation will name the fresh type variables a0, a1, ...
.
In order to remember the index of the first unused fresh variable,
we introduce a new data type called InferState
:
data InferState = InferState {
stSub :: Subst -- ^ current substitution
, stCnt :: Int -- ^ number of fresh type variables generated so far
}
An InferState
is a Haskell record, which we haven't seen before. A record is like a tuple, but its fields have names. An InferState
has two named fields: stSub
, which represents the current substitution, and stCnt
, which represents the number of fresh type variables represented so far. It represents the "current state" of the unification and inference algorithms.
It remembers how many fresh type variables the algorithm has generated so far,
alongside the "current substitution".
The provided code defines initInferState
for you, which contains an empty substitution and a stCnt
of 0.
Implement a function unifyTVar :: InferState -> TId -> Type -> InferState
. unifyTVar
takes an InferState
(containing a substitution) st
, a type variable a
, and a type t
, and tries to produce a new InferState
(containing an updated substitution that is a unifier for a
and t
.
If it succeeds, it records the new type assignment by extending the state st
.
If it fails, it throws an exception.
You can use the provided helper function extendState
to extend a substitution inside a state.
Assumption:
In the implementation of unifyTVar
, you can assume that a
does not appear in the domain of stSub st
(i.e., among the type variables it maps);
remember that code that uses unifyTVar
should guarantee this property.
When you are done, you should get the following behavior (recall that initInferState
contains an empty substitution):
ghci> stSub (unifyTVar initInferState "a" (list "b"))
[("a",[b])]
ghci> stSub (unifyTVar initInferState "a" "a")
[]
ghci> stSub (unifyTVar initInferState "a" (TInt :=> TBool))
[("a",(Int) -> Bool)]
ghci> stSub (unifyTVar initInferState "a" ("a" :=> TBool))
*** Exception: Error {errMsg = "type error: cannot unify a and (a) -> Bool (occurs check)"}
ghci> stSub (unifyTVar initInferState "a" (list "a"))
*** Exception: Error {errMsg = "type error: cannot unify a and [a] (occurs check)"}
Hint
The implementation of unifyTVar
is quite simple. There are three cases to handle:
- Unifying a type variable with itself (for instance, unifying
a
witha
) succeeds, and doesn't change the existing substitution. - Attempting to unify a type variable with a type containing a free occurrence of that type variable somewhere in its structure (for instance, unifying
a
with[a]
, or unifyinga
witha -> Int
) fails, and is known as the "occurs check", so you should throw an exception. To get an exception like the one above, you could usethrow (Error ("type error: cannot unify " ++ a ++ " and " ++ show t ++ " (occurs check)"))
(but any error message containing the string"type error"
will pass tests). - Unifying a type variable with a type not containing a free occurrence of that type variable succeeds, and you should use
extendState
.
With unifyTVar
done, you can now implement a function unify :: InferState -> Type -> Type -> InferState
. unify
takes an InferState
containing a substitution, and two types t1
and t2
. It tries to unify t1
and t2
, extending the substitution as needed to unify them, or fails and throws an exception.
Assumption:
In the implementation of unify
, you can assume that any type variables in t1
and t2
do not appear in the domain of stSub st
;
remember that code that uses unify
should guarantee this property.
Here are some examples of how unify
should behave:
ghci> stSub (unify initInferState TInt TInt)
[]
ghci> stSub (unify initInferState TInt TBool)
*** Exception: Error {errMsg = "type error: cannot unify Int and Bool"}
ghci> stSub (unify initInferState "a" "a")
[]
ghci> stSub (unify initInferState "a" "b")
[("a",b)]
ghci> stSub (unify initInferState ("a" :=> TInt) (TBool :=> "b"))
[("b",Int),("a",Bool)]
ghci> stSub (unify initInferState (list "a") (list (list "a")))
*** Exception: Error {errMsg = "type error: cannot unify a and [a] (occurs check)"}
ghci> stSub (unify initInferState (list "a" :=> "b") ("c" :=> TBool))
[("b",Bool),("c",[a])]
ghci> stSub (unify initInferState (list "a" :=> "b") (TInt :=> TBool))
*** Exception: Error {errMsg = "type error: cannot unify [a] and Int"}
ghci> stSub (unify initInferState ("a" :=> "b") (TInt :=> TBool))
[("b",Bool),("a",Int)]
ghci> stSub (unify initInferState ("a" :=> (TBool :=> "b")) ((TInt :=> "c") :=> ("d" :=> list "e")))
[("b",[e]),("d",Bool),("a",(Int) -> c)]
Hint
Here are some hints on how to implement the trickiest parts of `unify`:- When either
Type
argument is aTVar
, then delegate tounifyTVar
. - The trickiest case is when both
Type
arguments are functions, e.g.,argType :=> resultType
. In this situation, you should:- Unify the
argType
s. - Apply the newfound substitution to the
resultType
s. You should already know what function does this. - Unify the
resultType
s.
- Unify the
Now we have everything in place to implement type inference!
Let's first consider the fragment of the Nano language without:
- binary operators
- conditionals
- polymorphism
- recursion
- lists
Implement the function infer :: InferState -> TypeEnv -> Expr -> (InferState, Type)
for this fragment.
The infer
function takes arguments st
(an InferState
), gamma
(a TypeEnv
), and e
(an Expr
). It attempts to infer the type of e
in type environment gamma
given that the current state is st
.
If e
imposes constraints on the types of its sub-expressions,
infer
calls unify
to enforce these constraints.
If all unification steps succeed, infer
returns a tuple of the InferState
(possibly extended with new type assignments generated during unification) and the inferred type of e
.
For this part of the assignment, you can assume that the type environment maps all variables to mono-types, i.e., Mono t
.
In particular, you should observe the following behavior:
ghci> typeOfString "True"
Bool
ghci> typeOfString "1"
Int
ghci> typeOfString "(\\x -> x) 5"
Int
ghci> typeOfString "(\\x -> x + 1) True"
*** Exception: Error {errMsg = "type error: cannot unify Int and Bool"}
ghci> typeOfString "let x = 5 in x + 3"
Int
Hint
There are six cases of infer
that you'll need to implement: EInt
, EBool
, EVar
, ELam
, EApp
, and ELet
.
-
EInt
andEBool
should be easy. -
EVar
will require you to look up the type of a variable in the type environment, usinglookupVarType
. The only tricky thing to deal with here is that aTypeEnv
containsPoly
types, and you just want aType
. To pass test cases like those shown above (and all of the other test cases for 3(a)), you can write a simple helper function that takes aPoly
and returns aType
. In theMono
case, this function should return the inner value of typeType
, and in theForall
case, you could throw an error or leave it asundefined
for now. (You'll need to revisit this decision when implementing part 3(b) and beyond, but for this part of the assignment, it's fine.) - For the
(ELam x body)
case, remember what the typing rule for lambda expressions looks like. Your result type will be something of the shapet1 :=> t2
. You'll need to:
- Create a fresh type variable for the type of the function's argument,
x
. (Remember to update theInferState
to count this newly created type variable.) - Extend the type environment with
x
and the newly created type variable. - Infer the type of the function body in the extended type environment.
- Since step 3 may update the substitution, you'll need to apply the updated substitution to the argument type variable to get the final argument type. (For example, consider a function like
\x -> x + 1
. This function has typeInt -> Int
, but the only way that we know that the type of the argumentx
isInt
is because of the constraint imposed by+
in the body, which says that both of its arguments must be of typeInt
. So this last step is crucial. Without it, you'll end up inferring something like(a0) -> Int
as the type of\x -> x + 1
. That's a sure sign that you've forgotten this last step.)
- For the
(EApp e1 e2)
case, remember what the typing rule for application expressions looks like.e1
should be of function type,e2
should be the same as the argument type ofe1
, and the type of the entire expression should be the same as the result type ofe1
. You'll need to:
- Infer the type of
e1
, the function. - Infer the type of
e2
, the argument (using theInferState
that you got from step 1). - Create a fresh type variable for the function's result type (which will be the type of the entire application expression). (Remember to update the
InferState
to count this newly created type variable.) - Unify the inferred type of
e1
with the function type that it ought to be (making use of the inferred type ofe2
and the type variable that you created). This step will ensure that the specified argument is of the appropriate type to be called with the specified function. If you leave out this step, then your type checker will not reject programs like(\x -> x + 1) True
as ill-typed! - Since the previous steps may update the substitution, you'll need to apply the updated substitution to the type variable you created to get the type of the entire application expression. (For example, consider an application like
(\x -> True) 3
. This expression has typeBool
, but the only reason we know that is because of the constraint imposed byTrue
in the body of the function. If you leave out this step, the type that you infer for this expression will still be an uninstantiated type variable.
- For the
(ELet x e1 e2)
case, remember what the typing rule forlet
expressions looks like. You'll need to:
- Infer the type of
e1
. - Create a fresh type variable for
x
's type. (Remember to update theInferState
to count this newly created type variable.) - Extend the type environment with the newly created type variable.
- Unify the inferred type of
e1
with the type ofx
. - Since the previous steps may update the substitution, you'll need to apply the updated substitution to the type environment.
- Finally, infer the type of
e2
in the extended type environment, giving you the type of the entirelet
expression.
When you are finished with this part of the assignment, you should be passing tests 3atest1.hs
through 3atest4.hs
, and the above examples should also work.
In this part of the assignment, you will add support for polymorphism.
-
Variables in the type environment can now have polymorphic types, like
forall a . a -> a
. -
Whenever we use a variable, we have to instantiate its poly-type, i.e., replace all bound type variables with fresh free type variables.
-
Whenever we define a variable using a
let
-expression, we have to generalize its type into a poly-type.
First implement the function generalize :: TypeEnv -> Type -> Poly
.
generalize gamma t
generalizes the type t
into a poly-type,
i.e., binds all its type variables that are not free in gamma
with a Forall
:
When you are done, you should observe the following behavior:
ghci> generalize [] ("a" :=> "a")
forall a . (a) -> a
ghci> generalize [("x", Mono "a")] ("a" :=> "a")
(a) -> a
We've provided for you an implementation of the function instantiate :: Int -> Poly -> (Int, Type)
, which is the opposite of generalize
: instantiate n s
instantiates a poly-type s
by replacing its bound type variables with fresh type variables.
Here, n
is the index of the first unused fresh type variable.
ghci> instantiate 0 (Forall "a" (Forall "b" (Mono ("a" :=> "b"))))
(2,(a0) -> a1)
Finally, modify your implementation of infer
to support polymorphism, using generalize
and instantiate
.
When you are done, your implementation should be able to
type-check the following example, in which the same function (the identity function, id
) is used in two different places, on arguments of different types:
ghci> typeOfString "let id = \\x -> x in let y = id 5 in id (\\z -> z + y)"
(Int) -> Int
Hint
You'll need to use instantiate
in just one place: when you look up the type of a variable in a type environment (which will happen in the EVar
case of infer
). You'll also need to use generalize
in just one place: when inferring the type of a let
-bound variable, you'll want to generalize it (so that it can be later instantiated in different ways). Without generalization, the above example that applies the function id
to arguments of different types will fail to type-check.
Instead of implementing separate type inference for binary operators, conditionals, and lists, we decided to represent them as built-in functions and reuse the type inference for variables and function application.
Fill in the types of built-in functions inside preludeTypes
(we have pre-filled the one for +
).
Remember that you are not allowed to use a0, a1, ...
as bound type variables,
since those are reserved for use as free type variables by the algorithm.
Modify your implementation of infer
to support recursive function definitions. Once you are done, all the provided tests should pass.
Hint
Recall how in Assignment 4, to support recursion, when evaluating a let
-bound expression, we had to extend the environment to include the variable that the expression itself was bound to. For example, to evaluate an expression like let f = \n -> if n == 0 then 1 else n * f (n - 1) in f 5
, the subexpression if n == 0 then 1 else n * f (n - 1)
needed to be evaluated in an extended environment that contained a binding for f
.
For type inference, you'll need to do something similar. In the ELet x e1 e2
case of infer
, when inferring the type of e1
, you'll need to make sure to do so in an extended type environment that has a binding for x
. (You're probably already extending the type environment with a binding for x
and using that extended type environment to infer the type of e2
, but to support recursive functions, you'll need to also use the extended type environment to infer the type of e1
, because x
may occur in e1
.)
As with the previous assignments, you can use the stack ghci
command to start the GHCi interpreter with your code loaded. We very strongly recommend using GHCi to develop and debug your code.
In GHCi, we recommend using the following provided functions to develop and debug your code as you work on the assignment:
The typeOfExpr
function takes an expression, represented as a Nano AST of type Expr
, and infers the type of the expression, passing the preludeTypes
type environment.
ghci> typeOfExpr (EBin Plus (EInt 1) (EInt 2))
Int
Since it would be annoying to always have to write out ASTs for Nano programs, there's also a typeOfString
helper function that takes a string, parses it into a Nano expression, and then infers the type of the expression.
For example, this will work once you correctly implement the EBool
case of infer
:
ghci> typeOfString "True"
Bool
Most Nano programs will exercise more parts of infer
. For example, adding two numbers requires the EApp
case of infer
to be implemented (since 1 + 2
is an EBin
expression, and the EBin
case, which is implemented for you, leverages the EApp
case).
Once you've implemented the EApp
case, this will work:
>>> typeOfString "1 + 2"
Int
Once you've implemented EApp
and ELet
, this will work:
>>> typeOfString "let x = 1 in x + 2"
Int
Note that backslashes (\
) are escape characters in Haskell strings, so you need to escape them with another backslash to write lambda expressions. For example, the lambda expression \x -> x + 1
should be written as "\\x -> x + 1"
.
Once you've implemented the EApp
and ELam
cases of infer
, this should work:
ghci> typeOfString "\\x -> x + 1"
(Int) -> Int
You can write your own Nano programs, save them in files, and call typeOfFile
to type-check them.
The typeOfFile
function takes a file path, reads the file's contents as a string, parses the string into a Nano AST, and type-checks the AST using infer
, passing the preludeTypes
type environment. This function is useful for testing your code with files.
>>> typeOfFile "tests/input/3atest1.hs"
Bool