Home

Functional programming using Caml Light

image

Contents

1. Reg Constant m PC Branch codei code2 code Stack r s gt state Reg lt r state Stack lt Address code s state PC lt if m 0 then code2 else code1 Reg r PC Push code Stack s gt state Stack lt r s state PC lt code Reg r1 PC Swap code Stack r2 s gt state Reg lt r2 state Stack 1 5 state PC lt code Reg r PC Clos codei code Stack s gt state Reg lt Closure Address codei r state PC lt code Reg _ PC Stack Address code s gt state Stack lt s state PC lt code Reg v PC Apply code Stack Closure Address codel1 Environment e s gt state Reg lt Environment state Stack lt Address code s state PC lt codel 118 CHAPTER 16 COMPILING ASL TO AN ABSTRACT MACHINE CODE Reg v PC Stack gt raise CAM_End v Reg _ PC Plus Minus Times Divide Equal code Stack _ _ gt raise CAMbug IllTyped Reg Environment e PC Nth n code Stack _ gt state Reg lt try nth ne with Failure _ gt raise CAMbug IllTyped state PC lt code _ gt raise CAMbug Wrong configuration step state gt unit fun We may notice that the empty code sequence denotes a possibly final return instruction We could argue that pattern matchi
2. 1 gt x hd a list gt a fun find succeed hd 0 1 21 3 41 int list 1 2 8 2 p 58 let rec map succeed f function 1 gt CO h t gt try f h map_succeed f t with gt map succeed f t 130 CHAPTER 17 ANSWERS TO EXERCISES map succeed a gt b gt a list gt b list fun map_succeed hd 1 2 31 4 5 6 int list 1 2 4 9 1 p 63 The first function copy that we define assumes that its arguments are respectively the input and the output channel They are assumed to be already opened let copy inch outch inch and outch are supposed to be opened channels try actual copying while true do output_char outch input_char inch done with End_of_file gt Normal termination raise End_of_file sys Sys error msg gt Abnormal termination prerr_endline msg raise Failure cp _ gt Unknow exception maybe interruption prerr_endline Unknown error while copying raise Failure cp copy in channel gt out channel gt unit fun The next function opens channels connected to its filename arguments and calls copy on these channels The advantage of dividing the code into two functions is that copy performs the actual work and can be reused in different applications while the role of cp is more administrative in the
3. let rec asl_typing_expr gamma let rec type rec function Const _ gt Number Var n gt let sigma try nth n gamma with Failure _ gt raise TypingBug Unbound in gen_instance sigma Cond e1 e2 e3 gt unify Number type_rec e1 let t2 type_rec e2 and t3 type_rec e3 in unify t2 t3 t3 App Abs x e2 as f 1 gt LET case let t1 type rec 1 in let sigma generalise type gamma t1 in asl typing expr sigma gamma e2 1 2 gt let u TypeVar new_vartype in unify type rec ei Arrow type rec e2 u u Abs x e gt let u TypeVar new_vartype in dk dt db db dt HH dk 110 CHAPTER 15 STATIC TYPING POLYMORPHISM AND TYPE SYNTHESIS let s Forall u in Arrow u asl_typing_expr s gamma e in type_rec asl typing expr asl type scheme list gt asl gt asl type fun 15 3 7 Typing trapping type clashes and printing ASL types Now we define some auxiliary functions in order to build a good looking type synthesizer First of all a printing routine for ASL type schemes is defined using a function tvar_name which computes a decent name for type variables let tvar_name n Computes a name a for type variables given an integer n representing the position of the type variable in the list of generic type variables let rec name of n let q r n 26 n mod 26 in let s make st
4. x1 11 x2 12 gt x1 x2 combine 11 12 _ gt raise Failure combine lists of different length combine a list b list gt a b list fun combine 1 2 3 int string list 1 a 2 b 3 c combine 1 2 3 a b Uncaught exception Failure combine lists of different length 5 2 p 36 let rec sublists function gt 0 x 1 gt let sl sublists 1 in sl map fun 1 gt x 1 s1 sublists a list gt a list list fun sublists a list list sublists 1 2 3 int list list 3 21 2 3 1 1 3 1 2 2 311 sublists a string list list a 6 1 p 46 type a b btree Leaf of b Btree of a b node and a b node Op a 127 Soni a b btree Son2 a b btree Type btree defined Type node defined let rec nodes_and_leaves function Leaf x gt x Btree Op x Soni si Son2 s2 gt let nodesi leavesi nodes and leaves s and nodes2 leaves2 nodes_and_leaves s2 in x nodesi nodes2 leavesi leaves2 nodes and leaves a b btree gt a list b list fun 4 nodes and leaves Btree Op Soni Leaf 1 Son2 Leaf 2 string list int list 1 2 6 2 p 46 let rec map btree f g function L
5. Py C may be read in two ways e If P and Ph then C e In order to prove C it is sufficient to prove P and Ph An inference rule may have no premise such a rule will be called an aziom A complete proof will be represented by a proof tree of the following form DI us yis gee where the leaves of the tree PF are instances of axioms In the premisses and the conclusions appear judgements having the form Thre o 102 CHAPTER 15 STATIC TYPING POLYMORPHISM AND TYPE SYNTHESIS Such a judgement should be read as under the typing environment I the expression e has type scheme Typing environments are sets of typing hypotheses of the form x where x is an identifier name and o is a type scheme typing environments give types to the variables occurring free i e unbound in the expression When typing A calculus terms the typing environment is managed as a stack because identifiers possess local scopes We represent that fact in the presentation of the type system by removing the typing hypothesis concerning an identifier name z if such a typing hypothesis exists before adding a new typing hypothesis concerning We write T x for the set ot typing hypotheses obtained from by removing the typing hypothesis concerning 2 if it exists Any numeric constant is of type Number NUM T Const n Number We obtain type schemes for variables from the typing environment
6. TAUT RULE ho It is possible to instantiate type schemes The GenInstance relation represents generic instanti ation Idee GenInstance c DLF Fe o It is possible to generalize type schemes with respect to variables that do not occur free in the set of hypotheses INST MIN ag FV T TF e Va GEN Typing a conditional Number Tbh e9 7T Tbr e3 7 if e then ez else fi T IF Typing an application Tre TOT Treg 7 APP e1 eo Typing an abstraction CPt Utes hese ABS 6 vr The special rule below is the one that introduces polymorphism this corresponds to the ML let construct Prete l I z U z o Fke r Db enr This type system has been proven to be semantically sound i e the semantic value of a well typed expression an expression admitting a type cannot be an error value due to a type error This is usually expressed as LET Well typed programs cannot go wrong This fact implies that a clever compiler may produce code without any dynamic type test for a well typed expression 15 2 THE ALGORITHM 103 Example Let us check using the set of rules above that the following is true 1 f Axx inf f 06 8 In order to do so we will use the equivalence between the let construct and an application of an immediate abstraction i e an expression having the following shape Av M N The LET
7. b lt 5b gt gt seq a b char stream gt char list fun Here operationally once an a has been recognized the first matching rule is chosen Any further mismatch either from seq a or from the last b will raise a Parse error exception and the whole parsing will fail On the other hand if the first character is not an a seq a will raise Parse failure and the second rule lt b gt gt will be tried 10 3 PARAMETERIZED PARSERS 69 This behavior is typical of predictive parsers Predictive parsing is recursive descent parsing with one look ahead token In other words a predictive parser is a set of possibly mutually recursive procedures which are selected according to the shape of at most the first token 10 2 2 Sequential binding in stream patterns Bindings in stream patterns occur sequentially in contrast with bindings in regular patterns which can be thought as occurring in parallel Stream matching is guaranteed to be performed from left to right For example computing the sum of the elements of an integer stream could be defined as let rec stream_sum n function lt 0 stream sum n p gt gt p amp lt stream sum n x p gt gt p lt gt gt n stream_sum int gt int stream gt int lt fun gt stream_sum 0 lt 70 1 2 3 4 gt int 10 The stream_sum function uses its first
8. gt 1 92 CHAPTER 12 ASL A SMALL LANGUAGE in function lt IDENT id DOT expr id rho e gt gt Abs id e lt atom rho ei rest 1 e2 gt gt 2 and atom rho function lt IDENT id gt gt try binding_depth id rho with Unbound s gt print_string Unbound ASL identifier print_string s print_newline raise Failure ASL parsing INT n gt gt Const n lt IF expr rho ei THEN expr rho e2 ELSE expr rho FI gt gt Cond e1 e2 e3 lt LPAR expr rho RPAR gt gt e top token stream gt top_asl lt fun gt expression token stream gt asl lt fun gt expr string list gt token stream gt asl lt fun gt atom string list gt token stream gt asl lt fun gt The complete parser that we will use reads a string converts it into a stream and produces the token stream that is parsed let parse_top s top lexer stream_of_string s parse_top string gt top_asl lt fun gt Let us try our grammar we do not augment the global environment at each declaration this will be performed after the semantic treatment of ASL programs We need to write double inside strings since is the string escape character parse_top let f be top asl Decl f Abs x Var 1 Sparse top let x be 1 x x 2 top asl Decl
9. 8 f is a type A type scheme is a type where some variables are distinguished as being generic We can represent type schemes by where is a type Example Vo a Number 8 and Number 8 are type schemes We will use c as metavariables representing type schemes We may also write type schemes as V r In this case represent a possibly empty set of generic type variables When the set of generic variables is empty we write or simply We will write FV for the set of unknowns occurring in the type scheme Unknowns are also called free variables they are not bound by a V quantifier We also write BV bound type variables of for the set of type variables occurring in which are not free i e the set of variables universally quantified Bound type variables are also said to be generic Example If c denotes the type scheme Va a Number 8 0 then we have By 8 BV a A substitution instance of a type scheme is the type scheme where S is a substitution of types for free type variables appearing in c When applying a substitution to a type scheme a renaming of some bound type variables may become necessary in order to avoid the capture of a free type variable by a quantifier Example e If c denotes V3 3 a a and a is VB B y
10. ASL Type of it is Number 40320 unit run parse top fib 9 ASL Type of it is Number 34 unit It is of course possible and desirable to introduce recursion by using a specific syntactic construct special instructions and a dedicated case to the compiling function See 25 for efficient compilation of recursion data structures etc Exercise 16 1 Interesting exercises for which we won t give solutions consist in enriching according to your taste the ASL language Also building a standalone ASL interpreter is a good exercise in modular programming Chapter 17 Answers to exercises We give in this chapter one possible solution for each exercise contained in this document Exercises are referred to by their number and the page where they have been proposed for example 2 1 p 15 refers to the first exercise in chapter 2 this exercise is located on page 15 3 1 p 19 The following anonymous functions have the required types function f gt f 2 1 int gt int gt int fun function m gt function n gt n m 1 int gt int gt int fun function f gt function m gt f m 1 2 int gt int gt int gt int fun 3 2 p 19 We must first rename y to z obtaining function x gt function z gt x z and finally function y gt function z gt ytz Without renaming we would have obtained function y gt
11. En For instance when applied to the curried addition function to the base element 0 and to a list of numbers it computes the sum of all elements in the list The expression it list prefix O 1 2 3 4 5 evaluates to 0 1 2 3 4 5 i e to 15 let rec it_list f a function gt a 1 gt it list f f a x 1 it list a gt b gt a gt a gt b list gt a fun let sigma it list prefix 0 Sigma int list gt int lt fun gt sigma 1 2 3 4 5 int 15 it_list prefix 1 1 2 3 4 5 int 120 The it list function is one of the many ways to iterate over a list For other list iteration functions see 9 Exercises Exercise 5 1 Define the combine function which when given a pair of lists returns a list of pairs such that combine x1 xn ly1 yn 1 1 xn yn The function generates a run time error if the argument lists do not have the same length Exercise 5 2 Define a function which when given a list returns the list of all its sublists Chapter 6 User defined types The user is allowed to define his her own data types With this facility there is no need to encode the data structures that must be manipulated by a program into lists as in Lisp or into arrays as in Fortran Furthermore early detection of type errors is enforced since user defined data types reflect precisely the needs of the algor
12. Funval fun Funval fun 13 2 Semantic functions The semantic function is the interpreter itself There is one for expressions and one for declarations The one for expressions computes the value of an ASL expression from an environment rho The environment will contain the values of globally defined ASL values or of temporary ASL values It is organized as a list and the numbers representing variable occurrences will be used as indices into the environment let rec nth n function gt raise Failure nth x l gt if n 1 then x else nth n 1 1 nth int gt a list gt fun let rec semant rho let rec sem function Const n gt Constval n Var n gt nth n rho Cond e1 e2 e3 gt match sem ei with Constval 0 gt sem Constval n gt sem e2 _ gt raise Illtyped Abs _ e gt Funval fun x gt semant x rho e dk dk 13 3 EXAMPLES 95 App ei e2 gt match sem el with Funval f gt f sem e2 _ gt raise Illtyped in sem 5 semant semval list gt asl gt semval fun The main function must be able to treat an ASL declaration evaluate it and update the global environments global_env and global_sem let semantics function Decl s e gt let result semant global_sem e in global_env s global_env global_sem result global_sem print_string ASL Value of print_string s print st
13. b fun It is even possible to catch all exceptions do something special close or remove opened files for example and raise again that exception to propagate it upwards let show exceptions f arg try f arg with x gt print string Exception raised n raise x show exceptions a gt b gt a gt b fun 58 CHAPTER 8 ESCAPING FROM COMPUTATIONS EXCEPTIONS In the example above we print a message to the standard output channel the terminal before raising again the trapped exception catch_all function x gt raise Failure foo 1 0 int 0 catch_all show_exceptions function x gt raise Failure foo 1 0 Exception raised int 0 8 4 Polymorphism and exceptions Exceptions must not be polymorphic for a reason similar to the one for references although it is a bit harder to give an example exception Exc of a list Toplevel input gt exception Exc of a list s AA The type variable a is unbound One reason is that the excn type is not a parameterized type but one deeper reason is that if the exception Exc is declared to be polymorphic then a function may raise Exc 1 2 There might be no mention of that fact in the type inferred for the function Then another function may trap that exception obtaining the value 1 2 whose real type is int list But the only type known by the typechecker is list the try form should refer to the Exc data c
14. get older person gt unit fun Notice that in the two previous expressions we did not specify all fields of the record p Other examples would be let move p new city p City lt new city and change job p j p Job lt j move person gt string gt unit fun change job person gt string gt unit fun change_job jean Teacher move jean Cannes 4get older jean jean person Name Jean Age 25 Job Teacher City Cannes We used the character between the different changes we imposed to jean This is the sequencing of evaluations it permits to evaluate successively several expressions discarding the result of each except the last one This construct becomes useful in the presence of side effects such as physical modifications and input output since we want to explicitly specify the order in which they are performed 7 2 The ref type The ref type is the predefined type of mutable indirection cells It is present in the Caml system for reasons of compatibility with earlier versions of Caml The ref type could be defined as follows we don t use the ref name in the following definition because we want to preserve the original ref type 7 3 ARRAYS 51 type a reference mutable Ref a Type reference defined Example of building a value of type ref let r ref 1 2 r int ref ref 3 The ref identifier is syntactically presented as a su
15. rplaca p 2 unit int bool lisp cons Car 2 Cdr true 7 2 p 54 let stamp_counter ref 0 stamp counter int ref ref O let stamp stamp_counter 1 stamp_counter stamp_counter stamp unit gt int fun stamp int 1 stamp int 2 7 3 p 54 let exchange t i j let v t i in vect assign t i t j vect assign t j v exchange a vect gt int gt int gt unit fun let quick sort t let rec quick lo hi if lo lt hi then begin 129 let i ref lo and j ref hi and p t hi in while i lt j do while i lt hi amp t i lt p do incr i done while j gt lo amp p lt t j do decr j done if i lt j then exchange t i j done exchange t hi i quick lo i 1 quick i 1 hi end else in quick 0 vect_length t 1 8 quick sort float vect gt unit fun let 2 0 1 5 4 0 0 0 10 0 1 0 a float vect 2 0 1 5 4 0 0 0 10 0 1 01 quick_sort a unit float vect 10 0 1 0 1 5 2 0 4 0 10 01 8 1 p 58 let rec find succeed f function gt raise Failure find succeed x 1 gt try f x x with gt find succeed f 1 find succeed a gt b gt a list gt a fun let hd function gt raise Failure empty
16. 30 An introduction to programming in lazy functional languages can be found in 5 135 136 CHAPTER 18 CONCLUSIONS AND FURTHER READING Bibliography 10 11 12 13 Abelson G J Sussman Structure and Interpretation of Computer Programs The MIT Press 1985 A Aho R Sethi and J Ullman Compilers Principles Techniques and Tools Addison Wesley 1986 J Backus Can programming be liberated from the Von Neumann style A functional style and its algebra of programs In Communications of the ACM volume 21 pages 133 140 1978 H P Barendregt The Lambda Calculus Its Syntax and Semantics volume 103 of Studies in Logic and the Foundations of Mathematics North Holland 1984 R Bird and P Wadler Introduction to Functional Programming Series in Computer Science Prentice Hall International 1986 A Church The Calculi of Lambda Conversion Princeton University Press 1941 D Cl ment J Despeyroux T Despeyroux and G Kahn A simple applicative language Mini ML In Proceedings of the ACM International Conference on Lisp and Functional Pro gramming pages 13 27 1986 G Cousineau P L Curien and M Mauny The categorical abstract machine In Func tional Programming Languages and Computer Architecture number 201 in Lecture Notes in Computer Science pages 50 64 Springer Verlag 1985 G Cousineau and G Huet The CAML primer Technical Report 122 INRIA 1990 N De Bruij
17. corresponding to the Z fixpoint combinator semantics parse top let fix be x y A x f y x x ASL Value of fix is lt fun gt unit We are now able to define the ASL factorial function semantics parse_top let fact be fix f n if n 0 then 1 else n f n 1 fi ASL Value of fact is fun unit semantics parse top fact 8 ASL Value of it is 40320 unit and the ASL Fibonacci function semantics parse top let fib be fix if n 1 then 1 else if n 2 then 1 else f n 1 n 2 fi 1 ASL Value of fib is fun unit semantics parse top fib 9 ASL Value of it is 34 unit 14 2 Recursion as a primitive construct Of course in a more realistic prototype we would extend the concrete and abstract syntaxes of ASL in order to support recursion as a primitive construct We do not do it here because we want to keep ASL simple This is an interesting non trivial exercise Chapter 15 Static typing polymorphism and type synthesis We now want to perform static typechecking of ASL programs that is to complete typechecking before evaluation making run time type tests unnecessary during evaluation of ASL programs Furthermore we want to have polymorphism i e allow the identity function for example to be applicable to any kind of data Type synthesis
18. in channel lt abstr gt std out out channel lt abstr gt std err out channel lt abstr gt They are the standard IO channels std in is usually connected to the keyboard and printing onto std out and std err usually appears on the screen 9 1 Printable types It is not possible to print and read every value Functions for example are typically not readable unless a suitable string representation is designed and reading such a representation is followed by an interpretation computing the desired function We call printable type a type for which there are input output primitives implemented in Caml Light The main printable types are e characters type char e strings type string e integers type int e floating point numbers type float 59 60 CHAPTER 9 BASIC INPUT OUTPUT We know all these types from the previous chapters Strings and characters support a notation for escaping to ASCII codes or to denote special characters such as newline char A 4t N065 char A char 6 n char n string with na newline inside string string with na newline inside The character is used as an escape and is useful for non printable or special characters Of course character constants can be used as constant patterns function a gt 0 gt 15 char gt int fun On ty
19. la Lisp for their application We will write 1 2 3 instead of 1 2 3 The if then eg else construct will be written if ei then else fi and will return the then part when e is different from 0 0 acts thus as falsity in ASL conditionals They have been proposed by N G De Bruijn in 10 in order to facilitate the mechanical treatment of A calculus terms 12 2 PARSING ASL PROGRAMS 89 12 2 1 Lexical analysis The concrete aspect of ASL programs will be either declarations of the form let identifier be expression or expression which will be understood as let it be expression The tokens produced by the lexical analyzer will represent the keywords let be if and else the binder the dot parentheses integers identifiers arithmetic operations and terminating semicolons We reuse here most of the code that we developed in chapter 10 or in the answers to its exercises Skipping blank spaces let rec spaces function lt t l n spaces _ gt gt O lt gt gt spaces char stream gt unit fun The type of tokens is given by type token LET BE LAMBDA DOT LPAR RPAR IF THEN ELSE FI SEMIC INT of int IDENT of string Type token defined Integers let int_of_digit function 0 9 as c gt int of char c int of char 0 4 gt raise Failure not a digit int
20. since pattern matching may fail 6 2 6 Degenerate cases when sums meet products What is the status of a sum type with a single case such as type counter1 Counter of int Type counter1 defined 6 3 SUMMARY 45 Of course the type counter1 is isomorphic to int The injection function x gt Counter xisa total function from int to counteri It is thus a bijection Another way to define a type isomorphic to int would be type counter2 Counter int Type counter2 defined The types counter1 and counter2 are isomorphic to int They are at the same time sum and product types Their pattern matching does not perform any run time test The possibility of defining arbitrary complex data types permits the easy manipulation of ab stract syntax trees in Caml such as the arith_expr type above These abstract syntax trees are supposed to represent programs of a language e g a language of arithmetic expressions These kind of languages which are defined in Caml are called object languages and Caml is said to be their metalanguage 6 3 Summary e New types can be introduced in Caml e Types may be parameterized by type variables The syntax of type parameters is lt params gt lt tvar gt C lt tvar gt lt tvar gt Types can be recursive Product types Mathematical product of several types The construct is type lt params gt lt tname gt lt Field gt lt type gt where the
21. 2 1 Give two versions of factorial recursive and tail recursive Exercise 4 4 Define the fibonacci function that when given a number n returns the nth Fi bonacci number i e the nth term un of the sequence defined by ui 1 1 Un 2 Un 1 Un 32 CHAPTER 4 BASIC TYPES Exercise 4 5 What are the types of the following expressions e uncurry compose e compose curry uncurry e compose uncurry curry Chapter 5 Lists Lists represent an important data structure mainly because of their success in the Lisp language Lists in ML are homogeneous a list cannot contain elements of different types This may be annoying to new ML users yet lists are not as fundamental as in Lisp since ML provides a facility for introducing new types allowing the user to define more precisely the data structures needed by the program cf chapter 6 5 1 Building lists Lists are empty or non empty sequences of elements They are built with two value constructors e the empty list read nil e the non empty list constructor read cons It takes an element e and a list J and builds a new list whose first element head is e and rest tail is l The special syntax ei1 n builds the list whose elements are It is equivalent to e2 n We may build lists of numbers 481 2 15 int list 1 2 3 4 5 int list 3 4 5 let x 2 in 1 2 x 1 x 2 int l
22. CHAPTER 7 MUTABLE DATA STRUCTURES 7 5 Polymorphism and mutable data structures There are some restrictions concerning polymorphism and mutable data structures One cannot enclose polymorphic objects inside mutable data structures let r ref a list ref ref The reason is that once the type of r a list ref has been computed it cannot be changed But the value of r can be changed we could write r 7 1 2 now r would be a reference on a list of numbers while its type would still be a list ref allowing us to write r true r making r a reference on true 1 2 which is an illegal Caml object Thus the Caml typechecker imposes that modifiable data structures appearing at toplevel must possess monomorphic types i e not polymorphic Exercises Exercise 7 1 Give a mutable data type defining the Lisp type of lists and define the four functions car cdr rplaca and rplacd Exercise 7 2 Define a stamp function of type unit gt int that returns a fresh integer each time it is called That is the first call returns 1 the second call returns 2 and so on Exercise 7 3 Define a quick sort function om arrays of floating point numbers following the quicksort algorithm 13 Information about the quicksort algorithm can be found in 33 for example Chapter 8 Escaping from computations exceptions In some situations it is necessary to abort computations If we are trying to compute th
23. Ens 103 15 3 The ASL type synthesizer 105 16 Compiling ASL to an abstract machine code 115 16 1 The Abstract Machine 2 sns 115 16 2 Compiling ASL programs into CAM code 118 16 3 Execution of CAM code 120 17 Answers to exercises 123 18 Conclusions and further reading 135 CONTENTS Chapter 1 Introduction This document is a tutorial introduction to functional programming and more precisely to the usage of Caml Light It has been used to teach Caml Light in different universities and is intended for beginners It contains numerous examples and exercises and absolute beginners should read it while sitting in front of a Caml Light toplevel loop testing examples and variations by themselves After generalities about functional programming some features specific to Caml Light are described ML type synthesis and a simple execution model are presented in a complete example of prototyping a subset of ML Part I chapters 2 6 may be skipped by users familiar with ML Users with experience in functional programming but unfamiliar with the ML dialects may skip the very first chapters and start at chapter 6 learning the Caml Light syntax from the examples Part I starts with some intuition about functions and types and gives an overview of ML and other functional languages chapter 2 Chapter 3 outlines the interaction with the Caml Light topl
24. Equal Environment let global CAM env ref init CAM env global CAM env object list ref ref Closure Address Clos Push Nth 2 Swap Nth 1 Plus Environment Closure Address Clos Push Nth 2 Swap Nth 1 Minus Environment Closure Address Clos Push Nth 2 Swap Nth 1 Times Environment Closure Address Clos Push Nth 2 Swap Nth 1 Divide Environment Closure Address Clos Push Nth 2 Swap Nth 1 Equal Environment As an example here is the code for some ASL expressions code of expression lexer stream of string 1 instruction list Quote 1 120 CHAPTER 16 COMPILING ASL TO AN ABSTRACT MACHINE CODE code of expression lexer stream of string 1 2 instruction list Push Push Nth 6 Swap Quote 1 Apply Swap Quote 2 Apply code_of expression lexer stream of string x x 0 instruction list Push Clos Nth 1 Swap Push Clos Nth 1 Swap Quote 0 Apply Apply code_of expression lexer stream of string 1 if O then 2 else 3 fi instruction list Push Push Nth 6 Swap Quote 1 Apply Swap Push Quote 0 Branch Quote 2 Quote 3 Apply 16 3 Execution of CAM code The main function for executing compiled ASL manages the global environment until execution has succeeded 1 run Decl s e TYPING reset vartypesO let tau try asl typing expr global
25. The first one until is an example of a recursive function it calls itself in its body and is defined using a let rec construct rec shows that the definition is recursive It takes three arguments a predicate p on floats a function change from floats to floats and a float x If p x is false then until is called with last argument change x otherwise x is the result of the whole call We will study recursive functions more closely later T he two other auxiliary functions satisfied and improve take a float as argument and deliver respectively a boolean value and a float The function satisfied asks wether the image of its argument by f is smaller than epsilon or not thus testing wether y is close enough to a root of f The function improve computes the next approximation using the formula given below The three local functions are defined using a cascade of let constructs The whole function is let newton f epsilon let rec until p change x if p x then x else until p change change x in let satisfied y abs f y lt epsilon in let improve y y f y deriv f y epsilon in until satisfied improve newton float gt float gt float gt float gt float fun Some examples of equation solving let square_root x epsilon newton function y gt y y x epsilon x 24 CHAPTER 4 BASIC TYPES and cubic root x epsilon newton function y gt y y y x epsilon x squ
26. and parsers Chapter 13 presents an untyped call by value semantics of ASL programs through the definition of an ASL interpreter The encoding of recursion in untyped ASL is presented in chapter 14 showing the The Caml Strong version of these notes is available as an INRIA technical report 26 6 CHAPTER 1 INTRODUCTION expressive power of the language The type synthesis of functional programs is demonstrated in chapter 15 using destructive unification on first order terms representing types as a central tool Chapter 16 introduces the Categorical Abstract Machine a simple execution model for call by value functional programs Although the Caml Light execution model is different from the one presented here an intuition about the simple compilation of functional languages can be found in this chapter Warning The programs and remarks especially contained in parts B and C might not be valid in Caml Light versions different from 0 7 Part I Functional programming Chapter 2 Functional languages Programming languages are said to be functional when the basic way of structuring programs is the notion of function and their essential control structure is function application For example the Lisp language 22 and more precisely its modern successor Scheme 31 1 has been called functional because it possesses these two properties However we want the programming notion of function to be as close as possible to the us
27. defined as exception Unbound of string Exception Unbound defined We also need a function which will compute the binding depths of variables That function simply looks for the position of the first occurrence of a variable name in a list It will raise Unbound if there is no such occurrence let binding depth s rho let rec bind n function gt raise Unbound s t 1 gt if s t then Var n else bind n 1 1 in bind 1 rho 55 binding depth string gt string list gt asl fun We also need a global environment containing names of already bound identifiers The global environment contains predefined names for the equality and arithmetic functions We represent the global environment as a reference since each ASL declaration will augment it with a new name let init_env M gus Ws meu sar s onse init env string list ams nn let global env ref init env global env string list ref ref m We now give a parsing function for ASL programs Blanks at the beginning of the string are skipped let rec top function lt LET IDENT id BE expression e SEMIC gt gt Decl id e expression e SEMIC gt gt Decl it e and expression function lt expr global env e gt gt and expr rho let rec rest 1 function lt atom rho 2 rest App ei e2 e gt gt e lt gt
28. function y gt yty which does not denote the same function 123 124 CHAPTER 17 ANSWERS TO EXERCISES 3 3 p 19 We write successively the reduction steps for each expressions and then we use Caml in order to check the result e let x 1 2 in function y gt y x x function y gt 1 2 1 2 function y gt y 1 2 3 3 1 2 3 3 6 Caml says let 1 2 in function y gt x int 6 e let 1 2 in function x gt x x x function x gt x x 142 3 3 6 Caml says let x 142 in function x gt x x x int 6 e let f1 function f2 gt function x gt f2 x in let g function x gt x 1 in f1 g 2 let g function x gt x 1 in function f2 gt function x gt f2 x g 2 function f2 gt function x gt f2 x function x gt x 1 2 function x gt function x gt x 1 x 2 function x gt x 1 2 2 1 3 Caml says let f1 function f2 gt function x gt f2 x in let g function x gt x 1 f1 g 2 int 3 4 1 p 31 To compute the surface area of a rectangle and the volume of a sphere 125 let surface rect len wid len wid surface rect int gt int gt int fun let pi 4 0 atan 1 0 pi float 3 14159265359 let volume sphere r 4 0 3 0 pi power 3 volume sphere float gt float fun 4 2 p 31 In a call by value la
29. gt gt lt gt gt n integer int gt char stream gt int lt fun gt The lexical analyzer let rec lexer s match s with 133 lt spaces _ gt gt lt LPAR lexer s gt lt spaces gt gt lt RPAR lexer gt lt spaces _ gt gt lt PLUS lexer s gt lt 2 4 spaces gt gt lt MINUS lexer gt lt spaces _ gt gt lt TIMES lexer s gt lt spaces _ gt gt lt DIV lexer s gt lt 0 9 as c integer int of digit c n spaces _ gt gt lt n lexer s gt lexer char stream gt token stream fun The parser has the same shape as the grammar let rec expr function lt INT n gt gt n lt PLUS expr expr e2 gt gt 1 2 lt gt MINUS expr expr 2 gt gt ei e2 lt expr 1 expr e2 gt gt ei e2 lt DIV expr expr 2 gt gt e1 e2 expr token stream gt int lt fun gt expr lexer stream_of_string 1 Into expr lexer stream of string 1 2 4 int 9 10 2 p 77 The only new function that we need is a function taking as argument a character stream and returning the first identifier of that stream It could be written as 4 let ident buf make string 8 ident buf string 5 let r
30. gt y gt y 7 then o is a substitution instance of because o S c where S a lt y gt y i e S substitutes the type y y for the variable a e If c denotes V8 8 a a and o is 0 8 B 8 B then o is a substitution instance of because o S c where S a lt 8 8 In this case the renaming of 8 into 6 was necessary we did not want the variable introduced by S to be captured by the universal quantification V 1 metavariable should not be confused with a variable or type variable 15 1 THE TYPE SYSTEM 101 The type scheme o 8 is said to be a generic instance of if there exists a substitution S such that e the domain of S is included in o Qn e no 3 occurs free in In other words a generic instance of a type scheme is obtained by giving more precise values to some generic variables and possibly quantifying some of the new type variables introduced Example Ifo V8 B a then o Vy y y a a is a generic instance of We changed 8 into y y and we universally quantified on the newly introduced type variable y We express this type system by means of inference rules An inference rule is written as a fraction e the numerator is called the premisses e the denominator is called the conclusion An inference rule
31. int gt int gt int fun We can then define the successor function by let successor addition 1 successor int gt int fun Now we may use our successor function successor successor 1 int 3 Exercises Exercise 3 1 Give examples of functions with the following types 1 int gt int gt int 2 int gt int gt int 3 int gt int gt int gt int Exercise 3 2 We have seen that the names of formal parameters are meaningless It is then possible to rename x by y in function x gt x x What should we do in order to rename x in y in function x gt function y gt xty Hint rename y by z first Question why Exercise 3 3 Evaluate the following expressions rewrite them until no longer possible 20 CHAPTER 3 BASIC CONCEPTS let x 1 2 in function y gt y x x let x 1 2 in function x gt x x x let 21 function f2 gt function x gt f2 x in let function gt 1 in f1 g 2 Chapter 4 Basic types We examine in this chapter the Caml basic types 4 1 Numbers Caml Light provides two numeric types integers type int and floating point numbers type float Integers are limited to the range 2 9 2 1 Integer arithmetic is taken modulo 2 that is an integer operation that overflows does not raise an error but the result simply wraps around Predefined operations functions on integers include addition subtract
32. intermediate results and is delivered at the end of parsing the initial sequence w After checking for the presence of C it is used to parse the second sequence w 10 4 Further reading A summary of the constructs over streams and of primitives over streams is given in 21 An alternative to parsing with streams and stream matching are the camllex and camlyacc programs A detailed presentation of streams and stream matching following predictive parsing semantics can be found in 24 where alternative semantics are given with some possible implementations Exercises Exercise 10 1 Define a parser for the language of prefix arithmetic expressions generated by the grammar 10 4 FURTHER READING TT INT Expr Expr Expr Expr Expr Expr Expr Expr Expr Use the lexical analyzer for arithmetic expressions given above The result of the parser must be the integer resulting from the evaluation of the arithmetic expression i e its type must be token gt int Exercise 10 2 Enrich the type token above with a constructor IDENT of string for identifiers and enrich the lexical analyzer for it to recognize identifiers built from alphabetic letters upper or lowercase Length of identifiers may be limited 78 CHAPTER 10 STREAMS AND PARSERS Chapter 11 Standalone programs and separate compilation So far we have used Caml Light in an interactive way lt is also possible to program in Caml Light in a ba
33. of type A a unique member of type B If x denotes an element of A then we will write f x for the application of f to x Parentheses are often useless they are used only for grouping subexpressions so we could also write f x as well as f x or simply f x The value of f x is the unique element of B associated with x by the rule of correspondence for f The notation f x is the one normally employed in mathematics to denote functional appli cation However we shall be careful not to confuse a function with its application We say the function f with formal parameter x meaning that f has been defined by f e or in Caml that the body of f is something like function x gt Functions are values as other values In particular functions may be passed as arguments to other functions and or returned as result For example we could write function f gt function x gt f x 1 1 int gt int gt int gt int fun That function takes as parameter a function let us call it f and returns another function which when given an argument let us call it x will return the predecessor of the value of the application f x 1 The type of that function should be read as int gt int gt int gt int 3 5 Definitions It is important to give names to values We have already seen some named values we called them formal parameters In the expression function x gt x 1 the name x is a formal
34. on types with data constructors For example 30 CHAPTER 4 BASIC TYPES let is zero function 0 gt true gt false is zero int gt bool fun let is yes function oui gt true si gt true ya gt true yes gt true _ gt false is_yes string gt bool lt fun gt 4 6 Functions The type constructor gt is predefined and cannot be defined in ML s type system We shall explore in this section some further aspects of functions and functional types 4 6 1 Functional composition Functional composition is easily definable in Caml It is of course a polymorphic function let compose f g function x gt f g x compose a gt gt c gt gt c gt b fun The type of compose contains no more constraints than the ones appearing in the definition in a sense it is the most general type compatible with these constraints These constraints are e the codomain of g and the domain of f must be the same e x must belong to the domain of g e compose f g x will belong to the codomain of f Let s see compose at work let succ x x 1 succ int gt int fun compose succ list length _a list gt int fun compose succ list length 1 2 3 int 4 4 6 2 Currying We can define two versions of the addition function 4 6 FUNCTIONS 3l let plus function x y gt plus in
35. parameter Its name is irrelevant changing it into another one does not change the meaning of the expression We could have written that function function y gt 1 If now we apply this function to say 1 2 we will evaluate the expression y 1 where y is the value of 1 2 Naming y the value of 1 2 in y 1 is written as let y 1 2 in 1 int 4 This expression is a legal Caml phrase and the let construct is indeed widely used in Caml programs The let construct introduces local bindings of values to identifiers They are local because the scope of y is restricted to the expression y 1 The identifier y kept its previous binding if any 18 CHAPTER 3 BASIC CONCEPTS after the evaluation of the let in construct We can check that y has not been globally defined by trying to evaluate it Toplevel input 2y s The value identifier y is unbound Local bindings using let also introduce sharing of possibly time consuming evaluations When evaluating let in gets evaluated only once All occurrences of x in access the value of which has been computed once For example the computation of let big sum of prime factors 35461243 in big 2 big 4 big 1 will be less expensive than sum of prime factors 35461243 2 sum of prime factors 35461243 4 sum of prime factors 35461243 1 The let construct also has a global form for to
36. parsers let rec wu function p pl gt function lt p x wu pl 1 gt gt x 1 amp gt function lt gt gt wu a stream gt string list gt a stream gt string lt fun gt The wu function builds from a list of parsers pi for i 1 n a single parser 76 CHAPTER 10 STREAMS AND PARSERS function lt 21 n gt gt m1 n which is the sequencing of parsers p The main parser w is let w function lt wd 1 C wu 1 r gt gt r w token stream gt string lt fun gt w 7A B C A B gt string abb w lt gt t string In the previous parser we used an intermediate list of parsers in order to build the second parser We can redefine wd without using such a list let w let rec wd wr function lt A wd function lt wr r gt gt p gt gt p lt B wd function lt wr r B gt gt r b p gt gt lt gt gt wr in function lt wd function lt gt gt p C p str gt gt str w token stream gt string lt fun gt w 7A B C A B gt string abb w lt gt f string Here wd is made local to w and takes as parameter wr for word recognizer whose initial value is the parser with an empty stream pattern This parameter accumulates
37. set of inference rules represents an algorithm because there is exactly one conclusion for each syntactic ASL construct giving priority to the LET rule over the regular application rule This set of rules may be read as a Prolog program This algorithm has been proven to be e syntactically sound when the algorithm succeeds on an expression e and returns a type then e T e complete if an expression e possesses a type then the algorithm will find a type T such that 7 is an instance of 7 The returned type 7 is thus the most general type of e Example We compute the type that we simply checked in our last example What is drawn below is the result of the type synthesis in fact we run our algorithm with type variables representing unknowns modifying the previous applications of the INST rule when necessary i e when encountering an equality constraint This is valid since it can be proved that the correction of the whole deduction tree is preserved by substitution of types for type variables In a real implementation of the algorithm the data structures representing types will be submitted to a unification mechanism INST NST x a F x a T E 8 8 8 8 r f 6 8 ABS APP Qr T f Vaa a Fff 6 86 LET Ot let f x x inf 8 8 Once again this expression is not typable without the use of the LET rule an error would occur because of the type equality co
38. shorten t match t with TypeVar Index _ Value Unknown gt t TypeVar Index _ Value TypeVar Index _ Value Unknown as tv gt tv TypeVar Index _ Value TypeVar tvi as tv2 tv2 Value lt tvi Value shorten t TypeVar Index _ Value t gt t Unknown gt raise TypingBug shorten t gt t shorten asl_type gt asl_type lt fun gt An ASL type error will be represented by the following exception exception TypeClash of asl type asl type Exception TypeClash defined We will need unification on ASL types with occur check The following function implements occur check let occurs Index n Value _ let rec occrec function TypeVar Index m Value _ gt n m Number gt false Arrow t1 t2 occrec t1 or occrec t2 Unknown gt raise TypingBug occurs in occrec 5 occurs vartype gt asl type gt bool fun dk dt 15 8 THE ASL TYPE SYNTHESIZER 107 The unification function implements destructive unification Instead of returning the most general unifier it returns the unificand of two types their most general common instance The two arguments are physically modified in order to represent the same type The unification function will detect type clashes let rec unify taui tau2 match shorten taui shorten tau2 with type variable n and type variable m TypeVar Index n Value
39. spaces gt gt lt lexer gt lt spaces _ gt gt lt DIV lexer s gt lt 0 9 as c integer int of digit c n spaces _ gt gt lt INT n lexer s gt lexer char stream gt token stream fun We assume there is no leading space in the input Now let us examine the language that we want to recognize We shall have integers infix arithmetic operations and parenthesized expressions The BNF form of the grammar is Expr Expr Expr Expr Expr Expr Expr Expr Expr Expr INT 10 3 PARAMETERIZED PARSERS 71 The values computed by the parser will be abstract syntax trees by contrast with concrete syntaz which is the input string or stream Such trees belong to the following type type atree Int of int Plus of atree atree Minus of atree atree Mult of atree atree Div of atree atree Type atree defined The Expr grammar is ambiguous To make it unambiguous we will adopt the usual precedences for arithmetic operators and assume that all operators associate to the left Now to use stream matching for parsing we must take into account the fact that matching rules are chosen according to the behavior of the first component of each matching rule This is predictive parsing and using well known techniques it is easy to rewrite the grammar above in such a way that writing the corresponding predictive parser bec
40. stream of characters and returning a stream of tokens Stream concatenation does not copy substreams they are simply put in the same stream Since as we will see later stream matching has a destructive effect on streams streams are physically eaten by stream matching parsing lt t t gt will in fact parse t only once the first occurrence of t will be consumed and the second occurrence will be empty before its parsing will be performed Interfacing streams with an input channel can be done with the function stream_of_channel in channel gt char stream fun returning a stream of characters which are read from the channel argument The end of stream will coincide with the end of the file associated to the channel In the same way one can build the character stream associated to a character string using stream_of_string string gt char stream lt fun gt let s stream of string abc S char stream lt abstr gt 10 1 2 Streams are lazily evaluated Stream expressions are submitted to lazy evaluation i e they are effectively build only when required This is useful in that it allows for the easy manipulation of interactive streams like the stream built from the standard input If this was not the case i e if streams were immediately completely computed a program evaluating stream_of_channel std_in would read everything up to an end of file on standard input before giving cont
41. typing env e with TypeClash t1 t2 gt let vars vars_of_type t1 vars of type t2 in print string ASL Type clash between print type scheme Forall vars t1 print string and print type scheme Forall vars t2 raise Failure ASL typing Unbound s gt raise TypingBug Unbound s in let sigma generalise type global typing env tau in PRINTING TYPE INFORMATION print string ASL Type of print string s print string is print type scheme sigma print newline COMPILING let code code_of e in let state Reg Environment global_CAM_env PC code Stack in EXECUTING let result try while true do step state done state Reg with CAM End v gt v in UPDATING ENVIRONMENTS global env s global env global typing env sigma global typing env n 16 3 EXECUTION OF CAM CODE global CAM env result global CAM env PRINTING RESULT match result with Constant n print int n Closure _ _ gt print string lt funval gt _ gt raise CAMbug Wrong state configuration print newline run top asl gt unit fun dk dk db Now let us run some examples Reinitializing environments global env init env 4global typing env init typing env global CAM env init CAM env 033 unit Q run parse_top 1 ASL Type of
42. x App App Var 1 Const 1 App Abs x Var 1 Const 2 Unbound identifiers and undefined operators are correctly detected parse_top let y be g 3 Unbound ASL identifier g Uncaught exception Failure ASL parsing parse_top f if 0 then else fi 2 3 Unbound ASL identifier f Uncaught exception Failure ASL parsing parse top x Illegal character Uncaught exception Failure ASL parsing Chapter 13 Untyped semantics of ASL programs In this section we give a semantic treatment of ASL programs We will use dynamic typechecking i e we will test the type correctness of programs during their interpretation 13 1 Semantic values We need a type for ASL semantic values representing results of computations A semantic value will be either an integer or a Caml functional value from ASL values to ASL values type semval Constval of int Funval of semval gt semval Type semval defined We now define two exceptions The first one will be used when we encounter an ill typed program and will represent run time type errors The other one is helpful for debugging it will be raised when our interpreter semantic function goes into an illegal situation The following two exceptions will be raised in case of run time ASL type error and in case of bug of our semantic treatment exception Illtyped Exception Illtyped defined exception SemantBug of string Exception SemantBug def
43. 1 gt gt x 1 lt gt gt star a stream gt b gt a stream gt b list fun The star parser iterates zero or more times its argument p and returns the list of results We still have to be careful in using these general parsers because of the predictive nature of parsing For example star p will never successfully terminate if p has a rule for the empty stream pattern this rule will make the second rule of star useless 10 3 4 Example parsing a non context free language As an example of parsing with parameterized parsers we shall build a parser for the language wCw w A B which is known to be non context free First let us define a type for this alphabet type token B C Type token defined Given an input of the form wCw the idea for a parser recognizing it is e first to recognize the sequence w with a parser wd for word definition returning information in order to build a parser recognizing only w e then to recognize C e and to use the parser built at the first step to recognize the sequence The definition of wd is as follows let rec wd function lt A wd 1 gt gt function lt A gt gt 1 lt wd 1 gt gt function lt B gt gt b 1 lt gt gt wd token stream gt token stream gt string list lt fun gt The wu function for word usage builds a parser sequencing a list of
44. 23 It is also possible to access the value of a single field with the dot operator jean Age int 23 jean Job string Student Labels always refer to the most recent type definition when two record types possess some common labels then these labels always refer to the most recently defined type When using modules see section 11 2 this problem arises for types defined in the same module For types defined in different modules the full name of labels i e with the name of the modules prepended disambiguates such situations 6 2 SUM TYPES 39 6 1 3 Parameterized product types It is important to be able to define parameterized types in order to define generic data structures The list type is parameterized and this is the reason why we may build lists of any kind of values If we want to define the cartesian product as a Caml type we need type parameters because we want to be able to build cartesian product of any pair of types type a b pair Fst a Snd b Type pair defined let first x x Fst and second x x Snd first Ca b pair gt a fun second a b pair gt b fun let p Snd true Fst 1 2 p nt bool pair Fst 3 Snd true first p int 3 Warning the pair type is similar to the Caml cartesian product but it is a different type fst p Toplevel input gt fst p gt P This expression has type int b
45. Functional programming using Caml Light Michel Mauny January 1995 Contents 1 Introduction 5 I Functional programming 7 2 Functional languages 9 2 1 History of functional languages 10 2 2 Whe uu utt ei ux S Redon Mee bx EIER Ba ot UNUS 10 2 3 Lhe Miranda family Eus m BRE HEURE d eae uds ds 11 3 Basic concepts 13 8 1 oplevello p 42x wd be ie RO SUM REIR ee ede 13 3 2 Evaluation from expressions to 14 9 9 LV POS e drm Gi Gop eae biel AR ears Eier qe be ed ie ol roten co ESTEE HENRI 15 39 4 JFuncblOnS 5 2 ue dels SNC bu pedo de REOR UE tcs Sunde ee Y pda i ie d 17 2 5 Definitions v ede UR em hk i aa a ck det ed us 17 3 0 Partial applications c ucsske BR a ee a RU RAUS 19 4 Basic types 21 N mbersc Ges amp quee ES OR uat en a Bodh d a ee RN Weg ee Ba 21 4 2 Boolean v lues ARRA LOS RU e d 24 4 9 Strings and charaebers ono Mer oue Dew ee S 26 AAS DIuples s acte mex SUO Cae x EIE ER e xU ed d e cde ee 27 4 5 Patterns and pattern matching 28 4 6 E nchions sso ALD Sake ie tb Oe eA Ru dese ie eo oS 30 5 Lists 33 Ball os Building li
46. PTER 11 STANDALONE PROGRAMS AND SEPARATE COMPILATION The input char function reads the next character from an input channel here std in the channel connected to standard input It raises exception End of file when reaching the end of the file The exit function aborts the process Its argument is the exit status of the process Calling exit is absolutely necessary to ensure proper flushing of the output channels Assume this program is in file count ml To compile it simply run the camlc command from the command interpreter camlc o count count ml The compiler produces an executable file count You can now run count with the help of the camlrun command camlrun count lt count ml This should display something like 306 characters 13 lines Under Unix the count file can actually be executed directly just like any other Unix command as in count lt count ml This also works under MS DOS provided the executable file is given extension exe That is if we compile count ml as follows camlc o count exe count ml we can run count exe directly as in count exe lt count ml See the reference manual for more information on camlc 11 2 Programs in several files It is possible to split one program into several source files separately compiled This way local changes do not imply a full recompilation of the program Let us illustrate that on the previous example We split it in two modules one that implements in
47. Princeton Univer sity and David MacQueen Bell Laboratories Caml Light is a smaller more portable implementation of the core Caml language developed by Xavier Leroy since 1990 2 3 The Miranda family All languages in this family use lazy evaluation i e the argument of a function is evaluated if and when the function needs its value arguments are passed unevaluated to functions They also use Milner s type system Languages belonging to the Miranda family find their origin in the SASL language 34 1976 developed by D Turner SASL and its successors KRC 35 1981 Miranda 36 1985 and Haskell 15 1990 use sets of mutually recursive equations as programs These equations are written in a script collection of declarations and the user may evaluate expressions using values defined in this script LML Lazy ML has been developed in G teborg Sweeden its syntax is close to ML s syntax and it uses a fast execution model the G machine 16 Laboratoire d Informatique de l Ecole Normale Sup rieure 45 Rue d Ulm 75005 Paris 12 CHAPTER 2 FUNCTIONAL LANGUAGES Chapter 3 Basic concepts We examine in this chapter some fundamental concepts which we will use and study in the following chapters Some of them are specific to the interface with the Caml language toplevel global definitions while others are applicable to any functional language 3 1 Toplevel loop The first contact with Caml is through its interactiv
48. Type of it is Number unit typing parse_top let fib be fix f n if n 1 then 1 else if n 2 then 1 else f n 1 n 2 fi ie Pea oe ASL Type of fib is Number gt Number unit typing parse top fib 9 ASL Type of it is Number unit Q 114 CHAPTER 15 STATIC TYPING POLYMORPHISM AND TYPE SYNTHESIS Chapter 16 Compiling ASL to an abstract machine code In order to fully take advantage of the static typing of ASL programs we have to e either write a new interpreter without type tests difficult because we used pattern matching in order to realize type tests e or design an untyped machine and produce code for it We choose here the second solution it will permit us to give some intuition about the compiling process of functional languages and to describe a typical execution model for strict functional languages The machine that we will use is a simplified version the Categorical Abstract Machine CAM for short We will call CAM our abstract machine despite its differences with the original CAM For more informations on the CAM see 8 25 16 1 The Abstract Machine The execution model is a stack machine i e a machine using a stack In this section we define in Caml the states of the CAM and its instructions A state is composed of e a register holding values and environments e a program counter represented here as a list of instructions
49. Unknown as tvi as t1 TypeVar Index m Value Unknown as tv2 as t2 gt if n lt gt m then tv Value lt t2 type 1 and type variable ti TypeVar Index _ Value Unknown as tv as t2 gt if not occurs tv t1 then tv Value lt t1 else raise TypeClash t1 t2 type variable and type t2 TypeVar Index _ Value Unknown as tv as t1 t2 if not occurs tv t2 then tv Value t2 else raise TypeClash t1 t2 Number Number gt Arrow ti1 t2 Arrow t 1 t 2 as t gt unify t1 t 1 unify t2 t 2 t1 t2 gt raise TypeClash t1 t2 unify asl type asl type gt unit fun dk dt dt db dt OF 15 3 3 Representation of typing environments We use asl type scheme list as typing environments and we will use the encoding of variables as indices into the environment The initial environment is a list of types Number Number gt Number which are the types of the ASL primitive functions let init typing env map function s gt Forall Arrow Number Arrow Number Number init_env The global typing environment is initialized to the initial typing environment and will be updated with the type of each ASL declaration after they are type checked let global typing env ref init typing env 15 3 4 From types to type schemes generalization In order to implement generalization we will need some functions collecting types va
50. We start by defining some very simple functions on numbers let square x Xj square float gt float lt fun gt square 2 0 float 0 square 2 0 3 0 float 0 444444444444 let sum_of_squares x y square x square y sum_of_squares float float gt float lt fun gt let half_pi 3 14159 2 0 in sum_of_squares cos half_pi sin half_pi float 1 0 4 1 NUMBERS 23 We now develop a classical example the computation of the root of a function by Newton s method Newton s method can be stated as follows if y is an approximation to a root of a function f then y fy 9 f is a better approximation where f y is the derivative of f evaluated at y For example with f x a a we have f x 2a and therefore f y y a Vytj y y f y 2y 2 We can define a function deriv for approximating the derivative of a function at a given point by let deriv f x dx f x dx f x dx deriv float gt float gt float gt float gt float fun Provided dx is sufficiently small this gives a reasonable estimate of the derivative of f at x The following function returns the absolute value of its floating point number argument It uses the if then else conditional construct let abs x if x gt 0 0 then x else x abs float gt float fun The main function given below uses three local functions
51. a contiguous fragment of memory while accessing list elements takes linear time 7 3 2 Modifying array elements Modification of an array element is done with the construct 1 lt where has the same type as the elements of the array The expression eg computes the index at which the modification will occur As for accessing a function for modifying array elements is also provided vect_assign vect gt int gt a gt unit fun For example 0 a 0 1 unit int vect 9 20 301 vect_assign a 0 vect_item a 0 1 unit int vect l8 20 301 7 4 Loops while and for Imperative programming i e using side effects such as physical modification of data structures traditionally makes use of sequences and explicit loops Sequencing evaluation in Caml Light is done by using the semicolon Evaluating expression e1 discarding the value returned and then evaluating e2 is written 7 4 LOOPS WHILE AND FOR 53 1 2 If e1 and eg perform side effects this construct ensures that they will be performed in the specified order from left to right In order to emphasize sequential side effects instead of using parentheses around sequences one can use begin and end as in let x ref 1 in begin x Ix 1 x Ix end unit The keywords begin and end are equiva
52. ake_set function eliminates duplicates in its list argument let rec make_set function EI gt O x l gt if mem x 1 then make set 1 else x make set 1 make set a list gt a list fun let generalise type gamma tau let genvars make_set subtract vars_of_type tau unknowns of type env gamma in Forall genvars tau generalise type asl type scheme list asl type gt asl type scheme fun 15 8 THE ASL TYPE SYNTHESIZER 109 15 3 5 From type schemes to types generic instantiation The following function returns a generic instance of its type scheme argument A generic instance is obtained by replacing all generic type variables by new unknowns let gen instance Forall gv tau We associate new unknown to each generic variable let unknowns map function n gt n TypeVar new vartypeO gv in let rec ginstance function TypeVar Index n Value Unknown as t gt try assoc n unknowns with Not_found gt t TypeVar Index _ Value t gt ginstance t Number gt Number Arrow t1 t2 gt Arrow ginstance t1 ginstance t2 Unknown gt raise TypingBug gen_instance in ginstance tau gen instance asl type scheme gt asl type fun 15 3 6 The ASL type synthesizer The type synthesizer is the asl_typing_expr function Each of its match cases corresponds to an inference rule given above
53. anguage which uses Milner s type system but where programs are submitted to lazy evaluation Currently the two main families of functional languages are the ML and the Miranda families 2 2 The ML family ML languages are based on a sugared version of A calculus Their evaluation regime is call by value i e the argument is evaluated before being passed to a function and they use Milner s type system The LCF proof system appeared in 1972 at Stanford Stanford LCF It has been further developed at Cambridge Cambridge LCF where the ML language was added to it From 1981 to 1986 a version of ML and its compiler was developed in a collaboration between INRIA and Cambridge by G Cousineau G Huet and L Paulson lie with a more user friendly syntax 2 8 THE MIRANDA FAMILY 11 In 1981 L Cardelli implemented a version of ML whose compiler generated native machine code In 1984 a committee of researchers from the universities of Edinburgh and Cambridge Bell Laboratories and INRIA headed by R Milner worked on a new extended language called Standard ML 28 This core language was completed by a module facility designed by D MacQueen 23 Since 1984 the Caml language has been under design in a collaboration between INRIA and LIENS Its first release appeared in 1987 The main implementors of Caml were Asc nder Su rez Pierre Weis and Michel Mauny In 1989 appeared Standard ML of New Jersey developed by Andrew Appel
54. are root float gt float gt float fun cubic root float gt float gt float fun square root 2 0 1e 5 float 1 41421569553 cubic root 8 0 1e 5 float 2 00000000131 42 0 newton cos 1e 5 1 5 float 3 14159265359 4 2 Boolean values The presence of the conditional construct implies the presence of boolean values The type bool is composed of two values true and false true bool true false bool false The functions with results of type bool are often called predicates Many predicates are predefined in Caml Here are some of them prefix lt a gt a gt bool fun 1 lt 2 bool true prefix lt float gt float gt bool fun 3 14159 lt 2 718 bool false There exist also lt gt gt and similarly lt gt gt 4 2 1 Equality Equality has a special status in functional languages because of functional values It is easy to test equality of values of base types integers floating point numbers booleans 1 2 bool false false 1 gt 2 bool true 422 BOOLEAN VALUES 25 But it is impossible in the general case to decide the equality of functional values Hence equality stops on a run time error when it encounters functional values fun x gt x fun x gt x Uncaught exception Invalid_argument equal functional value Since equali
55. argument as an accumulator holding the sum computed so far The call stream_sum n x uses x which was bound in the stream pattern component occurring at the left of the call Warning streams patterns are legal only in the function and match constructs The let and other forms are restricted to usual patterns Furthermore a stream pattern cannot appear inside another pattern 10 3 Parameterized parsers Since a parser is a function like any other function it can be parameterized or be used as a parameter Parameters used only in the right hand side of stream matching rules simulate inherited attributes of attribute grammars Parameters used as parsers in stream patterns allow for the implementation of higher order parsers We will use the next example to motivate the introduction of parameterized parsers 10 3 1 Example a parser for arithmetic expressions Before building a parser for arithmetic expressions we need a lexical analyzer able to recognize arithmetic operations and integer constants Let us first define a type for tokens type token PLUS MINUS TIMES DIV LPAR RPAR INT of int Type token defined Skipping blank spaces is performed by the spaces function defined as let rec spaces function lt Nt Nn spaces _ gt gt 70 CHAPTER 10 STREAMS AND PARSERS SI lt gt gt spaces char stream gt unit fun The conversion of a digit character into its integer va
56. atches any boolean value in fact only false since true has been caught by the first match case When the second case is chosen the identifier x gets bound to the argument of negate and could be used in the right hand part of the match case Since in our example we do not use the value of the argument in the right hand part of the second match case another equivalent definition of negate would be let negate function true gt false _ gt true negate bool gt bool lt fun gt Where _ acts as a formal paramenter matches any value but does not introduce any binding it should be read as anything else As an example of pattern matching consider the following function definition truth value table of implication let imply function true true gt true true false gt false false true gt true false false gt true imply bool bool gt bool lt fun gt Here is another way of defining imply by using variables let imply function true x gt x false x gt true imply bool bool gt bool lt fun gt Simpler and simpler again let imply function true x gt x false _ gt true imply bool bool gt bool lt fun gt let imply function true false gt false _ gt true imply bool bool gt bool lt fun gt Pattern matching is allowed on any type of value non trivial pattern matching is only possible
57. cl s e reset vartypesO let tau TYPING try asl_typing_expr global_typing_env e with TypeClash t1 t2 gt A typing error let vars vars_of_type t1 vars_of_type t2 in print_string ASL Type clash between print_type_scheme Forall vars t1 print_string and print_type_scheme Forall vars t2 print newline raise Failure ASL typing in let sigma generalise_type global_typing_env tau in UPDATING ENVIRONMENTS global_env s global_env global_typing_env sigma global_typing_env reset_vartypes PRINTING RESULTING TYPE print_string ASL Type of print_string s print string is print type scheme sigma print newlineO typing top asl gt unit fun 15 3 8 Typing ASL programs We reinitialize the parsing environment global env init env unit Now let us run some examples through the ASL type checker typing parse_top let x be 1 ASL Type of x is Number 112 CHAPTER 15 STATIC TYPING POLYMORPHISM AND TYPE SYNTHESIS unit typing parse top 2 x x 3 ASL Type of it is Number unit Q typing parse top if O 1 then 1 else O fi ASL Type of it is Number unit Q typing parse_top let id be x x ASL Type of id is a gt a unit Q typing parse_top id 1 id id 2 ASL Type of it is Number unit
58. d i e possess a non recursive case or base case in order to work well with call by value 6 2 4 Parameterized sum types Sum types may also be parameterized Here is the definition of a type isomorphic to the list type type a sequence Empty Sequence of sequence Type sequence defined A more sophisticated example is the type of generic binary trees type a b btree Leaf of b Btree of a b node and a b node Op a Soni a b btree Son2 a b btree Type btree defined Type node defined A binary tree is either a leaf holding values of type b or a node composed of an operator of type a and two sons both of them being binary trees Binary trees can also be used to represent abstract trees for arithmetic expressions with only binary operators and only one kind of leaves The abstract syntax tree t of 1 2 could be defined as let t Btree Op Soni Leaf 1 Son2 Leaf 2 t string int btree Btree 0 Soni Leaf 1 Son2 Leaf 2 Finally it is time to notice that pattern matching is not restricted to function bodies i e constructs such as function Pj gt Merc P4 En 44 CHAPTER 6 USER DEFINED TYPES but there is also a construct dedicated to pattern matching of actual values match E with Pj gt F4 Du es P gt En which matches the value of the expression E against each of
59. e We can see from the previous examples that the stream pattern lt x gt matches an initial segment of the stream Such a pattern must be read as the stream whose first element matches x Furthermore once stream matching has succeeded the stream argument has been physically modified and does not contain any longer the part that has been recognized by the next function If we come back to the infinite stream of integers we can see that the calls to next provoke the evaluation of the necessary part of the stream next ints next ints next ints Toplevel input gt next ints next ints next ints Warning this expression has type int but is used with type unit Toplevel input gt next ints next ints next ints ees Warning this expression has type int but is used with type unit 012 int 2 68 CHAPTER 10 STREAMS AND PARSERS Thus successive calls to next remove the first elements of the stream until it becomes empty Then next fails when applied to the empty stream since in the definition of next there is no stream pattern that matches an initial segment of the empty stream It is of course possible to specify several stream patterns as in let next function lt x gt gt x lt gt gt raise Failure empty next a stream gt a fun Cases are tried in turn from top to bottom Stream pattern components are not restricted to quoted patterns intended to ma
60. e conditional must be a special construct because our language will be submitted to call by value thus the conditional cannot be a function ASL programs are built up from numbers variables functional expressions A abstractions applications and conditionals An ASL program consists of a global declaration of an identifier getting bound to the value of an expression The primitive functions that are available are equality between numbers and arithmetic binary operations The concrete syntax of ASL expressions can be described ambiguously as Expr INT IDENT if Expr then Expr else Expr fi Expr IDENT Expr and the syntax of declarations is given as Decl let IDENT be Expr Expr yt Arithmetic binary operations will be written in prefix position and will belong to the class IDENT The symbol will play the role of the Caml keyword function We start by defining the abstract syntax of ASL expressions and of ASL toplevel phrases Then we define a parser in order to produce abstract syntax trees from the concrete syntax of ASL programs 12 1 ASL abstract syntax trees We encode variable names by numbers These numbers represent the binding depth of variables For instance the function of x returning x the ASL identity function will be represented as Abs x Var 1 87 88 CHAPTER 12 ASL A SMALL LANGUAGE And the ASL application function which would be written in Caml func
61. e aspect When running Caml on a computer we enter a toplevel loop where Caml waits for input from the user and then gives a response to what has been entered The beginning of a Caml Light session looks like this assuming is the shell prompt on the host machine camllight gt Caml Light version 0 6 On the PC version the command is called caml The character is Caml s prompt When this character appears at the beginning of a line in an actual Caml Light session the toplevel loop is waiting for a new toplevel phrase to be entered Throughout this document the phrases starting by the character represent legal input to Caml Since this document has been pre processed by Caml Light and these lines have been effectively given as input to a Caml Light process the reader may reproduce by him herself the session contained in each chapter each chapter of the first two parts contains a different session the third part is a single session Lines starting with the gt character are Caml Light error messages Lines starting by another character are either Caml responses or possibly illegal input The input is printed in typewriter font like this and output is written using slanted typewriter font like that Important Of course when developing non trivial programs it is preferable to edit programs in files and then to include the files in the toplevel instead of entering the programs directly Caml Light impl
62. e considerations on mult and restmult we can complete the parser obtaining 10 3 PARAMETERIZED PARSERS 73 let rec expr let rec restexpr 1 function lt PLUS mult 2 restexpr Plus e1 e2 e gt gt e lt MINUS mult e2 restexpr Minus 1 2 e gt gt e lt gt gt 1 in function lt mult e1 restexpr e1 e2 gt gt e2 and mult let rec restmult 1 function lt TIMES atom e2 restmult Mult e1 e2 e gt gt e lt DIV atom 2 restmult Div e1 e2 e gt gt lt gt gt et din function lt atom ei restmult e1 e2 gt gt e2 and atom function lt INT n gt gt Int n lt LPAR expr RPAR gt gt e expr token stream gt atree lt fun gt mult token stream gt atree lt fun gt atom token stream gt atree lt fun gt And we can now try our parser expr lexer stream of string 14243 4 567 atree Minus Plus Plus Int 1 Int 2 Mult Int 3 Int 4 Int 567 10 3 2 Parameters simulating inherited attributes In the previous example the parsers restexpr and restmult take an abstract syntax tree e1 as ar gument and pass it down to the result through recursive calls such as restexpr Plus e1 e2 If we see such parsers as non terminals RestExpr from the grammar above this parameter acts as an inherited attribute of the non termi
63. e historical points e 1930 Alonzo Church developed the A calculus 6 as an attempt to provide a basis for math ematics The A calculus is a formal theory for functionality The three basic constructs of the A calculus are variable names e g 2 y application MN if M and M are terms functional abstraction Terms of the A calculus represent functions The pure A calculus has been proved inconsis tent as a logical theory Some type systems have been added to it in order to remedy this inconsistency e 1958 Mac Carthy invented Lisp 22 whose programs have some similarities with terms of the A calculus Lisp dialects have been recently evolving in order to be closer to modern functional languages Scheme but they still do not possess a type system e 1965 P Landin proposed the ISWIM 18 language for If You See What I Mean which is the precursor of languages of the ML family e 1978 J Backus introduced FP a language of combinators 3 and a framework in which it is possible to reason about programs The main particularity of FP programs is that they have no variable names e 1978 R Milner proposes a language called ML 11 intended to be the metalanguage of the LCF proof assistant i e the language used to program the search of proofs This language is inspired by ISWIM close to A calculus and possesses an original type system e 1985 D Turner proposed the Miranda 36 programming l
64. e integer division of an integer n by 0 then we must escape from that embarrassing situation without returning any result Another example of the usage of such an escape mechanism appears when we want to define the head function on lists let head function x L gt x gt raise Failure head empty list Toplevel input gt Beek P x gt 3 Warning the variable L starts with an upper case letter in this pattern head a list gt fun We cannot give a regular value to the expression head without losing the polymorphism of head We thus choose to escape we raise an exception 8 1 Exceptions An exception is a Caml value of the built in type exn very similar to a sum type An exception e has a name Failure in our example e and holds zero or one value head empty list of type string in the example Some exceptions are predefined like Failure New exceptions can be defined with the following construct exception exception name gt of lt type gt Example 55 56 CHAPTER 8 ESCAPING FROM COMPUTATIONS EXCEPTIONS exception Found of int Exception Found defined The exception Found has been declared and it carries integer values When we apply it to an integer we get an exception value of type exn Found 5 exn Found 5 8 2 Raising an exception Raising an exception is done by applying the primitive function raise to a value of type exn raise e
65. e object defined The type state is a product type with mutable components type state mutable Reg object mutable PC instruction list mutable Stack object list Type state defined Now we give the operational semantics of CAM instructions The effect of an instruction is to change the state configuration This is what we describe now with the step function Code executions will be arbitrary iterations of this function exception CAMbug of string Exception CAMbug defined exception CAM End of object Exception CAM End defined let step state match state with Reg _ PC Quote n code Stack s gt state Reg lt Constant n state PC lt code 16 1 THE ABSTRACT MACHINE 117 Reg Constant m PC Plus code Stack Constant n s gt state Reg lt Constant ntm state Stack lt s State PC code Reg Constant m PC Minus code Stack Constant n s gt state Reg lt Constant n m state Stack lt s state PC lt code Reg Constant m PC Times code Stack Constant n s gt state Reg lt Constant n m state Stack lt s state PC lt code Reg Constant m PC Divide code Stack Constant n s gt state Reg lt Constant n m state Stack lt s state PC lt code Reg Constant m PC Equal code Stack Constant n s gt state Reg lt Constant if n m then 1 else 0 state Stack lt s state PC lt code
66. e representing the functional value the closure is composed of the code of the function and the current environment Here is the compiling function let rec code of function Const n gt Quote n Var n gt Nth n 16 2 COMPILING ASL PROGRAMS INTO CAM CODE 119 4 Cond e test e t e f gt Push code_of e test Branch code_of e_t code_of e_f App ei e2 gt Push code of 1 Swap code_of e2 Apply Abs _ e gt Clos code_of e code_of asl gt instruction list lt fun gt A global environment is needed in order to maintain already defined values Any CAM execution will start in a state whose register part contains this global environment let init_CAM_env let basic_instruction function gt Plus Minus Times gt Divide gt Equal s gt raise CAMbug Unknown primitive in map function s gt Closure Address Clos Push Nth 2 Swap Nth 1 basic_instruction s Environment init env init CAM env object list Closure Address Clos Push Nth 2 Swap Nth 1 Plus Environment Closure Address Clos Push Nth 2 Swap Nth 1 Minus Environment Closure Address Clos Push Nth 2 Swap Nth 1 Times Environment Closure Address Clos Push Nth 2 Swap Nth 1 Divide Environment Closure Address Clos Push Nth 2 Swap Nth 1
67. e string type Types are divided into two classes e Basic types int bool string 16 CHAPTER 3 BASIC CONCEPTS e Compound types such as functional types For example the type of functions from integers to integers is denoted by int gt int The type of functions from boolean values to character strings is written bool gt string The type of pairs whose first component is an integer and whose second component is a boolean value is written int bool In fact any combination of the types above and even more is possible This could be written as int bool string BasicType Type gt Type Type Type Type Once a Caml toplevel phrase has been entered in the computer the Caml process starts analyzing it First of all it performs syntax analysis which consists in checking whether the phrase belongs to the language or not For example here is a syntax error a parenthesis is missing function x gt x 1 2 3 Toplevel input gt function x gt x 1 2 3 s ES Syntax error The carets underline the location where the error was detected The second analysis performed is type analysis the system attempts to assign a type to each subexpression of the phrase and to synthesize the type of the whole phrase If type analysis fails i e if the system is unable to assign a sensible type to the phrase then a type error is reported and Caml waits for another input rejectin
68. e1 restmult gt gt and atom function lt INT n gt gt Int n lt LPAR expr e gt gt e We used question marks where parameters bindings and results still have to appear Let us consider the expr function clearly as soon as e1 is recognized we must be ready to build the leftmost subtree of the result This leftmost subtree is either restricted to e1 itself in case restexpr does not encounter any operator or it is the tree representing the addition or subtraction of e1 and the expression immediately following the additive operator Therefore restexpr must be called with e1 as an intermediate result and accumulate subtrees built from its intermediate result the tree constructor corresponding to the operator and the last expression encountered The body of expr becomes let rec expr let rec restexpr e1 function lt PLUS mult e2 restexpr Plus e1 e2 e gt gt e lt MINUS mult e2 restexpr Minus e1 e2 gt gt e lt gt gt 1 in function lt mult e1 restexpr e1 e2 gt gt e2 Now expr recognizes a product e1 by mult and applies restexpr e1 to the rest of the stream According to the additive operator encountered if any this function will apply mult which will return some e2 Then the process continues with Plus e1 e2 as intermediate result In the end a correctly balanced tree will be produced using the last rule of restexpr With the sam
69. eaf x gt Leaf f x Btree Op op Soni s1 Son2 s2 Btree Op g op Soni map btree f g s1 Son2 map_btree f g s2 map_btree a gt b gt c gt gt Cc a btree gt d b btree lt fun gt 6 3 p 46 We need to give a functional interpretation to btree data constructors We use f resp g to denote the function associated to the Leaf resp Btree data constructor obtaining the following Caml definition let rec btree_it f g function Leaf x gt f x Btree Op op Soni s1 Son2 s2 gt g op btree it f g s1 btree it f g s2 btree it a gt b gt c gt b gt b gt gt Cc btree gt b fun btree_it function x gt x function gt prefix _ gt raise Failure Unknown op Btree Op Soni Leaf 1 Son2 Leaf 2 int 3 128 CHAPTER 17 ANSWERS TO EXERCISES 7 1 p 54 type a b lisp cons mutable Car a mutable Cdr b Type lisp_cons defined let car p p Car and cdr p p Cdr and rplaca p v p Car lt v and rplacd p v p Cdr lt v car a b lisp_cons gt a lt fun gt a b lisp_cons gt b lt fun gt rplaca a b lisp cons gt a gt unit fun rplacd a b lisp cons gt b gt unit fun let p Car 1 Cdr true p nt bool lisp_cons Car 1 Cdr true
70. ec ident len function lt af Z A Zf as if len gt 8 then ident len else begin set_nth_char ident_buf len c ident succ len end s gt gt s lt gt gt sub string ident buf 0 len ident int gt char stream gt string lt fun gt dk dt db dto The lexical analyzer will first try to recognize an alphabetic character c then put c at position 0 of ident buf and call ident 1 rest of the character stream Alphabetic characters encountered will be stored in the string buffer ident buf up to the 8th Further alphabetic characters will be skipped Finally a substring of the buffer will be returned as result 134 CHAPTER 17 let s stream of string toto 1 S char stream lt abstr gt ident 0 s string toto Let us see what remains in the stream match s with lt c gt gt c char let s stream of string LongIdentifier S char stream lt abstr gt ident 0 s string LongIden match s with lt c gt gt c char ANSWERS TO EXERCISES The definitions of the new token type and of the lexical analyzer is trivial and we shall omit them A slightly more complex lexical analyzer recognizing identifiers lowercase only is given in section 12 2 1 in this part 11 1 p 83 main ml let chars counter new 0 let lines counter new 0 let count file filename let in chan o
71. eir ASCII code as in char of int 65 char A Other operations are available sub_string int_of_char etc See the Caml Light reference manual 21 for complete information 4 4 Tuples 4 4 1 Constructing tuples It is possible to combine values into tuples pairs triples The value constructor for tuples is the character the comma We will often use parentheses around tuples in order to improve readability but they are not strictly necessary 1 2 int int 1 2 1 2 3 4 int int int int 1 2 3 4 let p 1 2 1 lt 2 p int bool 3 true The infix identifier is the type constructor for tuples For instance f1 t9 corresponds to the mathematical cartesian product of types t and t We can build tuples from any values the tuple value constructor is generic 4 4 2 Extracting pair components Projection functions are used to extract components of tuples For pairs we have fst a b gt a lt fun gt snd a b gt b fun They have polymorphic types of course since they may be applied to any kind of pair They are predefined in the Caml system but could be defined by the user in fact the cartesian product itself could be defined see section 6 1 dedicated to user defined product types let first x y x second x y y first gt a fun second b g
72. ementations also possess a batch compiler usable to compile files and produce standalone appli cations this will be discussed in chapter 11 13 14 CHAPTER 3 BASIC CONCEPTS Furthermore when debugging it is very useful to trace some functions to see with what arguments they are called and what result they return See the reference manual 21 for a description of these facilities 3 2 Evaluation from expressions to values Let us enter an arithmetic expression and see what is Caml s response 14 2 int 3 The expression 1 2 has been entered followed by which represents the end of the current toplevel phrase When encountering Caml enters the type checking more precisely type synthesis phase and prints the inferred type for the expression Then it compiles code for the expression executes it and finally prints the result In the previous example the result of evaluation is printed as 3 and the type is int the type of integers The process of evaluation of a Caml expression can be seen as transforming this expression until no further transformation is allowed These transformations must preserve semantics For example if the expression 1 2 has the mathematical object 3 as semantics then the result 3 must have the same semantics The different phases of the Caml evaluation process are e parsing checking the syntactic legality of input e type synthesis e com
73. erson Name string Age int Job string City string Type person defined The type person is the product of string int string and string The field names provide type information and also documentation it is much easier to understand data structures such as jean above than arbitrary tuples We have labels i e Name to refer to components of the products The order of appearance of the products components is not relevant labels are sufficient to uniquely identify the components The Caml system finds a canonical order on labels to represent and print record values The order is always the order which appeared in the definition of the type We may now define the individual jean as let jean Job Student City Paris Name Jean Age 23 jean person Name Jean Age 23 Job Student City Paris This example illustrates the fact that order of labels is not relevant 6 1 2 Extracting products components The canonical way of extracting product components is pattern matching Pattern matching pro vides a way to mention the shape of values and to give local names to components of values In the following example we name n the numerical value contained in the field Age of the argument and we choose to forget values contained in other fields using the _ character let age_of function Age n Name _ Job _ City _ gt n age of person gt int lt fun gt age_of jean int
74. evel loop and its basic objects Basic types and some of their associated primitives are presented in chapter 4 Lists chapter 5 and user defined types chapter 6 are structured data allowing for the representation of complex objects and their easy creation and destructuration While concepts presented in part I are common under one form or another to many functional languages part B chapters 7 11 is dedicated to features specific to Caml Light mutable data structures chapter 7 exception handling chapter 8 input output chapter 9 and streams and parsers chapter 10 show a more imperative side of the language Standalone programs and separate compilation chapter 11 allow for modular programming and the creation of standalone applications Concise examples of Caml Light features are to be found in this part Part C chapters 12 16 is meant for already experienced Caml Light users willing to know more about how the Caml Light compiler synthesizes the types of expression and how compilation and evaluation proceeds Some knowledge about first order unification is assumed The presentation is rather informal and is sometimes terse specially in the chapter about type synthesis We prototype a small and simple functional language called ASL we give the complete prototype implementation from the ASL parser to the symbolic execution of code Lexing and parsing of ASL programs are presented in chapter 12 providing realistic usages of streams
75. g the current phrase Here is a type error cannot add a number to a boolean function x gt 1 2 true Toplevel input gt function x gt x 1 2 true ded This expression has type bool but is used with type int The rejection of ill typed phrases is called strong typing Compilers for weakly typed languages C for example would instead issue a warning message and continue their work at the risk of getting a Bus error or Illegal instruction message at run time Once a sensible type has been deduced for the expression then the evaluation process compi lation loading and execution may begin Strong typing enforces writing clear and well structured programs Moreover it adds security and increases the speed of program development since most typos and many conceptual errors are Actually lexical analysis takes place before syntax analysis and lexical errors may occur detecting for instance badly formed character constants 3 4 FUNCTIONS 17 trapped and signaled by the type analysis Finally well typed programs do not need dynamic type tests the addition function does not need to test whether or not its arguments are numbers hence static type analysis allows for more efficient machine code 3 4 Functions The most important kind of values in functional programming are functional values Mathemati cally a function f of type A B is a rule of correspondence which associates with each element
76. grams and separate compilation 11 1 Standalone programs 2l 11 2 Programs in several files 11 3 Abstraetion ua amp A mr omnc moe Liens III A complete example 12 ASL A Small Language 12 ASL abstract syntax trees 12 2 Parsing ASL programs les 13 Untyped semantics of ASL programs I34 Semantic values x do ox e RUE X Bex n 13 2 Semantic 13 3 Exaimples fy aes ab es me dy vod ees 14 Encoding recursion 14 1 Fixpoint combinators es 14 2 Recursion as a primitive construct CONTENTS AT 49 LOSS d ae kk ud te f 49 RES EDO page Mate 50 51 Dx Hoh aes rues Pai cede 52 Shades e A a ud 54 55 TTA DONI 55 pe Mi oe BO iis egestas d 56 RE NOME 57 sub y ceto axe toe a Ms 58 59 dau ES NT es 59 61 Eu AO EE Suc ep 62 62 65 veuve tuus ond 65 67 hy de 69 Scab Wat Se i ster oa ots ae 76 79 Sah Raat ae te 79 EE 80 fono ADM DLP 82 85 87 sea eae hy E 87 DE 88 93 ev d de coped Dr eoim 93 E 94 T s 95 97 97 CONTENTS 3 15 Static typing polymorphism and type synthesis 99 15 1 Lhestype System fas dk y dece modo xo Rheno Rex deos HIE ek d ET CREER d 99 1522 Fhevalgorithim a te o dan B Xem de Pee tan hte Pte Site
77. head function x gt x gt raise Failure head head a list gt a fun let tail function _ 1 gt 1 gt raise Failure tail tail a list gt a list fun The expression raise Failure head will produce a run time error when evaluated In the definition of head above we have chosen to forbid taking the head of an empty list We could have chosen tail to evaluate to but we cannot return a value for head without changing the type of the head function We say that the types list and bool are sum types because they are defined with several alternatives e list is either or of e a boolean value is either true or false Lists and booleans are typical examples of sum types Sum types are the only types whose values need run time tests in order to be matched by a non variable pattern i e a pattern that is not a single variable The cartesian product is a product type only one alternative Product types do not involve run time tests during pattern matching because the type of their values suffices to indicate statically what their structure is 5 3 FUNCTIONS OVER LISTS 35 5 3 Functions over lists We will see in this section the definition of some useful functions over lists These functions are of general interest but are not exhaustive Some of them are predefined in the Caml Light system See also 9 or 37 for other examples of functions over lis
78. ht system release 0 6 Documentation and user s manual Technical report INRIA 1993 Included in the Caml Light 0 6 distribution J MacCarthy Lisp 1 5 Programmer s Manual MIT Press Cambridge Mass 1962 D MacQueen Modules for Standard ML In Proceedings of the ACM Conference on Lisp and Functional Programming 1984 M Mauny and D de Rauglaudre Parsers in ML In Proceedings of the ACM Conference on Lisp and Functional Programming 1992 M Mauny and A Su rez Implementing functional languages in the categorical abstract machine In Proceedings of the ACM International Conference on Lisp and Functional Pro gramming pages 266 278 1986 M Mauny Functional programming using CAML Technical Report 129 INRIA 1991 R Milner A theory of type polymorphism in programming J Comput Syst Sci 17 348 375 1978 R Milner A proposal for Standard ML In Proceedings of the ACM Conference on Lisp and Functional Programming 1987 S L Peyton Jones The Implementation of Functional Programming Languages Series in Computer Science Prentice Hall International 1987 S L Peyton Jones Mise en ceuvre des langages fonctionnels de programmation Manuels Informatiques Masson Masson 1990 J Rees and W Clinger The revised report on the algorithmic language Scheme In SIGPLAN Notices volume 21 1987 BIBLIOGRAPHY 139 32 J A Robinson Computational logic the unification computation In Machine Intelligence v
79. ined We must give a semantic value to our basic functions equality and arithmetic operations The next function transforms a Caml function into an ASL value let init semantics caml_fun Funval function Constval n gt Funval function Constval m gt Constval caml fun n m _ gt raise Illtyped _ gt raise Illtyped init semantics int gt int gt int gt semval fun 93 94 CHAPTER 13 UNTYPED SEMANTICS OF ASL PROGRAMS Now associate a Caml Light function to each ASL predefined function let caml function function gt prefix prefix prefix gt prefix gt fun n m gt if n m then 1 else 0 s gt raise SemantBug Unknown primitive caml function string gt int gt int gt int fun dk dk In the same way as for parsing we needed a global environment from which the binding depth of identifiers was computed we need a semantic environment from which the interpreter will fetch the value represented by identifiers The global semantic environment will be a reference on the list of predefined ASL values let init sem map fun x gt init semantics caml function x init_env init_sem semval list Funval fun Funval fun Funval fun Funval fun Funval fun let global sem ref init sem global sem semval list ref ref Funval fun Funval fun Funval fun
80. ion and unary minus multiplication division mod modulo Some examples of expressions 3 4 2 int 14 3 4 2 int 18 3 7 23 int 6 There are precedence rules that make bind tighter than and so on In doubt write extra parentheses So far we have not seen the type of these arithmetic operations They all expect two integer arguments so they are functions taking one integer and returning a function from integers to integers The functional value of such infix identifiers may be obtained by taking their prefix version prefix int gt int gt int fun 21 22 CHAPTER 4 BASIC TYPES prefix int gt int gt int fun prefix mod int gt int gt int fun As shown by their types the infix operators do not work on floating point values A separate set of floating point arithmetic operations is provided addition subtraction and unary minus multiplication division sqrt square root exp log exponentiation and logarithm sin cos usual trigonometric operations We have two conversion functions to go back and forth between integers and floating point numbers int of float and float of int 1 2 3 Toplevel input gt 1 2 3 5 This expression has type float but is used with type int float of int 1 2 3 float 3 3 Let us now give some examples of numerical functions
81. ist 1 2 3 4 Lists of functions let adds let add x y in add 1 add 2 add 3 adds int gt int list fun fun lt fun gt 33 34 CHAPTER 5 LISTS and indeed lists of anything desired We may wonder what are the types of the list data constructors The empty list is a list of anything since it has no element it has thus the type list of anything The list constructor takes an element and a list containing elements with the same type and returns a new list Here again there is no type constraint on the elements considered 1 list function head gt function tail gt head tail a gt a list gt a list fun We see here that the list type is a recursive type The constructor receives two arguments the second argument is itself a list 5 2 Extracting elements from lists pattern matching We know how to build lists we now show how to test emptiness of lists although the equality predicate could be used for that and extract elements from lists e g the first one We have used pattern matching on pairs numbers or boolean values The syntax of patterns also includes list patterns We will see that any data constructor can actually be used in a pattern For lists at least two cases have to be considered empty non empty let is null function gt true false is null a list gt bool fun let
82. it is Number 1 unit Q run parse_top 1 2 ASL Type of it is Number 3 unit run parse top x x 1 3 ASL Type of it is Number 4 unit Q We may now introduce the Z fixpoint combinator as a predefined function fix begin global env fix init env global typing env Forall 1 Arrow Arrow TypeVar Index 1 Value Unknown TypeVar Index 1 Value Unknown TypeVar Index 1 Value Unknown init typing env global CAM env match code of expression lexer stream of string COW xe f A Z Ge x 2 2 x z with Clos C gt Closure Address C Environment _ gt raise CAMbug Wrong code for fix init CAM env 121 122 CHAPTER 16 COMPILING ASL TO AN ABSTRACT MACHINE CODE end Toplevel input gt with Clos C gt Closure Address C Environment 5 Warning the variable C starts with an upper case letter in this pattern unit run parse top let fact be fix f n if n 0 then 1 else n f n 1 nisi Beta bre ASL Type of fact is Number gt Number lt funval gt unit run parse top let fib be fix f n if n 1 then 1 else if n 2 then 1 else f n 1 n 2 fi Type of fib is Number gt Number lt funval gt unit run parse top fact 8
83. ithms Types are either e product types e or sum types We have already seen examples of both kinds of types the bool and list types are sum types they contain values with different shapes and are defined and matched using several alternatives The cartesian product is an example of a product type we know statically the shape of values belonging to cartesian products In this chapter we will see how to define and use new types in Caml 6 1 Product types Product types are finite labeled products of types They are a generalization of cartesian product Elements of product types are called records 6 1 1 Defining product types An example suppose we want to define a data structure containing information about individuals We could define let jean Jean 23 Student Paris jean string int string string Jean 23 Student Paris and use pattern matching to extract any particular information about the person jean The prob lem with such usage of cartesian product is that a function name_of returning the name field of a value representing an individual would have the same type as the general first projection for 4 tuples and indeed would be the same function This type is not precise enough since it allows for the application of the function to any 4 uple and not only to values such as jean Instead of using cartesian product we define a person data type 37 38 CHAPTER 6 USER DEFINED TYPES type p
84. ix ffact 3 Using our intuition about the evaluation rules and the definition of a fixpoint combinator we obtain ffact fix ffact 3 Replacing ffact by its definition we obtain E fact n if O then 1 else n fact n 1 fi fix ffact We can now pass the two arguments to the first abstraction instantiating fact and n respectively to fix ffact and 3 E if 3 0 then 1 else 3 fix ffact 3 1 fi We can now reduce the conditional into its else branch E fix ffact 3 1 Continuing this way we eventually compute 3 2 1 1 6 This is the expected behavior of the factorial function Given an appropriate fixpoint combinator fix we could define the factorial function as fix ffact where ffact is the expression above Unfortunately when using call by value any application of a fixpoint combinator F such that F M evaluates to M F M leads to non termination of the evaluation because evaluation of F M leads to evaluating M F M and thus F M again 97 98 CHAPTER 14 ENCODING RECURSION We will use the Z fixpoint combinator defined by 2 Af Qu f y Az f x y The fixpoint combinator Z has the particularity of being usable under call by value evaluation regime in order to check that fact it is necessary to know the evaluation rules of A calculus Since the name z looks more like an ordinary parameter name we will call fix the ASL expression
85. le files counter zo main zo The final linking phases takes the two zo files and produces the executable main Object files must be linked in the right order for each global identifier the module defining it must come before the modules that use it Prefixing all external identifiers by the name of their defining module is sometimes tedious Therefore the Caml Light compiler provides a mechanism to omit the module__ part in external identifiers The system maintains a list of default modules called the currently opened modules and whenever it encounters an identifier without the module__ part it searches through the opened modules to find one that defines this identifier Searched modules always include the module being compiled searched first and some library modules of general use In addition two directives are provided to add and to remove modules from the list of opened modules e open module to add module in front of the list e close module to remove module from the list For instance we can rewrite the main ml file above as 82 CHAPTER 11 STANDALONE PROGRAMS AND SEPARATE COMPILATION open counter let chars new 0 let lines new 0 try while true do let c input_char std_in in incr chars if c n then incr lines done with End_of_file gt print int read chars print string characters print int read lines print string lines Wn exit 0 5 After the open co
86. lent to opening and closing parentheses The program above could be written as let x ref 1 in x Ix 1 x Ix x unit QO Explicit loops are not strictly necessary per se a recursive function could perform the same work However the usage of an explicit loop locally emphasizes a more imperative style Two loops are provided e while while do e2 done evaluates which must return a boolean expression if e1 return true then e which is usually a sequence is evaluated then e is evaluated again and so on until returns false e for two variants increasing and decreasing for v e to e do done for v e downto e do done where v is an identifier Expressions and e are the bounds of the loop they must evaluate to integers In the case of the increasing loop the expressions and ez are evaluated producing values n4 and if n is strictly greater than n then nothing is done Otherwise is evaluated n4 1 times with the variable v bound successively to n1 ni 1 The behavior of the decreasing loop is similar if n4 is strictly smaller than n5 then nothing is done Otherwise is evaluated n n3 1 times with v bound to successive values decreasing from n to ng Both loops return the value of type unit for i 0 to vect length a 1 do a i i done unit int vect l0 1 21 54
87. lt type gt s may contain type variables appearing in lt params gt e Sum types Mathematical disjoint union of several types The construct is type lt params gt lt tname gt lt Injection gt of lt type gt where the lt type gt s may contain type variables appearing in lt params gt 46 CHAPTER 6 USER DEFINED TYPES Exercises Exercise 6 1 Define a function taking as argument a binary tree and returning a pair of lists the first one contains all operators of the tree the second one contains all its leaves Exercise 6 2 Define a function map btree analogous to the map function on lists The function map btree should take as arguments two functions f and g and a binary tree It should return a new binary tree whose leaves are the result of applying to the leaves of the tree argument and whose operators are the results of applying the g function to the operators of the argument Exercise 6 3 We can associate to the list type definition an canonical iterator in the following way We give a functional interpretation to the data constructors of the list type We change the list constructors and respectively into a constant a and an operator used as a prefix identifier and abstract with respect to these two operators obtaining the list iterator satisfying list it a list it a z1 zn zn a Its Caml definition would be let rec list it f a f
88. lue is done by let int of digit function 4 0 9 as c gt int of char c int of char 0 gt raise Failure not a digit int of digit char gt int fun The as keyword allows for naming a pattern in this case the variable c will be bound to the actual digit matched by 0 9 Pattern built with as are called alias patterns For the recognition of integers we already feel the need for a parameterized parser Integer recognition is done by the integer analyzer defined below It is parameterized by a numeric value representing the value of the first digits of the number let rec integer n function 0 9 as c integer 10 n int of digit c r gt gt lt gt gt n integer int gt char stream gt int lt fun gt integer 0 stream of string 12345 int 12345 We are now ready to write the lexical analyzer taking a stream of characters and returning a stream of tokens Returning a token stream which will be explored by the parser is a simple reasonably efficient and intuitive way of composing a lexical analyzer and a parser let rec lexer s match s with lt lt spaces _ gt gt lt LPAR lexer s gt lt spaces _ gt gt lt RPAR lexer s gt lt spaces _ gt gt lt PLUS lexer s gt lt 2 4 spaces gt gt lt MINUS lexer gt lt
89. lushing is done with the flush function flush out channel gt unit fun string Hello flush std out Hello unit Q The print_newline function prints a newline character and flushes the standard output print_newline unit gt unit fun Flushing is required when writing standalone applications in which the application may terminate without all printing being done Standalone applications should terminate by a call to the exit function from the io module which flushes all pending output on std out and std err Printing on the standard error channel std err is done with the following functions prerr_char char gt unit lt fun gt prerr_string string gt unit lt fun gt prerr_int int gt unit fun prerr_float float gt unit fun The following function prints its string argument followed by a newline character to std_err and then flushes std_err prerr_endline string gt unit lt fun gt 62 CHAPTER 9 BASIC INPUT OUTPUT 9 3 Input These input primitives flush the standard output and read from the standard input dread line unit gt string fun read_int unit gt int fun read_float unit gt float fun Because of their names and types these functions do not need further explanation 9 4 Channels on files When programs have to read from or print
90. m data constructor The definition of r should be read as let be a reference to the value of 142 The value of r is nothing but a memory location whose contents can be overwritten We consult a reference i e read its memory location with the symbol r 13 int 4 We modify values of type ref with the infix function r rt 1 5 unit int ref ref 4 Some primitives are attached to the ref type for example incr int ref gt unit fun decr int ref gt unit fun which increments resp decrements references on integers 7 3 Arrays Arrays are modifiable data structures They belong to the parameterized type vect Array expressions are bracketed by and and elements are separated by semicolons 1 10 20 30 a int vect 10 20 301 The length of an array is returned by with the function vect_length vect_length a int 3 52 CHAPTER 7 MUTABLE DATA STRUCTURES 7 3 1 Accessing array elements Accesses to array elements can be done using the following syntax 0 int 10 or more generally e1 2 where evaluates to an array and e to an integer Alternatively the function vect item is provided vect_item a vect gt int gt a fun The first element of an array is at index 0 Arrays are useful because accessing an element is done in constant time an array is
91. may be seen as a game When learning a game we must e learn the rules what is allowed and what is forbidden e learn a winning strategy In type synthesis the rules of the game are called a type system and the winning strategy is the typechecking algorithm In the following sections we give the ASL type system the algorithm and an implementation of that algorithm Most of this presentation is borrowed from 7 15 1 The type system We study in this section a type system for the ASL language Then we present an algorithm performing the type synthesis of ASL programs and its Caml Light implementation Because of subtle aspects of the notation used inference rules and since some important mathematical notions such as unification of first order terms are not presented here this chapter may seem obscure at first reading The type system we will consider for ASL has been first given by Milner 27 for a subset of the ML language in fact a superset of A calculus A type is either e the type Number e or a type variable a 8 e Or T2 where 7 and 7 are types 99 100 CHAPTER 15 STATIC TYPING POLYMORPHISM AND TYPE SYNTHESIS In a type a type variable is an unknown i e a type that we are computing We will use 7 7 Ti as metavariables representing types This notation is important we shall use other greek letters to denote other notions in the following sections Example o Number
92. mult token stream gt atree lt fun gt atom token stream gt atree lt fun gt Now we take advantage of these similarities in order to define a general parser for left associative operators Its name is left assoc and is parameterized by a parser for operators and a parser for expressions let rec left assoc op term let rec rest 1 function lt op f term 2 rest f 1 2 e gt gt lt gt gt 1 in function lt term ei rest e1 2 gt gt e2 left_assoc a stream gt b b gt b gt a stream gt b gt a stream gt b fun Now we can redefine expr as let rec expr str left_assoc addop mult str and mult str left_assoc multop atom str and atom function lt INT n gt gt Int n lt LPAR expr RPAR gt gt e expr token stream gt atree lt fun gt mult token stream gt atree lt fun gt atom token stream gt atree lt fun gt And we can now try our definitive parser 10 3 PARAMETERIZED PARSERS 75 lexer stream of string 1 2 3 4 567 atree Minus Plus Plus Int 1 Int 2 Mult Int 3 Int 4 Int 567 Parameterized parsers are useful for defining general parsers such as left assoc that can be used with different instances Another example of a useful general parser is the star parser defined as let rec star p function lt p x star p
93. n Lambda calculus notation with nameless dummies a tool for automatic formula manipulation Indag Math 1962 M Gordon R Milner L Morris M Newey and C Wadworth A metalanguage for interactive proofs in LCF In Proceedings of ACM Symposium on Principles of Programming Languages pages 119 130 1978 J R Hindley and J P Seldin Introduction to Combinators and A calculus London Mathe matical Society Student Texts Cambridge University Press 1986 C A R Hoare Quicksort Computer Journal 5 1 1962 137 138 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 BIBLIOGRAPHY W A Howard The formulae as type notion of construction pages 479 490 Academic Press 1980 P Hudak and P Wadler Report on the programming language Haskell Technical Report YALEU DCS RR 777 Yale University 1990 T Johnsson Efficient compilation of lazy evaluation In Proceedings of ACM Conference on Compiler Construction pages 58 69 1984 J L Krivine Lambda calcul types et mod les Etudes et recherches en informatique Masson 1990 P Landin The next 700 programming languages In Communications of the ACM volume 9 pages 157 164 1966 X Leroy and P Weis Manuel de r f rence du langage Caml Inter ditions 1993 X Leroy The ZINC experiment an economical implementation of the ML language Technical Report 117 INRIA 1990 X Leroy The Caml Lig
94. n is raised during this evaluation then the raised exception is matched against the match cases If a case matches then control is passed to it If no case matches then the exception is propagated outside of the try construct looking for the enclosing try Example let find index p L let rec find n function gt raise Failure not found x L gt if p x then raise Found n else find n41 L in try find 1 L with Found n gt n Toplevel input gt let find index p L s 2 Warning the variable L starts with an upper case letter in this pattern Toplevel input gt x L gt if p x then raise Found n 5 Warning the variable L starts with an upper case letter in this pattern find index a gt bool gt a list gt int fun itfind index function n gt n mod 2 0 1 3 5 7 9 10 int 6 find_index function n gt n mod 2 0 1 3 5 7 9 Uncaught exception Failure not found The lt match cases gt part of the try construct is a regular pattern matching on values of type exn It is thus possible to trap any exception by using the _ symbol As an example the following function traps any exception raised during the application of its two arguments Warning the _ will also trap interrupts from the keyboard such as control C let catch_all f arg default try f arg with gt default catch all a gt b gt a gt b gt
95. nal Synthesized attributes are simulated by the right hand sides of stream matching rules 10 3 3 Higher order parsers In the definition of expr we may notice that the parsers expr and mult on the one hand and restexpr and restmult on the other hand have exactly the same structure To emphasize this similarity if we define parsers for additive resp multiplicative operators by let addop function lt PLUS gt gt function x y gt Plus x y lt MINUS gt gt function x y gt Minus x y and multop function lt gt gt function x y gt Mult x y lt DIV gt gt function x y gt Div x y 74 CHAPTER 10 STREAMS AND PARSERS addop token stream gt atree atree gt atree lt fun gt multop token stream gt atree atree gt atree lt fun gt we can rewrite the expr parser as let rec expr let rec restexpr 1 function lt addop f mult e2 restexpr f 1 2 e gt gt e lt gt gt et in function lt mult e1 restexpr e1 e2 gt gt e2 and mult let rec restmult e1 function lt multop f atom e2 restmult f e1 e2 e gt gt e lt gt gt et in function lt atom e1 restmult e1 e2 gt gt e2 and atom function lt DINT n gt gt Int n lt LPAR expr RPAR gt gt e expr token stream gt atree lt fun gt
96. ng in the Camlstep function implements kind of dynamic typechecking In fact in a concrete low level implementation of the machine expansion of the CAM instructions in assembly code for example these tests would not appear They are useless since we trust the typechecker and the compiler So any execution error in a real implementa tion comes from a bug in one of the above processes and would lead to memory errors or illegal instructions usually detected by the computer s operating system 16 2 Compiling ASL programs into CAM code We give in this section a compiling function taking the abstract syntax tree of an ASL expression and producing CAM code The compilation scheme is very simple the code of a constant is Quote a variable is compiled as an access to the appropriate component of the current environment Nth the code of a conditional expression will save the current environment Push evaluate the condition part and according to the boolean value obtained select the appropriate code to execute Branch the code of an application will also save the environment on the stack Push execute the function part of the application then exchange the functional value and the saved environment Swap evaluate the argument and finally apply the functional value which is at the top of the stack to the argument held in the register with the Apply instruction the code of an abstraction simply consists in building a closur
97. nguage without conditional construct and without sum types all programs involving a recursive definition never terminate 4 3 p 31 1 rec factorial n if n 1 then 1 else n factorial n 1 factorial int gt int fun factorial 5 int 120 let tail recursive factorial n let rec fact n m if n 1 then m else fact n 1 n m in fact n 1 tail recursive factorial int gt int fun tail recursive factorial 5 int 120 4 4 p 31 let rec fibonacci n if n 1 then 1 else if n 2 then 1 else fibonacci n 1 fibonacci n 2 fibonacci int gt int fun fibonacci 20 int 6765 4 5 p 32 let compose f g function x gt f g x compose a gt gt c gt gt c gt b fun let curry f function x gt function y gt f x y curry b gt c gt gt gt c lt fun gt 126 CHAPTER 17 ANSWERS TO EXERCISES let uncurry f function x y gt f x y uncurry a gt gt c gt b gt c fun uncurry compose _a gt _ _c gt _ gt _ gt _b fun compose curry uncurry gt _b gt _ gt _ gt _b gt _ fun compose uncurry curry gt _ gt _ gt lt fun gt 5 1 p 36 let rec combine function 0 0 gt O
98. not complete because the end of the file has been reached End_of_file exn End of file The exception sys Sys error Sys error from the module sys is raised when some manip ulation of files is forbidden by the operating system open in abracadabra Uncaught exception sys Sys error abracadabra No such file or directory The functions that we have seen in this chapter are sufficient for our needs Many more exist useful mainly when working with files and are described in 21 Exercises Exercise 9 1 Define a function copy file taking two filenames of type string as arguments and copying the contents of the first file on the second ome Error messages must be printed on std err Exercise 9 2 Define a function wc taking a filename as argument and printing on the standard output the number of characters and lines appearing in the file Error messages must be printed on std err Note it is good practice to develop a program in defining small functions A single function doing the whole work is usually harder to debug and to read With small functions one can trace them and see the arguments they are called on and the result they produce 64 CHAPTER 9 BASIC INPUT OUTPUT Chapter 10 Streams and parsers In the next part of these course notes we will implement a small functional language Parsing valid programs of this language requires writing a lexical analyzer and a parser for the language For the pu
99. nstraints between all occurrences of a variable bound by a A In an effective implementation a unification error would occur 15 8 THE ASL TYPE SYNTHESIZER 105 We may notice from the example above that the algorithm is syntaz directed since for a given expression a type deduction for that expression uses exactly one rule per sub expression the deduction possesses the same structure as the expression We can thus reconstruct the ASL expression from its type deduction tree From the deduction tree above if we write upper rules as being arguments of the ones below and if we annotate the applications of the INST and ABS rules by the name of the subject variable we obtain LET ABS INST APP INST INST This is an illustration of the types as propositions and programs as proofs paradigm also known as the Curry Howard isomorphism cf 14 5 In this example we can see the type of the considered expression as a proposition and the expression itself as the proof and indeed we recognize the expression as the deduction tree 15 3 The ASL type synthesizer We now implement the set of inference rules given above We need e a Caml representation of ASL types and type schemes e management of type environments e a unification procedure e a typing algorithm 15 3 1 Representation of ASL types and type schemes We first need to define a Caml type for our ASL type data structure type asl t
100. of digit char gt int fun let rec integer n function lt gt 0 9 as c integer 10 n int of digit c r gt gt gt gt INT n integer int gt char stream gt token lt fun gt We restrict ASL identifiers to be composed of lowercase letters the eight first being significative An explanation about the ident function can be found in the chapter dedicated to the answers to exercises chapter 17 The function given here is slightly different and tests its result in order to see wether it is a keyword let be or not let ident buf make string 8 ident buf string let rec ident len function CHAPTER 12 ASL A SMALL LANGUAGE e e lt af z as if len gt 8 then ident len else begin set nth char ident buf len c ident succ len end s gt gt s lt gt gt match sub string ident buf O len with let gt LET be gt BE if gt IF then gt THEN else gt ELSE fi gt FI s gt IDENT s ident int gt char stream gt token lt fun gt dk dt dt dt db dt dt OF A reasonable lexical analyzer would use a hash table to recognize keywords faster Primitive operations are recognized by the following function which also detects illegal operators and ends of input let oper function lt gt as c gt gt IDENT make string 1 c lt c gt gt pre
101. olume 6 of American Elsevier B Meltzer and D Mitchie Eds 1971 33 R Sedgewick Algorithms Addison Wesley 1988 34 D Turner SASL language manual Technical report St Andrews University 1976 35 D Turner Recursion equations as a programming language pages 1 28 Cambridge University Press 1982 36 D Turner Miranda a non strict functional language with polymorphic types In Func tional Programming Languages and Computer Architecture number 201 in Lecture Notes in Computer Science pages 1 16 Springer Verlag 1985 37 P Weis M V Aponte A Laville M Mauny and A Su rez The CAML reference manual Technical Report 121 INRIA 1990 38 P Weis and X Leroy Le langage Caml Inter ditions 1993
102. omes trivial These techniques are described in 2 and consist in adding a non terminal for each precedence level and removing left recursion We obtain Mult Mult Expr Mult Expr Expr Mult Atom Atom Mult Atom Mult Atom INT Expr While removing left recursion we forgot about left associativity of operators This is not a problem as long as we build correct abstract trees Since stream matching bases its choices on the first component of stream patterns we cannot see the grammar above as a parser We need a further transformation factoring common prefixes of grammar rules left factor We obtain Expr Mult RestExpr RestExpr Mult RestExpr Mult RestExpr nothing Mult Atom RestMult RestMult Atom RestMult Atom RestMult nothing 72 CHAPTER 10 STREAMS AND PARSERS Atom INT Expr Now we can see this grammar as a parser note that the order of rules becomes important and empty productions must appear last The shape of the parser is let rec expr let rec restexpr function lt PLUS mult restexpr gt gt lt MINUS mult restexpr gt gt lt gt gt in function lt mult e1 restexpr gt gt and mult let rec restmult function lt TIMES atom restmult gt gt lt DIV atom restmult gt gt lt gt gt in function lt atom
103. onstructor which is known to be polymorphic It may then be possible to build an ill typed Caml value true 1 2 since the typechecker does not possess any further type information than a list The problem is thus the absence of static connection from exceptions that are raised and the occurrences where they are trapped Another example would be the one of a function raising Exc with an integer or a boolean value depending on some condition Then in that case when trying to trap these exceptions we cannot decide wether they will hold integers or boolean values Exercises Exercise 8 1 Define the function ind succeed which given a function f and a list L returns the first element of L on which the application of succeeds Exercise 8 2 Define the function map succeed which given a function f and a list L returns the list of the results of successful applications of to elements of L Chapter 9 Basic input output We describe in this chapter the Caml Light input output model and some of its primitive operations More complete information about IO can be found in the Caml Light manual 21 Caml Light has an imperative input output model an IO operation should be considered as a side effect and is thus dependent on the order of evaluation IOs are performed onto channels with types in channel and out channel These types are abstract i e their representation is not accessible Three channels are predefined std_in
104. ool pair but is used with type a b Actually any two type definitions produce different types If we enter again the previous definition type a b pair Fst a Snd b Type pair defined we cannot any longer extract the Fst component of x p Fst Toplevel input gt p Fst xn This expression has type int bool pair but is used with type a b pair since the label Fst refers to the latter type pair that we defined while p s type is the former pair 6 2 Sum types A sum type is the finite labeled disjoint union of several types A sum type definition defines a type as being the union of some other types 40 CHAPTER 6 USER DEFINED TYPES 6 2 1 Defining sum types Example we want to have a type called identification whose values can be e cither strings name of an individual e or integers encoding of social security number as a pair of integers We then need a type containing both int int and character strings We also want to preserve static type checking we thus want to be able to distinguish pairs from character strings even if they are injected in order to form a single type Here is what we would do type identification Name of string SS of int int Type identification defined The type identification is the labeled disjoint union of string and int int The labels Name and SS are injections They respectively inject int int and string into a single ty
105. pe identification Let us use these injections in order to build new values let idi Name Jean idi identification Name Jean 1 142 SS 1670728 280305 id2 identification SS 1670728 280305 Values idi and id2 belong to the same type They may for example be put into lists as in id1 id2 identification list Name Jean SS 1670728 280305 Injections may possess one argument as in the example above or none The latter case corresponds to enumerated types as in Pascal An example of enumerated type is type suit Heart Diamond Club Spade Type suit defined Club suit Club The type suit contains only 4 distinct elements Let us continue this example by defining a type for cards In Caml Light there is no implicit order on values of sum types 6 2 SUM TYPES 41 type card Ace of suit King of suit Queen of suit Jack of suit Plain of suit int Type card defined The type card is the disjoint union of e suit under the injection Ace e suit under the injection King e suit under the injection Queen e suit under the injection Jack e the product of int and suit under the injection Plain Let us build a list of cards let figures of c Ace c King c Queen c Jack c and small cards of c map function n gt Plain c n 7 8 9 10 figures of suit gt card list lt fun gt small cards of
106. pen in filename in try while true do let c input char in chan in counter incr chars if c n then counter incr lines done with End of file gt close in in chan for 1 to vect length sys command line 1 do count file sys command line i done print int counter read chars print string characters print int counter read lines print string lines Mn exit 0 Chapter 18 Conclusions and further reading We have not been exhaustive in the description of the Caml Light features We only introduced general concepts in functional programming and we have insisted on the features used in the prototyping of ASL a tiny model of Caml Light typing and semantics The reference manual 21 provides an exhaustive description of the Caml Light language its libraries commands and extensions Those who read French are referred to 38 a progressive but thorough introduction to pro gramming in Caml Light with many interesting examples and to 19 the French edition of the Caml Light reference manual Description about Caml Strong and useful information about programming in Caml can be found in 9 and 37 An introduction to lambda calculus and type systems can be found in 17 12 and 4 The description of the implementation of call by value functional programming languages can be found in 20 The implementation of lazy functional languages is described in 29 translated in French as
107. pes such as char that have a finite number of constant elements it may be useful to use or patterns gathering constants in the same matching rule let is_vowel function faf fet Egs u gt true 4 _ gt false is_vowel char gt bool lt fun gt The first rule is chosen if the argument matches one of the cases Since there is a total ordering on characters the syntax of character patterns is enriched with a notation let is_lower_case_letter function af z gt true _ gt false is_lower_case_letter char gt bool lt fun gt Of course or patterns and this notation can be mixed as in let is_letter function af z A Z true 4 gt false is letter char gt bool fun In the next sections we give the most commonly used IO primitives on these printable types A complete listing of predefined IO operations is given in 21 9 2 OUTPUT 61 9 2 Output Printing on standard output is performed by the following functions print_char char gt unit fun print_string string gt unit lt fun gt print_int int gt unit fun print_float float gt unit fun Printing is buffered i e the effect of a call to a printing function may not be seen immediately flushing explicitly the output buffer is sometimes required unless a printing function flushes it implicitly F
108. piling e loading e executing e printing the result of execution Let us consider another example the application of the successor function to 2 3 The expression function x gt x 1 should be read as the function that given x returns x 1 The juxtapo sition of this expression to 2 3 is function application function x gt x 1 2 3 int 6 There are several ways to reduce that value before obtaining the result 6 Here are two of them the expression being reduced at each step is underlined function x gt x 1 2 3 function x gt x 1 2 3 l l function x gt x 1 5 243 1 l l 5 1 5 1 l l 6 6 3 3 TYPES 15 The transformations used by Caml during evaluation cannot be described in this chapter since they imply knowledge about compilation of Caml programs and machine representation of Caml values However since the basic control structure of Caml is function application it is quite easy to give an idea of the transformations involved in the Caml evaluation process by using the kind of rewriting we used in the last example The evaluation of the well typed application e e2 where 1 and eg denote arbitrary expressions consists in the following steps e Evaluate e obtaining its value v e Evaluate e until it becomes a functional value Because of the well typing hypothesis must represent a function from some type t to some tg and the type of v is tj We will w
109. plevel declarations as in let successor function x gt x 1 successor int gt int fun let square function x gt x x Square int gt int fun When a value has been declared at toplevel it is of course available during the rest of the session square successor 3 int 16 square int gt int fun When declaring a functional value there are some alternative syntaxes available For example we could have declared the square function by let square Square int gt int fun or closer to the mathematical notation by let square x x x Square int gt int fun All these definitions are equivalent 3 6 PARTIAL APPLICATIONS 19 3 6 Partial applications A partial application is the application of a function to some but not all of its arguments Consider for example the function f defined by let f x function y gt 2 xty f int gt int gt int fun Now the expression 3 denotes the function which when given an argument y returns the value of 2 3 y The application f x is called a partial application since f waits for two successive arguments and is applied to only one The value of f x is still a function We may thus define an addition function by let addition x function y gt addition int gt int gt int fun This can also be written let addition x y addition
110. r types must be given by hand The interface is contained in a file whose name is the module name with extension mli Here is for instance an interface for the counter module that abstracts the type counter counter mli type counter an abstract type value new int gt counter and incr counter gt unit and read counter gt int 11 3 ABSTRACTION 83 Interfaces must be compiled separately However once the interface for module A has been compiled any module B that uses A can be immediately compiled even if the implementation of A is not yet compiled or even not yet written Consider camlc camlc camlc camlc counter mli main ml counter ml main counter zo main zo The implementation main ml could be compiled before counter ml The only requirement for compiling main ml is the existence of counter zi the compiled interface of the counter module Exercises Exercise 11 1 Complete the count command it should be able to operate on several files given on the command line Hint sys command line is an array of strings containing the command line arguments to the process 84 CHAPTER 11 STANDALONE PROGRAMS AND SEPARATE COMPILATION Part III A complete example Chapter 12 ASL A Small Language We present in this chapter a simple language ASL A Small Language This language is basically the A calculus the purely functional kernel of Caml enriched with a conditional construct Th
111. riables occur ring in ASL types The following function computes the list of type variables of its argument 108 CHAPTER 15 STATIC TYPING POLYMORPHISM AND TYPE SYNTHESIS let vars of type tau let rec vars vs function Number gt vs TypeVar Index n Value Unknown if mem n vs then vs else n vs TypeVar Index _ Value t gt vars vs t Arrow t1 t2 gt vars vars vs t1 t2 Unknown gt raise TypingBug vars_of_type in vars tau vars of type asl type gt int list fun The unknowns of type bv t application returns the list of variables occurring in t that do not appear in bv The subtract function returns the difference of two lists let unknowns of type bv t subtract vars_of_type t bv unknowns_of_type int list asl_type gt int list lt fun gt We need to compute the list of unknowns of a type environment for the generalization process un knowns belonging to that list cannot become generic The set of unknowns of a type environment is the union of the unknowns of each type The flat function flattens a list of lists let flat it list prefix 0 flat a list list gt a list fun let unknowns of type env env flat map function Forall gv t gt unknowns_of_type gv t env unknowns_of_type_env asl_type_scheme list gt int list lt fun gt The generalization of a type is relative to a typing environment The m
112. ring is match result with Constval n gt print int n Funval f gt print string lt fun gt print newline semantics top asl gt unit fun 13 3 Examples semantics parse top let f be x x 1 ASL Value of f is lt fun gt unit semantics parse top let i be x x ASL Value of i is fun unit Q semantics parse top let x be i f 2 ASL Value of x is 3 unit semantics parse top let y be if x then x x else 2 fi 0 ASL Value of y is O unit 96 CHAPTER 13 UNTYPED SEMANTICS OF ASL PROGRAMS Chapter 14 Encoding recursion 14 1 Fixpoint combinators We have seen that we do not have recursion in ASL However it is possible to encode recursion by defining a fixpoint combinator A fixpoint combinator is a function F such that F M is equivalent to M F M modulo the evaluation rules for any expression M A consequence of the equivalence given above is that fixpoint combinators can encode recursion Let us note M N if expressions M and N are equivalent modulo the evaluation rules Then consider ffact to be the functional obtained from the body of the factorial function by abstracting i e using as a parameter the fact identifier and fix an arbitrary fixpoint combinator We have ffact is Mn if n 0 then 1 else n fact n 1 fi Now let us consider the expression E f
113. ring 1 char of int 964r in if q 0 then s else name of q s in name of n dk dk 5 tvar name int gt string fun Then a printing function for type schemes let print type scheme Forall gv t Prints a type scheme Fails when it encounters an unknown let names let rec names_of function aD gt 1 1 gt tvar name n names_of 1 Lv in names of 1 gv in let tvar names combine rev gv names in let rec print rec function TypeVar Index n Value Unknown gt let name try assoc n tvar_names with Not_found gt raise TypingBug Non generic variable in print_string name TypeVar Index _ Value t gt print rec t Number gt print string Number Arrow ti t2 gt print_string print_rec t1 print string gt print rec t2 print string dk dt dk dt dt dt db 15 3 THE ASL TYPE SYNTHESIZER 111 Unknown gt raise TypingBug print_type_scheme in print_rec t Toplevel input 0 vi Lv gt tvar name n names of 1 Lv PA Warning the variable Lv starts with an upper case letter in this pattern print type scheme asl type scheme gt unit fun Now the main function which resets the type variables counter calls the type synthesizer traps ASL type clashes and prints the resulting types At the end the global environments are updated let typing De
114. rite function x gt e for the result of the evaluation of It denotes the mathematical object usually written as f thot x e where of course e may depend on zx N B We do not evaluate e before we know the value of z e Evaluate e where v has been substituted for all occurrences of x We then get the final value of the original expression The result is of type t In the previous example we evaluate e 2 3 to 5 e function x gt 1 to itself it is already a function body e x 1 where 5 is substituted for x i e evaluate 5 1 getting 6 as result It should be noticed that Caml uses call by value arguments are evaluated before being passed to functions The relative evaluation order of the functional expression and of the argument expression is implementation dependent and should not be relied upon The Caml Light implementation evaluates arguments before functions right to left evaluation as shown above the original Caml implementation evaluates functions before arguments left to right evaluation 3 9 Types Types and their checking synthesis are crucial to functional programming they provide an indica tion about the correctness of programs The universe of Caml values is partitioned into types A type represents a collection of values For example int is the type of integer numbers and float is the type of floating point numbers Truth values belong to the bool type Character strings belong to th
115. rol to the rest of the program Furthermore lazy evaluation of streams allows for the manipulation of infinite streams As an example we can build the infinite stream of integers using side effects to show precisely when computations occur let rec ints from n lt print int n print char flush std out n ints from 1 gt ints from int gt int stream fun let ints ints from 0 ints int stream lt abstr gt We notice that no printing occurred and that the program terminates this shows that none of the elements have been evaluated and that the infinite stream has not been built We will see in the next section that these side effects will occur on demand i e when tests will be needed by a matching function on streams 10 2 STREAM MATCHING AND PARSERS 67 10 2 Stream matching and parsers The syntax for building streams can be used for pattern matching over them However stream matching is more complex than the usual pattern matching 10 2 1 Stream matching is destructive Let us start with a simple example 1 next function lt x gt gt x next a stream gt a fun The next function returns the first element of its stream argument and fails if the stream is empty let s lt 0 1 2 gt S int stream lt abstr gt next int 0 next int 1 next 5 int 2 next Uncaught exception Parse_failur
116. rpose of writing easily such programs Caml Light provides a special data structure streams Their main usage is to be interfaced to input channels or strings and to be matched against stream patterns 10 1 Streams Streams belong to an abstract data type their actual representation remains hidden from the user However it is still possible to build streams either by hand or by using some predefined functions 10 1 1 The stream type The type stream is a parameterized type One can build streams of integers of characters or of any other type Streams receive a special syntax looking like the one for lists The empty stream is written lt gt _a stream lt abstr gt 66 9 99 A non empty stream possesses elements that are written preceded by the quote character lt 70 1 2 gt int stream lt abstr gt 66 9 99 Elements that are not preceded by are substreams that are expanded in the enclosing stream lt 0 1 25 3 3 gt int stream lt abstr gt let s lt abc gt in lt s def gt string stream lt abstr gt Thus stream concatenation can be defined as 65 66 CHAPTER 10 STREAMS AND PARSERS let stream concat s t lt s t gt Stream concat a stream gt a stream gt a stream fun Building streams in this way can be useful while testing a parsing function or defining a lexical analyzer taking as argument a
117. rr string Illegal character prerr_endline char_for_read c raise Failure ASL parsing lt gt gt prerr_endline Unexpected end of input raise Failure ASL parsing oper char stream gt token lt fun gt The lexical analyzer has the same structure as the one given in chapter 10 except that leading blanks are skipped let rec lexer str spaces str match str with lt spaces _ gt gt lt LPAR lexer str gt 1 lt spaces _ gt gt lt RPAR lexer str gt 4 lt spaces _ gt gt lt LAMBDA lexer str gt lt spaces _ gt gt lt DOT lexer str gt lt spaces _ gt gt lt SEMIC lexer str gt lt 2 0 9 as integer int_of_digit c tok spaces _ gt gt lt tok lexer str gt a z as set nth char ident buf 0 c ident 1 tok spaces _ gt gt lt lexer str gt lt oper tok spaces _ gt gt lt tok lexer str gt 12 2 PARSING ASL PROGRAMS 91 lexer char stream gt token stream fun The lexical analyzer returns a stream of tokens that the parser will receive as argument 12 2 2 Parsing The final output of our parser will be abstract syntax trees of type asl or top asl This implies that we will detect unbound identifiers at parse time In this case we will raise the Unbound exception
118. rule will be crucial without it we could not check the judgement above TAUT TAUT TAUT x a Fx a TF l f Va o a ABS INST INST Mr Axx a a T f 0 8 8 B r f 0 8 GEN APP H Ax x Vaca gt a 6 0 LET Ot let f Axx in f 8 B This type system does not tell us how to find the best type for an expression But what is the best type for an expression It must be such that any other possible type for that expression is more specific in other words the best type is the most general 15 2 The algorithm How do we find the most general type for an expression of our language The problem with the set of rules above is that we could instantiate and generalize types at any time introducing type schemes while the most important rules application and abstraction used only types Let us write a new set of inference rules that we will read as an algorithm close to a Prolog program Any numeric constant is of type Number NUM T Const n Number The types of identifiers are obtained by taking generic instances of type schemes appearing in the typing environment These generic instances will be types and not type schemes this restriction appears in the rule below where the type 7 is expected to be a generic instance of the type scheme As it is presented belonging to a deduction system the following rule will have to an
119. sense that it does nothing but opening and closing channels and printing possible error messages let cp f1 f2 try open_out f2 with sys Sys error msg gt Opening channels f1 first then 2 let inch try open in f1 with sys Sys error msg gt prerr endline f1 msg raise Failure cp _ gt prerr endline Unknown exception while opening f1 raise Failure cp in let outch 131 close in inch prerr endline f2 msg raise Failure cp gt close in inch prerr endline Unknown exception while opening f2 raise Failure cp in Copying and then closing try copy inch outch with End of file gt close in inch close out outch close out flushes exc gt close in inch close out outch raise exc dk dt dt db 5 cp string gt string gt unit fun Let us try cp cp etc passwd tmp foo unit Q cp tmp foo foo foo foo Permission denied Uncaught exception Failure cp The last example failed because a regular user is not allowed to write at the root of the file system 9 2 p 63 As in the previous exercise the function count performs the actual counting It works on an input channel and returns a pair of integers let count inch chars lines with End of file gt chars lines let chars ref 0 and lines ref O in try while true do le
120. string Type person defined We can build values of type person in the very same way as before let jean Name Jean Age 23 Job Student City Paris jean person Name Jean Age 23 Job Student City Paris But now the value jean may be physically modified in the fields specified to be mutable in the definition and only in these fields We can modify the field Field of an expression expri in order to assign it the value of lt expr2 gt by using the following construct lt expri gt Field lt lt expr2 gt For example if we want jean to become one year older we would write jean Age lt jean Age 1 unit QO 49 50 CHAPTER 7 MUTABLE DATA STRUCTURES Now the value jean has been modified into jean person Name Jean Age 24 Job Student City Paris We may try to change the Name of jean but we won t succeed the typecheker will not allow us to do that jean Name lt Paul Toplevel input gt jean Name lt Paul The label Name is not mutable It is of course possible to use such constructs in functions as in let get older Age n _ as p p Age lt n 1 get_older person gt unit lt fun gt In that example we named n the current Age of the argument but we also named p the argument This is an alias pattern it saves us the bother of writing let get_older p match p with Age n gt p Age lt n 1
121. sts a Ki donee ter tts eur wi Md lg BE aed ee OS 4 33 5 2 Extracting elements from lists pattern matching 34 5 3 Functions over lists oaa ee 35 6 User defined types 37 6 1 Product types e kem aL uu X Se Bl ee a RES WS 37 02 S UT OY DOS s sak Cats zt eet Ea ane de ak erar een Sena Se go ala 39 673 Summary o Pde a Be we ai ele doe la Bar eee ben s 45 Caml Light specifics 7 Mutable data structures 7 1 User defined mutable data structures 52 Save Uo a ek ee ed fads Soldi MIN uh wig uae 7 4 Loops while and for 7 5 Polymorphism and mutable data structures 8 Escaping from computations exceptions Sl EXCEPTIONS aus node m tu Dun eT eve ds 8 2 Raising an exception es 8 3 Trapping exceptions 8 4 Polymorphism and 8 9 Basic input output 9 1 Printable types esai ar a Pee RR RUE EROS A re 9 2 Output dece Y x a verti oad OR ge x 9 8 eb aec eee ee E s 9 4 Channels on files llle 10 Streams and parsers 10 l Str anis 4 le awe eed Bee S EGER 10 2 Stream matching and parsers 10 8 Parameterized parsers 10 4 Further 11 Standalone pro
122. suit gt card list fun figures of Heart card list Ace Heart King Heart Queen Heart Jack Heart small_cards_of Spade card list Plain Spade 7 Plain Spade 8 Plain Spade 9 Plain Spade 10 6 2 2 Extracting sum components Of course pattern matching is used to extract sum components In case of sum types pattern matching uses dynamic tests for this extraction The next example defines a function color re turning the name of the color of the suit argument let color function Diamond gt red Heart gt red 2 gt black s color suit gt string lt fun gt Let us count the values of cards assume we are playing belote let count trump function Ace _ gt 11 King _ gt 4 42 CHAPTER 6 USER DEFINED TYPES Queen _ gt 3 Jack c gt if c trump then 20 else 2 Plain c 10 gt 10 Plain c 9 gt if c trump then 14 else 0 gt 03 count suit gt card gt int fun 6 2 3 Recursive types Some types possess a naturally recursive structure lists for example It is also the case for tree like structures since trees have subtrees that are trees themselves Let us define a type for abstract syntax trees of a simple arithmetic language An arithmetic expression will be either a numeric constant or a variable or the addition multiplication difference or division of two expressions type arith_e
123. t b fun first p int 3 second p bool true 28 CHAPTER 4 BASIC TYPES We say that first is a generic value because it works uniformly on several kinds of arguments provided they are pairs We often confuse between generic and polymorphic saying that such value is polymorphic instead of generic 4 5 Patterns and pattern matching Patterns and pattern matching play an important role in ML languages They appear in all real programs and are strongly related to types predefined or user defined A pattern indicates the shape of a value Patterns are values with holes A single variable formal parameter is a pattern with no shape specified it matches any value When a value is matched against a pattern this is called pattern matching the pattern acts as a filter We already used patterns and pattern matching in all the functions we wrote the function body function x gt does trivial pattern matching When applied to a value the formal parameter x gets bound to that value The function body function x y gt x y does also pattern matching when applied to a value a pair because of well typing hypotheses the x and y get bound respectively to the first and the second component of that pair All these pattern matching examples were trivial ones they did not involve any test e formal parameters such as x do not impose any particular shape to the value they are supposed to ma
124. t int gt int lt fun gt let add function x gt function y gt add int gt int gt int fun These two functions differ only in their way of taking their arguments The first one will take an argument belonging to a cartesian product the second one will take a number and return a function The add function is said to be the curried version of plus in honor of the logician Haskell Curry The currying transformation may be written in Caml under the form of a higher order function let curry f function x gt function y gt f x y curry b gt c gt gt b gt c lt fun gt Its inverse function may be defined by let uncurry f function x y gt fx y uncurry a gt gt c gt b gt c fun We may check the types of curry and uncurry on particular examples uncurry prefix int int gt int fun uncurry prefix string string gt string fun curry plus int gt int gt int fun Exercises Exercise 4 1 Define functions that compute the surface area and the volume of well known geo metric objects rectangles circles spheres Exercise 4 2 What would happen in a language submitted to call by value with recursion if there was no conditional construct if then else Exercise 4 3 Define the factorial functiom such that factorial n n n 1
125. t c input_char inch in chars chars 1 if c n then lines lines 1 else done i i i 35 count in channel gt int int fun The function wc opens a channel on its filename argument calls count and prints the result let wc f let inch 132 CHAPTER 17 ANSWERS TO EXERCISES try open in f with sys Sys error msg gt prerr endline f msg raise Failure wc _ gt prerr endline Unknown exception while opening f raise Failure wc in let chars lines count inch in print int chars print string characters print int lines print string lines Wn dk dt dt db db db 5 WC String gt unit lt fun gt Counting etc passwd gives wc etc passwd 16624 characters 203 lines unit 10 1 p 76 Let us recall the definitions of the type token and of the lexical analyzer type token PLUS MINUS TIMES DIV LPAR RPAR INT of int Type token defined Spaces let rec spaces function lt t n spaces _ gt gt 1 lt gt gt spaces char stream gt unit fun Integers let int_of_digit function 0 9 as c gt int of char c int of char 0 gt raise Failure not a digit int of digit char gt int fun let rec integer n function k gt 0 9 as c integer 10 n int of digit c r
126. tch e pair patterns such as x y always match for typing reasons cartesian product is a product type However some types do not guarantee such a uniform shape to their values Consider the bool type an element of type bool is either true or false If we impose to a value of type bool to have the shape of true then pattern matching may fail Consider the following function let f function true gt false Toplevel input gt let f function true gt false DS an AAA AAA AAA Warning this matching is not exhaustive f bool gt bool fun The compiler warns us that the pattern matching may fail we did not consider the false case Thus if we apply f to a value that evaluates to true pattern matching will succeed an equality test is performed during execution 4f 1 lt 2 bool false But if f s argument evaluates to false a run time error is reported 4f 1 gt 2 Uncaught exception Match failure 1346 1368 Here is a correct definition using pattern matching 4 5 PATTERNS AND PATTERN MATCHING 29 let negate function true gt false false gt true negate bool gt bool fun The pattern matching has now two cases separated by the character Cases are examined in turn from top to bottom An equivalent definition of negate would be let negate function true gt false x gt true negate bool gt bool lt fun gt The second case now m
127. tch oriented way writing source code in a file having it compiled into an executable program and executing the program outside of the Caml Light environment Interactive use is great for learning the language and quickly testing new functions Batch use is more convenient to develop larger programs that should be usable without knowledge of Caml Light Note for Macintosh users batch compilation is not available in the standalone Caml Light application It requires the MPW environment see the Caml Light manual 11 1 Standalone programs Standalone programs are composed of a sequence of phrases contained in one or several text files Phrases are the same as at toplevel expressions value declarations type declarations exception declarations and directives When executing the stand alone program produced by the compiler all phrases are executed in order The values of expressions and declared global variables are not printed however A stand alone program has to perform input and output explicitly Here is a sample program that prints the number of characters and the number of lines of its standard input like the wc Unix utility let chars ref 0 let lines ref 0 try while true do let c input char std in in chars chars 1 if c n then lines lines 1 else done with End of file gt print int chars print string characters print int lines print string lines Mn exit O 79 80 CHA
128. tch stream elements but can be also function calls corresponding to non terminals in the grammar termi nology Functions appearing as stream pattern components are intended to match substreams of the stream argument they are called on the actual stream argument and they are followed by a pattern which should match the result of this call For example if we define a parser recognizing a non empty sequence of characters a let seq a let rec seq function lt gt seq 1 gt gt 1 lt gt gt in function lt a seq 1 gt gt a 1 seq_a char stream gt char list fun we used the recursively defined function seq inside the stream pattern of the first rule This definition should be read as e if the stream is not empty and if its first element matches a apply seq to the rest of the stream let 1 be the result of this call and return a 1 e otherwise fail raise Parse_failure and seq should be read in the same way except that since it recognizes possibly empty sequences of a it never fails Less operationally we can read it as a non empty sequence of a starts with an a and is followed by a possibly empty sequence of a Another example is the recognition of a non empty sequence of a followed by a b or a b alone let seq_a_b function lt l b gt gt 1
129. teger counters another that performs the actual counting Here is the first one counter ml counter ml type counter mutable val int let new init val init let incr c c val lt c val 1 let read c c val Here is the source for the main program in file main ml 11 22 PROGRAMS IN SEVERAL FILES 81 main ml let chars counter new 0 let lines counter new 0 try while true do let c input char std in in counter incr chars if c n then counter incr lines else done with End of file gt print int counter read chars print string characters print int counter read lines print string lines Mn exit 0 35 Notice that references to identifiers defined in module counter ml are prefixed with the name of the module counter and by __ the long dash symbol two underscore characters If we had simply entered new 0 for instance the compiler would have assumed new is an identifier declared in the current module and issued an undefined identifier error Compiling this program requires two compilation steps plus one final linking step camlc c counter ml camlc c main ml camlc o main counter zo main zo Running the program is done as before camlrun main lt counter ml The c option to camlc means compile only that is the compiler should not attempt to produce a stand alone executable program from the given file but simply an object code fi
130. ten or and amp true or false bool true 1 lt 2 amp 2 gt 1 bool true The operators amp and or are not functions They should not be seen as regular functions since they evaluate their second argument only if it is needed For example the or operator evaluates its second operand only if the first one evaluates to false The behavior of these operators may be described as follows or e2 is equivalent to if then true else e2 amp e2 is equivalent to if then eg else false Negation is predefined not true bool false The not identifier receives a special treatment from the parser the application not f x is recognized as not f x while f g x isidenticalto g x This special treatment explains why the functional value associated to not can be obtained only using the prefix keyword prefix not bool gt bool fun 4 3 Strings and characters String constants type string are written between characters double quotes Single character constants type char are written between characters backquotes The most used string oper ation is string concatenation denoted by the character Hello World string Hello World prefix string gt string gt string fun Characters are ASCII characters 6a char a 065 char A 4 4 TUPLES 27 and can be built from th
131. the patterns P selecting the first one that matches and giving control to the corresponding expression For example we can match the tree t previously defined by match t with Btree Op x Soni Son2 _ gt x Leaf 1 gt No operator 2 string e 6 2 5 Data constructors and functions One may ask What is the difference between a sum data constructor and a function At first sight they look very similar We assimilate constant data constructors such as Heart to constants Similarly in Caml Light sum data constructors with arguments also possess a functional type Ace suit gt card fun However a data constructor possesses particular properties that a general function does not possess and it is interesting to understand these differences From the mathematical point of view a sum data constructor is known to be an injection while a Caml function is a general function without further information A mathematical injection f B admits an inverse function f from its image f A C B to A From the examples above if we consider the King constructor then let king c King king suit gt card fun king is the general function associated to the King constructor and function King c gt Toplevel input gt function King c gt c Warning this matching is not exhaustive card gt suit fun is the inverse function for king It is a partial function
132. ticipate the effect of the equality constraints between types in the other rules multiple occurrences of a type metavariable when choosing the instance GenInstance c PUf e o F Var ziT INST When we read this set of inference rules as an algorithm the INST rule will be implemented by 1 taking as 7 the most general generic instance of o that is a type the rule requires 7 to be a type and not a type scheme 104 CHAPTER 15 STATIC TYPING POLYMORPHISM AND TYPE SYNTHESIS 2 making more specific by unification 32 when encountering equality constraints Typing a conditional requires only the test part to be of type Number and both alternatives to be of the same type 7 This is an example of equality constraint between the types of two expressions T F Number Th e9 T Th e3 7 if ej then eg else fi T COND Typing an application produces also equality constraints that are to be solved by unification LCFre rr Th erT e1 Typing an abstraction pushes a typing hypothesis for the abstracted identifier unification will make it more precise during the typing of the abstraction body T T x U fe Vt o e v Ar ABS Typing a let construct involves a generalization step we generalize as much as possible 01 04 FV r FV T P r z U i z Voi os 7 et LET This
133. tion f gt function x gt f x would be represented as Abs f Abs x App Var 2 Var 1 and should be viewed as the tree Abs ngu Abs nt App Var 2 Var 1 Var n should be read as an occurrence of the variable bound by the nth abstraction node encoun tered when going toward the root of the abstract syntax tree In our example when going from Var 2 to the root the 2nd abstraction node we encounter introduces the f variable The numbers encoding variables in abstract syntax trees of functional expressions are called De Bruijn numbers The characters that we attach to abstraction nodes simply serve as docu mentation they will not be used by any of the semantic analyses that we will perform on the trees The type of ASL abstract syntax trees is defined by type asl Const of int Var of int Cond of asl asl asl App of asl asl Abs of string asl and top asl Decl of string asl Type asl defined Type top_asl defined 12 2 Parsing ASL programs Now we come to the problem of defining a concrete syntax for ASL programs and declarations The choice of the concrete aspect of the programs is simply a matter of taste The one we choose here is close to the syntax of A calculus except that we will use the backslash character because there is on our keyboards We will use the curried versions of equality and arithmetic functions We will also use a prefiz notation
134. to files it is necessary to open and close channels on these files 9 4 1 Opening and closing channels Opening and closing is performed with the following functions open_in string gt in_channel lt fun gt open_out string gt out channel fun close_in in channel gt unit fun close_out out channel gt unit fun The open in function checks the existence of its filename argument and returns a new input channel on that file open out creates a new file or truncates it to zero length if it exists and returns an output channel on that file Both functions fail if permissions are not sufficient for reading or writing Warning e Closing functions close their channel argument Since their behavior is unspecified on already closed channels anything can happen in this case e Closing one of the standard IO channels std in std out std err have unpredictable effects 9 4 CHANNELS ON FILES 63 9 4 2 Reading or writing from to specified channels Some of the functions on standard input output have corresponding functions working on channels output_char out channel gt char gt unit fun output_string out_channel gt string gt unit lt fun gt input_char in channel gt char fun input_line in channel gt string fun 9 4 3 Failures The exception End_of_file is raised when an input operation can
135. ts Computation of the length of a list let rec length function gt 0 _ 1 gt 1 length 1 length a list gt int lt fun gt length true 1 int 2 Concatenating two lists let rec append function 12 gt 12 x 11 12 gt x append 11 12 append a list a list gt a list fun The append function is already defined in Caml and bound to the infix identifier 1 2 e 3 4 int list 1 2 3 4 Reversing a list let rec rev function gt x 1 gt rev 1 x rev list gt a list fun rev 1 2 3 int list 3 2 1 The map function applies a function on all the elements of a list and return the list of the function results It demonstrates full functionality it takes a function as argument list processing and polymorphism note the sharing of type variables between the arguments of map in its type let rec map f function gt x 1 gt f x map f 1 map a gt b gt a list gt b list fun map function x gt x 1 1 2 3 4 5 int list 2 3 4 5 6 map length 1 2 3 4 5 6 1 int list 3 2 1 O 36 CHAPTER 5 LISTS The following function is a list iterator It takes a function f a base element a and a list 11 Zn It computes it li et f a 1 f a x1 2
136. ty may be used on values of any type what is its type Equality takes two arguments of the same type whatever type it is and returns a boolean value The type of equality is a polymorphic type ie may take several possible forms Indeed when testing equality of two numbers then its type is int gt int gt bool and when testing equality of strings its type is string gt string gt bool prefix a gt a gt bool fun The type of equality uses type variables written a b etc Any type can be substituted to type variables in order to produces specific instances of types For example substituting int for a above produces int gt int gt bool which is the type of the equality function used on integer values 1 2 bool false 1 2 2 1 bool false 1 1 2 Toplevel input gt 1 1 2 5 AA This expression has type int int but is used with type int 4 2 2 Conditional Conditional expressions are of the form if eig then else e2 where eqs is an expression of type bool and e2 are expressions possessing the same type As an example the logical negation could be written let negate a if a then false else true negate bool gt bool fun negate 1 2 bool true 26 CHAPTER 4 BASIC TYPES 4 2 3 Logical operators The classical logical operators are available in Caml Disjunction and conjunction are respectively writ
137. typing parse top let f be x x x ASL Type of f is 7a gt a unit Q typing parse top x x 1 ASL Type clash between Number and a gt a Uncaught exception Failure ASL typing 15 3 9 Typing and recursion The Z fixpoint combinator does not have a type in Milner s type system typing parse_top let fix be f x f z x x z 2 x x z ASL Type clash between a and a gt b Uncaught exception Failure ASL typing This is because we try to apply x to itself and the type of x is not polymorphic In fact no fixpoint combinator is typable in ASL This is why we need a special primitive or syntactic construct in order to express recursivity If we want to assign types to recursive programs we have to predefine the Z fixpoint combinator Its type scheme should be Va a a because we take fixpoints of functions 4global env fix init env 4global typing env Forall 1 Arrow Arrow TypeVar Index 1 Value Unknown TypeVar Index 1 Value Unknown TypeVar Index 1 Value Unknown init typing env unit dk dt dt 15 3 THE ASL TYPE SYNTHESIZER 113 We can now define our favorite functions as typing parse top let fact be fix f n if n 0 then 1 else n f n 1 fi ASL Type of fact is Number gt Number unit typing parse top fact 8 ASL
138. ual mathematical notion of function In mathematics functions are first class objects they can be arbitrarily manipulated For example they can be composed and the composition function is itself a function In mathematics one would present the successor function in the following way successor N N n n 1 The functional composition could be presented as o A C A C B fig z 9 2 where A gt B denotes the space of functions from A to B We remark here the importance of 1 the notion of type a mathematical function always possesses a domain and a codomain They will correspond to the programming notion of type 2 lexical binding when we wrote the mathematical definition of successor we have assumed that the addition function had been previously defined mapping a pair of natural numbers to a natural number the meaning of the successor function is defined using the meaning of the addition whatever denotes in the future this successor function will remain the same 3 the notion of functional abstraction allowing to express the behavior of f o g as x f 9 x i e the function which when given some 2 returns f g x ML dialects cf below respect these notions But they also allow non functional programming styles and in this sense they are functional but not purely functional 9 10 CHAPTER 2 FUNCTIONAL LANGUAGES 2 1 History of functional languages Som
139. unction gt a x 1 gt f x list it f a 1 list it a gt b gt b gt b gt a list gt fun As an example the application of it_list to the functional composition and to its neutral element the identity function computes the composition of lists of functions try it Define using the same method a canonical iterator over binary trees Part II Caml Light specifics 47 Chapter 7 Mutable data structures The definition of a sum or product type may be annotated to allow physical destructive update on data structures of that type This is the main feature of the imperative programming style Writing values into memory locations is the fundamental mechanism of imperative languages such as Pascal The Lisp language while mostly functional also provides the dangerous functions rplaca and rplacd to physically modify lists Mutable structures are required to implement many efficient algorithms They are also very convenient to represent the current state of a state machine 7 1 User defined mutable data structures Assume we want to define a type person as in the previous chapter Then it seems natural to allow a person to change his her age job and the city that person lives in but not his her name We can do this by annotating some labels in the type definition of person by the mutable keyword type person Name string mutable Age int mutable Job string mutable City
140. unter directive the identifier new automatically resolves to counters new If two modules say mod1 and mod2 define both a global value f then f in a client module client resolves to modi f if modi is opened but not mod2 or if modi has been opened more recently than mod2 Otherwise it resolves to mod2__f For instance the two system modules int and float both define the infix identifier Both modules int and float are opened by default but int comes first Hence x y is understood as the integer addition since is resolved to int But after the directive open float module float comes first and the identifier is resolved to float 11 3 Abstraction Some globals defined in a module are not intended to be used outside of this module Then it is good programming style not to export them outside of the module so that the compiler can check they are not used in another module Also one may wish to export a data type abstractly that is without publicizing the structure of the type This ensures that other modules cannot build or inspect objects of that type without going through one of the functions on that type exported in the defining module This helps in writing clean well structured programs The way to do that in Caml Light is to write an explicit interface or output signature specifying those identifiers that are visible from the outside All other identifiers will remain local to the module For global values thei
141. whose first element is the current instruction being executed e and a stack represented as a list of code addresses instruction lists values and environments The first Caml type that we need is the type for CAM instructions We will study later the effect of each instruction type instruction Quote of int Integer constants 115 116 CHAPTER 16 COMPILING ASL TO AN ABSTRACT MACHINE CODE Plus Minus Arithmetic operations Divide Equal Times 4 Nth of int Variable accesses Branch of instruction list instruction list Conditional execution Push Pushes onto the stack Swap Exch top of stack and register Clos of instruction list Builds a closure with the current environment Apply Function application 5 Type instruction defined We need a new type for semantic values since instruction lists have now replaced abstract syntax trees The semantic values are merged in a type object The type object behaves as data in a computer memory we need higher level information such as type information in order to interpret them Furthermore some data do not correspond to anything for example an environment composed of environments represents neither an ASL value nor an intermediate data in a legal computation process type object Constant of int Closure of object object Address of instruction list Environment of object list Typ
142. xn gt a fun raise Found 5 Uncaught exception Found 5 Here is a more realistic example let find_index p let rec find n function gt raise Failure not found x L gt if p x then raise Found n else find n 1 L ain find 1 Toplevel input gt x L gt if p x then raise Found n ss Warning the variable L starts with an upper case letter in this pattern find index a gt bool gt a list gt b fun The find_index function always fails It raises e Found n if there is an element x of the list such that p x in this case n is the index of x in the list e the Failure exception if no such x has been found Raising exceptions is more than an error mechanism it is a programmable control structure In the find_index example there was no error when raising the Found exception we only wanted to quickly escape from the computation since we found what we were looking for This is why it must be possible to trap exceptions we want to trap possible errors but we also want to get our result in the case of the find_index function 8 3 TRAPPING EXCEPTIONS 57 8 3 Trapping exceptions Trapping exceptions is achieved by the following construct try expression with match cases This construct evaluates expression If no exception is raised during the evaluation then the result of the try construct is the result of expression If an exceptio
143. xpr Const of int Var of string Plus of args Mult of args Minus of args Div of args and args Argl arith_expr Arg2 arith_expr Type arith_expr defined Type args defined dk dt The two types arith expr and args are simultaneously defined and arith_expr is recursive since its definition refers to args which itself refers to arith expr As an example one could represent the abstract syntax tree of the arithmetic expression x 3 y as the Caml value Plus Argi Var x Arg2 Mult Argi Const 3 Arg2 Var y arith_expr Plus Argi Var x Arg2 Mult Arg1 Const 3 Arg2 Var y The recursive definition of types may lead to types such that it is hard or impossible to build values of these types For example type stupid Next stupid Type stupid defined Elements of this type are infinite data structures Essentially the only way to construct one is let rec stupid value Next stupid_value stupid_value stupid Next Syntax trees are said to be abstract because they are pieces of abstract syntax contrasting with concrete syntax which is the string form of programs analyzing parsing concrete syntax usually produces abstract syntax 6 2 SUM TYPES 43 Next Next Next Next Next Next Next Next Next Next Next Next Next Next Next Recursive type definitions should be well founde
144. ype Unknown Number TypeVar of vartype Arrow of asl_type asl_type and vartype Index int mutable Value asl_type and asl_type_scheme Forall of int list asl_type Type asl_type defined Type vartype defined Type asl_type_scheme defined The Unknown ASL type is not really a type it is the initial value of fresh ASL type variables We will consider as abnormal a situation where Unknown appears in place of a regular ASL type In such situations we will raise the following exception exception TypingBug of string Exception TypingBug defined 106 CHAPTER 15 STATIC TYPING POLYMORPHISM AND TYPE SYNTHESIS Type variables are allocated by the new vartype function and their global counter a local refer ence is reset by reset vartypes let new vartype reset vartypes Generating and resetting unknowns let counter ref 0 in function gt counter counter 1 Index counter Value Unknown function gt counter 0 new vartype unit gt vartype fun reset vartypes unit gt unit fun 15 3 2 Destructive unification of ASL types We will need to shorten type variables since they are indirections to ASL types we need to follow these indirections in order to obtain the type that they represent For the sake of efficiency we take advantage of this operation to replace multiple indirections by single indirections shortening let rec

Download Pdf Manuals

image

Related Search

Related Contents

Samsung 172S Instrukcja obsługi    GB OPERATING INSTRUCTIONS - Stanmech Technologies Inc.  LOREX Technology L174P-81 User's Manual  Betriebsanleitung - LBV 310 -  取扱説明書 棚  MANUAL DE INSTALACIÓN DEL KIT ACA  Napoleon Fireplaces GDS 50-N User's Manual  ControlNet-to-Foundation Fieldbus, H1 Linking Device  

Copyright © All rights reserved.
Failed to retrieve file