Previous Up Next

Appendix A  First steps in OCaml

Let us first check the “Hello world” program. Use the editor to create a file hello.ml containing the following single line:

     
print_string "Hello world!\n";;

Then, compile and execute the program as follows:

     
ocamlc -o hello hello.ml ./hello
     
Hello World

Alternatively, the same program could have been typed interactively, using the interpreter ocaml as a big desk calculator, as shown in the following session:

     
ocaml
     
Objective Caml version 3.00 #
     
print_string "hello world!\n";;
     
hello world! - : unit = ()

To end interactive sessions type ^D (Control D) or call the exit function of type int -> unit:

     
exit 0;;

Note that the exit function would also terminate the execution in a compiled program. Its integer argument is the return code of the program (or of the interpreter).

Exercise 33 ((*) Unix commands true and false)   Write the Unix commands true et false that do nothing but return the codes 0 and 1, respectively.
Answer

The interpreter can also be used in batch mode, for running scripts. The name of the file containing the code to be interpreted is passed as argument on the command line of the interpreter:

     
ocaml hello.ml
     
Hello World

Note the difference between the previous command and the following one:

     
ocaml < hello.ml
     
Objective Caml version 3.00 # Hello World - : unit = () #

The latter is a “batch” interactive session where the input commands are taken from the file hello.ml, while the former is a script execution, where the commands of the file are evaluated in script mode, which turns off interactive messages.

Phrases

of the core language are summarized in the table below:

– value definitionlet x = e
– [mutually recursive]let [ rec ] f1 x1 ... = e1 ...
function definition[s]  [ and fn xn ... = en]
– type definition[s] type q1 = t1... [ and qn = tn ]
– expressione

Phrases (optionally) end with “;;”.

     
(* That is a comment (* and this is a comment inside a comment *) continuing on several lines *)

Note that an opening comment paren “(*” will absorb everything as part of the comment until a well-balanced closing comment paren “*)” is found. Thus, if you inadvertently type the opening command, you may think that the interpreter is broken because it swallows all your input without ever sending any output but the prompt.

Use ^C (Control-C) to interrupt the evaluation of the current phrase and return to the toplevel if you ever fall in this trap!

Typing ^C can also be used to stop a never-ending computation. For instance, try the infinite loop

     
while true do () done;;

and observe that there is no answer. Then type ^C. The input is taken into account immediately (with no trailing carriage return) and produces the following message:

     
^C
     
Interrupted.
Expressions

are

– local definitionlet x = e1 in e2
  (+ mutually recursive local function definitions)
– anonymous functionfun x1 ... xn -> e
– function callf x1 ... xn
– variablex   (M.x if x is defined in M)
– constructed value(e1, e2)
  including constants1, 'c', "aa"
– case analysismatch e with p1  ->  e1 … ∣ pn  ->  en
– handling exceptionstry e with p1  ->  e1 … ∣ pn  ->  en
– raising exceptionsraise e
– for loop for i = e0 [down]to ef do e done
– while loopwhile e0 do e done
– conditionalif e1 then e2 else e3
– sequencee; e '
– parenthesis(e) or begin e end

Remark that there is no notion of instruction or procedure, since all expressions must return a value. The unit value () of type unit conveys no information: it is the unique value of its type.

The expression e in for and while loops, and in sequences must be of type unit (otherwise, a warning message is printed).

Therefore, useless results must explicitly be thrown away. This can be achieved either by using the ignore primitive or an anonymous binding.

     
ignore;;
     
- : 'a -> unit = <fun>
     
ignore 1; 2;;
     
- : int = 2
     
let _ = 1 in 2;;
     
- : int = 2

(The anonymous variable _ used in the last sentence could be replaced by any regular variable that does not appear in the body of the let)

Basic types, constants, and primitives

are described in the following table.

TypeConstantsOperations
unit()no operation!
booltrue   false&&   ||   not
char'a'   '\n'   '\097'Char.code   Char.chr
int1   2   3+   -   *   /   max_int
float1.0   2.   3.14   6e23+.   -.   *.   /.   cos
string"a\tb\010c\n"^   s.[i]   s.[i] <- c
 
Polymorphic types and operations
arrays[| 0; 1; 2; 3 |]t.(i)   t.(i) <- v
pairs(1, 2)fst   snd
tuples(1, 2, 3, 4)Use pattern matching!

Infixes become prefixes when put between parentheses.

For instance, ( + ) x1 x2 is equivalent to x1 + x2. Here, it is good practice to leave a space between the operator and the parenthesis, so as not to fall in the usual trap: The expression “(*)” would not mean the product used as a prefix, but the unbalanced comment starting with the character “)” and waiting for its closing comment paren “*)” closing paren.

Array

operations are polymorphic, but arrays are homogeneous:

     
[| 0; 1; 3 |];;
     
- : int array = [|0; 1; 3|]
     
[| true; false |];;
     
- : bool array = [|true; false|]

Array indices vary from 0 to n−1 where n is the array size.

Array projections are polymorphic: they operate on any kind of array:

     
fun x -> x.(0);;
     
- : 'a array -> 'a = <fun>
     
fun t k x -> t.(k) <- x;;
     
- : 'a array -> int -> 'a -> unit = <fun>

Arrays must always be initialized:

     
Array.create;;
     
- : int -> 'a -> 'a array = <fun>

The type of the initial element becomes the type of the array.

Tuples

are heterogeneous; however, their arity is fixed by their type: a pair (1, 2) of int * int and a triple (1, 2, 3) of type int * int * int are incompatible.

The projections are polymorphic but are defined only for a fixed arity. For instance, fun (xyz) -> y returns the second component of any triple. There is no particular syntax for projections, and pattern matching must be used. The only exceptions are fst and snd for pairs defined in the standard library.

Records

In OCaml, records are analogous to variants and must be declared before being used. See for example the type regular used for cards (Exercise 2.1, page ??). Mutable fields of records must be declared as such at the definition of the record type they belong to.

     
type 'a annotation = { name : string; mutable info : 'a};;
     
type 'a annotation = { name : string; mutable info : 'a; }
     
fun x -> x.info;;
     
- : 'a annotation -> 'a = <fun>
     
let p = { name = "John"; info = 23 };;
     
val p : int annotation = {name="John"; info=23}
     
p.info <- p.info + 1;;
     
- : unit = ()
Command line

Arguments passed on the command line are stored in the string array Sys.argv, the first argument being the name of the command.

Exercise 34 ((*) Unix command echo)   Implement the Unix echo function.
Answer

The standard library Arg provides an interface to extract arguments from the command line.

Input-output

A summary of primitives for manipulating channels and writing on them is given in the two tables below. See the core and standard libraries for an exhaustive list.

Predefined channels
     
stdin : in_channel stdout : out_channel stderr : out_channel
Creating channels
     
open_out : string -> out_channel open_in : string -> in_channel close_out : out_channel -> unit


Reading on stdin
     
read_line : unit -> string read_int : unit -> int
Writing on stdout
     
print_string : string -> unit print_int : int -> unit print_newline : unit -> unit
Exercise 35 ((**) Unix cat and grep commands)   Implement the Unix cat command that takes a list of file names on the command line and print the contents of all files in order of appearance; if there is no file on the command line, it prints stdin.
Answer
The Unix grep command is quite similar to cat but only list the lines matching some regular expression. Implement the command grep by a tiny small change to the program cat, thanks to the standard library Str.
Answer
Exercise 36 ((**) Unix wc command)   Implement the Unix wc command that takes a list of file names on the command line and for each file count characters, words, and lines; additionally, but only if there were more than one file, it presents a global summary for the union of all files.
Answer

Previous Up Next