Judging process

This page explains the procedure we used to determine the winners of the programming contest.

Errare Humanum Est

We were not quite as merciless as we had announced. We applied relaxed criteria on these three points:

Deadline: we accepted entries up to 10 minutes after the deadline. Some peoples' clocks are off by that much...
Interfacing problems: a few programs would take their input from a file, or would not expect the time limit as a command-line argument. We fixed them, except for the binaries/links to cat, which are worthless anyway.
Left-over debugging code: a couple of programs had debugging output enabled (and sent to stdout). We fixed them too.

We didn't fix any other kind of bug, however trivial. The line has to be drawn somewhere.

Elimination round

We tried to eliminate as many programs as possible using the correctness and time-limit criteria. Because the number of entries was overwhelming, we were eager to get rid of most of them.

The elimination round consisted of running the programs on the following input files (the file size and time limit are given in parentheses).

000-example.txt (14k, 3 minutes): an HTML page translated into SML/NG. This file was made available to the competitors shortly after the beginning of the contest.
001-null.txt (0, 3 minutes): an empty file.
002-almost-empty.txt (9 bytes, 3 minutes): nothing but an open-and-closed PL tag.
003-third-step.txt (5k, 3 minutes): some of the validator input, some random stuff, and a list of nasty hand-written cases.
004-random.txt (13k, 3 minutes): random stuff produced by three different programs.
005-icfp2000.txt (65k, 10 minutes): the task description of last year's contest, HTML version translated to SML/NG.
006-exhaustive.txt (1.2M, 30 minutes): a not-quite-exhaustive catalog of tag combinations.
100-hand.txt (4k, 15 minutes): a bunch of hand-written test cases.
101-validate-big.txt (134k, 15 minutes): a large extract from the log file of the on-line equivalence validator.
102-validate-small.txt (97k, 15 minutes): same as validate-big, but using the short versions of the equivalent documents.
103-the-random-returns.txt (11k, 15 minutes): another random-generated example.

The elimination round reduced the competition to 57 entries, eliminating exactly 2/3 of the entries. The full table of results is available.

Performance round

The goal of this round was to pick a list of 10 finalists. We used the following algorithm:

Use the results of the "interesting" input files: 005, 006, 100, 101, 102, 103.
Each input file yields a ranking of the entries (first, second, etc.) Note that the entries eliminated by the first round do not interfere with these ranks.
The score of an entry is simply the sum of its ranks. For example, an entry that was first on every input file would have a score of 6. The best programs are the ones with the smallest scores.
The finalists are the 10 programs with the smallest scores.

The finalists are:

LALR(5000000)
runme
Wheeler's Wacky Search
Dylan Hackers
Beam Search
trender
mltong
(\x.x)
ElegantBlob
RemarkToo

The full table of results is available.

Final round

Now that we are down to 10 programs, we can try larger time-limits, in order to give a good opportunity to sophisticated programs. We used the following input files:

200-hevea.txt (2.6M, 60 minutes): the source code of HeVeA (a TeX-to-HTML translator that includes an HTML optimiser), pretty-printed in HTML and translated to SML/NG.
201-suite.txt (28k, 60 minutes): the test suite of HeVeA, translated to SML/NG via HTML.
202-hand-again.txt (4k, 60 minutes): same as 100-hand.txt, but with a larger time limit.
203-fractal-big.txt (925k, 60 minutes): a large randomly-generated document.

We ranked the programs with the same algorithm as for the performance round, using the following list of files: 005, 006, 101, 102, 103, 201, 202, 203, 204. 100-hand.txt was not used for this round because it is the same file as 202-hand-again.txt (and it gives the same ranks). The final scores are:

(16) LALR(5000000)
(27) Dylan Hackers
(38) runme
(39) ElegantBlob
(51) ReMarkToo
(53) Beam Search
(58) mltong
(59) trender
(62) (\x.x)
(73) Wheeler's Wacky Search

The full table of results is available.

Final round for the lightning division

For lack of a better idea for the judges' special prize, we simply repeated the performance round and final round (with 6 finalists), restricted to lightning programs, with the following results:

(16) Erlightning
(20) slow
(30) Wheeler's Wacky Search [not the same as above]
(36) NCNH
(43) mltong-lightning
(44) b8_short

The full table of results is available.