Vincent Goffin (vjg@research.att.com) wrote:
>Why would a long running program generate more interesting
>points than one that just started from a random seed?
Well, we are all computing billions of arbitrary points by iterating
from one to the next and eventually there will be a collision when two
people find the same one. There are far too many points to collect
them all so only distinguished points are collected. The collision
will happen at an arbitrary point but will be detected (shortly
afterwards) at a distinguished point.
When you compute some points, those points can't possibly participate
in a collision until you find a distinguished point downstream of
them. Naturally, the value of a distinguished point depends on how
many arbitrary points are upstream of it. The more there are, the
more likely it is to participate in a collision.
When you compute some arbitrary points, they stay latent in the saved
state until you find a distinguished point downstream of them. If you
start two chains of iterations, one from a random point and one from a
point taken from a saved state, then they are equally likely to find
a distinguished point. But the latter chain will find a more valuable
distinguished point since a bunch of latent points are known to be
upstream of it, whereas for the random starting point it is highly
likely that very few points are upstream of it at all.
This whole picture is exacerbated by the fact that 32 chains of
iterations are running in parallel. Suppose you start afresh and do
320 million iterations and then find a first distinguished point.
That point has only 10 million points upstream of it. The other 310
million iterations are all still latent in the saved state!
Doing the parallel chains speeds up the program a lot so it is
worthwhile, but it means the saved state is quite valuable. If the
number of parallel chains is PARAL, then the value of the saved state
rises from nil to about PARAL/2 distinguished points.
A trade-off is required between the gain in speed and the risk of
losing latent work. One way to lose it is by accidentally deleting a
long-running saved state. But the main way is this: when the solution
is found, we will all have some latent work in our saved states which
didn't contribute to the solution! Choosing PARAL to be something
like 20 or 30 seems to be a good trade-off in practice.
>The code also converts the points it generates to a special basis
>before counting the 0's and 1's. That looks like an expensive step.
>If it is expensive, why is it needed?
Yes, it is expensive.
In fact the set of g = 324518553658426701487448656461467
elliptic-curve points is partitioned into equivalence classes of 218
points each (and one containing the point at infinity). The
pseudo-random iteration is chosen so that when applied to two
equivalent points, it gives two new points which are also equivalent.
Furthermore, the distinguishing characteristic is chosen so that when
a point is distinguished, all equivalent points are distinguished too.
(These equivalence classes are called "clusters" in the FAQ).
This scheme allows the search space to be reduced by a factor of 218
which reduces the number of iterations by a factor sqrt(218) ~= 15.
But it also imposes draconian conditions on the pseudo-random
iteration and the distinguishing characteristic which slow down the
iterations.
For many elliptic curves one can design an algorithm using equivalence
classes: points are considered equivalent if you can get from one to
the other by multiplying by a root of unity of small order. But this
is usually not useful as the iterations get slowed down way too much.
For this challenge Certicom chose a curve with a special form: all the
coefficients in the equation of the curve are 0 or 1. In such a case
one can use the Frobenius, a fancy name for the function that squares
the x and y coordinates of a point. The Frobenius applied to a point
on the curve gives another point on the curve, because in
characteristic 2 it is true that (a+b)^2 = a^2 + b^2.
Applying the Frobenius is very fast. And it is the same as
multiplying by (1-sqrt(-7))/2 which is 138423345589698157369693034392981
modulo the group order g. Crucially, this a 109-th root of unity!
Another root of unity that can always be used is -1. Together these
two give equivalence classes of 2*109 = 218 points. With a
well-designed algorithm, the time per iteration increases by only 50%
or so compared to not using equivalence classes. Since the number of
iterations is 15 times smaller, this leads to a 10-fold gain overall!
Historical note:
When Certicom designed the challenge they didn't know about this
method and their estimates of difficulty were way off. The first
working version I know of was described in email from a guy called Ari
Singer at Pitney Bowes in February '98, just using -1. I generalised
it to use any roots of unity and implemented it. It was first used to
solve ECC2K-95.
Bye,
Rob.
This archive was generated by hypermail 2b29 : Wed Mar 01 2000 - 19:48:07 MET