ciphergoth: (Default)
[personal profile] ciphergoth
EDIT: have discovered why this isn't a good idea after all! It turns out that Pollard rho is much more optimized than I'd realized, and those optimizations mean that this isn't useful. In particular I was pointed to this on the cryptography mailing list. Basically it turns out they don't start from scratch every time when they generate ax + by — they just update the original point with a single point addition chosen by a hash function. So basically t_a == t_p and there's no advantage to the below.



Either this isn't original, or it doesn't work, or I've found a way to solve the discrete logarithm problem on elliptic curves a little bit faster. This idea woke me up this morning so I'm keen to know which :-)

The standard algorithm is "Pollard rho": Supposing you're looking for an x such that xg = y where g, y are in some group G. You generate lots of random a, b st 0 <= a, b < |G|, and look for a collision in ag + by: in other words a pair such that ag + by = a'g + b'y. Then you have a + bx = a' + b'x (mod |G|) and so you solve for x in (a - a') = (b' - b)x (mod |G|). By the standard birthday argument, you need to generate around sqrt(2*ln(2)*|G|) values of ag + by for a 50% probability of success; if generating such a pair takes time t_p then the whole thing takes time roughly sqrt(2*ln(2)*|G|) t_p.

The idea behind my speedup is to reduce the size of the space you're looking for a collision. Suppose that given a point p, you can find the point p + g in time t_a. Find a way to efficiently distinguish a subset D of points in the curve, where |D| should be around |G|(t_a / t_p); this can be as simple as hashing the point and checking if the k least significant bits are zero for some suitable k. Now, after generating ag + by, find the least non-negative integer c such that (a + c)g + by is in D; you do this just by starting with ag + by, and adding g repeatedly until you get a point in D. It's this value we look for collisions in. When we find a collision, we have (a + c)g + by = (a' + c')g = b'y, so we solve (a + c - a' - c') = (b' - b)x (mod |G|) to find x.

Now generating points is slower; instead of taking time t_p, it takes on average t_p + (|G|/|D|)t_a. But we now only need to generate sqrt(2*ln(2)*|D|) points for a 50% success probability, so this takes time in total sqrt(2*ln(2)*|D|)(t_p + (|G|/|D|)t_a). Since |D| ≅ |G|(t_a / t_p), this is ≅ 2 sqrt(2*ln(2)*|D|)t_p, and so the speedup is roughly sqrt(t_p / t_a)/2. So how much faster this is depends roughly on how many point additions it takes to find ag + by.

The current world record in ECDL was in a 112-bit curve. With the most naive possible algorithm for calculating ag + by, we'd see t_p/t_a of around 672, meaning that this technique would speed things up by a factor of 13 or so. However, doing lots of precomputation on g and y would allow a much faster table-driven implementation to get t_p/t_a to as little as 13 - but even then, this technique would nearly halve execution time, and it gets better as the curve gets larger.

This trick seems much too simple not to have been thought of before, but if so I don't know why the world record attack didn't use it. Does this make sense? If so, does it work? If so, is it original? Thanks!

Date: 2011-05-29 09:11 am (UTC)
From: [personal profile] josy
What happens if there is no c such that (a+c)g + by is in D? It might be possible to avoid this by careful choice of D; I couldn't say off the top of my head.

Aside from that, I see no obvious problem. Sometimes obvious-seeming things actually haven't been thought of before.

Date: 2011-05-29 10:20 pm (UTC)
From: [personal profile] josy
I see what you mean: as long as such situations are very rare, they don't actually matter that much.

Date: 2011-05-29 02:20 pm (UTC)
From: (Anonymous)
Exciting. This is totally out of my area of expertise but it's a really great idea and an elegant one (took me 30 minutes with wikipedia to even half way understand it). I'm totally talking out of my area of expertise so naturally this means I may well be talking inadvertently out of something less savoury.

One issue I have is with your (|G|/|D|) factor. This is your mean number of additional points to get from a random point in |G| to a point in |D| by additions (each costing t_a). I'm guessing your assumption here is that your hashing function is sufficiently good that the points in |D| are generated is such a way that they are effectively like a Poisson process with rate (|D|/|G|) and therefore by the waiting time paradox we get that the number of steps (if picking the first thing is one step is (|D|/|G|).

[Aside: in fact the mean number of additions is one less than this because you may "start" in |D| so the mean number of additions is (|D|/|G| -1) which then gives an adjusted optimal size of |D| as (t_p - t_a)/t_a.]

However, if the hash function "clumps" things for whatever reason then the mean number of steps may be much larger. So in a worst case if a really bad hash function creates members of |D| absolutely adjacent then the mean number of addition steps is (|G| - |D|) (|G - D|/2) -- obviously much larger.

I have no idea about this though. The assumption of a hash function which can pick members of |D| distributed no worse than Poisson within |G| is not such a bad one it seems to me.

Best of luck with this -- it seems an extremely promising idea, indeed an exciting one.

Date: 2011-05-29 02:22 pm (UTC)
From: (Anonymous)
Oh -- that last comment was from Steer by the way.

Your point about "pseudorandom" hasing may answer my query properly (I don't know)

Profile

ciphergoth: (Default)
Paul Crowley

January 2025

S M T W T F S
   1234
5678 91011
12131415161718
19202122232425
262728293031 

Most Popular Tags

Page Summary

Style Credit

Expand Cut Tags

No cut tags
Page generated Jul. 14th, 2025 10:41 am
Powered by Dreamwidth Studios