ciphergoth: (Default)
Paul Crowley ([personal profile] ciphergoth) wrote2005-11-01 03:53 pm
Entry tags:

Whitby goodness

I had a lovely Whitby - a little more muted than usual, partly because of colds at our end and partly because of colds very sadly keeping [livejournal.com profile] ergotia and [livejournal.com profile] lilithmagna from joining us, but very nice all the same. My Friday night outfit was particularly well received, and will no doubt be reprised.

Now I'm obsessing with trying to get a paper ready by November 25 for FSE 2006. I have to keep a little quiet about what I'm doing until January 27th, since submissions are anonymous, but, it's cool.

Update with a geek puzzle: here's a simple game. I lay out a set of tiles that say Y or N; say I lay out

NYYNYNN

You lay the same number of tiles underneath:

NYYNYNN
NNYYNNN

The rules are: (1) if a tile is the same as the one above, you have to lay the same tile to the right. So you're not allowed to lay

NYYNYNN
NYYYNNN
-^- not allowed: because the first column is N N, you must lay N as your second tile.

(2) you get a point every time you lay a tile which is the same as the one above.

NYYNYNN
NNYYNNN
1010011 - four points

The least you can score is always zero: it's always a legal move to lay tiles which differ in every place from the ones above. What's the best you can score for NYYNYN, and how many ways can you do it? Can anyone give an algorithm which determines the highest you can score, or which generates all of the possibilities which give you a given score?

Yay!

[identity profile] webcowgirl.livejournal.com 2005-11-01 04:36 pm (UTC)(link)
Nice to have you back. The pictures [livejournal.com profile] emarkienna had of you were stunning - you're so gorgeous in the shiny with your hair down! I was sitting at work with my mouth hanging open. Wish I'd been there.

Did [livejournal.com profile] ergotia and [livejournal.com profile] lilithmagna not make it there at all, or just not go out?

Re: Yay!

[identity profile] ciphergoth.livejournal.com 2005-11-01 04:50 pm (UTC)(link)
They didn't make it at all - we travelled on Thursday hoping they'd follow the next day, and got a text early the next day saying that it wasn't going to happen.

I take it that it's you that has started adding useful tags, btw? Thanks!

Re: Yay!

[identity profile] webcowgirl.livejournal.com 2005-11-01 06:02 pm (UTC)(link)
Useful tags? You mean, in my journal? Like Shiny and Theater and Schedule and such? Or something else?

Re: Yay!

[identity profile] ciphergoth.livejournal.com 2005-11-01 07:43 pm (UTC)(link)
This entry has "geek" and "whitby" tags, and it wasn't me who put them there!

Re: Yay!

[identity profile] webcowgirl.livejournal.com 2005-11-01 07:46 pm (UTC)(link)
Hmm, I didn't add them ...

[identity profile] drdoug.livejournal.com 2005-11-01 05:21 pm (UTC)(link)
Is the number of tiles fixed - i.e. is it always seven?

The least you can score is always zero: it's always a legal move to lay tiles which differ in every place from the ones above

Err ... isn't it always one? You can always score one by laying tiles which differ in every place but the right-hand one.


It looks like the problem ought to be related to adding but I can't make the connection.

If the number of tiles we play with is always seven (or even just small), my first choice for an algorithm to find the highest score would be simple brute force, sad to say! With only a little bit-twiddling for each instance it's not going to take long, and I'm usually more confident in the completeness of brute force algorithms than more subtle ones.

[identity profile] ciphergoth.livejournal.com 2005-11-01 05:35 pm (UTC)(link)
No, you can have any number of tiles, so brute force isn't enough. I've implemented a brute force solution to experiment with it, but I need something faster.

You can always score zero with the strategy I describe; you can always score 1 with the strategy you describe. Actually I think you can always score at least ceil(n/2): choose the letter which appears most often above, and lay that down in every place below. If the tiles are alternating then that's the best you can do.

There is a connection with adding, but it's pretty longwinded.

[identity profile] drdoug.livejournal.com 2005-11-01 05:46 pm (UTC)(link)
No, you can have any number of tiles, so brute force isn't enough.

Shame.

You can always score zero with the strategy I describe; you can always score 1 with the strategy you describe.

Agreed.

Actually I think you can always score at least ceil(n/2): choose the letter which appears most often above, and lay that down in every place below. If the tiles are alternating then that's the best you can do.

Reckon so. That's a nice strategy - it does well in lots of circumstances. Presumably you want a solution that'll outperform that when possible? I'm not finding any immediately to hand.

[identity profile] ciphergoth.livejournal.com 2005-11-01 10:18 pm (UTC)(link)
It's definitely possible to do better: consider

NNNNNNNNNNYYYYYYYYYY
NNNNNNNNNNnYYYYYYYYY

(using a lowercase "n" to mark a choice where you don't score a point)

I think a greedy strategy (if you're forced to put a letter, put that, else put what's above) does pretty well - it may even be optimal. But I haven't proven it yet, and even if it is, it only finds one best solution, not all of them.

[identity profile] emarkienna.livejournal.com 2005-11-03 12:12 pm (UTC)(link)
The greedy strategy is what I was thinking too.

My reasoning was to break the sequence up into blocks of "one or more Ys followed by an N", and "one or more Ns followed by a Y" (any remaining Ys or Ns treat as a special case), and then show that in the first case an optimal sequence for that block is all Ys, and similarly therefore for the second case an optimal sequence is all Ns. Since you've ended with a different letter, you are free to choose the first letter for the next block, so it should be possible to treat the blocks independently. Though I may have missed something or made some assumptions.

The only time an alternative solution is possible if the block is either NY or YN - we can try YY or NN, but that constrains us to keep the same letter for the next choice ... So it can be seen that a block of NYY can have YYY as a solution instead of NNY (and similarly YNN can have NNN instead of YYN). I think that [livejournal.com profile] shevek's solution is the greedy solution except with these two substitutions made.

I'm not sure how to prove that these are the only possible substitutions. If they are, then it ought to be possible to find all possible solutions by spotting sequences of NYY and YNN and writing all possible combinations with each of these sequences having either of the two solutions?

So for NYYNYNN, the best would be NNYYYYN (score 4). There is 1 sequence of NYY and 1 of YNN given 3 other solutions: YYYYYYN, NNYYNNN and YYYYNNN.

And you were looking very beautiful at Whitby!

[identity profile] sparklegoth.livejournal.com 2005-11-01 07:21 pm (UTC)(link)
Bloody hell! That is some outfit!

[identity profile] shevek.livejournal.com 2005-11-01 07:27 pm (UTC)(link)
Trivial thought. It's bounded below by len(sequence) - count(shifts) - 2. There are local optimisations for both ends and for local alternations, but I don't think you ever need to consider a subsequence of length < 3 or 4. But I am rushing through on my way somewhere and will look later and prove it. Now am thinking that local optimisations give len(sequence) - (count(shifts) / 2) - 2 but have not proven that yet.

[identity profile] shevek.livejournal.com 2005-11-02 12:43 am (UTC)(link)
I attach what I believe to be the implementation of the optimal algorithm. The proof is local to windows of size 3. First prove the windowing, then generate and fold the patterns.

I have also fixed the code (hence the repost) so it deals correctly with the degenerate cases (short sequences, end of sequence) correctly! All lines after the blank line deal with degenerate cases, and can be handled as special cases in the proof.

%{
#include <stdio.h>
#define OUT(x) do { fputs(#x, stdout); \
        if (#x[0] == yytext[0]) { score++; BEGIN(x); } \
        else BEGIN(INITIAL); } while(0)
static int  score = 0;
%}
%option noyywrap
%option noinput
%option nounput
%option noreject
%option debug
%option noyy_top_state
%x N Y
%%
<Y>N/NN { OUT(Y); }
<Y>N/NY { OUT(Y); }
<Y>N/Y  { OUT(Y); }
<N>Y/YY { OUT(N); }
<N>Y/YN { OUT(N); }
<N>Y/N  { OUT(N); }
<Y>Y    { OUT(Y); }
<N>N    { OUT(N); }
Y/Y     { OUT(Y); }
Y/NY    { OUT(Y); }
Y/NN    { OUT(N); }
N/YN    { OUT(N); }
N/YY    { OUT(Y); }
N/N     { OUT(N); }

<Y>N  { OUT(Y); }
<N>Y  { OUT(N); }
Y/N     { OUT(Y); }
N/Y     { OUT(N); }
Y       { OUT(Y); }
N       { OUT(N); }
%%
int main(void) {
    char    line[BUFSIZ];
    fgets(line, BUFSIZ, stdin);
    yy_scan_string(line);
    yylex();
    printf("Score = %d\n", score);
    return 0;
}

[identity profile] ciphergoth.livejournal.com 2005-11-02 01:04 am (UTC)(link)
I'm not the only one up late thinking about this, then? Have some Python code. I represent these vectors of Y and N as integers (ie bit vectors), so findbest(5, 5) means find what scores are possible for the challenge YNYNN and all suffixes. Given that, it's an easy bit of dynamic programming to generate all the sequences that have a given score. I've tested it by comparison to brute force up to 11-bit sequences and it seems to be correct.

# findbest(l, challenge)[i][b] & (1 << s) != 0
# means: there is a solution to challenge >> (l-i) of length i
# starting with b, which scores s
def findbest(l, challenge):
    if l == 0:
        return [(1L, 1L)]
    allbits = findbest(l-1, challenge >> 1)
    down = allbits[-1]
    bit = challenge & 1
    res = [None, None]
    res[bit] = down[bit] << 1
    res[bit ^ 1] = down[0] | down[1]
    allbits.append(tuple(res))
    return allbits

[identity profile] shevek.livejournal.com 2005-11-02 01:16 am (UTC)(link)
To find a solution of score n < L, take my optimal solution, and flip the first (L - n) matching letters. To find them all ... I'm not sure that's interesting, it's just a matter for recursive programming. The objective for me was to find the optimal solution in linear time, which is why I expressed it as a type 3 grammar.

[identity profile] biog.livejournal.com 2005-11-04 04:34 pm (UTC)(link)
Super-nice outfit *drool*. Wish I could make it to Whitby more often, circumstances etc. Still, you look stunning :)

[identity profile] mskala.livejournal.com 2005-11-06 05:53 pm (UTC)(link)
At a glance this looks like a job for dynamic programming, but I think it can actually be done in linear time if you do this:

Compute, for the first k tiles, the optimal solution that DOES and the optimal solution that DOES NOT have a match in the last position.

From those two solutions it's easy to compute a similar pair of solutions for the first k+1 tiles.

Repeat until you've done all the tiles, then keep whichever of the two solutions you have, is better.

The big advantage: you can prove the above correct without needing to prove that a (potentially long and complicated) list of pattern-matching templates is correct and complete.

[identity profile] mskala.livejournal.com 2005-11-06 05:56 pm (UTC)(link)
And if you want all of them - keep lists of all optimal solutions at each stage, instead of just one. That may kill worst-case efficiency, though.

[identity profile] ciphergoth.livejournal.com 2005-11-06 06:22 pm (UTC)(link)
Check out my bit of Python above - it's easy to do in linear time with dynamic programming. You collect answers to the question "What's the best and worst score I can achive against the i-tile suffix of this puzzle, using a solution starting with x?". So you end up storing four small integers per tile; I actually used a bitmask representing possible scores. Then it's easy to iterate through all the answers that score k, because it's easy to answer the question "Given this prefix (which scores y), can I add tile x to it and achieve my target score?".

You can easily generalize this to any game played on an FSM where transitions score points and you know your opponent's moves in advance.

[identity profile] hythloday.livejournal.com 2005-12-06 03:34 am (UTC)(link)
any game played on an FSM where transitions score points and you know your opponent's moves in advance

Nim, in other words?

[identity profile] ciphergoth.livejournal.com 2005-12-06 09:27 am (UTC)(link)
I can't quite work that out:

First, Nim's not really an FSM: there are infinitely many possible Nim boards.
Second, you don't score points for the transitions - the game is decided at the end.
Third, you don't know your opponent's moves in advance!

But maybe there's a clever way of thinking about Nim that turns it into a problem fitting the above constraints?

&ldquo;Apart from that, Mrs. Licoln, how was the play?&rdquo;

[identity profile] hythloday.livejournal.com 2005-12-06 06:19 pm (UTC)(link)
First up - infinitely many Nim boards. There are, it's true, infinitely many ways of laying out a board for a game of Nim, but once you've made that board, there are a finite number of ways of playing it. This struck me as similar to your game, in that you can lay out an infinite number of tiles, but once you do, there are only a finite number of ways I can lay out my tiles.

Second - ah, you formally score points at the end, it's true, but you can regard a game of Nim for a player as a set of transitions in and out of Nim numbers - the player gains a point when he gets to a Nim number and loses a point when he strays off it - then whoever has "more points" (although there's only one) at the end of the game wins.

I misunderstood what you meant by "know your opponents moves in advance" - I though you meant in the sense of chess vs. roshambo rather than in the sense of your game (did you give it a name?) vs. chess. I suppose in an optimally played game of Nim, one *does* know one's opponent's moves, in effect, because they'll either be a move to a Nim number or a move that will let one get to a Nim number, and the Nim number that one will get to is known, but that's clutching at straws rather.

I'm not certain, though, that there isn't some kind of relationship between the two - if it becomes more apparent to me I'll let you know, if you like.

Re: &ldquo;Apart from that, Mrs. Licoln, how was the play?&rdquo;

[identity profile] ciphergoth.livejournal.com 2005-12-07 04:18 pm (UTC)(link)
Any problem solvable in X time can be transformed in X time into any other problem, so of course you can transform Nim into an instance of the problem my program solves. However, the relationship this establishes is trival and uninteresting, because all the real "work" in solving the problem is done in the transformation step, not in the "solving" step.

(for X substitute polynomial, linear, subquadratic, or whatever)

[identity profile] grlleastlikely.livejournal.com 2005-11-24 03:57 am (UTC)(link)
i love the outfit. very becoming.