The algorithms I usually work with are pretty simple. They are mostly variations on depth-first or breadth-first search, and the difficulty comes from the complex data structures. Most of the details of how to do things have been worked out. So, it is quite enjoyable to implement a slightly more involved algorithm. Yesterday and today I have been playing around with a Python implementation of the Kameda-Weiner algorithm. The purpose of the algorithm is to minimize a nondeterministic finite automaton. The algorithm dates from 1970, so it is roughly as old as I am! What makes it tricky is that (1) the original description is slightly obfuscated, and (2) it is not straightforward, and involves at least two NP-complete subproblems.
For those not in the know, NP-complete means that, based on our current knowledge, the algorithm is expected to take a long time for certain inputs. Well, that is not really the whole story. But how to explain this to non-computer scientists? These problems used to be called “intractable”, but that is not very accurate. Let’s put it this way: in theory some problems are supposed to be easy to solve, and some are hard. Programs that solve some of the hard problems may, however, be faster then some other programs that solve some of the easy problems in practical cases. So the situation is a little complicated.
In theory, there is no difference between theory and practice. But, in practice, there is.
Jan van de SnepscheutLike the ski resort full of girls hunting for husbands and husbands hunting for girls the situation is not as symmetrical as it might seem.
Alan Mackay
In any case, Kameda-Weiner is tricky because, although it is NP-complete hard, and although the situation is hopeless in the long run, I still want my implementation, my program, to work well for small inputs. So I have to think of different kinds of tricks to cut corners here and there. But I also have to make sure that the corners I cut are not going to produce incorrect results. At at high-level there are some mathematical tricks, at the middle-level there are implementation tricks, at the low-level there are considerations of how Python’s data structures work, and at the sub-terranean level there are considerations of how the CPU will eventually execute the code.
Technically I am not supposed to think of the lower levels. This has to do with Knuth’s First Law of Optimization: Don’t! But I really want the program to run as fast as its little wheels can go. Knuth’s Second Law of Optimization (for experts only) is: Not yet! So, after the program is finished, I am supposed to get an execution profile that identifies the slow parts of the program and to then decide where to cut the corners. Unfortunately this program is so small (it is really just one big algorithm with loops inside loops inside loops) that I have to think about optimization right up front.
So there is this fine balance between keeping things fast, keeping things correct, and still keeping the code legible, at a lot of different levels. I have implemented the algorithm before (in C), so things are a little easier this time round. A little.
I (and some other people) would really like to have a good implementation of the algorithm, but it is nowhere to be found on the web. And I can appreciate why other people might be reluctant their implementations. At the end of the day it is a bit of a mess. Perhaps I’ll publish my version… But it is such a good exercise for students that I might not. We’ll see.
You and I know a professor, who does not emphasize Knuth’s laws of optimization enough. He suggested a grading based on how optimized students’ programs were, when the subject was the isomorphism problem.
I do not know this K-W algorithm and I cannot log on to the appropriate places to get the original paper unless I go to the office, which will not happen in the next few days. You know where I could find a description of the algorithm for free?