From JBRYAN@pstcc.cc.tn.us Wed May 1 18:33:37 1996 Return-Path: Received: from pstcc6.pstcc.cc.tn.us by life.ai.mit.edu (4.1/AI-4.10) for /com/archive/cube-lovers id AA17358; Wed, 1 May 96 18:33:37 EDT Received: from PSTCC6.PSTCC.CC.TN.US by PSTCC6.PSTCC.CC.TN.US (PMDF V5.0-4 #11457) id <01I4799SSU8G001E0M@PSTCC6.PSTCC.CC.TN.US> for cube-lovers@ai.mit.edu; Wed, 01 May 1996 18:33:30 -0500 (EST) Date: Wed, 01 May 1996 18:33:30 -0500 (EST) From: Jerry Bryan Subject: Shamir and M-Conjugacy Don't Mix To: Cube-Lovers Message-Id: Mime-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Content-Transfer-Encoding: 7BIT I have reluctantly concluded that encoding the nodes of a breadth first search tree as resentative elements of M-conjugacy classes cannot be combined with Shamir's method. The short version of the reason is that Shamir works only for post-multiplying, and we must both pre-multiply and post-multiply in order to calculate M-conjugates. It is possible, I think, to use a number of what might be called "clever hacks" involving M-conjugacy to reduce rather considerably the memory requirements of a program implementing Shamir's method, but the basic method cannot store just the representative elements. I will describe the clever hacks if and when I get a working program. How "clever" the clever hacks are may lie in the eye of the beholder. In all cases, they exchange reduced storage requirements for increased running time. It is always a question as to whether such trade-offs are advantageous or not. The longer explanation follows. I will be starting with some real basics. Almost any reasonable introductory or advanced math book will talk about functions. There are two basic views of what a function is. In one view, a function is a non-empty set X, a non-empty set Y, and some general rule or correspondence such that for every x in X, there is exactly one y in Y. In the other view, you start out with the set of all ordered pairs (x,y) with x in X and y in Y. This set is usually called X x Y (X cross Y). A function is then a non-empty subset of X x Y such that every x appears exactly one time as the left hand element of the ordered pair, so that again for every x in X there is exactly one y in Y. The second definition is probably more accurate, but it loses (on purpose, perhaps) the intuitive feel that there is some sort of "general" rule relating X and Y. Indeed, for a finite X and Y, there may be no shorter way to specify a particular function than simply to list the set of ordered pairs of which it is comprised. A function where X and Y are the same set is said to be a function "on X". A function may be one-to-one or onto or both. There are many (equivalent) definitions, but my favorites are that a function is onto if there is at least one x for every y, and a function is one-to-one if there is at most one x for every y. Hence, a function is both one-to-one and onto if there is exactly one x for every y. This condition is necessary if we wish to be able to run the function backwards, that is, if we wish to have an inverse function. Finally, a function that is on a set and which is one-to-one and onto is a permutation. We model the Rubik's cube as a set of permutations. Suppose F and G are functions. In algebra and calculus, we define the composition of two functions something like the following: FoG(x)=F(G(x)). Proper treatment of this definition would require some care in handling the domain and range of the respective functions. But we will dispose of this issue by simply stipulating that F and G are permutations on the same set. The algebra/calculus notation for function composition yields a right-to-left evaluation of the functions. This is especially visible if we compose more than two functions, e.g., FoGoH(x)=F(G(H(x))). In group theory, function composition is more typically written left-to-right such as HGF for the example at hand, with debates about where the argument goes. I prefer in front -- (x)HGF. In Cube Theory, we almost *always* write operators left-to-write, following group theory rather than algebra/calculus. This whole left-to-right vs. right-to-left issue is critical for for proper implementation of Shamir. It is especially critical to get it right because essentially all programming languages follow the algebra/calculus model, whereas Cube Theory follows group theory. Hence, everything is always backwards to some extent in a program. I'm an *old* programmer, so my first higher level programming language (after assembler) was FORTRAN. FORTRAN lets you have statements such as Y=X(I) or Y=SQRT(X). FORTRAN has rather obtuse semantics and is hard to compile. I can remember at the time I learned FORTRAN being puzzled by how the compiler could figure out whether parentheses meant function arguments or whether parentheses meant subscripts. More "modern" languages (say, those less than 20 years old) tend to use square brackets for subscripts, making the compiler's job a bit easier. But the vagaries of FORTRAN serve us well at this point. Suppose we want to define a permutation (which is after all, just a function) on 1..3. We define F as what old FORTRAN programmers called an array with three elements (more often called a vector these days). Then, we can assign values to the array elements, such as F(1)=2, F(2)=3, and F(3)=1. Finally, we can write statements such as Y=F(X), which look and act like functions, although FORTRAN thinks of them as arrays. (Well, F, X, and Y would have to be defined as INTEGERS, which is not very FORTRANish, but so be it). What about function composition, say G(F(X))? It works, but be careful what you mean. A very short snippit of code might look something like the following: X=1 Y=F(X) Z=G(F(X)) PRINT Y, Z Function composition works as advertized even though these things are really arrays. But the composition is calculated only for one particular value of X, namely X=1. If we want to calculate the full=blown composition H=GoF (group theory, H=FG), the code snippit would be H(1)=G(F(1)) H(2)=G(F(2)) H(3)=G(F(3)) As you can see, this programming way of implementing a permutation as an array is really the second way in which math books define a function, namely as a set of ordered pairs. For example, the function F from above is simply F={ (1,2), (2,3), 3,1) }. But the X values were never stored explicitly. Rather, they were the array indices. We would say that the F array stores the Y values as a vector. In this case, we would say that F=[2,3,1]. Notice that it is somewhat arbitrary whether X is represented by the indices and Y is represented by the vector, or vice versa. The way I have shown it seems more natural, but the vice versa is certainly tenable. Notice also that if we think of the vice versa where X is represented by the vector and Y is represented by the indices, then we have the inverse function F'. Hence, the same vector can represent both F and F'. As a practical matter, I really prefer to have indices represent X and to store the inverse as a separate vector. Let me switch to a more modern look and feel, using square brackets. The code to calculate an inverse would then look something like. F_inverse[F[1]] := 1; F_inverse[F[2]] := 2; F_inverse{F[3]] := 3; You would really do this with a loop, so it would be something like For i := 1 to 3 F_inverse[F[i]] := i; If you translate this loop back into group theory, it more or less states the identity FF'=I (the looping just makes sure that we touch all our X values -- the index i is our X value, and the order of F and F' is backwards between our program and group theory). The key component of Shamir's method involves multiplying a permutation t by each permutation s in a set S, where the set S is in lexicographic order. I want to show what happens with both pre-multiplying and post-multiplying. In order to deal with representative elements of M-conjugacy classes, we need both to pre-multiply by m' and to post-multiply by m, so the issue of pre-multiplying vs. post-multiplying is critical. I will use permutations on 1..4 in vector notation for my examples. t S tS [3,1,4,2] [1,2,3,4] = [3,1,4,2] [3,1,4,2] [1,2,4,3] = [4,1,3,2] [3,1,4,2] [1,3,4,2] = [4,1,2,3] [3,1,4,2] [2,1,3,4] = [3,2,4,1] [3,1,4,2] [3,1,2,4] = [2,3,4,1] [3,1,4,2] [4,3 2,1] = [2,4,1,3] S t St [1,2,3,4] [3,1,4,2] = [3,1,4,2] [1,2,4,3] [3,1,4,2] = [3,1,2,4] [1,3,4,2] [3,1,4,2] = [3,4,2,1] [2,1,3,4] [3,1,4,2] = [1,3,4,2] [3,1,2,4] [3,1,4,2] = [4,3,1,2] [4,3,2,1] [3,1,4,2] = [2,4,1,3] Let's take the case of St first. This is classic Shamir. S is in lexicographic order. Going from S to St, every 1 has been replaced by a 3, every 2 has been replaced by a 1, every 3 has been replaced by a 4, and every 4 has been replaced by a 2. We can get St into lexicographic order by sorting S in the order 2 first, 4 second, 1 third, and 3 fourth. The vector [2,4,1,3] which controls this alternative sorting order is simply t'. Hence, we don't really have to sort St if S is made into a tree. Rather, we traverse the S tree using t' as a template to define an alternative order of traversal, and St automagically pops out in lexicographic order. (By the way, there is a minor error in one of my previous posts. Allen Bawden used right-to-left notation in his original Shamir article. I copied what he wrote thinking he was using left-to-right notation. To properly "copy" what he wrote and also put it in left-to-write notation, I needed to reverse everything, but I failed to do so. Hopefully, everything will be consistent and correct in this article.) The tS case is much trickier. Think of S as a matrix. To get to tS, what you do is permute the columns. With the particular t we are using, column 3 becomes column 1, column 1 becomes column 2, column 4 becomes column 3, and column 2 becomes column 4. I really can't think of any Shamir-like tree traversal that would put tS into lexicographic order. To see the nature of the problem very clearly, look at the original S and think of sorting the rows using column 3 as the major sort. We can't really move the rows around of course, because we only have one S and we have to sort it differently for each t. Just look down column 3 and think about sorting without actually moving anything. Remember that in case of ties, you would then have to look at column 1, then column 4, etc. It's a mess, and I don't think you can do it without adding a data structure much larger than what we already have. And the original point of combining Shamir with M-conjugacy was to save memory. = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = Robert G. Bryan (Jerry Bryan) jbryan@pstcc.cc.tn.us Pellissippi State (423) 539-7127 10915 Hardin Valley Road (423) 694-6435 (fax) P.O. Box 22990 Knoxville, TN 37933-0990