Steven C. Morrell : Mathematical Models for Step Sizes

morrell@math.utah.edu 10 Apr 1996 05:29:54 -0400 Storm King Ind. Inc. Abstract: Corewar is a programming game played on a board simulating computer memory. This board is called the core, and consists of N cells, each containing an assembly language command. Programs run simultaneously, with the objective of killing other programs in core and thus becoming the sole owner of the core. Addition, subtraction, and multiplication give the core a Z/NZ ring structure. N is called the coresize. Popular values of Nare 8000, 8192, and 55440. In this article, we examine ways to optimize simple bombing or scanning step sizes. We give formulas to measure 'optimization' of a step for a given coresize, and prove that imp-pairs have the same optima score. Contents: (0) Terminology (1) Euclid's Algorithm and Continued Fractions (2) Some Models for Bombing Core (incomplete) (3) A Better Formula for the Length of Intervals (4) The Optima Step Formula (5) The Find-X Formula (omitted in current draft) (6) Locally Fibonacci Step-Sizes (omitted in current draft) (7) Calculations (omitted in current draft)

(0) Terminology

We denote subscripts and superscripts in a TeX-ish way. So "X sub 1" will be written X_1, and X^2 = X*X. If we need to perform math on subscripts, we will use curly braces to denote groupings. We also use the following symbols: Z+ : The set of all positive integers. (A,B) : The greatest common factor of positive integers A,B. [x] : The integral part of a non-negative real number x. A|B : The integer A divides the integer B evenly (that is, without remainder.)

(1) Euclid's Algorithm and Continued Fractions

In this section, we build the mathematical framework for the results of this article. We explore the relationship betwwen Euclid's Algorithm and continued fractions, and prove a classic result of Euler, who seems to have anticipated Corewar in 1764. Recall that Euclid's Algorithm computes (A,B), with integers A>B>0, as follows: Let X_0 = A and X_1 = B. Inductively define X_{i+1} = X_{i-1} - X_i*Y_i, (1) with Y_i=[X_{i-1}/X_i]. (2) Clearly, 0 <= X_{i+1} < X_i. If X_{i+1}=0, our induction must stop (because Y_{i+1} is undefined). In this case, define k to be the largest i for which X_i is non-zero. Since X_{i+1}0. Hence, X_k and (A,B) have the same factors, so X_k=(A,B). Definition: Given A_1,A_2,...,A_n in Z+, we define the continued fraction generated by the A_i's as = A_1+1/(A_2+1/...(A_{n-1}_n)...) (3) We define the length of to be n. Clearly, is always rational. We define the numerator of to be p, where p/q is the reduced fraction with p/q= and denote it by p=[A_1,...,A_n]. We allow "empty" numerators, defining []=1. Propositon 1: Given A,B in Z+ with A>B, we have A/B = where the Y_i's and k are defined by Euclid's algorithm using (1) and (2). Proof: We the notations of Euclid's algorithm above and induction on k. When k=1, we have X_2=0 (by definition of k), so we know by (1) with i=1 that 0 = X_0-Y_1*X_1. Hence, = Y_1 = X_0/X_1. Now, suppose k>1. Note that X_1>X_2>0, so we can use Euclid's algorthm on A'= X_1, B'= X_2 to find each Y'_i, and it is easy to see that Y'_i = Y_{i+1}. Hence, for induction we may assume that X_1 / X_2 = . We then have = Y_1+1/ = Y_1+1/(X_1/X_2) = (Y_1 * X_1 + X_2)/X_1 = X_0/X_1 = A/B. [box] Corollary 2: If we are given the Y_i's and m=(A,B), we can recover A and B. Proof: is rational, hence can be uniquely written as a reduced fraction a/b. Existence: Since this fraction is reduced, we have (a,b)=1. Define A= ma, B= mb. Then A/B=a/b= and (A,B)=m*(a,b)=1. Uniqueness: If A/B=A'/B'= and (A,B)=(A',B')=m then A/m,B/m are in Z+ and (A/m,B/m)=1, so (A/m)/(B/m) is the reduced fraction a/b. Hence A/m=a, B/m=b. Similary, A'/m=a, B'/m=b. Thus A=A' and B=B'. [box] Corollary 3: Given A,B in Z+ with A>B and (A,B)=1, we have A = [Y_1,...,Y_k], B = [Y_2,...,Y_k] where the Y_i's and k are defined by Euclid's algorithm using (1) and (2). Proof: From Proposition 1, we know A/B=. This fraction is reduced when (A,B)=1, so A = [Y_1,...,Y_k] by definition. If k=1, then B=1=[]. Now if k>1 and C/D is the reduced fraction , then we have A/B= Y_1 + 1/(C/D) = (Y_1*C + D)/C. The fractions on both sides of this equation are reduced, hence B=C. But C = [Y_2,...,Y_k] by definition. [box] Corollary 4: For any continued fraction with n>=2, the following identity holds: [A_1,...,A_n] = A_1*[A_2,...,A_n] + [A_3,...,A_n]. (4) Proof: If A/B= and C/D= are both reduced fractions, then A/B= (A_1*C + D)/C, as in the proof of Corollary 3. Multiplying both sides by B=C, we obtain: A = A_1*C + D. From Corollary 3, we have A = [A_1,...,A_n], C = [A_2,...,A_n], and D = [A_3,...,A_n]. [box] Examples: 43/10 = <4,3,3>. Also, 43/10 = <4,3,2,1>, even though these are not the Y_i defined in (2). 43/30 = <1,2,3,4>. If (A,B)=1 and A/B = <1,1,...,1> (k times), then A=F_{k+1} and B=F_k, where F_i is the i'th term in the Fibonacci sequence {1,1,2,3,5,8,13,...}. We now come to the main result of this section, proved by Euler in 1764. Theorem 5: For any continued fraction , the following relations hold: [A_1,...,A_n]=[A_n,...,A_1] (5) [A_2,...,A_n] * [A_1,...,A_{n-1}] = [A_1,...,A_n] * [A_2,...,A_{n-1}] - (-1)^n (n>1) (6) Proof: We use induction on n. When n=0 or n=1, (5) is tautological. And for n=2, we have, by (4), [A_1,A_2]= A_1 * [A_2]+ [] = A_1 * A_2 + 1. Hence, [A_2] * [A_1] = A_2 * A_1 = (A_1 * A_2 + 1)*1 -1 = [A_1,A_2]*[]-(-1)^2, so (6) holds. For n>=2, we use (4) and induction on (5) repeatedly to get: [A_1,...,A_2] = A_1 * [A_2,...,A_n] + [A_3,...,A_n] = A_1 * [A_n,...,A_2] + [A_n,...,A_3] = A_1 * (A_n * [A_{n-1},...,A_2] + [A_{n-2},...,A_2]) + A_n * [A_{n-1},...,A_3] + [A_{n-2},...,A_3] = A_1 * A_n * [A_2,...,A_{n-1}] + A_1 * [A_2,...,A_{n-2}] + A_n * [A_3,...,A_{n-1}] + [A_3,...,A_{n-2}]. If we replace (A_1,...,A_n) with (A_n,...,A_1), the first and fourth terms of the right-hand expression stay the same, and the second and third terms get swapped. Thus, [A_1,...,A_n] = [A_n,...,A_1] since they both equal the same expression. This proves (5) for all n>0. Finally, for n>=3, we can assume by induction that (6) holds for . In other words, [A_2,...,A_{n-1}]*[A_3,...,A_n}] = [A_3,...,A_{n-1}]*[A_2,...,A_n] -(-1)^{n-1}. Thus, [A_1,...,A_n]*[A_2,...,A_{n-1}] = (A_1*[A_2,...,A_n]+[A_3,...,A_n]) *[A_2,...,A_{n-1}] = A_1*[A_2,...,A_n]*[A_2,...,A_{n-1}] +[A_3,...,A_n]*[A_2,...,A_{n-1}] = A_1*[A_2,...,A_n]*[A_2,...,A_{n-1}] +[A_3,...,A_{n-1}]*[A_2,...,A_n]+(-1)^n = [A_2,...,A_n]*(A_1*[A_2,...,A_{n-1}]+[A_3,...,A_{n-1}])+(-1)^n = [A_2,...,A_n]*[A_1,...,A_{n-1}]+(-1)^n. This is (6) for , so by induction, (6) holds for all n>1. [box] Corollary 6: Given positive integers A,B, with A/B = , define positive integers A',B' by A'/B' = (7) where (A',B')=(A,B) (8) Then the following relationships hold: A' = A; (9) B'*B = -(-1)^k*(A,B)^2 (mod A). (10) Proof: Let m=(A,B). Then (A/m)/(B/m) is the reduced form of A/B, whence A = m*[Y_1,...,Y_k], B = m*[Y_2,...,Y_k]. Similarly, A' = m*[Y_k,...,Y_1], B' = m*[Y_{k-1},...,Y_1]. Then (9) follows from (5), and (6) can be written as: (B/m)*(B'/m)=(A/m)*[Y_2,...,Y_{k-1}]-(-1)^k. We multiply both sides by m^2 to get B*B'=A*m*[Y_2,...,Y_{k-1}]-(-1)^k*m^2. Then (10) follows from passing to congruence (mod A). [box]

(2) Some Models for Bombing Core

Let N be the coresize and let KZ/NZ given by f(n)=n+N. Rather than mess with cosets, we will refer to elements of core by representatives in Z. For a0, let U_j be the interval containing 0 in the partition after j bombs. Theorem 7: For all j with 0=0 is in the partition. In the first case, we have ||[a,b)||=||[d+g*K,e+G*K)||=e-d=||[d,e)||, so we are done. Otherwise, since d is in the bombing run, we know that d=l*K for some l with 1(3) A Better Formula for the Length of Intervals Here is some terminology for the next result. Let X_0=N, X_1=K, and define X_i and Y_i using Euclid's algorithm (that is, equations (1) and (2)). Further, define Z_i inductively by Z_0=0, Z_1=1, and Z_{i+1}=Z_i*Y_i+Z_{i-1}. (11) By proposition 1, we have X_0/X_1=(X_0/m)/(X_1/m)=. Applying Corollary 3 to X_0/m,X_1/m, we get X_0= m*[Y_1,...,Y_k] X_1= m*[Y_2,...,Y_k] and inductively using Corollary 4 with equation (1), we get X_i= m*[Y_{i+1},...,Y_k] (0<=i<=k). (12) Inductively using Corollary 4 with equation (11), we also get Z_i= [Y_{i-1},...,Y_1] (1<=i<=k+1). (13) Now we can describe the bombing sequence as a series of bombing runs by looking at the intervals U_i containing 0. Definition: The i'th bombing run for i=1 and X_{i-1}-p*X_i>= 0, whch means p<=Y_i. We have by hypothesis that the first bomb in the i'th bombing run is dropped at X_{i-1}-X_i and shrinks U_j. So, what is the gap in time between a bombs being dropped at X_{i-1}-(p-1)*X_i and X_{i-1}-p*X_i? Since all bombs are evenly spaced, this gap will be the same for all p, and with p=1, we know that the Z_{i-1}'th bomb dropped at X_{i-1} and the Z_{i-1}+Z_i'th bomb drops at X_{i-1}-X_i. Thus the value of p increases every Z_i bombs. We know that the value of p starts at one with the Z_i+Z_{i-1}'th bomb, so we can verify that the p given in the statement of the theorem is correct. Next, we must verify that the bombs we just dropped were the i'th bombing run. We started with bomb Z_i+Z_{i-1}, and dropped Z_i bombs for each p between 1 and Y_i (except for the last bombing run, where p only ranged between 1 and Y_i-1). Thus, the last bomb in this bombing was the Z_i+Z_{i -1}+Y_i*Z_i-1=Z_i+Z_{i+1}-1'th bomb (or Z_{k+1}-1=N/m-1 for the last run). From the last two paragraphs, with p=Y_i+1, we see that the first bomb of the next bombing run is dropped at X_{i-1}-(p+1)*X_i= X_{i+1}-X_i. And with p=Y_i, we have the Z_{i+1}'th bomb is the Z_i*Y_i+Z_{i-1}'th bomb, which drops at X_{i-1}-X_i*Y_i= X_{i+1}. This finishes the induction step. [box]

(4) The Optima Step Formula

In Corewar, the step size of a bombing or scanning pattern is the distance between successive bombing or scanning locations. Chosing a good step size is critical to the success of a warrior, so we need a way of measuring the effectiveness of a step size. What makes one step size better than another? Here is one train of thought: Larger enemies can usually be found faster than small ones. If your step takes the same time to find small enemies as large ones, it probably needs to be optimized against larger enemies. On the other hand, if your step size is too coarse, small programs can slip through the cracks. For instance, assuming 8000 instructions in core, a step size of 4000 is bad. The first two bombs are spaced well, but the pattern only hits two core locations. An ideal pattern will divide the core into successively smaller chunks, until it has checked all locations in core (except where your code resides). We can measure the success of this method by adding the lengths of the largest untouched segments during different stages of bombing. This sum is the optima function of a coresize and step. A lower function value between step-sizes with the same greatest common factor indicates a more optimal pattern. Definition: The optima function is defined to be: O(N,K)= sum_j( ||U_j|| ). Corollary 8 assures that this is the sum we want. Proposition 10: O(N,K) = O(N,N-K). Proof: In coresize N, we can consider a step-size of N-K as a step-size of K bombing in the other direction. If the U_j are the intervals containing 0 for step -size K, as above, then, by symmetry, the intervals U'_j containing 0 for step-size N-K will just be the U_j reversed; that is, if for some j, U_j=[ -a,b), then U'_j=[-b,a). Note that ||U_j|| = a+b = ||U'_j||. We sum over all j to get the result. [box] Theorem 11: O(N,K)= sum_{i=1}^k(X_{i-1}*Z{i+1}-X_i*Z_i-X_i*Z_i*(Y_i*(Y_i-1)/2) Proof: We write O(N,K)= sum_{i=1}^k O_i(N,K), where each O_i is the sum of the ||U_j|| from the i'th bombing run. In the i'th bombing run, we have (by theorem 9) Z_i intervals of length X_{i-1}-(p-1)*X_i, for p ranging from 1 to Y_i. Adding these up, we think that O_i(N,K)= sum_{p=1}^{Y_i} Z_i*(X_{i-1}-(p-1)*X_i) = sum_{p=1}^{Y_i}(Z_i*X_{i-1}) - sum_{p=1}^{Y_i}(Z_i*X_i*(p-1)) = Y_i*Z_i*X_{i-1} - Z_i*X_i*(Y_i*(Y_i-1)/2) Well, not quite. We forgot that the last bombing run only uses the values of p from 1 to Y_k-1. Thus we must add the correction term -Z_k*(X_{k-1}-(Y_k-1)*X_k)= -Z_k*(X_{k-1}-Y_k*X_k+X_k) = -Z_k*(X_{k+1}+X_k) = -Z_k*X_k since X_{k+1}=0. We thus have O(N,K)= (15) sum_{i=1}^k( Y_i*Z_i*X_{i-1} - Z_i*X_i*(Y_i*(Y_i-1)/2) ) -Z_k*X_k. We also have the following telescoping sum: sum_{i=1}^k( X_i*Z_i - X_{i-1}*Z_{i-1}) = X_k*Z_k - X_0*Z_0 = X_k*Z_k since Z_0=0. Substituting this into (15), we get O(N,K)= (16) sum_{i=1}^k( Y_i*Z_i*X_{i-1} - Z_i*X_i*(Y_i*(Y_i-1)/2) - X_i*Z_i + X_{i-1}*Z_{i-1}). and the first and fourth terms of the sum combine thusly: Y_i*Z_i*X_{i-1} + X_{i-1}*Z_{i-1} = X_{i-1}*(Y_i*Z_i+Z_{i-1}) = X_{i-1}*Z{i+1}. to yield the formula we wished to prove. [box] Corollary 12: If {K,K''} is an imp-pair for coresize N, then O(N,K)=O(N,K''). Proof: Recall that {K,K''} is an imp-pair if and only if K*K'' = 1 (mod N). For any given K,N, we know that K'' exists if and only if (K,N)=1, and we know that K'' is unique. Construct K' and N' as in corollary 6. Then we have N=[Y_1,...,Y_k], K=[Y_2,...,Y_k], N'=[Y_k,...,Y_1], K'=[Y_{k-1},...,Y_1]. If we define X'_i,Y'_i, and Z'_i by X'_0=N',X'_1=K' and the recurrence relations in section 3, we get the follwing for all applicable i: Y'_i=Y_{k+1-i} X'_i=Z_{k+1-i} Z'_i=X_{k+1-i}. If we apply theorem 11 and reverse the order of summation, we get O(N',K')= sum_{i=1}^k(X'_{i-1}*Z'{i+1} -X'_i*Z'_i -X'_i*Z'_i*(Y'_i*(Y'_i-1)/2) = sum_{i=1}^k(X'_{k+1-(i-1)}*Z'_{k+1-(i+1)} -X'_{k+1-i}*Z'_{k+1-i} -X'_{k+1-i}*Z'_{k+1-i}*(Y'_{k+1-i}*(Y'_{k+1-i}-1)/2) = sum_{i=1}^k(Z_{i+1}*X_{i-1} -Z_i*X_i -Z_i*X_i*(Y_i*(Y_i-1)/2) = O(N,K). Now, by corollary 6, we have N'=N and K*K'= +/- 1 (mod N), so by uniqueness of K'', we have either K''=K' (mod N) or K''=-K' (mod N). In the first case, O(N,K'')=O(N',K'), and in the second case, O(N,K'')= O(N,N-K')=O(N',K') by proposition 10. [box]