Unknown ($D11D—D17B)

This one was hard to work out, so I'll reveal straight out what's happening: TODO: diagram, 8 bytes wide, 32 bytes high from 0x2129.

  • column 1 & 2 - from $2129 - set to all 0s
  • column 3 - from $212B - fill with 8 rows each of $5B, $68, $70, and $78.
  • column 4 - from $212C - random numbers between 0 and 255, where no two numbers are closer than 3 to each other.
  • column 5 & 6 - from $212D - fill with 8 rows each of $DCC6, $DCCC, $DCD2 and $DCD8. These appear to be ROM addresses
  • column 7 & 8 - from $212F - fill with 8 rows each of $3C78, $30A0, $28C0 and $22E2.

Set column 1 & 2

D11D: 8E 21 29     LDX #$2129   X = $2129
D120: CC 00 00     LDD #$0000   A = 0, B = 0
D123: ED 02        STD +$02,X   Set (X+2)->value = 0
D125: 30 08        LEAX +$08,X  X = X + 8
D127: 8C 22 29     CMPX #$2229  Check if X has reached $2229
D12A: 26 F7        BNE $D123    Loop to $D123 if not

Starting from $212B and until $2229, we loop over 8 byte chunks at a time, setting 2 of those bytes to 0 each time. TODO: grid showing all 256 bytes and highlighting which of those have been changed.

In simplified C, this would look like this:

for (int i = 0x2129; i < 0x2229; i += 8)
{
    memory[i] = 0;
    memory[i+1] = 0;
}

Set column 4

This is by far the most complicated logic to follow. We want to populate this column with a set of random numbers, but we want some degree of even distribution amongst these numbers. Specifically, no two numbers are to be within 3 of each other. For example, if one of our random numbers if 6, then we cannot for any of the other values choose the values 3, 4, 5, 6, 7, 8 or 9.

D12C: 8E 21 2C     LDX #$212C   insert_address = $212C
D12F: BD DB 1B     JSR $DB1B    candidate_random = random_number()
D132: 34 12        PSHS ,X,A    Push insert_address and candidate_random to the stack

We set our pointer X to the first row of column 4 ($212C), and generate a random number. By pushing these immediately on to the stack we indicate that we're about to re-use A and X for other purposes, but that we need these values later. Therefore we know that these values are variables, and we accordingly name them insert_address and candidate_random.

D134: 8E 21 2C     LDX #$212C   X = $212C
D137: A6 84        LDA ,X       A = X->val

We immediately reset the value of X to $212C. It's true that we already set it to $212C only three lines ago, but we'll be looping back later when it's a different value. We then load the value from that memory location into the accumulator.

D139: A0 E4        SUBA ,S      A = A - candidate_random

We now compare the candidate_random number - which is stored at the current stack position - with the current row of column 4.

D13B: 2A 01        BPL $D13E    Skip next line if the result is >= 0
D13D: 40           NEGA         A = -A
D13E: 81 03        CMPA #$03    Check A - 3
D140: 23 14        BLS $D156    If A <= 3 branch to on_failed_candidate()

Let's look ahead to $D13E first. Here we compare the result of our subtraction to 3. If the result is less than 3, then the candidate_random is too close to an existing value in the column, and we branch to on_candidate_failed(), which we will study shortly.

However, that comparison alone really only checks that the candidate_random is at least 3 less than the value in memory; in other words it checks this:

candidate_random <= X + 3

TODO: is this actually checking X - 3?

but we need to check if the candidate_random value is both 3 higher or 3 lower i.e.

X - 3 <= candidate_random <= X + 3

That's what $D13B and $D13D are for. By changing any negative value to a positive value we can then compare to a positive 3 and catch both cases.

BGE vs BPL

Why don't we use BGE here instead? They look pretty similar; both compare whether one number is greater than another. But as it turns out, BPL is great for comparing unsigned numbers while BGE is better for signed numbers. The key difference is that BGE considers not only the negative flag but also the overflow flag, which is set when the calculation causes an overflow or underflow of the byte boundaries at +128 or -127. The following table shows some calculations where this does or doesn't make a difference.

calculation mathematical result CPU result overflow negative BPL BGE
47 - 115 -68 -68 0 1 false false
47 - 21 26 26 0 0 true true
-90 - 40 -130 126 1 0 true false
126 - (-127) 253 -3 1 1 false true

When you study the table you can see why BGE is recommended for normal arithmetic; it correctly indicates whether the actual mathematical result of the calculation was positive or negative. But BPL ignores any overflow or underflow, and only looks at the final result on the CPU; as we'll see shortly, that is preferred in this case.

The sign reversal that occurs for negatives if A is negative - in effective the same as A = abs(A) - is clever because it makes the subsequent CMPA and BLS act like a check on whether A is within the range -3 to +3. In other words, in practice the branch occurs when -3 <= A <= 3.

The decision to compare A to the proposed byte using BPL rather than BGE makes it clear that we are comparing how close, in unsigned terms, the two bytes are. If A is $7E and the proposed byte is $81, this operation would not branch:

D139: A0 E4        SUBA ,S      A = $7E - $81
D13B: 2A 01        BGE $D13E    result = $FE (254 unsigned, -2 signed)
                                overflow = 1, negative = 1
                                does branch
D13D: 40           NEGA         (skipped!)
D13E: 81 03        CMPA #$03    Check A - 3, does not branch

But $7E and $81 are actually very close in quality. On a number line they are in sequence: 7E...7F...80...81. Using BPL will reflect this:

D139: A0 E4        SUBA ,S      A = $7E - $81
D13B: 2A 01        BPL $D13E    result = $FE (-2 signed)
                                overflow = 1, negative = 1
                                does not branch
D13D: 40           NEGA         A = 2
D13E: 81 03        CMPA #$03    Check A - 3, does branch

If they are 3 or less apart, we consider the proposed byte a failure and restart the loop with a new proposed byte:

D156: BD DB 1B     JSR $DB1B    Call random_number()
D159: A7 E4        STA ,S       Replace old random byte on stack with new random byte
D15B: 20 D7        BRA $D134    Go back to before start of loop

Note that while the "normal" loop goes back to $D137, this actually jumps back to one instruction before that, which includes the initialization of X to $212C.

D142: 30 08        LEAX +$08,X  X = X + 8
D144: 8C 22 2C     CMPX #$222C  Check if X has reached $222C
D147: 26 EE        BNE $D137    Branch to $D137 if not

The above code is executed when the proposed byte is not within close proximity to the tested byte. We then increment X to move on to the next byte to test, until we have tested all 32 bytes in the range.

D149: 35 12        PULS ,A,X    Pop random byte 1 and X off the stack
D14B: A7 84        STA ,X       X->val = random byte
D14D: 30 08        LEAX +$08,X  X = X + 8
D14F: 8C 22 2C     CMPX #$222C  
D152: 26 DB        BNE $D12F    Loop to $D12F until X = $222C
D154: 20 07        BRA $D15D    Skip past reset_loop() function

We reach this point once we have determined, by testing against all 32 bytes in the column, that the proposed byte is valid. The byte then gets stored at the next position in the column and we loop the whole thing again until all 32 places have been filled.

Set column 3

D15D: 8E 21 2B     LDX #$212B   X = $212B
D160: 86 5D        LDA #$5D     A = $5D
D162: 8D 0E        BSR $D172    copy_8x1bytechunks_8byteoffsets_fromA()
D164: 86 68        LDA #$68     A = $68
D166: 8D 0A        BSR $D172    copy_8x1bytechunks_8byteoffsets_fromA()
D168: 86 70        LDA #$70     A = $70
D16A: 8D 06        BSR $D172    copy_8x1bytechunks_8byteoffsets_fromA()
D16C: 86 78        LDA #$78     A = $78
D16E: 8D 02        BSR $D172    copy_8x1bytechunks_8byteoffsets_fromA()
D170: 20 0A        BRA $D17C

The above code populates the data in column 3, from $212B down to $2223. The helper function, is shown below.

# copy_8x1bytechunks_8byteoffsets_fromA()
D172: C6 08        LDB #$08
D174: A7 84        STA ,X
D176: 30 08        LEAX +$08,X
D178: 5A           DECB
D179: 26 F9        BNE $D174
D17B: 39           RTS

The helper function performs a simple loop 8 times, setting one byte each time, somewhat like this:

for (int i = 0x212B; i < 0x2223; i += 8)
{
    memory[i] = A
}

Set column 5 & 6

D17C: 8E 21 2D     LDX #$212D   X = $212D
D17F: 10 8E DC C6  LDY #$DCC6   Y = $DCC6
D183: 8D 14        BSR $D199    copy_8x2bytechunks_8byteoffsets_fromY()
D185: 10 8E DC CC  LDY #$DCCC   Y = $DCCC
D189: 8D 0E        BSR $D199    copy_8x2bytechunks_8byteoffsets_fromY()
D18B: 10 8E DC D2  LDY #$DCD2   Y = $DCD2   
D18F: 8D 08        BSR $D199    copy_8x2bytechunks_8byteoffsets_fromY()
D191: 10 8E DC D8  LDY #$DCD8   Y = $DCD8
D195: 8D 02        BSR $D199    copy_8x2bytechunks_8byteoffsets_fromY()
D197: 20 0B        BRA $D1A4

TODO: Are we storing actual addresses here rather than actual values? - remains to be seen how we read them.

The above code populates the data in column 5 and 6, from $212D down to $2225. The helper function, is shown below.

# copy_8x2bytechunks_8byteoffsets()
D199: C6 08        LDB #$08     B = 8
D19B: 10 AF 84     STY ,X       X->val = Y
D19E: 30 08        LEAX +$08,X  X = X + 8
D1A0: 5A           DECB         B = B - 1
D1A1: 26 F8        BNE $D19B    Loop until B = 0
D1A3: 39           RTS          Return

Set column 7 & 8

D1A4: 8E 21 2F     LDX #$212F   X = $212F
D1A7: 86 3C        LDA #$3C     A = $3C
D1A9: C6 78        LDB #$78     B = $78
D1AB: 8D 14        BSR $D1C1    copy_8x2bytechunks_8byteoffsets_fromD()
D1AD: 86 30        LDA #$30     A = $30
D1AF: C6 A0        LDB #$A0     B = $A0
D1B1: 8D 0E        BSR $D1C1    copy_8x2bytechunks_8byteoffsets_fromD()
D1B3: 86 28        LDA #$28     A = $28
D1B5: C6 C0        LDB #$C0     B = $A0
D1B7: 8D 08        BSR $D1C1    copy_8x2bytechunks_8byteoffsets_fromD()
D1B9: 86 22        LDA #$22     A = $22
D1BB: C6 E2        LDB #$E2     B = $E2
D1BD: 8D 02        BSR $D1C1    copy_8x2bytechunks_8byteoffsets_fromD()
D1BF: 20 0D        BRA $D1CE

TODO: Are we storing actual addresses here rather than actual values? - remains to be seen how we read them.

# copy_8x2bytechunks_8byteoffsets_fromD()
D1C1: 10 8E 00 08  LDY #$0008   Y = 8
D1C5: ED 84        STD ,X       X->val = D
D1C7: 30 08        LEAX +$08,X  X = X + 8
D1C9: 31 3F        LEAY -$01,Y  Y = Y - 1
D1CB: 26 F8        BNE $D1C5    Loop until Y = 0
D1CD: 39           RTS          Return

We have to use Y as a looping register instead of B because we're copying A and B into memory.

D1CE: 39           RTS          Finally back to the main program!
Written on June 15, 2015