Unknown ($D11D—D17B)
This one was hard to work out, so I'll reveal straight out what's happening: TODO: diagram, 8 bytes wide, 32 bytes high from 0x2129.
- column 1 & 2 - from
$2129
- set to all 0s - column 3 - from
$212B
- fill with 8 rows each of$5B
,$68
,$70
, and$78
. - column 4 - from
$212C
- random numbers between 0 and 255, where no two numbers are closer than 3 to each other. - column 5 & 6 - from
$212D
- fill with 8 rows each of$DCC6
,$DCCC
,$DCD2
and$DCD8
. These appear to be ROM addresses - column 7 & 8 - from
$212F
- fill with 8 rows each of$3C78
,$30A0
,$28C0
and$22E2
.
Set column 1 & 2
D11D: 8E 21 29 LDX #$2129 X = $2129
D120: CC 00 00 LDD #$0000 A = 0, B = 0
D123: ED 02 STD +$02,X Set (X+2)->value = 0
D125: 30 08 LEAX +$08,X X = X + 8
D127: 8C 22 29 CMPX #$2229 Check if X has reached $2229
D12A: 26 F7 BNE $D123 Loop to $D123 if not
Starting from $212B
and until $2229
, we loop over 8 byte chunks at a time, setting 2 of those bytes to 0 each time.
TODO: grid showing all 256 bytes and highlighting which of those have been changed.
In simplified C, this would look like this:
for (int i = 0x2129; i < 0x2229; i += 8)
{
memory[i] = 0;
memory[i+1] = 0;
}
Set column 4
This is by far the most complicated logic to follow. We want to populate this column with a set of random numbers, but we want some degree of even distribution amongst these numbers. Specifically, no two numbers are to be within 3 of each other. For example, if one of our random numbers if 6
, then we cannot for any of the other values choose the values 3
, 4
, 5
, 6
, 7
, 8
or 9
.
D12C: 8E 21 2C LDX #$212C insert_address = $212C
D12F: BD DB 1B JSR $DB1B candidate_random = random_number()
D132: 34 12 PSHS ,X,A Push insert_address and candidate_random to the stack
We set our pointer X to the first row of column 4 ($212C
), and generate a random number. By pushing these immediately on to the stack we indicate that we're about to re-use A and X for other purposes, but that we need these values later. Therefore we know that these values are variables, and we accordingly name them insert_address
and candidate_random
.
D134: 8E 21 2C LDX #$212C X = $212C
D137: A6 84 LDA ,X A = X->val
We immediately reset the value of X to $212C
. It's true that we already set it to $212C
only three lines ago, but we'll be looping back later when it's a different value. We then load the value from that memory location into the accumulator.
D139: A0 E4 SUBA ,S A = A - candidate_random
We now compare the candidate_random
number - which is stored at the current stack position - with the current row of column 4.
D13B: 2A 01 BPL $D13E Skip next line if the result is >= 0
D13D: 40 NEGA A = -A
D13E: 81 03 CMPA #$03 Check A - 3
D140: 23 14 BLS $D156 If A <= 3 branch to on_failed_candidate()
Let's look ahead to $D13E
first. Here we compare the result of our subtraction to 3. If the result is less than 3, then the candidate_random
is too close to an existing value in the column, and we branch to on_candidate_failed()
, which we will study shortly.
However, that comparison alone really only checks that the candidate_random
is at least 3 less than the value in memory; in other words it checks this:
candidate_random <= X + 3
TODO: is this actually checking X - 3?
but we need to check if the candidate_random
value is both 3 higher or 3 lower i.e.
X - 3 <= candidate_random <= X + 3
That's what $D13B
and $D13D
are for. By changing any negative value to a positive value we can then compare to a positive 3 and catch both cases.
BGE vs BPL
Why don't we use BGE
here instead? They look pretty similar; both compare whether one number is greater than another. But as it turns out, BPL
is great for comparing unsigned numbers while BGE
is better for signed numbers. The key difference is that BGE
considers not only the negative flag but also the overflow flag, which is set when the calculation causes an overflow or underflow of the byte boundaries at +128
or -127
. The following table shows some calculations where this does or doesn't make a difference.
calculation | mathematical result | CPU result | overflow | negative | BPL | BGE |
---|---|---|---|---|---|---|
47 - 115 |
-68 | -68 | 0 | 1 | false | false |
47 - 21 |
26 | 26 | 0 | 0 | true | true |
-90 - 40 |
-130 | 126 | 1 | 0 | true | false |
126 - (-127) |
253 | -3 | 1 | 1 | false | true |
When you study the table you can see why BGE
is recommended for normal arithmetic; it correctly indicates whether the actual mathematical result of the calculation was positive or negative. But BPL
ignores any overflow or underflow, and only looks at the final result on the CPU; as we'll see shortly, that is preferred in this case.
The sign reversal that occurs for negatives if A is negative - in effective the same as A = abs(A)
- is clever because it makes the subsequent CMPA
and BLS
act like a check on whether A
is within the range -3 to +3. In other words, in practice the branch occurs when -3 <= A <= 3
.
The decision to compare A to the proposed byte using BPL
rather than BGE
makes it clear that we are comparing how close, in unsigned terms, the two bytes are. If A is $7E
and the proposed byte is $81
, this operation would not branch:
D139: A0 E4 SUBA ,S A = $7E - $81
D13B: 2A 01 BGE $D13E result = $FE (254 unsigned, -2 signed)
overflow = 1, negative = 1
does branch
D13D: 40 NEGA (skipped!)
D13E: 81 03 CMPA #$03 Check A - 3, does not branch
But $7E
and $81
are actually very close in quality. On a number line they are in sequence: 7E
...7F
...80
...81
. Using BPL
will reflect this:
D139: A0 E4 SUBA ,S A = $7E - $81
D13B: 2A 01 BPL $D13E result = $FE (-2 signed)
overflow = 1, negative = 1
does not branch
D13D: 40 NEGA A = 2
D13E: 81 03 CMPA #$03 Check A - 3, does branch
If they are 3 or less apart, we consider the proposed byte a failure and restart the loop with a new proposed byte:
D156: BD DB 1B JSR $DB1B Call random_number()
D159: A7 E4 STA ,S Replace old random byte on stack with new random byte
D15B: 20 D7 BRA $D134 Go back to before start of loop
Note that while the "normal" loop goes back to $D137
, this actually jumps back to one instruction before that, which includes the initialization of X
to $212C
.
D142: 30 08 LEAX +$08,X X = X + 8
D144: 8C 22 2C CMPX #$222C Check if X has reached $222C
D147: 26 EE BNE $D137 Branch to $D137 if not
The above code is executed when the proposed byte is not within close proximity to the tested byte. We then increment X to move on to the next byte to test, until we have tested all 32 bytes in the range.
D149: 35 12 PULS ,A,X Pop random byte 1 and X off the stack
D14B: A7 84 STA ,X X->val = random byte
D14D: 30 08 LEAX +$08,X X = X + 8
D14F: 8C 22 2C CMPX #$222C
D152: 26 DB BNE $D12F Loop to $D12F until X = $222C
D154: 20 07 BRA $D15D Skip past reset_loop() function
We reach this point once we have determined, by testing against all 32 bytes in the column, that the proposed byte is valid. The byte then gets stored at the next position in the column and we loop the whole thing again until all 32 places have been filled.
Set column 3
D15D: 8E 21 2B LDX #$212B X = $212B
D160: 86 5D LDA #$5D A = $5D
D162: 8D 0E BSR $D172 copy_8x1bytechunks_8byteoffsets_fromA()
D164: 86 68 LDA #$68 A = $68
D166: 8D 0A BSR $D172 copy_8x1bytechunks_8byteoffsets_fromA()
D168: 86 70 LDA #$70 A = $70
D16A: 8D 06 BSR $D172 copy_8x1bytechunks_8byteoffsets_fromA()
D16C: 86 78 LDA #$78 A = $78
D16E: 8D 02 BSR $D172 copy_8x1bytechunks_8byteoffsets_fromA()
D170: 20 0A BRA $D17C
The above code populates the data in column 3, from $212B
down to $2223
. The helper function, is shown below.
# copy_8x1bytechunks_8byteoffsets_fromA()
D172: C6 08 LDB #$08
D174: A7 84 STA ,X
D176: 30 08 LEAX +$08,X
D178: 5A DECB
D179: 26 F9 BNE $D174
D17B: 39 RTS
The helper function performs a simple loop 8 times, setting one byte each time, somewhat like this:
for (int i = 0x212B; i < 0x2223; i += 8)
{
memory[i] = A
}
Set column 5 & 6
D17C: 8E 21 2D LDX #$212D X = $212D
D17F: 10 8E DC C6 LDY #$DCC6 Y = $DCC6
D183: 8D 14 BSR $D199 copy_8x2bytechunks_8byteoffsets_fromY()
D185: 10 8E DC CC LDY #$DCCC Y = $DCCC
D189: 8D 0E BSR $D199 copy_8x2bytechunks_8byteoffsets_fromY()
D18B: 10 8E DC D2 LDY #$DCD2 Y = $DCD2
D18F: 8D 08 BSR $D199 copy_8x2bytechunks_8byteoffsets_fromY()
D191: 10 8E DC D8 LDY #$DCD8 Y = $DCD8
D195: 8D 02 BSR $D199 copy_8x2bytechunks_8byteoffsets_fromY()
D197: 20 0B BRA $D1A4
TODO: Are we storing actual addresses here rather than actual values? - remains to be seen how we read them.
The above code populates the data in column 5 and 6, from $212D
down to $2225
. The helper function, is shown below.
# copy_8x2bytechunks_8byteoffsets()
D199: C6 08 LDB #$08 B = 8
D19B: 10 AF 84 STY ,X X->val = Y
D19E: 30 08 LEAX +$08,X X = X + 8
D1A0: 5A DECB B = B - 1
D1A1: 26 F8 BNE $D19B Loop until B = 0
D1A3: 39 RTS Return
Set column 7 & 8
D1A4: 8E 21 2F LDX #$212F X = $212F
D1A7: 86 3C LDA #$3C A = $3C
D1A9: C6 78 LDB #$78 B = $78
D1AB: 8D 14 BSR $D1C1 copy_8x2bytechunks_8byteoffsets_fromD()
D1AD: 86 30 LDA #$30 A = $30
D1AF: C6 A0 LDB #$A0 B = $A0
D1B1: 8D 0E BSR $D1C1 copy_8x2bytechunks_8byteoffsets_fromD()
D1B3: 86 28 LDA #$28 A = $28
D1B5: C6 C0 LDB #$C0 B = $A0
D1B7: 8D 08 BSR $D1C1 copy_8x2bytechunks_8byteoffsets_fromD()
D1B9: 86 22 LDA #$22 A = $22
D1BB: C6 E2 LDB #$E2 B = $E2
D1BD: 8D 02 BSR $D1C1 copy_8x2bytechunks_8byteoffsets_fromD()
D1BF: 20 0D BRA $D1CE
TODO: Are we storing actual addresses here rather than actual values? - remains to be seen how we read them.
# copy_8x2bytechunks_8byteoffsets_fromD()
D1C1: 10 8E 00 08 LDY #$0008 Y = 8
D1C5: ED 84 STD ,X X->val = D
D1C7: 30 08 LEAX +$08,X X = X + 8
D1C9: 31 3F LEAY -$01,Y Y = Y - 1
D1CB: 26 F8 BNE $D1C5 Loop until Y = 0
D1CD: 39 RTS Return
We have to use Y as a looping register instead of B because we're copying A and B into memory.
D1CE: 39 RTS Finally back to the main program!