As a teenager in the '80s I had a friend that had Milton Bradley's Dark Tower game. We loved playing it and I recently picked up a copy off of eBay. While playing recently, I started really wondering about its internal workings, and after doing a bit of research, I stumbled across Sean Riddle's TMS1400 page.
Seeing the work he did there and the ROM dumps that were available, I immediately set out in search of a disassembler for the TMS1400. After considerable digging, I discovered @paulscottrobson's Simon2Simon project where he had written a TMS1000 disassembler and assembler. Using that (yeah, I should have done some git magic and created a branch, but this is the first I've ever used github), I took it and modified both it and his assembler heavily to support the TMS1400 and some of the things I wanted to accomplish. This did require taking out some features he had put in and I'll apologize in advance for my lack of python skill. In spite of my ineptitude at python, I was able to take his code and build a disassembler and its sister assembler to generate a binary that matched the original input. I can't emphasize enough that he did the vast majority of the heavy lifting.
Once I had that, it was just a matter of starting to dig into the assembly code along with the TMS1400 documentation, the various bits of information on Sean's page and the Dark Tower instructions themselves. This took me reasonably far until I started trying to figure out the physical interactions. It was then that I discovered that MAME had a built-in debugger and already supported the Dark Tower game. Visibility into the internal workings of the processor and having a single-step debugger basically made the entire process easy.
The various (imporant) files that make up this repository are:
darktower.asm
- The file that was originally generated by the disassembler (bin/dasm.py
) ; this has been heavily commented by medarktower.lst
- The listing file generated by the assembler (bin/tasm.py
)darktower.sym
- Symbol file generated by the assembler and taken as an inputdarktower.bin
- Binary dump from Sean Riddle's siteDark Tower Info.txt
- Various notes I've made while digging through the codeDark Tower RAM usage.xlsx
- Excel spreadsheet representing the TMS1400's RAM and how Dark Tower uses itDark Tower Documentation/Dark Tower controller Pinout.txt
- Sean's original file with some additional information I've discoveredbin/dasm.py
- My version of @paulscottrobson's TMS1100 disassembler heavily modified to disassemble TMS1400 binary databin/tasm.py
- My version of @paulscottrobson's TMS1100 assembler heavily modified to assemble TMS1400 code (specifically, dasm's output)
The TMS1400 has eight files of 16 4-bit words that are addressed by instructions using the X and Y registers as pointers. While MAME simply treats the RAM as sequentially-addressed bytes of data, this didn't seem appropriate to me. In all of my comments as well as the Excel spreadsheet, I refer to RAM addresses in the format of X/Y, where X is the file identifier and Y is the word in that file.
For example, in the AD2B10 routine, I have the following comment:
;********************************************************************************
; AD2B10
;
; Perform Base 10 arithmetic on values in scratchpad RAM 4/1-2 and 5/1-2
;
; Add the two-digit base 10 number encoded in 4/1 and 4/2 to the two-digit
; base 10 number encoded in 5/1 and 5/2 giving a three-digit base-10 number
; encoded in 5/0-2.
;
; For example: 0 1 2 (x means doesn't matter--overwritten)
; 4 x 1 9
; 5 x 2 2
;
; Returns: 0 1 2
; 4 0 1 9
; 5 0 4 1
;********************************************************************************
The reference to RAM 4/1-2
refers to the words referenced when the X register is loaded with a 4 and the Y register is loaded with a 1 and subsequently a 2.
The same applies in the comment about 5/1
--where X=5 and Y=1. When viewing RAM in MAME, this would refer to address 0x51.
In the examples, the numbers in the first row and first column represent the Y and X values respectively forming a grid. I suppose representing it this way would make more sense:
0 1 2
-------
4 | x 1 9
5 | x 2 2
I didn't put a ton of effort into the assembler itself. It basically assumes any characters in the first eight positions are labels. In the initial pass of the disassembler, labels are generated with either an "L" or an "S" as their first character (long or short respectively) concatenated with the address they represent. Note the L/S usage is determined by the first time the label is generated. It's possible that a long branch used once could be targeted by short branches later. No change to the label would be made.
As I was writing this document, I stumbled across the original TI assembler file format on page 4-4 of the Programmer Reference Manual. I wish I would have seen this as I would have followed their formatting standard; however, at this point it would be difficult to run the code back through the disassembler and reintegrate the comments. Hindsight's 20/20.
Here are the first few lines from the lst
file in my initial commit:
1
2 ; *** Chapter 0 page 0
3
0:0:00 000:28 4 S000 ldx 0
0:0:01 001:21 5 tma
0:0:03 003:71 6 a9aac
0:0:07 007:BE 7 br S03E
0:0:0F 00F:61 8 S00F tcmiy 8
0:0:1F 01F:40 9 tcy 0
0:0:3F 03F:9B 10 br S01B
0:0:3E 03E:60 11 S03E tcmiy 0
The first three colon-delimited values are there to correspond to MAME's representation of addresses in its debugger. Basically, they are in the format Chapter:Page:Offset. Note that the TMS1xxx processers do not use a sequential address scheme; this is why addresses run in the goofy order they do.
The second two colon-delimited values are simply the absolute address followed by the opcode. Following that are the line number and the original assembly source.
Following the listing is the cross-reference section:
Cross-reference
---------------
LABEL VALUE - (DEF) REF(b = BR, c=CALL)
S000 0x0000 - (4) 33(b) 41(b)
S00A 0x000a - (67) 61(b) 63(b)
S00F 0x000f - (8) 45(b)
S011 0x0011 - (57) 51(b)
S018 0x0018 - (34) 32(b)
S01B 0x001b - (48) 10(b)
S01C 0x001c - (42) 35(b) 40(b)
S026 0x0026 - (62) 56(b)
S033 0x0033 - (25) 22(b)
S03D 0x003d - (17) 66(b) 72(b) 806(b)
S03E 0x003e - (11) 7(b)
L040 0x0040 - (77) 26(c) 1369(c)
As the headings indicate, the first column is the name of the label, the second column is its value (absolute address). The subsequent values after the hyphen provide information about the label. The first value in parenthesis is the line number where the label is defined. The following values are lines where the label is referenced followed by either a "b" if it was used in a branch (br
) statement or a "c" if it was used in a call (call
).
One of the major challenges with disassembling TMS1xxx code is due to its use of the chapter buffer and page buffer registers on branches and calls. Judging from the way the Dark Tower code was written and some of the examples I saw in the documentation, the TI TMS1xxx assembler supported both short and long branches/calls. In the case of a long branch/call, the appropriate ldp
or ldp
/tpc
/ldp
combo was inserted into the code and that information was lost. From a disassembly perspective, an assumption has to be made as to the contents of the page buffer and chapter buffer registers on a given br
or call
to decide whether it's long or short.
Now, the assumption really becomes an educated guess based on evidence in the source code. Generally speaking, a long br
or call
(the TI assembler used bl
according to page 2-6 and calll
according to the sample code on page 14-8 of the TMS1000 Programming Reference Manual) will be immediately preceded by the appropriate page/chapter buffer register loads thanks to the assembler processing the bl
/calll
. Additionally, upon returning from a call
, the page and chapter buffer registers are reset to the current page. Unfortunately, that's not the case with a br
. Because the page and chapter buffer registers directly impact the target absolute address, it's necessary to manage this situation in the disassembler so I basically worked under the assumption that the disassembler's internal page and chapter buffer registers should be reset after any call
or br
. This worked in 99% of the cases I ran into.
Arguably, the bl
/calll
could have been integrated back into the sources when their use was implied; this functionality was even already in Paul's disassembler. At the time, though, I wanted the generated disassembly to correspond line-for-line with the object code. To this end, I stripped that code out of the disassembler to ensure I had good output.
Because ROM space is at a premium, I discovered that the above rule was broken in a few instances at the end of some pages. My guess is, the programmer(s) needed the extra bytes so they went in and massaged the code to recover the extra bytes that would be unnecessarily consumed by extra ldp
instructions. Here's an example from the listing at the end of Chapter 0, page 6:
0:6:22 1A2:4C 595 S1A2 tcy 3 ; Switch drum to Gold Key/Silver Key/Brass Key
0:6:04 184:11 596 ldp 8
0:6:09 189:D5 597 call ROTDRUM
0:6:13 193:19 598 ldp 9
0:6:26 1A6:CF 599 call LTMIDCL ; Light Silver Key and clear display
600
0:6:0C 18C:29 601 S18C ldx 4
0:6:19 199:47 602 tcy 14
0:6:32 1B2:1E 603 ldp 7
0:6:25 1A5:39 604 tbit1 2 ; Brass Key found during encounter?
0:6:0A 18A:9F 605 br L1DF
606 ;
607 ; Display Brass Key
608 ;
0:6:15 195:2A 609 ldx 2
0:6:2A 1AA:4C 610 tcy 3
0:6:14 194:3A 611 tbit1 1 ; Check inventory for Brass Key
0:6:28 1A8:80 612 br L1C0
0:6:10 190:9F 613 br L1DF
0:6:20 1A0:00 614 mnea
Note the calls on lines 597 and 599 are both preceded by ldp
instructions as expected. Where I did, however, run into issues was the br
statements at lines 612 and 613. Note there's an ldp 7
on line 603. The disassembler assumed that it was paired with line 605 and reset the page buffer register making the br
statements at lines 612 and 613 short branches which was incorrect. This was only revealed as I stepped through the program, working to document it. Making matters worse, this does not impact the machine code put out by the assembler as a branch is always assembled the same way; it's the contents of the page and chapter buffer registers that controls whether it's a long jump or not.
Here's what I think happened. This is likely the code that was generated by the assembler when the long branch pseudo-op (bl
) was used:
S18C ldx 4
tcy 14
tbit1 2 ; Brass Key found during encounter?
ldp 7
br L1DF
;
; Display Brass Key
;
ldx 2
tcy 3
tbit1 1 ; Check inventory for Brass Key
ldp 7
br L1C0
ldp 7
br L1DF
Note there were two extra ldp 7
instructions, wasting two extra words of ROM space. It appears the programmer went back and patched the code slightly to take out the extra, unnecessary instructions. This may also explain the spurious mnea
at the end of the page. As soon as I discovered this situation, I went back and reviewed the code at the end of each page and discovered about a half-dozen more instances that have been manually repaired.
Both the disassembler and assembler are purpose-built for use in this project. If there is interest, the following are improvements I'm thinking about:
- Support parameters for input and output files on both the assembler and disassembler
- Support
bl
andcalll
pseudo-ops in the assembler - Support generating
bl
andcalll
pseudo-ops in the disassembler - Change the formatting of the output of the disassembler and the input of the assembler to match the Programmer Guide more closely
- Update both programs to not only support the TMS1100/TMS1400 instruction set but also the TMS1000/TMS1200 as well as was originally the case