Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Upgraded GDA syntax #2

Open
wants to merge 17 commits into
base: main
Choose a base branch
from
Open

Upgraded GDA syntax #2

wants to merge 17 commits into from

Conversation

ilonachan
Copy link
Contributor

@ilonachan ilonachan commented Aug 30, 2024

So as I mentioned, my intention was to create a human-readable script-like format, where each command in a GDS script is listed with a human-readable name (and possibly named parameters) that could be de- and recompiled from and to GDS. Turns out that's exactly what GDA was. I'm not sure why the code implies this format is old or legacy stuff or sth, but it seems very useful, so this PR beefs it up a bit. If there's a defined GDA standard out there I couldn't find it, and so my modifications probably would't follow it.

Here's what I did:

  • Added command to decompile a GDS file into a GDA script, and renamed/changed helpstrings for the other gds commands (specifically, JSON isn't considered the main format anymore)
  • Rather than just a hex number or two specific command names, any recognized command name (listed in data/commands.json) can be used, along with cmd_{decimal number} (TODO: maybe prefer the latter in decompilations, or let the user choose). Unrecognized commands in user-written GDA scripts can't be mapped to numerical command codes, and thus still throw an error. It's possible to map different names to the same code using an alias property; for example to update a name in later versions, or if a code is just used for multiple purposes (support for this in decompilation is TODO, could e.g. show possible alternative command meanings in a comment above the line)
  • Can read and write strings with python escape sequences
  • Should allow more whitespace (TODO: comments at the end of lines)
  • Allows specifying the other "supported" unknown GDS parameter types using !5(0xABCD) (parens with payload only in case where the type supports it). This syntax was chosen instead of a keyword, to allow for parameter names to be documented and inserted as named parameter for user information later.

In my opinion the ultimate goals of this component should be, in this order:

  1. Any GDS file that's decompiled and recompiled should always result in the same binary code.
  2. All command IDs (or as many as we can figure out) should be documented, including how many parameters of what types they expect, and ideally their meaning.
  3. Possibly support a workflow where existing GDA files can get updated with new and better command names, otherwise preserving their structure.

To this end I've also extended the format used by commands.json to include command & params documentation (potentially usable by automated tools, or a gui!) and added a few commands that I could figure out at a glance. This could definitely be further improved:

  • Maybe using YAML could reduce clutter from braces and obligatory quote marks
  • Many commands are only relevant in certain contexts, and it could be helpful for readability to group these together
  • Formally denote that a command modifies a previous command's invocation?
  • What I added was just some minigames and specific puzzle engine commands. A lot more is needed for a full understanding of all puzzles, and especially room and event scripts.

@ilonachan ilonachan force-pushed the gds branch 3 times, most recently from c047f93 to ccd6749 Compare September 1, 2024 20:28
Copy link
Owner

@patataofcourse patataofcourse left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just a couple comment things I noticed and should be done

formats/gds.py Outdated Show resolved Hide resolved
formats/gds.py Outdated Show resolved Hide resolved
Copy link
Contributor Author

@ilonachan ilonachan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

aight done! Sorry it took so long, I have a job now (^o^)/

@ilonachan
Copy link
Contributor Author

Okay, so I've worked a bit on this. I'm joking, I did A LOT. Basically through my decompilation efforts (which I sadly don't have a good non-DMCA way to share yet) I've learned a lot about GDS, including that it has control flow instructions, an IEEE-754 32-bit float datatype (which the NDS processor doesn't even support! All the operations are implemented in software. And for what? For most of the floats being integers anyway. I'll take "things that solve my problem easier" for 100$, "what are fixed-point floats?") and a bit of other stuff I'll need to look into more.

For now, based on these discoveries, which I've tried to collect in two writeups, I've designed and implemented GDA as a DSL for decompiling GDS scripts. The decompiler even has a few bells and whistles, for example it can add comments with helpful context information into the resulting script (the only place I've implemented this for now is textbox scripts displaying the english text corresponding to their event-text ID, but adding this to more commands is just a matter of declaration in the command YAML files).

Oh yeah, speaking of. The old commands.json would have simply been too cluttered, so I made my own system where the commands are listed in YAML files structured in a data directory. The details of this are definitely open to change, for example the versioning of command names. And I need to do a pass of checking how many bytes of each integer/string are actually read in the game, and copy that over too. But at least I know for certain that every command is in there in some form, and every command that's ever actually used has the correct parameter count and types listed.

Obviously this is a REALLY big one, and I wouldn't be surprised if the review takes longer than the writing. I'm 100% open for questions, mostly asynchronously though (bc job and stuff, and maybe timezones idk)

ilonachan and others added 13 commits October 3, 2024 21:01
… can detect input file type automatically. Added yaml export type, and fixed a few bugs.
TODO: improve the parser to handle jumps and address labels
TODO: create the GDA parsing and writing logic
… in the original game scripts, it perfectly decompiles all scripts and reassembles then into the same binaries.
… scripts in a nice format, with indentations too!
…unded to the lowest number of decimal points that still produces identical byte data.
… GDS files into identical binaries (except patches)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants