News:

Herr Otto Partz says you're all nothing but pipsqueaks!

Main Menu

Restunts repository - Git mirror

Started by dreadnaut, March 19, 2021, 12:00:36 AM

Previous topic - Next topic

dreadnaut


dstien

Quote from: Daniel3D on October 27, 2021, 12:37:40 AMASMORIG_OBJFILES = $(ASM_OBJDIR)\segments.obj $(ASMORIG_OBJDIR)\seg000.obj
should be ASMORIG of course.
Not that it matters, but this was intentional. segments.asm was handwritten and there was only supposed to be one copy shared by asm and asmorig.

Quote from: Daniel3D on November 07, 2023, 06:59:33 AM
Quote from: llm on November 07, 2023, 06:38:29 AMdstien isn't the initial creator/svn maintainer of restunts - clvn is - so he never controlled the source
Yeah, I know. But dstien is traceable to an extent outside this community. I know nothing about clvn. So impossible to tell for me if he maintains a copy..
Sadly his batman.no domain has lapsed, but since anders-e.com is still up I'm assuming he's well, just busy with middle-aged life. clvn was a bit of a celebrity in the Norwegian demo scene in the mid-90's. I was surprised to hear him namedropped in a Norwegian podcast about Amiga music a few years back.

Quote from: llm on October 31, 2021, 06:13:42 PMSo dereadnaut, just replace

QuoteA clone of dstien's restunts SVN repository.

with

QuoteA clone of clvn's restunts SVN repository.

dstien got no public SVN repo
Indeed, clvn started and ran this project. He figured out all the bit-trickery that allowed us to rebuild working executables injected with our own code. He did the majority of the analysis and porting to C. I mostly ported low-hanging fruit and cheered on the sideline.

I appreciate the efforts to make the code public, and I have thought about doing it myself from time to time. But I also wanted to do things a bit different, modernise the toolchain, cleaning up cruft, ensuring ported code is both portable and correct, and getting rid of the blobs. Particularly the database from the program whose vendor has a history of refusing to sell to people outside the infosec clique. Seeing the spectacular work of @HerrNove on SuperSight inspired me to have another crack at it. I decided to start with a clean slate to avoid said blobs and avoid offending anyone when deleting their code. Whether the new repo should be placed under the 4d-stunts org or be merged with the mainline repo is not for me to decide, it's still an experiment.

The first order of business was to convert the ID* Pr* database to Ghidra. After discovering that Ghidra provide an XML exporter script for ID*, I got my hopes up. Checking the output of the script confirmed that everything we needed were included, we could just use Ghidra's XML import and this project would be done in an evening or two.

Turned out it would take over a month before I had a working restunts.exe built from Ghidra. First the export script ignored symbols it deemed automatically generated based on their prefix. When we used names like "arg_cheeseburger" it would be dropped. It's just a Python script, so it was a quick fix. Next it turned out that our "arg_cheeseburger" wouldn't be imported at all anyway, as Ghidra's importer ignores function stack frames entirely. I think the reasoning is that Ghidra prefers to trust its own analysis to build the stack frame layout, which it does with its powerful decompiler that can infer arguments and their sizes. Our problem with that, besides losing argument and local variable names, is that Ghidra's support for working with segmented memory appears to be an afterthought. While ID* was first released in 1991 when 16-bit was still dominant, and its segmented memory handling remains a first-class citizen. There are many related issues filed for Ghidra about this, and many remains unactioned for years. There seem to be a, quite natural, distance between the NSA's and the retro community's priorities. I found quite a few pull requests for issues I was having that had been rejected. The maintainers seem understanding, but they don't accept duct tape patches, instead wanting to redesign their cores to properly support the features they initially hadn't accounted for. And it appears there's just never time granted for the NSA to accommodate to nostalgic gamers.

At this point I had long since given up on documenting all the Ghidra issues I found, and just did a LOT of manual fixups. The upside to this was that I got a solid refresher on Stunt's code. The next step was to port the IDC script to Ghidra. I chose the Jython approach to interface with Ghidra's Java API. It's archaic Python 2.7 and very, very slow, but it works right out of the box, no compilation needed, and it's thus very hackable. Initially I thought reproducing TASM output like the original script would be the easiest solution. But nothing is easy. While ID*'s native disassembly syntax is more or less MASM/TASM compatible, Ghidra goes its own way. Things that are implicit in Ghidra may have to be explicit in TASM, and vice versa. When I found myself writing my own x86 disassembler in Ghidra Script in order to format memory operands with proper TASM syntax I realised we might just target a contemporary assembler. I first pivoted towards NASM, it's widely (pun intended) available and has the best documentation of any assembler by far. But its syntax is even further out there. I eventually settled on the Watcom Assembler, whose name unfortunately now means something very different. There are several modern WASM forks, but I wanted to use the whole Watcom toolchain anyway, so I settled on the original to avoid further dependency complexity.

I wanted to make patching of the generated assembly code dynamic. This is done by using the tags <REPLACE>, <INSERT>, <NOP> and <DELETE> in comments. See the README for more details. This also proved to be a convenient crutch when my Ghidra knowledge fell short:


Another change is that only one set of asm files is created, instead using a build-time toggle to determine whether to wire up ported or keep original functions.

Something we wasted a lot of time on back in the day was trying to use unofficial ID* database synchronisation tools so that we could work on it simultaneously. Ghidra comes with its own server which I have set up on re.stunts.no. Anonymous readonly-connections are accepted. For those who want to contribute, send me a PM with a username and I'll add you.

I've gone back to the trusted old wlink. This sacrifices support for Turbo Debugger, but Watcom Debugger also seem very capable:

wlink can also produce DWARF and CodeView debug data, so it may open up for even more debuggers?

For the C code I wanted to carefully pull in one function at a time from the original restunts code. Making sure everything is portable and correct. So far I've only added the good old sin_fast() and cos_fast(), and made unit tests for these. The GitHub repo has a CI-task for building and testing, and test output can be inspected there. I haven't yet made up my mind how we'd go about testing functions with side effects. I don't know how realistic it is, but I want to see how far we can take it by only using Stunts' clib in seg010 and not link in any Watcom libraries as long as we rely on linking with the original code.

I've only tested building on Linux. Adding Windows support again will probably take some work, and I don't know if it's worth it as it probably works out of the box in WSL. There's also no proper dependency tracking for incremental building yet. Compilation is so fast that I've just been doing rm -drf build && make so far. Which I hear is in vogue with the handmade crowd these days anyway.

Quote from: llm on October 28, 2021, 09:25:30 AMit would have been very easy if it was originaly developed for 32bit systems - like many other DOS games - then we could only replace the hardware access stuff and it would then run on windows/linux and we could have used todays development tools - but its just way too old :(
There are two 32-bit builds of Stunts; FM Towns and FM Towns Marty are built for i386, using the Phar Lap 386|DOS-Extender. While FM Towns isn't DOS compatible, and these versions aren't entirely bug-compatible with the DOS versions, there's a lot of shared code that is far more pleasant to explore without segmented memory. This is where the 32-bit anecdote was supposed to end, but I wanted to add these executables to the Ghidra server repo, and since Ghidra don't have a loader for Phar Lap payloads, I had a look at the files to see how much effort it would take:

That's some peculiar structured information after the DATA section... I can't believe it took 32 years for us to find out that WE HAD DEBUG SYMBOLS FOR STUNTS ALL ALONG! It's only public symbols, no types or stack variables, but still. It's over 2000 symbols. Since the porters appears to have kept many of the original DOS funtions as stubs, we even have these names.

Here's a function in the FM Towns Marty port and DOS side-by-side:

This is a striking example showing how spot-on some of the original analysis is, how unknown data now has clear names, and how the FM Towns code deviates from the DOS code when dealing with IO. When the FM Towns code has generated labels (_DAT_addr) it's usually because it is a struct member offset, but we only know the name of the root value.

I added both FM Towns executables in the Ghidra repo with all debug symbols loaded, no further analysis. I discovered this just now, so I haven't explored the symbols in depth yet. I'm not sure if we should adopt all the original names, as some are quitehardtoread compared to our snake_case_notation.

More details on how to connect to the Ghidra server, use the Ghidra script and build is in the repo readme: https://github.com/dstien/restunts2
The problems I encountered myself already are covered in the troubleshooting section.

Daniel3D

 :o I'm going to need some time to wrap my head around this... Christmas is early this year it seems  ::)
Edison once said,
"I have not failed 10,000 times,
I've successfully found 10,000 ways that will not work."
---------
Currently running over 20 separate instances of Stunts
---------
Check out the STUNTS resources on my Mega (globe icon)

Matei

As I wrote before:

https://forum.stunts.hu/index.php?topic=4257.msg92963#msg92963

Quoteif, by any chance, the Chinese government will want it done, I can assure you that it will be done.

Besides, everything that can be obtained was already. Maybe some more motorbikes would work...

Duplode

This is fantastic work @dstien , thank you so much! Can't wait to try it out! (Yes, I've been repeating this since the beginning of the year about e.g. @HerrNove 's improvements, but hopefully I'll stop getting sidetracked anytime soon! 😅) No longer needing the IDA-centric workflows and having the 32-bit symbols as a reference should be a huge boon to analysis.