News:

Herr Otto Partz says you're all nothing but pipsqueaks!

Main Menu
Menu

Show posts

This section allows you to view all posts made by this member. Note that you can only see posts made in areas you currently have access to.

Show posts Menu

Messages - llm

#31
Quote from: Daniel3D on October 14, 2022, 12:42:16 PMAnd it does not matter where Func0 is. If the new code is before the location that Func0 is looking for it fails.

yes 100% correct - its not called fails, but "undefined behavior"
its not clear what happens when the value gets read from the wrong offset - nearly everything is possile - like random-problem-generator, it could be that there is always 0 and the correct code always wanted 0, or there is a ever changing value that most of the time is in a range were the function can work with and producing no visual or audio glitches, maybe some strange physic behavior while driving a special way
#32
Quote from: Daniel3D on October 14, 2022, 12:16:40 PM
Quote from: llm on October 14, 2022, 09:10:47 AMwhat happens if offsets are not symbolic?

Code Select Expand
0x3440 func0
0x3440  mov ax,0x3456
0x3442  call XYZ
0x3448
0x3450 func1
0x3451   some code <-- the above non-symbolic offset will get wrong if you add/remove code here
0x3452 added code
0x3453 added code
0x3454 added code
0x3455
0x3456: Something entirely different (not: dw some_value 234)
0x3457
0x3458
0x3459: dw some_value 234
Like this?
Then func0 fails. I know. That is why getting rid of them is important.

yes 100% correct - but "fails" isnt defined here - it could be that the algorithm works still because its just not that robust, or there is a identical or nearly identical value at the target offset
think of values like 0,255,-1 or something there a very typical around so it "could" still work
#33
again, for your daily training :)

what happens if offsets are not symbolic?

0x3440 func0
0x3440  mov ax,0x3456
0x3442  call XYZ
0x3448
0x3450 func1
0x3451   some code <-- the above non-symbolic offset will get wrong if you add/remove code here
0x3452
0x3453
0x3454
0x3455
0x3456: dw some_value 234

func0
  mov ax,offset some_value
  call XYZ

func1
  some code <-- the above symbolic offset will not get wrong if you add/remove code here

dw some_value 234
#34
nice findings - i will have a look and check what of these a real offsets - but i think at least 50% are very likely offsets
#35
Quote from: Daniel3D on October 11, 2022, 01:03:15 PMThis should be one to.
Lucky find on my phone..

no thats the DOS-API (int 21h, function=ah=4Ch=exit program, with error=al=0FFh result == -1)
http://www.osfree.org/doku/en:docs:dos:api:int21:4c

could be written as

mov ah,4Ch
mov al,0FFh ; -1
int 21h

or

mov ax,4CFFh
int 21h

you always need to analyse the context around a little - everything in assembler is more or less global, typeless (pointer, value, ... everything is possible)

C port of that is

exit(-1);
#36
Quote from: Daniel3D on October 11, 2022, 12:38:31 PMI'm not starting with ida.

would be the easiest - but IDA is commercial, costs ~400$ in the home edition

i would love to go back to IDA Freeware 5 (the only free version that still supports DOS)
official download available on ScummVM homepage: https://www.scummvm.org/news/20180331/

but upgrading the IDA database (idb) is a one-way-ticket - and im currently working with 6.8

but you should install the freeware - give you a good idea how that all works, even if IDA is not the latest of the latest - most reversing projects using this freeware version (or Ghidra - which is sometimes problematic with segment/offset support)
#37
just to give you a feeling what the code does in one of your examples:

seg000:053A _ask_dos:                              ; CODE XREF: stuntsmain+43D␘j
seg000:053A  sub    ax, ax
seg000:053C  push    ax  ; show_dialog param 9
seg000:053D  push    ax  ; show_dialog param 8
seg000:053E  push    dialogarg2  ; show_dialog param 7
seg000:0542  mov    ax, 0FFFFh
seg000:0545  push    ax  ; show_dialog param 6
seg000:0546  push    ax  ; show_dialog param 5

seg000:0547      mov    ax, offset aDos ; "dos"
seg000:054A      push    ax ; locate_text_res param 3
seg000:054B      push    word ptr mainresptr+2 ; locate_text_res param 2
seg000:054F      push    word ptr mainresptr ; locate_text_res param 1
seg000:0553      call    locate_text_res
seg000:0558      add    sp, 6 -> 6 bytes removed from strack (du to the previous 3 pushes 'a 2 bytes)

seg000:055B  push    dx  ; show_dialog param 4
seg000:055C  push    ax  ; show_dialog param 3
seg000:055D  mov    ax, 1
seg000:0560  push    ax  ; show_dialog param 2
seg000:0561  mov    ax, 2
seg000:0564  push    ax  ; show_dialog param 1
seg000:0565  call    show_dialog
seg000:056A  add    sp, 12h ; 12h  = 18 bytes bytes on stack removed (due to the previus 9 pushes)

this is the C-port of that asm-code

  locate_text_res(mainresptr.offset, mainresptr.segment, "dos"); // sets dx and ax (could be a ptr)
  show_dialog(2, 1, ax, dx, -1, -1, dialogarg2, 0, 0);
#38
Quote from: Daniel3D on October 11, 2022, 10:51:24 AMThis is the last for now. Enough to test if I am finding them correctly .. And to see if it is useful..
(I will make more compact logs of others I find when useful to continue. I did it this way, so you can easily see if I make obvious mistakes)
seg000 Line 1795:     mov     ax, 0FFFEh 
    call    shape3d_load_all
    mov     ax, 0C8h ; 'È'
    push    ax
    mov     ax, 140h
    push    ax
    mov     ax, 28h ; '('
    push    ax
    push    ax
    call    set_projection
    add     sp, 8
    mov     ax, 0FFFEh
    push    ax
    call    init_game_state
    add     sp, 2
    call    sprite_copy_wnd_to_1
    push    skybox_grd_color
    call    sprite_clear_1_color
if this is one there are 3 other hits on "ax, 0FFFEh"


0FFFEh is not a valid looking offset - just too big, and 0FFFEh as signed is -2 - so its maybe some sort
of parameter or really the value 65534

you need to understand hex/dec, signed/unsigned and type-size very well do get a "feeling" what that number could be - combined with knowledge about the called functions
#39
Quote from: Daniel3D on October 11, 2022, 10:47:13 AMseg000 Line 1060:     mov     ax, 0AC74h 
    mov     ax, offset aGsta; "gsta"
    push    ax
    push    [bp+var_38]
    push    [bp+var_3A]
    call    locate_shape_alt
    add     sp, 6
    push    dx
    push    ax
    mov     ax, 0AC74h
    push    ax
    call    copy_string
    add     sp, 6
    push    word_407D6
    push    word_407D4
    mov     ax, 4Ch ; 'L'
if this is one there are 13 other hits on "ax, 0AC74h"


0AC74h is very likely an offset into the data segment, to some string or something
you need to analyse copy_string - in IDA you would annotate the parameter of copy_string so IDA can infere further
#40
Quote from: Daniel3D on October 11, 2022, 10:42:57 AMFirst line of interest..  8)
seg000 Line  607:     mov     ax, 0FFFFh 
_ask_dos:
    sub     ax, ax
    push    ax
    push    ax
    push    dialogarg2
    mov     ax, 0FFFFh
    push    ax
    push    ax
    mov     ax, offset aDos ; "dos"
    push    ax
if this is one there are 85 other hits on "ax, 0FFFFh"

that is very likly just -1, in assembler everything is unsiged, but that does not
mean that a value IS unsigned, -1 isn't very likely an offset :)

see online-conversion:
https://cryptii.com/pipes/integer-converter
https://imgur.com/BiCqyoI
#41
Quote from: Daniel3D on October 11, 2022, 08:15:07 AMI only intend to catalogue them.

this regex finds most of magic-values numbers, that could be offsets, and only global offsets are relevant

(\,|\-|\+)\s*((0[a-fA-F0-9]*|[1-9][a-fA-F0-9]*)h|[0-9])

im using that with Notepad++ (but other editors with regex support should also work)
searching all asmorig asm-files

removing all "add or sub sp,VALUE" + defines reduces the list to ~13.000, but most of the findings
are value-sets or something

as usual - a huge mess of assembler code :(
#42
Quote from: Daniel3D on October 10, 2022, 07:22:24 PMThere will be false positives.

if you change a non-symbolic offset to an symbolic one and compare the exe before/after no bit should have changed - then could it be still wrong but still does not can harm the gameplay because the exe is not changed, doing such changes without checking before/after is like playing roulett for earning bugs without any need
#43
for example

mov    di, 55CAh
is

mov    di, offset word_40D3A
should produce the very same executable (binary equal)

from IDA-Editor:
dseg:55C8                 db 0FFh
dseg:55C9                 db    0
dseg:55CA word_40D3A      dw 0                    ; DATA XREF: end_hiscore+638␘w
dseg:55CA                                         ; end_hiscore+656␘r start+6A␘o
dseg:55CC word_40D3C      dw 0                    ; DATA XREF: end_hiscore+63E␘w
dseg:55CC                                         ; end_hiscore+6C1␘r
dseg:55CE word_40D3E      dw 0                    ; DATA XREF: end_hiscore+644␘w

more or less easy in IDA Pro - but first you need to know that this is really a offset value
and which segment the offset targets - in this case seeable by looking at the code above
seems to be dseg - so its a offset to a variable in the data segment (some copy/init operation is done)

IDA always shows the binary information (offsets, opcodes) in parallel to the disassembly: https://imgur.com/fsUvtVI
thats the primary reason for using a professional tool for reverse engineering, thats also the reason for using a IDA script
to produce the asm code - any finding can result in multiple changes over the asm files - for example you finding a common type
and start using it in IDA - IDA will use that information to extend other parts of the disassembly, resolving more and more
that is not easy with dead end assembler code - and a huge part of the reversing process

IDA Pro is not an assembler editor (you can't change anything in the assembler-code) is just a tool to help reverse engineering - so
cross references, graphs, deep analysis etc., you can add types, structs and annotated the found functions, giving IDA more infos
how to disassemble stuff he didn't understand by itself
#44
Quote from: Daniel3D on October 10, 2022, 01:43:11 PMSo to make a symbolic offset out of it you must first find the correct byte offset and locate it in the assembly code?

thats why people using IDA or Ghidra for reversing - they keep the assembler source view and and the binary code in sync - so you can easier see what an offset could target

Quote from: Daniel3D on October 10, 2022, 01:43:11 PMLooking at your example i guess it is not very difficult for you. But i understand why they are not all done.

it could be difficul because sometimes offsets are calculated using serveral lines of assembler code
which could be also some sort of 3d point calculation - its not always easy to differ

Quote from: Daniel3D on October 10, 2022, 01:43:11 PMIf i find more (i now have an idea of what they look like) and they are not commented as such I'll make a note of it.

great
#45
Quote from: Daniel3D on October 10, 2022, 12:55:19 PMAssuming that the hex value is a number that corresponds to the line it seems to be a little off.
But i guess that the start of the file should be excluded from the line count as that is compiler info.

the hex value does NOT coresponds to a line number, NEVER - the hex-value is an byte-offset from the image start (behind the exe header), it depends on the size of code that sits before and every asm command is of different size in binary

for example: this is a assembler routine (not from stunts) - on the left is the binary-offset, then the binary code and the corespondig asm source

IDA-Offset  | Binary code | Assembler source
            |             |
seg000:BDF4 |             |  sub_1BDF4  proc near
seg000:BDF4 |             |                                   
seg000:BDF4 | 06          |            push    es
seg000:BDF5 | 1E          |            push    ds
seg000:BDF6 | 56          |            push    si
seg000:BDF7 | 57          |            push    di
seg000:BDF8 | 8D 36 76 BC |            lea    si, ds:0BC76h
seg000:BDFC | B9 06 00    |            mov    cx, 6
seg000:BDFF |            |
seg000:BDFF |            |  loc_1BDFF:                       
seg000:BDFF | 83 C6 04    |            add    si, 4
seg000:BE02 | 2E 8B 04    |            mov    ax, cs:[si]
seg000:BE05 | 2E 0B 44 02 |            or      ax, cs:[si+2]
seg000:BE09 | 74 02      |            jz      short loc_1BE0D
seg000:BE0B | E2 F2      |            loop    loc_1BDFF
seg000:BE0D |            |
seg000:BE0D |            |  loc_1BE0D:                       
seg000:BE0D | 2E 89 1C    |            mov    cs:[si], bx
seg000:BE10 | 2E 89 7C 02 |            mov    cs:[si+2], di
seg000:BE14 | 5F          |            pop    di
seg000:BE15 | 5E          |            pop    si
seg000:BE16 | 1F          |            pop    ds
seg000:BE17 | 07          |            pop    es
seg000:BE18 | C3          |            retn
seg000:BE18 |            |  sub_1BDF4  endp

Shellstorm disassembly of the same Binary code without symbolic offsets

0x0000000000000000:  06            push es
0x0000000000000001:  1E            push ds
0x0000000000000002:  56            push si
0x0000000000000003:  57            push di
0x0000000000000004:  8D 36 76 BC    lea  si, [0xbc76]
0x0000000000000008:  B9 06 00      mov  cx, 6
0x000000000000000b:  83 C6 04      add  si, 4
0x000000000000000e:  2E 8B 04      mov  ax, word ptr cs:[si]
0x0000000000000011:  2E 0B 44 02    or  ax, word ptr cs:[si + 2]
0x0000000000000015:  74 02          je  0x19 <-- jmp offset
0x0000000000000017:  E2 F2          loop 0xb <-- jmp offset
0x0000000000000019:  2E 89 1C      mov  word ptr cs:[si], bx
0x000000000000001c:  2E 89 7C 02    mov  word ptr cs:[si + 2], di
0x0000000000000020:  5F            pop  di
0x0000000000000021:  5E            pop  si
0x0000000000000022:  1F            pop  ds
0x0000000000000023:  07            pop  es
0x0000000000000024:  C3            ret 

so "lea si, ds:0BC76h" is encoded in the exe as {8D 36 76 BC}