Skip to content

Basilisk II Core Emulation Analysis

Ricky Zhang edited this page Aug 19, 2017 · 43 revisions

Table of Contents

"Emulation core" refers to the logic of translating M68k CPU guest instructions into non-M68k CPU host instructions (Intel X86, AMD64, ARM and PPC). Besides CPU core emulation, we have a separate page to describe 68k Macintosh peripheral hardware emulation such as timer, ethernet, audio and etc. The majority of the following emulation analysis is based on the study of an AMD64 Linux host, but it should apply to different host architectures and operating systems.

History

The facts described below are purely based on:
If it is on the Internet it must be true.

Basilisk II CPU emulation was first started by Christian Bauer. From the initial source code, it has an original root from another M68k Amiga emulation project called UAE. By performing a diff between early commit set 2bebaceabc7646d in macemu git repo and UAE v0.8.10 source code, we find that the files build68k.c, cpuopti.c, gencpu.c and table68k are nearly identical to those in UAE. For further reading on UAE, you can view the UAE People Section in the WinUAE documentation.

Based on the commit history, Gwénolé Beauchesne is the key contributor to Basilisk II CPU emulation. He added JIT translation (a.k.a dynamic binary translation) to speed up emulation. TODO -- Add more when read JIT code.

Source Code

For non-M68k CPU emulation, the source code is under src/uae_cpu folder.'

TODO -- Add overview of Glue/Adapter, UAE CPU, FPU and JIT.

Addressing

Background

There are two different perspectives in terms of memory addressing in emulation.

The first one is from the host OS point of view. An emulation program such as Basilisk II runs as an application at ring 3 user space. The majority of modern host CPUs, such as Intel x86, AMD 64, PPC and ARM nowadays, contain an MMU, which may provide segmentation and paging. The majority of modern host OSes such as Linux, Mac OS X and Windows use virtual addressing, instead of direct physical addressing of memory.

For 32 bit CPUs, the CPU in theory can access up to 2^32 bytes (4GB) virtual memory. A 64 bit CPU can access a larger memory space than you can imagine. However, it doesn’t mean that applications can use an arbitrary virtual address. This usually depends on CPU architecture and host OS implementation. For example, 32 bit Linux by default will put aside the lower 3GB for user space and the upper 1GB for kernel space [1].

The second perspective is from the guest Macintosh OS point of view. In theory, the guest OS doesn’t know if it is running under a physical M68k CPU or an emulated CPU provided by BII. Therefore, BII needs to provide memory address mapping between the guest OS and BII's user space memory in the host OS when executing translated instructions.

According to the Wikipedia page on M68k series CPUs [2], only 68030 or above M68k series CPU have a built-in Paged MMU. In addition, Apple added virtual memory features to System 7. TODO -- Investigate if BII emulates the PMMU. Try to enable virtual memory in memory manager under control panel.

In terms of the address mapping provided by Basilisk II, there are three different types: direct addressing, real addressing and virtual addressing. By default, the GNU automake tools determine the proper addressing mapping strategy for you. If you know better than the automatic detection, you can override it by passing the enable-addressing option to the ./configure script. It accepts the options direct, real and banks. (Note that the banks option refers to virtual addressing). You can also see the addressing mode after running ./configure:

...
Assembly optimizations ................. : x86-64
Addressing mode ........................ : direct
Bad memory access recovery type ........ : siginfo
...

Addressing Strategy Selection

It is interesting to analyze the configure.ac script to figure out how the optimal addressing strategy is automatically determined. The selection logic depends on several tests of the host OS and platform.

  1. Check if OS supports VOSF a.k.a Video on Segmentation Fault (Regarding to the technical details of VOSF, please refer to [3]). Whether it supports VOSF or not is determined by another condition that whether OS supports segmentation fault signal handler. This relies on several compilation test in src/CrossPlatform folder controlled by macro. As long as any of test below is passed under Linux/Mac OS X, it will set variable CAN_VOSF as ‘yes’.
/*Test 1: Check if OS supports extended signal handlers via asm*/
#ifdef HAVE_UNISTD_H
#include <unistd.h>
#endif
#define HAVE_ASM_UCONTEXT 1
#define HAVE_SIGINFO_T 1
#define CONFIGURE_TEST_SIGSEGV_RECOVERY
#include "../CrossPlatform/vm_alloc.cpp"
#include "../CrossPlatform/sigsegv.cpp"


/*Test 2: Check if OS supports extended signal handlers*/
#ifdef HAVE_UNISTD_H
#include <unistd.h>
#endif
#define HAVE_SIGINFO_T 1
#define CONFIGURE_TEST_SIGSEGV_RECOVERY
#include "../CrossPlatform/vm_alloc.cpp"
#include "../CrossPlatform/sigsegv.cpp"


/*Test 3: Check if OS supports the hack*/
#define HAVE_SIGCONTEXT_SUBTERFUGE 1
#define CONFIGURE_TEST_SIGSEGV_RECOVERY
#include "../CrossPlatform/vm_alloc.cpp"
#include "../CrossPlatform/sigsegv.cpp"
  1. Check if the OS allows page zero mapping in user space or another related page zero hack. Due to NULL pointer deference security concern, by default the modern Linux kernel has disabled page zero mapping. But Mac OS X still supports this.

In the end, the configure script uses the result of the tests:

  • If native M68k CPU, it uses real addressing.
  • If page zero mapping is supported and the host is able to do VOSF, it uses real addressing.
  • If page zero mapping is not supported, but it is able to do VOSF, it uses direct addressing.
  • Otherwise, it uses memory banks a.k.a virtual addressing.
if [[ "x$WANT_NATIVE_M68K" = "xyes" ]]; then
  ADDRESSING_MODE="real"
else
  ADDRESSING_MODE=""
  AC_MSG_CHECKING([for the addressing mode to use])
  for am in $ADDRESSING_TEST_ORDER; do
    case $am in
    real)
      dnl Requires ability to mmap() Low Memory globals
      if [[ "x$ac_cv_can_map_lm$ac_cv_pagezero_hack" = "xnono" ]]; then
        continue
      fi   
      dnl Requires VOSF screen updates
      if [[ "x$CAN_VOSF" = "xno" ]]; then
        continue
      fi   
      dnl Real addressing will probably work.
      ADDRESSING_MODE="real"
      WANT_VOSF=yes dnl we can use VOSF and we need it actually
      DEFINES="$DEFINES -DREAL_ADDRESSING"
      if [[ "x$ac_cv_pagezero_hack" = "xyes" ]]; then
        BLESS=Darwin/lowmem
        LDFLAGS="$LDFLAGS -pagezero_size 0x2000"
      fi   
      break
      ;;   
    direct)
      dnl Requires VOSF screen updates
      if [[ "x$CAN_VOSF" = "xyes" ]]; then
        ADDRESSING_MODE="direct"
        WANT_VOSF=yes dnl we can use VOSF and we need it actually
        DEFINES="$DEFINES -DDIRECT_ADDRESSING"
        break
      fi   
      ;;   
    banks)
      dnl Default addressing mode
      ADDRESSING_MODE="memory banks"
      break
      ;;   
    *)   
      AC_MSG_ERROR([Internal configure.in script error for $am addressing mode])
    esac
  done

Common Data Structure and Utilities

Common data structure defined in src/uae_cpu/cpu_emulation.h

// RAM and ROM pointers
uint32 RAMBaseMac = 0;		// RAM base (Mac address space) gb-- initializer is important
uint8 *RAMBaseHost;			// RAM base (host address space)
uint32 RAMSize;				// Size of RAM
uint32 ROMBaseMac;			// ROM base (Mac address space)
uint8 *ROMBaseHost;			// ROM base (host address space)
uint32 ROMSize;				// Size of ROM

Common utilities functions defined in src/uae_cpu/cpu_emulation.h

static inline uint32 ReadMacInt32(uint32 addr) {return get_long(addr);}
static inline uint32 ReadMacInt16(uint32 addr) {return get_word(addr);}
static inline uint32 ReadMacInt8(uint32 addr) {return get_byte(addr);}
static inline void WriteMacInt32(uint32 addr, uint32 l) {put_long(addr, l);}
static inline void WriteMacInt16(uint32 addr, uint32 w) {put_word(addr, w);}
static inline void WriteMacInt8(uint32 addr, uint32 b) {put_byte(addr, b);}
static inline uint8 *Mac2HostAddr(uint32 addr) {return get_real_address(addr);}
static inline uint32 Host2MacAddr(uint8 *addr) {return get_virtual_address(addr);}

static inline void *Mac_memset(uint32 addr, int c, size_t n) {return memset(Mac2HostAddr(addr), c, n);}
static inline void *Mac2Host_memcpy(void *dest, uint32 src, size_t n) {return memcpy(dest, Mac2HostAddr(src), n);}
static inline void *Host2Mac_memcpy(uint32 dest, const void *src, size_t n) {return memcpy(Mac2HostAddr(dest), src, n);}
static inline void *Mac2Mac_memcpy(uint32 dest, uint32 src, size_t n) {return memcpy(Mac2HostAddr(dest), Mac2HostAddr(src), n);}

Direct Addressing

Host memory pre-allocation function in direct addressing

This allocates memory in host OS for guest OS's RAM and ROM.

TODO -- Change the following into sequence diagram later.

vm_acuqire_mac(RAMSize + 0x100000)  -- why ROM size only 1MB 0x100000?
=>
vm_acquire(size, VM_MAP_DEFAULT | VM_MAP_32BIT) -- from CrossPlatform VM allocator wrapper.
=>
vm_acquire calls on OS specific user space memory allocation function and setup read/write flag.

For example, in Linux vm_acquire function calls mmap [4] to map /dev/zero file to the starting address 0x10000000 (NOTE: virtual address of host OS starts at 256MB) with specific size. This creates chunks of pre-initialized memory in host OS. Then vm_acquire calls mprotect to specify allocated virtual memory with with read/write permission.

Mapping in direct addressing

The mapping of direct addressing between host and guest is simple and efficient.

Host: RAM starts at whatever return address from vm_acquire_mac function. Let’s denote it as RAMBaseHost. ROM starts at RAMBaseHost + RAMSize.

Guest: RAM starts at 0. ROM starts at RAMSIZE.

As you can easily tell, the mapping between host and guest is to apply a fixed difference -- RAMBaseHost. In my test, I found that Linux 64 and Linux ARMv7 uses direct addressing. You can run pmap command on BasiliskII process to confirm code analysis with actual memory usage at runtime. In my BII’s preference, 1GB RAM is specified. According to pmap, 1GB memory is allocated with read/write flag starting at virtual address 0x10000000.

[Ricky@gtx Unix]$ cat ~/.basilisk_ii_prefs | grep ramsize
ramsize 1073741824
[Ricky@gtx Unix]$ ps -A | grep BasiliskII
27151 pts/1    00:00:01 BasiliskII
[Ricky@gtx Unix]$ pmap 27151
27151:   ./BasiliskII
0000000010000000 1048640K rw---   [ anon ]
0000000050010000   1896K r----   [ anon ]
0000000078000000   2160K rwx-- BasiliskII
000000007821c000    544K rwx--   [ anon ]
000000007a1ef000   2920K rw---   [ anon ]

One side note: as user space program, the text segment of BasiliskII is mapped above 0x70000000. The trick is done by linker script [5] defined in src/Unix/ldscripts folder. See -T option in linking.

...
g++ -o BasiliskII -Wl,-T,ldscripts/linux-x86_64.ld 	obj/main.o obj/prefs.o obj/prefs_items.o obj/sys_unix.o obj/rom_patches.o …
...

Real Addressing

Host memory pre-allocation function in real addressing

Real addressing requires host CPU to access host memory from 0x0000 to 0x2000. As mentioned above, for security reason modern Linux has disabled accessing page zero

This allocates memory in host OS for guest OS's RAM and ROM in contiguous location:

TODO -- Change the following into sequence diagram later.

vm_acquire_mac_fixed(0, RAMSize + 0x100000)
=>
vm_acquire_fixed(addr, size, VM_MAP_DEFAULT | VM_MAP_32BIT)
=>
vm_acquire_fixed calls on OS specific user space memory allocation function and setup read/write flag.

Mapping in real addressing

Compared to direct addressing, real addressing is very straightforward with zero overhead in turns of address translation. The address in guest CPU maps to the same address in the host CPU. There is no address mapping is needed.

Native M68k CPU can use real addressing. But just for fun, let’s do an experiment to trick Basilisk II into real addressing mode under modern Linux host.

First, we explicitly set vm.mmap_min_addr to 0 by sysctl in Linux so that we can use paging zero in user space.

[root@gtx vm]# echo 0 > /proc/sys/vm/mmap_min_addr

Secondly, run configure

./configure --enable-sdl-video --enable-sdl-audio --disable-jit-compiler --with-x --with-gtk --enable-addressing=real

However, it shows that the addressing uses memory_banks, instead. We already know that Linux can use VOSF. So it must be test that set $ac_cv_can_map_lm variable failed. I extract the test file from configure.ac

  /* confdefs.h */
 #define PACKAGE_NAME "Basilisk II"
 #define PACKAGE_TARNAME "BasiliskII"
 #define PACKAGE_VERSION "1.0"
 #define PACKAGE_STRING "Basilisk II 1.0"
 #define PACKAGE_BUGREPORT "[email protected]"
 #define PACKAGE_URL ""
 #define STDC_HEADERS 1
 #define HAVE_SYS_TYPES_H 1
 #define HAVE_SYS_STAT_H 1
 #define HAVE_STDLIB_H 1
 #define HAVE_STRING_H 1
 #define HAVE_MEMORY_H 1
 #define HAVE_STRINGS_H 1
 #define HAVE_INTTYPES_H 1
 #define HAVE_STDINT_H 1
 #define HAVE_UNISTD_H 1
 #define __EXTENSIONS__ 1
 #define _ALL_SOURCE 1
 #define _GNU_SOURCE 1
 #define _POSIX_PTHREAD_SEMANTICS 1
 #define _TANDEM_SOURCE 1
 #define PACKAGE "Basilisk II"
 #define VERSION "1.0"
 #define HAVE_LIBRT 1
 #define HAVE_LIBRT 1
 #define HAVE_LIBM 1
 #define HAVE_PTHREADS 1
 #define HAVE_PTHREAD_COND_INIT 1
 #define HAVE_PTHREAD_CANCEL 1
 #define HAVE_PTHREAD_TESTCANCEL 1
 #define HAVE_PTHREAD_MUTEXATTR_SETPROTOCOL 1
 #define HAVE_PTHREAD_MUTEXATTR_SETTYPE 1
 #define HAVE_PTHREAD_MUTEXATTR_SETPSHARED 1
 #define HAVE_SEM_INIT 1
 #define ENABLE_GTK 1
 #define STDC_HEADERS 1
 #define HAVE_STDLIB_H 1
 #define HAVE_STDINT_H 1
 #define HAVE_UNISTD_H 1
 #define HAVE_FCNTL_H 1
 #define HAVE_SYS_TYPES_H 1
 #define HAVE_SYS_TIME_H 1
 #define HAVE_SYS_MMAN_H 1
 #define HAVE_READLINE_READLINE_H 1
 #define HAVE_READLINE_HISTORY_H 1
 #define HAVE_SYS_SOCKET_H 1
 #define HAVE_SYS_IOCTL_H 1
 #define HAVE_SYS_BITYPES_H 1
 #define HAVE_SYS_WAIT_H 1
 #define HAVE_SYS_POLL_H 1
 #define HAVE_SYS_SELECT_H 1
 #define HAVE_ARPA_INET_H 1
 #define HAVE_LINUX_IF_H 1
 #define HAVE_LINUX_IF_TUN_H 1
 #define HAVE_NET_IF_H 1
 #define SIZEOF_SHORT 2
 #define SIZEOF_INT 4
 #define SIZEOF_LONG 8
 #define SIZEOF_LONG_LONG 8
 #define SIZEOF_FLOAT 4
 #define SIZEOF_DOUBLE 8
 #define SIZEOF_LONG_DOUBLE 16
 #define SIZEOF_VOID_P 8
 #define HAVE_LOFF_T 1
 #define HAVE_CADDR_T 1
 #define RETSIGTYPE void
 #define TIME_WITH_SYS_TIME 1
 #define HAVE_STRDUP 1
 #define HAVE_STRERROR 1
 #define HAVE_CFMAKERAW 1
 #define HAVE_CLOCK_GETTIME 1
 #define HAVE_TIMER_CREATE 1
 #define HAVE_SIGACTION 1
 #define HAVE_SIGNAL 1
 #define HAVE_MMAP 1
 #define HAVE_MPROTECT 1
 #define HAVE_MUNMAP 1
 #define HAVE_POLL 1
 #define HAVE_INET_ATON 1
 #define HAVE_STRINGS_H 1
 #define HAVE_SYS_STAT_H 1
 #define HAVE_PTY_H 1
 #define HAVE_VHANGUP 1
 #define HAVE_SLIRP 1
 #define USE_SDL 1
 #define USE_SDL_VIDEO 1
 #define USE_SDL_AUDIO 1
 #define ENABLE_TUNTAP 1
 #define HAVE_MMAP_VM 1
 #define HAVE_MMAP_ANON 1
 #define HAVE_MMAP_ANONYMOUS 1
 #define HAVE_MMAP_VM 1
 /* end confdefs.h.  */

#include "/home/Ricky/repo/github/macemu/BasiliskII/src/CrossPlatform/vm_alloc.cpp"
#include <stdio.h>
    int main(void) { /* returns 0 if we could map the lowmem globals */
      volatile char * lm = 0;
      if (vm_init() < 0) exit(1);
      if (vm_acquire_fixed(0, 0x2000) < 0) exit(1);
      lm[0] = 'z';
      if (vm_release((char *)lm, 0x2000) < 0) exit(1);
      vm_exit(); exit(0);
    }

Run the following to test run. I got illegal instruction error, instead of segfault.

g++ -o conftest -g -O2 -I/usr/include/SDL -D_GNU_SOURCE=1 -D_REENTRANT    conftest.cpp -lm -lrt -lrt  -lSDL -lpthread
./conftest
Illegal instruction (core dumped)

Here is the more interesting finding. I run gdb and found that it failed at lm[0] = ‘z’. After disassembling the binary, I found that it is GCC optimizations fault.

.text:00000000004006C0                 call    _Z16vm_acquire_fixedPvmi ; vm_acquire_fixed(void *,ulong,int)
.text:00000000004006C5                 test    eax, eax
.text:00000000004006C7                 js      short loc_4006D3
.text:00000000004006C9                 mov     byte ptr ds:0, 0
.text:00000000004006D1                 ud2

GCC thought that since it is NULL pointer, there is no point to do anything. That’s why you use ud2 instruction.

If you remove -O2 optimization flag in compilation, you will get a successful run.

g++ -o conftest -g -I/usr/include/SDL -D_GNU_SOURCE=1 -D_REENTRANT    conftest.cpp -lm -lrt -lrt  -lSDL -lpthread
./conftest

Here is the assembly from non-optimization binary:

.text:0000000000400A8F loc_400A8F:                             ; CODE XREF: main+1Aj
.text:0000000000400A8F                 mov     edx, 2
.text:0000000000400A94                 mov     esi, 2000h
.text:0000000000400A99                 mov     edi, 0
.text:0000000000400A9E                 call    _Z16vm_acquire_fixedPvmi ; vm_acquire_fixed(void *,ulong,int)
.text:0000000000400AA3                 shr     eax, 1Fh
.text:0000000000400AA6                 test    al, al
.text:0000000000400AA8                 jz      short loc_400AB4
.text:0000000000400AAA                 mov     edi, 1          ; status
.text:0000000000400AAF                 call    _exit
.text:0000000000400AB4 ; ---------------------------------------------------------------------------
.text:0000000000400AB4
.text:0000000000400AB4 loc_400AB4:                             ; CODE XREF: main+3Fj
.text:0000000000400AB4                 mov     rax, [rbp+var_8]
.text:0000000000400AB8                 mov     byte ptr [rax], 7Ah ; Ricky comment - 7Ah is ‘z’ ASCII
.text:0000000000400ABB                 mov     rax, [rbp+var_8]

Therefore, to workaround this you need to add compile flag as -O0

CFLAGS=-O0;CPPFLAGS=-O0 ./configure --enable-sdl-video --enable-sdl-audio --disable-jit-compiler --with-x --with-gtk --enable-addressing=real
…
Assembly optimizations ................. : x86-64
Addressing mode ........................ : real
Bad memory access recovery type ........ : siginfo


...

Build and run BasiliskII in real mode

[Ricky@gtx zero_mem]$ ps -A | grep BasiliskII
30607 pts/2    00:00:00 BasiliskII
[Ricky@gtx zero_mem]$ pmap 30607
30607:   ./BasiliskII
0000000000000000 1048576K rw---   [ anon ]
0000000040425000     64K rw---   [ anon ]
0000000040435000   1896K r----   [ anon ]

Finally, we can make it access to page zero with proper workaround.

Virtual Addressing

TODO

Static Analysis

TODO

Dynamic Analysis

TODO

Bibliography

  1. Virtual Memory and Linux
  2. Motorola_68000_series#Feature_map
  3. Explanation on VOSF from Basilisk II devel mailing list
  4. Linux mmap function
  5. Linker Script Guide