Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

VAX qe interface has problems on macOS, when compiled with -O1 or -O2, works when DEBUG=1 is set #387

Open
NCommander opened this issue Jun 9, 2024 · 5 comments

Comments

@NCommander
Copy link

  • Context

Using the network interface on OpenBSD or NetBSD mostly works, on a stock OpenSIMH build, but has problems under heavy load, or with certain small packets such as ICMP ping.

PING 4.2.2.2 (4.2.2.2): 56 data bytes
64 bytes from 4.2.2.2: icmp_seq=0 ttl=255 time=769335973.549 ms
wrong data byte #8 should be 0x8 but was 0xc0
        38 1 f6 52 4 2 2 2 c0 a8 0 65 0 0 57 54 b8 74 0 0 66 63 b7 e5 0 b e6 df 8 9 a b 
        c d e f 10 11 12 13 14 15 16 17 18 19 1a 1b 
64 bytes from 4.2.2.2: icmp_seq=0 ttl=255 time=769337983.549 ms (DUP!)
wrong data byte #8 should be 0x8 but was 0xc0
        38 1 f6 51 4 2 2 2 c0 a8 0 65 0 0 30 42 b8 74 0 1 66 63 b7 e6 0 c d ef 8 9 a b 
        c d e f 10 11 12 13 14 15 16 17 18 19 1a 1b 
--- 4.2.2.2 ping statistics ---
2 packets transmitted, 1 packets received, 1 duplicates, 50.0% packet loss
round-trip min/avg/max/std-dev = 999999.999/769336978.549/769337983.549/1005.000 ms

# uname -a
OpenBSD vax55.my.domain 5.5 kbuild#7 vax

This problem was discovered running distcc from OpenBSD in an effort to rebuild the OS a bit faster, but shows up in other network related ways. I'm using NAT attachment on macOS. I get similar network weirdness in NetBSD, although I haven't had a network delock on NetBSD like I have had in multiple versions of OpenBSD.

Compiling Open SIMH was DEBUG=1 causes the network stack to behave.

I tried running SIMH on NetBSD/arm64 as an additional data point, but it kept hanging trying to load the KA655 self-test.

  • the output of "sim> SHOW VERSION" while running the simulator which is having the issue

sim> show version
MicroVAX 3900 simulator Open SIMH V4.1-0 Current
    Simulator Framework Capabilities:
        64b data
        64b addresses
        Threaded Ethernet Packet transports:PCAP:NAT:UDP
        Idle/Throttling support is available
        Virtual Hard Disk (VHD) support
        RAW disk and CD/DVD ROM support
        Asynchronous I/O support (Lock free asynchronous event queue)
        Asynchronous Clock support
        FrontPanel API Version 12
    Host Platform:
        Compiler: GCC Apple LLVM 14.0.0 (clang-1400.0.29.202)
        Simulator Compiled as C (Release Build) on Nov 26 2023 at 20:18:44
        Build Tool: simh-makefile
        Memory Access: Little Endian
        Memory Pointer Size: 64 bits
        Large File (>2GB) support
        SDL Video support: No Video Support
        No RegEx support for EXPECT commands
        OS clock resolution: 1ms
        Time taken by msleep(1): 1ms
        Ethernet packet info: libpcap version 1.10.1
        OS: Darwin infinityway.local 23.4.0 Darwin Kernel Version 23.4.0: Fri Mar 15 00:19:22 PDT 2024; root:xnu-10063.101.17~1/RELEASE_ARM64_T8112 x86_64
        Processor Name: Apple M2
        tar tool: bsdtar 3.5.3 - libarchive 3.5.3 zlib/1.2.12 liblzma/5.4.3 bz2lib/1.0.8
        curl tool: curl 8.4.0 (x86_64-apple-darwin23.0) libcurl/8.4.0 (SecureTransport) LibreSSL/3.3.6 zlib/1.2.12 nghttp2/1.58.0
        git commit id: 2396fd03
        git commit time: 2023-11-15T20:36:10-0500
sim> 
  • how you built the simulator or that you're using prebuilt binaries

Pull current git master, make DEBUG=1, load up my OpenBSD image, see network problems go away. Building with standard make or make OPTIMIZE=-O1 still has broken network stack

  • the simulator configuration file (or commands) which were used when the problem occurred.

ncommander@infinityway ~/SIMH/OpenBSD/5.5
 % cat openbsd-boot-vax 
#set cpu autoboot
set cpu simhalt
set cpu 64m 
attach nvr openbsd-vax-nvram.img
set rq0 ra92
attach rq0 openbsd-vax-hdd-root.img
set rq1 cdrom
attach rq1 install55.iso
set rq2 ra92
attach rq2 opensbd-vax-src-build.img
set xq type=delqa
attach xq nat:
boot cpu
  • the expected behavior and the actual behavior

I can have an optimized SIMH and working network at once.

@pkoning2
Copy link
Member

pkoning2 commented Sep 4, 2024

I'm confused. The title speaks of Mac OS, but the description keeps talking about OpenBSD.

Could you explain more precisely what OS you're building on, with what tools, and what network option (pcap, vde, etc.). What simulator are you building, and what are you running on that simulator, and what is the test that demonstrates the issue?

@NCommander
Copy link
Author

I'm building on Mac OS, with the MicroVAX 3900 simulator, running OpenBSD 5.8/vax in the simulator.

The problem is that when SIMH is built with -O2, the qe driver breaks, with the OpenBSD kernel messages put above. I'm using distcc in OpenBSD/vax to handle compiling the base system. The errors generally start after a few miutes of network activity, and then networking stops working in the simulator entirely.

The problem largely seems to happen when there are a large number of TCP/IP connections open and closing within a short period, but I'm not sure if that's what actually triggering it.

Building SIMH with -O0 causes the network within OpenBSD/vax to work correctly, and I can use distcc with multiple simulators to great effect. From the behavior, and messages I'm seeing, this feels like an alignment issue, but it only happens when the system is under load.

@pkoning2
Copy link
Member

pkoning2 commented Sep 4, 2024

Thanks. One possible answer is that it's an OpenBSD issue related to timing of the emulated device. The emulator runs much faster than the real hardware, so if there are timing bugs they can appear in simulation even if the bug is impossible to reach on the original hardware.

@hbent
Copy link

hbent commented Sep 6, 2024

I agree that unless this can be replicated on real hardware, it isn't a SIMH issue. I haven't seen any similar issues with Ultrix, 4.2BSD, NetBSD, etc. and I've used all of them in a fairly heavily loaded way for extended periods of time.

@NCommander
Copy link
Author

I've had similar problems with NetBSD as said above; the same fix, compiling DEBUG=1/-O0 resolved it there. So I can reproduce this with different OSes in SIMH.

I'll test more when I get a chance, but from memory, I did have stable network on SIMH with the same OS image on different host platforms; I only had broken networking on macOS as a host platform.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants