A sub–cycle-accurate Nintendo Entertainment System emulator.
Shawn (L. Spiro) Wilcoxen
A “sub–cycle-accurate” Nintendo Entertainment System emulator with the goal of being as authentic of an experience as possible. It should look, sound, and feel like real hardware, with convincing visuals, clean and accurate audio, and real-time input response. No visual or audible delays. BeesNES also represents the under-served regions with support for a wide range of console variants, currently including NTSC, PAL, Dendy, PAL-M, and PAL-N.
RF Cables:
Composite:
HDMI:
HDMI Mod:
PAL (Composite:):
Dendy (Composite):
PAL-M (Brazil) (Composite):
PAL-N (Argentina) (Composite):
The top is a hardware reference recording in a test ROM. The bottom is the BeesNES audio output for the same test ROM. Zooming in on the highest frequencies reveals that BeesNES audio is crisp and clean, with high-frequency aliasing completely eliminated. All 3 images showcase the accuracy of the audio. Listen to MDFourier Test Audio
YouTube Video: Castlevania Demo Play (Low Noise)
YouTube Video: 1943: The Battle of Midway (RF Cables)
YouTube Video: Probotector (PAL, Composite Cables)
YouTube Video: Battletoads Opening (Extreme Noise)
YouTube Video: Double Dragon (Composite Cables)
YouTube Video: Akira Opening (Extreme Noise)
YouTube Video: Battletoads (RF Cables)
NTSC-CRT library: https://github.com/LMP88959/NTSC-CRT
PAL-CRT library: https://github.com/LMP88959/PAL-CRT
Persune palgen: https://github.com/Gumball2415/palgen-persune
We are aiming for “Sub-Cycle Accuracy”: https://emulation.gametechwiki.com/index.php/Emulation_accuracy#Subcycle_accuracy
This means that multi-byte writes are correctly partitioned across cycles and partial data updates are possible, allowing for the more esoteric features of the system to be accurately emulated. This means we should be able to support interrupt hijacking and any other cases that rely heavily on the cycle timing of the system.
Additional options/features to facilitate accurate emulation:
- Start-Up: Start from known state or from random state. Helps the random seed in some games.
- Hardware bugs will be emulated in both their buggy and fixed states (OAMADDR bugs (writing fewer than 8 bytes on the 2C02G) are examples of this).
- Unofficial opcodes used by games will be optionally supported.
- The bus will be open and correctly maintain the last floating read.
- Etc.
If behavior differes from the actual hardware result, it is considered a bug. Hacks are to be avoided as much as possible.
The CPU should be completely sub–cycle-accurate, as every individual cycle is documented there. The same should apply to the PPU (questions surround PAL differences at the cycle level) and the APU.
Timing is not based off audio or monitor refresh rates as is done in many emulators. We use a real clock (with at-minimum microsecond accuracy) and match real timings to real time units, which we can speed up and slow down as options. The NTSC version’s CPU will need to pump out ~29,780.506887 cycles per frame at 60.098814 FPS, while the PAL will need to pump out ~33,247.485977 cycles at 50.006979 FPS. This means there is no noticeable visual delay (rendered frames are presented essentially immediately, rather than waiting for a monitor refresh, doing a frame’s worth of work, and then providing the visible frame after a delay) and that input is polled with exactly the same timing as in a real console, eliminating all input lag. It should both look and feel like a real console, with responsive controls that feel identical to how they do on real machines.
There were initially some concerns that being sub–cycle-accurate would mean extra overhead—other emulators may skip useless redundant opcode fetches, but not here, and each fetch is accompanied by an entire CPU tick and all the work that goes into updating the CPU state, etc. For this reason, most systems were implemented in an entirely branchless fashion—there are no “if”/“else” statements, “%” operations, “&” operations, “>=”/“<” checks, etc. when accessing memory; address mirroring, address mapping to registers, etc., is all handled entirely without branching, and most CPU, PPU, and APU cycles are branchless as well. This more-than made up for the cycle-accuracy overhead.
My custom filters and custom image-resizing routines are AVX/SSE-enhanced, and AVX/SSE is also used to put heavy work into audio processing while remaining blazingly fast. On my laptop, the authentic CRT filters with 100% clean audio can run at 90 FPS, while the L. Spiro filters can run at 120-144 FPS. Even though max settings are still able to cleanly maintain 60.098… FPS, both audio and video can be reduced in quality to run even faster. Low-power machines should have no problems running BeesNES, and this is all still running in software. GPU support will eventually make everything even faster.
Other features will include:
- A debugger.
- A disassembler.
- An assembler.
- 1-877-Tools-4-TAS.
-
- Stepping and keylogging.
-
- Movie-making.
BeesNES does not use any 3rd-party libraries outside of OpenAL. Simply install the OpenAL SDK and BeesNES should build without a problem. Microsoft Visual Studio Community 2022 (64-bit) - Current Version 17.4.4