diff --git a/ReverseEngineering/GhidraGuideResources/ctor_arr.png b/ReverseEngineering/GhidraGuideResources/ctor_arr.png new file mode 100644 index 0000000..ff56de6 Binary files /dev/null and b/ReverseEngineering/GhidraGuideResources/ctor_arr.png differ diff --git a/ReverseEngineering/GhidraGuideResources/ctor_initializer_window.png b/ReverseEngineering/GhidraGuideResources/ctor_initializer_window.png new file mode 100644 index 0000000..e69e942 Binary files /dev/null and b/ReverseEngineering/GhidraGuideResources/ctor_initializer_window.png differ diff --git a/ReverseEngineering/GhidraGuideResources/final_sram_layout.png b/ReverseEngineering/GhidraGuideResources/final_sram_layout.png new file mode 100644 index 0000000..8774573 Binary files /dev/null and b/ReverseEngineering/GhidraGuideResources/final_sram_layout.png differ diff --git a/ReverseEngineering/GhidraGuideResources/ghidra_main_window.png b/ReverseEngineering/GhidraGuideResources/ghidra_main_window.png new file mode 100644 index 0000000..53016f2 Binary files /dev/null and b/ReverseEngineering/GhidraGuideResources/ghidra_main_window.png differ diff --git a/ReverseEngineering/GhidraGuideResources/import_window.png b/ReverseEngineering/GhidraGuideResources/import_window.png new file mode 100644 index 0000000..843357f Binary files /dev/null and b/ReverseEngineering/GhidraGuideResources/import_window.png differ diff --git a/ReverseEngineering/GhidraGuideResources/memory_layout_icon.png b/ReverseEngineering/GhidraGuideResources/memory_layout_icon.png new file mode 100644 index 0000000..b100731 Binary files /dev/null and b/ReverseEngineering/GhidraGuideResources/memory_layout_icon.png differ diff --git a/ReverseEngineering/GhidraGuideResources/reset_unnamed.png b/ReverseEngineering/GhidraGuideResources/reset_unnamed.png new file mode 100644 index 0000000..6bcac8c Binary files /dev/null and b/ReverseEngineering/GhidraGuideResources/reset_unnamed.png differ diff --git a/ReverseEngineering/GhidraGuideResources/script_dirs.png b/ReverseEngineering/GhidraGuideResources/script_dirs.png new file mode 100644 index 0000000..88936f6 Binary files /dev/null and b/ReverseEngineering/GhidraGuideResources/script_dirs.png differ diff --git a/ReverseEngineering/GhidraGuideResources/script_manager.png b/ReverseEngineering/GhidraGuideResources/script_manager.png new file mode 100644 index 0000000..6d72bbc Binary files /dev/null and b/ReverseEngineering/GhidraGuideResources/script_manager.png differ diff --git a/ReverseEngineering/GhidraGuideResources/split_block.png b/ReverseEngineering/GhidraGuideResources/split_block.png new file mode 100644 index 0000000..7b13353 Binary files /dev/null and b/ReverseEngineering/GhidraGuideResources/split_block.png differ diff --git a/ReverseEngineering/LPC11U37F_Software.md b/ReverseEngineering/LPC11U37F_Software.md index a6991b7..8e9cd4f 100644 --- a/ReverseEngineering/LPC11U37F_Software.md +++ b/ReverseEngineering/LPC11U37F_Software.md @@ -1,14 +1,32 @@ # LPC11U37F/501 Software The purpose of this document is to track information regarding the software that - runs on the LPC11U37F main/master processor of the controller. The LPC11U37F - is a 32-bit processor ARMv6-M architecture with 16-bit Thumb ISA and includes + runs on the LPC11U37F main/master processor of the controller. The LPC11U37F + is a 32-bit processor ARMv6-M architecture with 16-bit Thumb ISA and includes Thumb-2 technology. +## Firmware primer + +The LPC11U37F binary is split into two parts: The first 0x2000 bytes contain +the bootloader, while the rest contains the firmware. You can find the version +of each component in Steam's controller "Support" screen, next to "bootloader +revision" and "firmware revision". + +The bootloader's job is simple: If the firmware at 0x2000 looks correct (has the +right magic at offset 0x30) and GPREG1 of the CPU isn't set to a magic value, +the bootloader will simply jump to the firmware's entrypoint. However, if the +firmware doesn't have the right magic or GPREG1 has a magic value, the +bootloader will enter programming mode, exposing a HID device that the Steam +desktop software can communicate with to flash a new firmware. + +The firmware's job is basically everything else, from playing the starting +jingle to getting button input and turning them into a HID stream to send over +USB/to the nRF chip. + # Reverse Engineering Artifacts -This section details the results of the reverse engineering effort. These are +This section details the results of the reverse engineering effort. These are the final references that contain data which is the basis for other Subprojects. Now that the artifacts below pertain to the vcf_wired_controller_d0g_57bf5c10.bin @@ -20,7 +38,7 @@ This file contains data on how the simulation has shown the controller to behave in different scenarios. The idea is to capture and identify as many actions as possible (with a focus on finding recurring code and memory usage) so that controller behavior can be understood to the extent and completely custom - firmware can be created. + firmware can be created. ## vcf_wired_controller_d0g_57bf5c10.h @@ -29,11 +47,15 @@ This file tracks unique functions called in vcf_wired_controller_d0g_57bf5c10.c ## vcf_wired_controller_d0g_57bf5c10.mem -The file tracks memory usage and attempts to identify how different section of - memory are used by the firmware. +The file tracks memory usage and attempts to identify how different section of + memory are used by the firmware. +## vcf_wired_controller_d0g_57bf5c10.bootloader.gzf -# Resources +This file is a complete reverse engineering of the bootloader section of the +binary, using [Ghidra]. + +# Resources * [Datasheet](http://www.nxp.com/documents/data_sheet/LPC11U3X.pdf?fasp=1&WT_TYPE=Data%20Sheets&WT_VENDOR=FREESCALE&WT_FILE_FORMAT=pdf&WT_ASSET=Documentation&fileExt=.pdf) @@ -44,12 +66,12 @@ The file tracks memory usage and attempts to identify how different section of # Disassembling the Firmware -This section details the approaches attempted and ultimately used to reverse +This section details the approaches attempted and ultimately used to reverse engineer the Steam Controller firmware running on the LPC11U37. ## [pinkySim](https://github.com/greggersaurus/pinkySim) -This is the primary method used for simulating the firmware and deconstructing +This is the primary method used for simulating the firmware and deconstructing its behavior. * Note: Need to build pinkySim specifically with command "make pinkySim" as building unit tests fails... @@ -68,13 +90,13 @@ This is the primary method used for simulating the firmware and deconstructing ### Simulation Steps -#### Launch Simulator +#### Launch Simulator -The following command launches the emulator with the proper memory map for the +The following command launches the emulator with the proper memory map for the LPC11U37F501, has the emulator halt on the first instruction to execute after reset and instructs pinkySim to log all instructions executed to a file named exeLog{timestmap}.csv. exeLog{timestmap}.c is also created, which makes a simple - attempt at a C-like decomposition of the simulated actions. + attempt at a C-like decomposition of the simulated actions. * ./pinkySim --breakOnStart --logExe LPC11U37 --flash 0 131072 --ram 268435456 8192 --ram 536805376 16384 --ram 536870912 2048 --ram 536887296 2048 --ram 1073741824 16384 --ram 1073758208 16384 --ram 1073774592 16384 --ram 1073790976 16384 --ram 1073807360 16384 --ram 1073823744 16384 --ram 1073840128 16384 --ram 1073856512 16384 --ram 1073971200 16384 --ram 1073987584 16384 --ram 1074003968 16384 --ram 1074020352 16384 --ram 1074036736 16384 --ram 1074053120 16384 --ram 1074102272 16384 --ram 1074118656 16384 --ram 1074135040 16384 --ram 1074266112 16384 --ram 1342177280 16484 --ram 3758096384 1048576 firmware.bin * Note: --ram 536805376 16384 --flash, but since we need to fill this ROM with the boot ROM code via gdb, this needs to be writable @@ -99,64 +121,71 @@ This section details different paths from which we may want to simulate the firm #### Initialization By default, simulation starts at instruction specified via Vector Table RESET entry point. - + Simulation is allowed to run with minimal intervention (i.e. loops waiting for PLLs to lock or other hardware reactions are simulated as needed, pauses are - made to adjust values "read" from EEPROM). + made to adjust values "read" from EEPROM). Unknown paths are identified to be revisited lated with further stimuli to simulation runs. These are being marked by TODO: UNKOWN PATHS. Attempts are made being made to identify SRAM0 memory usage. -#### Exceptions +#### Exceptions -This section details attempts to simulate certain exception paths. This is being +This section details attempts to simulate certain exception paths. This is being pursued as Reset/Init path eventually called WFI instruction. This implies interrupts occurring is a necessary part of system boot (i.e. either successful connection or shutdown due to connection timeout). -Simulating an exception can be achieved by setting the PC register to the +Simulating an exception can be achieved by setting the PC register to the instruction specified in the Vector Table. However, keep in mind the nuances - of how IRQ actually work and that simply setting the PC will allow you to - simulate the interrupt accurately, but may be desctructive in terms of picking - back up where the main thread code was interrupted. + of how IRQ actually work and that simply setting the PC will allow you to + simulate the interrupt accurately, but may be desctructive in terms of picking + back up where the main thread code was interrupted. According to 24.3.3.6.1 xPSR, PC, LR, R12, R3, R2, R1 and R0 are saved upon interrupt and restored upon exit. Save and restore these values if you want to pick up the main thread after simulating an interrupt. Also, according 24.3.3.6.1 LR is set to EXC_RETURN upon interrupt entry. Thus a - bx to LR (assuming no further stack pushes (without matching pops) to change - LR), will indicate an interrupt exit. In short, set LR to a know instruction + bx to LR (assuming no further stack pushes (without matching pops) to change + LR), will indicate an interrupt exit. In short, set LR to a know instruction (maybe a WFI instruction or something that will cause an emulator break) before - changing the PC to the interrupt handler so that you know when the interrupt + changing the PC to the interrupt handler so that you know when the interrupt handler is exiting. ## [FirmwareParser.py](./FirmwareParser.py) This is only being used to display the Vector Table. -The original idea was to create disassembler that can recreate assembly file, +The original idea was to create disassembler that can recreate assembly file, distinguishing data versus instructions by evaluating code and all possible -branches. +branches. -In the end this was a larger undertaking than expected. It would make more -sense to leverage pinkySim's ability to decode instructions and their +In the end this was a larger undertaking than expected. It would make more +sense to leverage pinkySim's ability to decode instructions and their behavior to do this, as opposed to starting from scratch. +## [Ghidra] + +Ghidra allows doing static analysis of binaries. We have a tutorial showing how +to load LPC11uxx binaries and start reverse engineering them at +[ghidra_guide.md](guidra_guide.md). Furthermore, you can load the existing gzf +present in this repository in Ghidra to see already-labeled functions. + ## [Reverse Engineering for Beginners](https://github.com/dennis714/RE-for-beginners) -Have not made much use of this yet. +Have not made much use of this yet. Free book geared towards beginners on how to reverse engineer code. ## [Radare](http://www.radare.org/r/) This may be worth learning and using as a supplementary tool (now that time spent - with pinkySim has give me better understanding of assembly flow). + with pinkySim has give me better understanding of assembly flow). -Somewhat different approach than pinkySim in that we are looking at assembly and trying +Somewhat different approach than pinkySim in that we are looking at assembly and trying to assess actions that could be taken and potential purpose, as opposed to focusing on actions taken during simulation and why they were taken (and maybe should not have been). @@ -169,7 +198,7 @@ Somewhat different approach than pinkySim in that we are looking at assembly and * What about #if 0 section with armv6 options? * How to deal with firmware stripped binary format (i.e. no code/data sections information)? * Build up scripting tooling to handle this? (i.e. don't decode instructions we "know" we "cannot reach") - * Scripting to identify code/data semi-automatically? + * Scripting to identify code/data semi-automatically? * Can use flow visualization to identify different sections of code to focus on? * Allows us to see paths we are not taking (and where they might go), rather than just knowing a branch was not taken. @@ -204,7 +233,7 @@ This proved to not be particularly useful for raw stripped binary. This proved to not be particularly useful for raw stripped binary. * ./arm-none-eabi-objdump -b binary -D vcf_wired_controller_d0g_57bf5c10.bin -m arm attempts to disassemble binary file -* With have firmware binary format (i.e. no code/data sections information) this only helps so much. +* With have firmware binary format (i.e. no code/data sections information) this only helps so much. * Simulation with pinkySim can be utilized to possibly rebuild this section information and then revisit this tool? ## [ARMu](http://pel.hu/armu/) @@ -219,3 +248,4 @@ Have not need to look into this much yet. * Worth looking into? +[Ghidra]: https://github.com/NationalSecurityAgency/ghidra \ No newline at end of file diff --git a/ReverseEngineering/README.md b/ReverseEngineering/README.md index f7badd5..a4f981f 100644 --- a/ReverseEngineering/README.md +++ b/ReverseEngineering/README.md @@ -1,16 +1,16 @@ # Reverse Engineering -Welcome to Reverse Engineering Subproject portion the Open Steam Controller effort. - +Welcome to Reverse Engineering Subproject portion the Open Steam Controller effort. + The work in this directory is the result of trying to understand the hardware and its capabilities based on the available resources. In this case the resources available were the circuit board itself and the raw binary firmware for the - controller's main processor (the LPC11U37F). + controller's main processor (the LPC11U37F). There is a lot of really neat and useful information captured here (i.e. the fact that there is a section of EEPROM where Jingle Data can be stored to change the official firmware's default behavior, how the interface to the - Trackpads works) and the result of these efforts are the basis of many of the + Trackpads works) and the result of these efforts are the basis of many of the Subprojects in the Open Steam Controller effort. If anything is unclear, or you think I am not drawing proper attention to feature I have unearthed, please be sure to let me know. @@ -19,7 +19,7 @@ See the sections below for further details on the data that has been captured in regards to the hardware and software, but please note that this is my first attempt at reverse engineering. I may have not gone about this in the most efficient or understandable manner, but I have done my best to make sure the results - are captured concisely so that others may benefit from it. + are captured concisely so that others may benefit from it. # Understanding the Hardware @@ -28,26 +28,29 @@ This section captures details on the controller hardware (i.e. what pins are connected to what peripherals or pins on other chips). This data was sometimes obtained simply by using digital multimeter to ohm out connections. Other times reverse engineering the firmware, or running tests with custom firmware were - required to fully understand how the hardware was designed. + required to fully understand how the hardware was designed. -See [Luna_maiboard_V000456-00_rev3.md](./Luna_maiboard_V000456-00_rev3.md) - for information regarding the Steam Controller hardware pertaining to +See [Luna_maiboard_V000456-00_rev3.md](./Luna_maiboard_V000456-00_rev3.md) + for information regarding the Steam Controller hardware pertaining to Luna_mainboard V000456-00 rev3. # Understanding the Software -This section captures details on the software running on the controller +## PinkySim and Ghidra + +This section captures details on the software running on the controller processors. This data was primarily obtained by using a modified version of [pinkySim](https://github.com/greggersaurus/pinkySim), which allowed for - simulating the main processor (LPC11U37F) and logging relevant actions. + simulating the main processor (LPC11U37F) and logging relevant actions. Verification of different behaviors often required running custom firmware - to ensure the proper paths were being simulated. + to ensure the proper paths were being simulated. A separate effort doing static + analysis using [Ghidra](https://github.com/NationalSecurityAgency/ghidra) is + also described. See [LPC11U37F_Software.md](./LPC11U37F_Software.md) for information regarding the software running on the LPC11U37 main/master processor. - # TODO See [TODO](./TODO.md) for details. diff --git a/ReverseEngineering/ghidra_guide.md b/ReverseEngineering/ghidra_guide.md new file mode 100644 index 0000000..203973f --- /dev/null +++ b/ReverseEngineering/ghidra_guide.md @@ -0,0 +1,165 @@ +# Ghidra binary loading tutorial + +This tutorial will show you how to load a new LPC11 Steam Controller firmware +into Ghidra for reverse engineering purposes. Note that this is mostly +unnecessary if opening a pre-loaded GZF, as those will already be properly +loaded. + +## Loading the binary + +First, you should split the binary into two components. Every Steam Controller +firmware is split in two parts: the bootloader spans the first 0x2000 bytes, +whilst the firmware takes the rest of the file. It is heavily recommended to +only load one or the other into Ghidra - loading both at the same time will lead +Ghidra to struggle on "shared" global variables between the two programs. + +For the purpose of following this tutorial, I highly recommend using the +bootloader, as it's what I used, and loading a firmware requires some complex +memory layout mangling to trick ghidra into being happy. + +To load a binary into Ghidra, you first want to create a new project (which may +contain many different files for reverse engineering). You will then want to +import your firmware binary using File -> Import File. Ghidra will open an +import window. + +![import_window](GhidraGuideResources/import_window.png) + +You will want to set the "Language" to "ARM:LE:32:Cortex:default". This tells +Ghidra to treat the file as an ARM Little Endian binary that targets Cortex-M +devices, and force it to create a vector table at the start of the firmware. + +If loading a firmware and not a bootloader, go in the options and make sure to +set the "Base Address" to 0x2000. + +Once imported, the file will appear in the project. Double click it to open it +in the CodeBrowser, the main Ghidra window. Ghidra will ask to auto-analyse the +project, say yes. Ghidra will spend a little while trying to find code. You can +see its progress in the bottom right corner. Once it is done (shouldn't take +more than a minute), you will start seeing functions showing up in the +"Functions" tab on the left. We can now start decompiling some code. We can +double click on the address under "Reset" to have the asm view travel there. + +![ghidra_main_window](GhidraGuideResources/ghidra_main_window.png) + +# Setting the Memory Layout + +To help Ghidra give us good decompilation output, we need to make it aware of +the memory layout of our binary and the device it runs on. Right now, Ghidra +simply assumes that there is a single region of memory spanning the binary in +RWX mode, and nothing else is mapped. + +The first thing we'll want to do is set the binary to be in RX. We know that the +binary is not directly writable - flash memory tends to need rather complex MMIO +sequences to be written to. To do so, click on the Memory Map button ![that +looks like a PCB](GhidraGuideResources/memory_layout_icon.png) and untick the +"w" on the row named "RAM". While we're at it, we can rename that row to "ROM". + +If working with a firmware, you'll want to create a new section named +"VECTOR_TABLE" at address 0, of size 0xc0, and with the "File Bytes" option +checked. This will duplicate the first 0xc0 bytes of our firmware image (which +contains the vector table) at address 0. + +Next, we can look at the LPC11UXX user guide to find where the RAM region is. +The datasheet tells us that there exists three different RAM regions: + +- SRAM0, from 0x10000000, 0x2000 bytes +- SRAM1, from 0x20000000, 0x2000 bytes +- USB_RAM, from 0x20004000, 0x800 bytes + +Click on the green plus button to add a new memory region. They should be RW, +and left uninitialized for now. + +Once all three areas are mapped, press the save button in the memory layout and +exit it. The next step is to add the MMIO regions. We'll use a fixed version of +the [Ghidra SVD Loader] in order to load SVD definitions for our CPU into +ghidra. `SVD` is an XML file format standardized by ARM to define the MMIO +regions of a CPU, among others. Ghidra SVD Loader will use those files to create +the memory regions and structures necessary to guide the decompiler towards +producing good output for code interacting with those. + +Install the Ghidra SVD Loader by cloning the git repo, going to Ghidra's Script +Manager (![script_manager](GhidraGuideResources/script_manager.png)) and adding +its directory to Ghidra's Script Directories (![script_dirs](GhidraGuideResources/script_dirs.png)). Next, we'll download the SVD definitions for our CPU. We'll +want the [LPC11Uxx_v7.svd] file and the [cm0.svd] file. Once downloaded, look +for SVD-Loader in the script manager, and click the "Run Script" button. It will +ask for a file. We'll first give it the LPC11Uxx_v7.svd file. Once it's done, +run it again, this time providing the CM0.svd file. + +With all this done, we now have a fully imported binary! + +[Ghidra SVD Loader]: https://github.com/roblabla/SVD-Loader-Ghidra/ +[LPC11Uxx_v7.svd]: https://github.com/posborne/cmsis-svd/blob/master/data/NXP/LPC11Uxx_v7.svd +[cm0.svd]: https://github.com/AdaCore/svd2ada/blob/master/CMSIS-SVD/Cortex_M/cm0.svd + + +# Reverse engineering the CRT + +The first thing to reverse engineer will be the CRT init functions - those are +the first couple of functions in the binary, responsible for clearing the BSS, +copying the .data RW segment into RAM, and doing some fancy things like +decompression. + +The Reset function should be calling either one or two functions: The first one, +only present in the bootloader, does some minimal hardware setup, and isn't very +interesting. The second one is the crt0 init function. You should have a +decompilation output looking roughly like this: ![reset_unnamed](GhidraGuideResources/reset_unnamed.png) + +We can see Ghidra is calling function pointers stored in a global. This is the +`crt_init`. We'll create a structure in Ghidra to represent those functions. In +the "Data Type Manager" view (bottom left of the CodeBrowser window), right +click on your project name (here vcf_wired_controller_d0g_57bf5c10) and chose +New -> Structure. You will be presented with a structure editor window. First, +fill out the structure name with "ctor_initializer". Then, fill up the table +to create four fields, such that the editor looks like this: + +![ctor_initializer_window](GhidraGuideResources/ctor_initializer_window.png) + +Save the structure and close the Structure Editor. We can now set the type of +our initializer. Right click the global (`UNK_00000183c`) and chose "Retype +Global". We'll set its type to `ctor_initializer[1]`. If ghidra complains that +there is not enough space, double click on the global, clear any already typed +globals there (select them and press the `C` key), then try again. You can also +rename the global by placing your cursor on top of it and press the `L` key. + +Looking at the while loop, we see that the code will repeatedly call the +initializer on every member of the array. We can easily figure out that there +are two elements to the initializer array, so we can set its type to +`ctor_initializer[2]`. If we double click our CTOR_ARR and expand all of its +field, we should now see something like this: + +![ctor_arr](GhidraGuideResources/ctor_arr.png) + +Now that we have our initializer, the next step is to figure out where they are +and what they do. The first step is to click on both initializer labels and turn +them into functions, and give the parameter names. It'll become pretty obvious +pretty fast that the first initializer is copying the flash into the RAM, while +the second initializer is setting the RAM to 0 without reading the flash. From +this, we can deduce that the first part is the .data, while the second is our +.bss. + +Let's encode this knowledge into the memory layout into our binary! Reopen the +memory layout ![PCB icon](GhidraGuideResources/memory_layout_icon.png) and split +the SRAM0 block using the Split Block icon (![split block](GhidraGuideResources/split_block.png)), such that you now have four blocks: SRAM0.start, SRAM0.data, +SRAM0.bss and SRAM0.end. Then, delete the SRAM0.data - we'll recreate it such +that it is initialized with the data. Create a new block, named SRAM0.data and +spanning the same byte range. But this time, tick the "File Bytes" checkmark, +and fill "File Offset" with the value in `start_flash` of your bss segment +(0x185c in the example). + +The final sram layout should like like this: ![sram_layout](GhidraGuideResources/final_sram_layout.png) + +Note that the firmware binary has more complicated initializer that decompresses +the data. To get those initialized in Ghidra, the easiest way is to use Ghidra's +emulation features - a separate tutorial to do this would be in order however. + +# Tips and Tricks + +The firmware is written using the open source LPCOpen framework. Almost all code +that interacts with MMIO registers is actually a function from this framework, +as such it is generally a good idea to look for an equivalent function there +when naming the functions or trying to figure out the arguments. + +When finding a function that doesn't ever return (for instance, a function doing +an infinite loop), make sure you edit the function and set it as noreturn! This +will prevent Ghidra from "merging" together unrelated functions that come after +a call to a noreturn function! diff --git a/ReverseEngineering/vcf_wired_controller_d0g_57bf5c10.bin.program1.gzf b/ReverseEngineering/vcf_wired_controller_d0g_57bf5c10.bin.program1.gzf new file mode 100644 index 0000000..76540ab Binary files /dev/null and b/ReverseEngineering/vcf_wired_controller_d0g_57bf5c10.bin.program1.gzf differ