Hmm, yeah I don't know. This reads like a lot of fluff or immediately unimportan...

boricj · on July 25, 2024

Applied reverse-engineering is all about bending the rules of engineering. Because of this, I think it can be learned through experience, but I doubt it can be taught through theory (or at least not in an effective manner). At its core, it's about spotting metapatterns to gain an understanding of a program and applying leverage to affect it. That's more art than science, no matter how much tooling you throw at it.

Honestly, I think the most effective way to learn about how to reverse-engineer something is to learn engineering at the same layer first and then start tinkering. If you want to binary patch a program, learn assembly. If you want to inject a .dll, learn how to write and use dynamic libraries. If you want to MITM a REST API, learn how to call a REST API. Because once you know the rules well, you can start breaking them and see exactly how much you can get away with.

I wrote a series of articles on reverse-engineering on my blog, about studying and modifying a program that outputs an ASCII table, mostly because I needed a way to introduce delinking as a technique. I would not say it's good, but it starts with how to build the case study and then it handholds the reader through the meat and potatoes.

PennRobotics · on July 26, 2024

This. There's a lot to be said about understanding registers and assembly and different languages and how a USB packet is constructed, but efficiency in reverse engineering comes down to effective pattern recognition.

A binary is likely to have a reasonable amount of often-called code for memory operations (memset, memcpy, strcat, strlen, sscanf, log) and a lot of library code (Flexcomm_Init, Clock_AttachClk, SPI1_Handler, NVIC_EnableIRQ) and then probably fairly little actual application code. For Ghidra users, being able to ignore the boilerplate (mem and BSP code) and quickly find and analyze the application code saves a TON of time.

(Conversely, if I know a binary is written using FreeRTOS, finding the task creation function would be my first step, as this reveals nearly all of the application code.)

There are techniques to help (setting a flash memory region as non-write so string references are recognized and disassembled correctly, loading a chip SVD so all the library code is more obvious) but those come with experience or a good hands-on tutorial, and they still won't tell you everything about the application code.

In my own breakdown of one Cortex-M binary (bare metal, no objects known) the only reason I was able to get the firmware in the first place was by noticing and decoding a base64 string in an unpacked Electron app used for USB communication with the device. This ended up holding plaintext credentials for their update server which had two channels: one for encrypted production binaries and the other for unencrypted development binaries.

In this specific case, it helped to know what base64 looks like, but that's like how knowing different methods of slicing onions might help you figure out a recipe by tasting a cooked meal. Very often such background knowledge is irrelevant. Once in a while it will be the only realistic way forward.

palata · on July 25, 2024

> I wrote a series of articles on reverse-engineering on my blog, about studying and modifying a program that outputs an ASCII table,

Would you mind sharing the links? I would be interested!

boricj · on July 25, 2024

You can find the table of contents for the series there: https://boricj.net/reverse-engineering/2023/05/01/introducti...

I expect that you'll be mostly interested in parts 2 through 6. Part 1 explains how a toolchain works in general (so mostly CS 101 stuff as the OP put it). Parts 7 to 10 demonstrates the delinking technique by easing into it, a technique which is as powerful as it is esoteric, but probably not what you're looking for in a beginner's guide.

darby_nine · on July 25, 2024

> I'm curious if there's any reading out there that covers this stuff from the meat and potatoes

In my experience using radare2 to peek at the code is pretty much the meat and potatoes of reverse engineering binaries and far from "CS 101 stuff". You certainly don't need to modify a binary to MITM an API or inspect/alter packets or inject code via dynamic loading; nor is it the most convenient or clean or easy to maintain way to do so.

Secondly, this is a shockingly dismissive attitude for such a large resource. It took me a few minutes to just read through the table of contents.

andrewmcwatters · on July 25, 2024

Just because it's large doesn't mean it's relevant: using radare2, IDA Pro, or some other tool doesn't mean you're going to be able to do anything besides look at a binary.

I mean, you said you read the table of contents, yeah? Doing the same thing across different CPU architectures isn't doing something at length, it's just doing the same thing over and over again in rhymes.

In practice, yeah, people in the wild are absolutely modifying binaries, injecting, stubbing .dlls and redirecting calls, or creating proxy servers that alter payloads, for sure.

Learning how to compile a program isn't exactly reverse engineering worthy content to write about.

acureau · on July 25, 2024

I disagree, learning how to compile a program is a prime example of something you'd want in a book about reverse engineering "for everyone". A book which focuses only on specific methods of changing software behavior would be useful only to those who know how to understand said software. In fact the term "reverse engineering" itself does not imply modification at all.

darby_nine · on July 25, 2024

> Just because it's large doesn't mean it's relevant: using radare2, IDA Pro, or some other tool doesn't mean you're going to be able to do anything besides look at a binary.

Looking at a binary is like 99% of the work, though. Or at least looking at some secondary form of it (e.g. assembly, decompilation, etc). Tools are absolutely critical to the work.

> people in the wild are absolutely modifying binaries, injecting, stubbing .dlls and redirecting calls, or creating proxy servers that alter payloads, for sure

I would call modifying a binary "cracking" it but it's been a few decades since I was involved in that scene. I also think that the topic is large enough to warrant multiple focuses—to me, at least, writing a MITM server is much more trivial than extracting a private key from a binary (or a running process) that makes that MITM server functionally useful.

> Learning how to compile a program isn't exactly reverse engineering worthy content to write about.

That's a disingenuous characterization of most of the content here. Coding at the instruction level requires a different way of reading and writing code than you're otherwise exposed to. Most programmers aren't used to handling bits directly, and certainly not to the extent that it rewards you at the instruction level for learning and knowing. With the tools here you can, in fact, sit down and inspect the license verification function of a piece of software (although I'm not sure how much that's true or beneficial these days with code-signing etc).

EDIT: Or you could do what I did and work with as, `otool`, and a hex editor, and learn extremely slowly & painfully why custom-built reverse engineering tools are so valuable to learn.

There's always more to learn, of course, but that's no reason to belittle what you've already learned and other people still have yet to learn.

andrewmcwatters · on July 25, 2024

Yeah, I'm sure what I'm saying probably comes off as belittling, but that's not my intent. It's just more productive to understand who the audience is. The author write "free PDF" content with Guy Fawkes mask header images in the README.mds.

If you're going to target script kiddies, at least show them how to Hello, World! from a DLL_PROCESS_ATTACH, and then teach them sigscanning.

jonpalmisc · on July 25, 2024

Resources exist, but are only so helpful IMO.

One can't necessarily build an airplane after watching a documentary on it.

Even if there was some "bible" on it, reverse engineering is one of those things that you have to put the reps in for to get good at it and actually develop understanding.

The "bible" is tackling reverse-engineering related projects independently over the course of months/years and picking up knowledge along the way.

Starting with something like cracking software (and making increasingly-advanced cracks) is always my advice for beginners.

xkcd-sucks · on July 25, 2024

> places where you see reverse engineering used, usually to modify existing software.

funnily enough I have a team reverse engineering binary data formats, which is often more easily accomplished by other means + only dropping down to the disassembly/decompilation where absolutely necessary. and which as far as I am aware never involves binary patching

but yeah about the article it seems like if you know this much about assembly / chips etc. to be able to read it, then general problem solving ability should be able to cover most of the article's content