Accidentally Reverse Engineering Things

I started back at uni this week. In studio (my favourite class by far) we’re starting to get into the nitty gritty engine stuff that I love so much. Stuff like graphics, and AI. In fact, in a few weeks time we will be setting our own AI bots against each others bots in a virtual ring to fight to death. Our facilitator handed out said arenas so we could write our bots and actually test them.

The Problem

There was no source; the program was a windows executable compiled in VS2103; and the bots had to come in the form of a dll. Usually my solution to this sort of dilemma would be using a cross-compiler in linux but that wouldn’t suffice in this situation for a few reasons.

  1. I needed to run the program to test my bot. Wine might have been a possibility except for the fact that –
  2. gcc based compilers (to my knowledge) can’t generate actual dlls. This also meant that mingw wasn’t a solution.

Fortunately I DO have windows dual-booted, but my strong opinions about not taking 10 minutes to start an IDE and my great fondness of c++11 features (of which, despite what I’ve heard, some are STILL missing from VS2013), meant that visual studio was never an option for me. I figured, if I couldn’t write AI in my favourite OS, the next best thing would be to write it in my current favourite language.

The Solution

That’s right. I decided to write D bindings to the arena program’s bot interface. Not exactly the most straightforward and easy approach, but it’s the one I’d be happiest using. The reason I chose to use D, other than the fact that it’s a breeze to use, is because of it’s limited capacity to interface directly with C++, which includes being able to pass some objects between them and call their methods as if they were native classes. This is important because the arena requires that the bots are classes. For the time being I’m using an intermediate C++ binding layer because some of the parameter types and structures aren’t directly supported in D. For example, I have to map std::vectors to dynamic arrays and get std::strings internals. I’m compiling the intermediate binding layer with the Digital Mars C++ compiler, which is NOT very good. It’s missing a lot of c++11 features and stalls whenever I use nullptr. But it came in a zip, ready to go so it’s staying for now.

The Discoveries

Now this is the first time I’ve ever written or compiled a DLL, or had to interface with anything that I didn’t write. One of the first problems to arise was the fact that the arena program passed data about the world to the bots through either references or const references to structs (which for the most part were easy to duplicate in D). D doesn’t support references. So the solution was taking addresses in the intermediate stage and passing them to D instead. Next, some structs had std::vectors and std::strings. For these, I created new structs in which the (important) std::vectors where transformed into pointer-length pairs, and the (important) std::strings where just transformed into const char*s. Then I hit my first problem.

Every time I tried printing strings, either in the intermediate layer or in the D part, everything would crash. So I stopped trying to print them for the time being and tried doing other things. Then I discovered that everything that followed a string was corrupted. I spent a good few hours trying different things but nothing seemed to work. But then I had an epiphany and after a quick google search discovered the cause of these two strange problems. See, the arena program was passing structs that contained std::strings, std::vectors, floats, some other structs, pointers, so on and so forth, to my intermediate C++ layer. For the most part that was okay. What wasn’t okay was the fact that those structs were using the Visual Studio 2013 standard template library, and I was using some other STL provided by Digital Mars. My epiphany was that the only part of the c++ STL that’s standardized is the interface. It’s completely up to vendors (to my knowledge) how they implement the classes therein. My google search came up with this, which confirmed all my suspicions. What was happening was the difference between what the arena provided and what my binding layer expected in terms of data layout was large enough that I was getting bits of string in my floats (literally).

My initial solution to this was to replace all std::strings in the struct definitions in my binding layer with 24 byte arrays. And this fixed it. My floats were floats again. Then I started experimenting a bit and found that treating the byte array as a character array and printing it actually worked. I found this puzzling as this meant that the string was being stored in the struct itself, where I assumed that it would have been stored behind a pointer. To confirm this, I dumped the byte array to console to figure out what was going on. The string was indeed being flattened. Note: I also discovered how good printf is for making tables of things.

std::string dumpThere in the first 5 bytes was the name of the map that the bot was in. Then at the 16th byte offset, the length of the string. Then at the 20th byte offset, what seems to be the capacity of the buffer excluding the null terminator. To see what would happen with a larger string, I renamed the map. And this was the result.

std::string dump 2

Unlike the last dump, the string didn’t appear to be flattened. This time, the first four bytes seemed to contain a pointer instead. The 16th byte was the same as the size of the string so that was definitely where the length is stored. And the 20th byte has doubled (if you account for the zero terminator), which is behaviour that one might expect from an STL container. This was the behaviour that I expected initially. I did one more test because why not.

std::string size 15 dumpI made the string one byte shorter and there it was again. The string was flattened. So since I’d figured out the behaviour of std::string (enough to not make it crash when I use it), I thought I’d put that knowledge into practice and wrote a struct that matched the layout of std::string and wrote a function that returned a D string constructed from the write data.

Not exactly the most complicated code in the world but I’m pretty happy that I managed to figure out why my float was wrong. Maybe I’ll try vector next. Or maybe I’ll try writing a bot. Who knows.





Hello, I’m Patrick Monaghan, a games programmer in training. The purpose of this blog is to track the progress of my various projects relating to game development and various experiments I decide to try.A little bit about myself, I'm fluent in c++. I'm familiar with a few other languages also. I have a working knowledge of blender and gimp, thus can pump out programmer art like it’s going out of style. I’m always willing to learn new things and I always welcome feedback if it means I can improve. Also, here's a youtube channel And my github And my soundcloud Also I'm @_manpat on twitter Finally here's a todo list that I'm putting here so I can't ignore it

Leave a Reply

Your email address will not be published. Required fields are marked *