Sunday, August 17, 2014

Revisiting Cinepak

While working on an as-of-yet-unnamed engine last year, I realized I needed to dither some videos. My only hope was that it wouldn't be as painful as DrMcCoy had it several years ago (and I'm pretty sure the "beauty" part was sarcastic). Looking at how the game dithers the graphics, I figured out that it relied on Video for Windows to handle dithering. VFW promptly handles it by making the codec handle it.

For this game, that codec was Cinepak. The Cinepak decoder has been in ScummVM since 2009 (I wrote it in late 2007 or so, so it's really even older). I refused to use some other dithering algorithm that would have been different and given different results. If I was going to implement this, I was going to do it to match the original 100%. That meant it was time to figure out what it does.

Basically, the algorithm is based on pre-dithered tables that are for a given hardcoded palette. For custom palettes, it finds the nearest (using a simple distance equation) color in it and maps from the Cinepak palette index to the custom one. It then uses the pre-dithered tables to generate 4x4 blocks based on the contents of the codebook which is then mapped to the custom palette.

I pushed the code for the curious.

QuickTime also does something similar (but with a different dithering algorithm in Cinepak, of course), which I'll be working on for Myst.

Here's the result, using one of the Cinepak samples from the MPlayer samples archive (in this case, the Lara Croft one):

Normal decode to 24bpp
Dither to 8bpp

The result looks pretty decent. I was mostly glad it wasn't a ridiculous amount of extra code.

Saturday, August 16, 2014

Hidden in Plain Sight

With the DVD/GOG version of Pegasus Prime, there was a slight problem before release. We had an invalid function call entering the three new chase sequences when compiled in gcc with optimizations. I was unable to figure out the exact cause at the time and I ended up writing a hack around it in final release.

Since a bad function was getting called, I had feared gcc was overwriting a return address somewhere and sending the program counter where it shouldn't be. valgrind wasn't helping and only showed the after-effects of the bad function call. It was pretty hard to pinpoint in gdb too, due to the calling function being called numerous times during execution without breaking. I had shelved the issue for some time so I could return later, perhaps with another idea of tackling it. I found my hope in the AddressSanitizer.

Armed with my shiny new PC and gcc 4.8.1, I recompiled with the address sanitizer to see what I would get. The game would now crash as soon as the sequence would start, due to the sanitizer kicking in. The information the sanitizer gave helped in really one way: I had a way to make it stop as soon as it broke from the stack buffer overflow error. Perhaps not quite the way the tool was intended to be used, but it was enough of a hint for me.

With some logging to a file, I saw that it crashed here the first time _inputHandler changed. Going with LordHoto's suggestion to check the vtable of the pointer, I noticed something funny: It was the vtable for the wrong class!

Once I saw where the _inputHandler field was populated, I quickly saw what my mistake was. Instead of relying on the compiler to upcast from the subclass to the InputHandler class, I had a manual C-style cast in there. The Neighborhood pointer (only known through forward declaration) was being cast to the InputHandler pointer. Normally this would be OK, as long as the compiler knew about the class hierarchy (in this case, with multiple inheritance and virtual functions), and then generate a static_cast. But if it didn't know that, it would have to go with a reinterpret_cast. The code was doing a reinterpret_cast and throwing away the hierarchy, and therefore causing undefined behavior. It just so happened that it called into the wrong vtable in this case.

But why did it happen only during optimization? Probably because the function was getting inlined. If the include order had Neighborhood defined in the translation unit before getting to the constructor of GameInteraction, it would have output the correct static_cast. It's likely one other place had this situation and that version ended up being the actual used function.

Definitely one of the hardest bugs I've had to track down.