I’ve used a lot of tools over the years, which means I’ve seen a lot of tools hit a plateau. That’s not always a problem; sometimes something is just “done” and won’t need any changes. Often, though, it’s a sign of what’s coming. Every now and then, something will pull back out of it and start improving again, but it’s often an early sign of long-term decline. I can’t always tell if something’s just coasting along or if it’s actually started to get worse; it’s easy to be the boiling frog. That changes for me when something that really matters to me breaks.
To me, one of GitHub’s killer power user features is its blame view. git blame on the commandline is useful but hard to read; it’s not the interface I reach for every day. GitHub’s web UI is not only convenient, but the ease by which I can click through to older versions of the blame view on a line by line basis is uniquely powerful. It’s one of those features that anchors me to a product: I stopped using offline graphical git clients because it was just that much nicer.
The other day though, I tried to use the blame view on a large file and ran into an issue I don’t remember seeing before: I just couldn’t find the line of code I was searching for. I threw various keywords from that line into the browser’s command+F search box, and nothing came up. I was stumped until a moment later, while I was idly scrolling the page while doing the search again, and it finally found the line I was looking for. I realized what must have happened.
I’d heard rumblings that GitHub’s in the middle of shipping a frontend rewrite in React, and I realized this must be it. The problem wasn’t that the line I wanted wasn’t on the page—it’s that the whole document wasn’t being rendered at once, so my browser’s builtin search bar just couldn’t find it. On a hunch, I tried disabling JavaScript entirely in the browser, and suddenly it started working again. GitHub is able to send a fully server-side rendered version of the page, which actually works like it should, but doesn’t do so unless JavaScript is completely unavailable.
I’m hardly anti-JavaScript, and I’m not anti-React either. Any tool’s perfectly fine when used in the right place. The problem: this isn’t the right place, and what is to me personally a key feature suddenly doesn’t work right all the time anymore. This isn’t the only GitHub feature that’s felt subtly worse in the past few years—the once-industry-leading status page no longer reports minor availability issues in an even vaguely timely manner; Actions runs randomly drop network connections to GitHub’s own APIs; hitting the merge button sometimes scrolls the page to the wrong position—but this is the first moment where it really hit me that GitHub’s probably not going to get better again from here.
The corporate branding, the new “AI-powered developer platform” slogan, makes it clear that what I think of as “GitHub”—the traditional website, what are to me the core features—simply isn’t Microsoft’s priority at this point in time. I know many talented people at GitHub who care, but the company’s priorities just don’t seem to value what I value about the service. This isn’t an anti-AI statement so much as a recognition that the tool I still need to use every day is past its prime. Copilot isn’t navigating the website for me, replacing my need to the website as it exists today. I’ve had tools hit this phase of decline and turn it around, but I’m not optimistic. It’s still plenty usable now, and probably will be for some years to come, but I’ll want to know what other options I have now rather than when things get worse than this.
And in the meantime, well… I still need to use GitHub everyday, but maybe it’s time to start exploring new platforms—and find a good local blame tool that works as well as the GitHub web interface used to. (Got a fave? Send it to me at misty@digipres.club / @cdrom.ca. Please!)
The short, no-clickbait version: to switch the Mac version of Puyo Puyo Fever to English, edit ~/Library/Preferences/PuyoPuyo Fever/PUYOF.BIN and set the byte at 0x266 to 0x01—or just download this pre-patched save game and place it in that directory.
I’ve been a Mac user since 2005, and one of the very first Mac games I bought was the Mac port of Sega’s Puyo Puyo Fever. I’ve always been a Sega fangirl and I’ve always loved puzzle games (even if I’m not that good at Puyo Puyo), so when they actually released a Puyo Puyo game for Mac I knew I had to get it. This was back in the days when very, very few companies released games for Mac, so there weren’t many options. Even Sega usually ignored Mac users; Puyo Puyo Fever only came out as part of a marketing gimmick that saw Sega release a new port every month for most of a year, leading them to target more niche platforms like Mac, Palm Pilot and Pocket PC.
A few of the console versions came out in English, but the Mac port was exclusive to Japan. I didn’t read any Japanese at the time, so I just muddled my way through the menus while wishing I could play it in English. I’d thought that maybe I could try to transplant English game data from the console versions, but I didn’t own any of them so I just resigned myself to playing the game in Japanese.
Recently, though, I came across some information that made me realize there might be more to it. First, I finally got to try the Japan-exclusive Dreamcast port from 2004… and discovered that it was fully bilingual, with an option in the menu to switch between Japanese or English text and voices. I might have just thought that Dreamcast players were lucky and I was still out of luck until I ran into the English Puyo Puyo fan community’s mod to enable English support in the Windows version. Their technique, which was discovered by community members Yoshi and nmn around 2009, involves modifying not the game itself but a flag in the save game—the same flag used by the Dreamcast version, which it’s still programmed to respect despite the menu option having been removed.
I wasn’t able to use the Windows save modding tool produced by Puyo Puyo fan community member NickW for a couple of reasons:
It’s hardcoded to open the save file from the Windows save location, %AppData%\SEGA\PuyoF\PUYOF.BIN, and can’t just be given a save file at some other path, and
The Windows version uses compressed save data, while the Mac version always uses uncompressed saves, and so the editor won’t try to open uncompressed saves.
I could have updated the editor to work around this but, knowing that that the save was uncompressed and I only had to change a single byte, it seemed like overkill. One byte is easy enough to edit without a specialized tool, so I just pulled out a hex editor. The Windows save editor is source-available, so I didn’t have to reverse engineer the locations of the key flags in the save file myself. I guessed that the language flag offset wouldn’t be different between the uncompressed Windows saves and the Mac saves, so after reading that it’s stored at byte 0x288, I tried changing it from 0x00 to 0x01 and started up the game.
…and it just worked! Without any changes, the entire game swapped over to English—menus, dialogue, and even the title screen logo. After 20 years, suddenly I was playing Puyo Puyo Fever for Mac in English.
According to the Windows save editor, the next byte (0x289) controls the voice language. Neither the Windows nor the Mac versions actually shipped with English voices on the disc, however, so setting this value just ends up silencing the game instead. The fan community prepared an English voice pack taken from the other versions, but I didn’t bother trying it on Mac since proper timing data for the English voices is missing.
At this point I figured I’d discovered everything I was going to find until I noticed something at the start of the save data in the hex editor:
I’d only been paying attention to data later in the file, so I’d overlooked the very beginning until now. But now that I looked at it, it was a very regular pattern. It looks suspiciously like an image; uncompressed bitmaps are usually recognizable to the naked eye in a hex editor, and I wondered if that could be what this was. So I dug out the Dreamcast version again, and lo and behold:
It’s the Dreamcast version’s save icon, displayed in the Dreamcast save menu and on the portable VMU save device. The Mac version doesn’t have any reason to need this, and has nowhere to display it, but it’s there anyway. Looking at the start of the header made me realize the default save file name from the Dreamcast port is there too—the very first bytes read 「システムファイル」, or “System File”. Grabbing an original Dreamcast save file, I was able to confirm that the Mac save is completely identical to the Dreamcast version, except for rendering multi-byte fields in big-endian format1. I guess by 2004 there was no reason to spend time rewriting the save data format just to save a few hundred bytes, so all the Dreamcast-specific features come along for the ride on Mac and Windows.
Now, you might, ask, why would I spend so much time on a Mac port that doesn’t even run on modern computers? (Though I’d be happy to fix that - Sega, email me!) Part of it is just that I love digging into older games like this to find out what makes them tick; it’s as much a hobby as actually playing them. The other part, of course, is that I’ll actually play it. As you might be able to guess from the PowerPC Mac package manager I maintain, I still keep my old Macs around and every now and then I break out the PowerMac G4 for a few rounds of Puyo Puyo Fever. The next time I do, I’ll be able to play it in English.
The byte order, or endianness, of multi-byte data types is different between different kinds of CPUs. The PowerPC processors used by that era of Macs use the big endian format.↩
Awhile back, I had the chance to dump a game for MAME. I told myself that if the chance ever came up again, I’d contribute again. Luckily, it turns out I didn’t have to wait too long—but the story didn’t end like I expected it to.
When I bought my PGM arcade motherboard, the #1 game I wanted to own was a one-on-one fighting game called Martial Masters. It’s a deeply underrated, gorgeous game—and judging from the price it goes for these days, I’m not the only one after it. It took quite a bit of hunting until I found a copy within my price range but my usual PGM game dealer in China finally tracked down a copy to sell me a few months ago. I was excited to finally play it on the original hardware, but also to see if I had another chance to contribute a game to MAME.
When it arrived, even before I had the chance to check the version number, I was surprised to see it was a Taiwanese region game. All of IGS’s games have simplified Chinese region variants for sale in China; it’s unusual to see a traditional Chinese version from Taiwan show up over there. It could just be a sign that the game was so popular they brought over extra cartridges from Taiwan when there weren’t enough for local arcades. Once I booted the game and made note of its version numbers, I checked MAME and saw that there was a matching game in its database: martmasttw, or a special Taiwanese version of revision 1.02. That also surprised me—IGS typically didn’t produce entirely separate builds for different regions. Instead, each of their games contains the data for every language and region in its ROMs, and the region code in its copy protection chip determines what region it boots up as.
The other thing I noticed about MAME’s martmasttw was a comment in the source code noting that it might be a bad dump—that is, an invalid read that produced corrupted data. This isn’t that uncommon when dumping these sorts of games. Whether it’s due to dying chips or hardware issues with the reading process, sometimes a read just goes wrong and it gets missed. Once I booted it up in MAME, I confirmed it looked like a bad dump. It instantly crashes with an illegal instruction error, a clear sign of corrupted program code. Now that I owned the game, I had a chance to dump the correct ROMs and fix MAME’s database.
As soon as I opened the cartridge, I noticed something interesting: these weren’t the chips I was expected. Like with The Gladiator, I only needed to remove and dump two socketed chips, but these were a completely different model. Other PGM games using the same hardware typically use 27C322 (4MB) and 27C160 (2MB) chips, which were common EPROMs in their time period. Here, though, I saw something much more exotic: an OKI 27C3202 soldered into a custom adapter. The game board itself is essentially the same one that’s in The Gladiator, so it was clear that the adapter was presenting them as 4MB 27C322 chips.
I haven’t been able to figure out why it was designed this way. It can’t have been cheap to design and manufacture these custom adapters, and other PGM games that were made both before and after this one all use the more common chips without any adapters. I’ve only seen a single other game built this way. Was there a 27C322 shortage at the time this specific game was being made? Were they experimenting with new designs and ended up abandoning this approach? It’s hard to tell.
I only have an EPROM reader adapter for chips in the 27C322 family, so I hoped it would would be able to handle reading them just fine. On my first attempt, it rejected it; as far as I can tell, it was trying to perform “smart” verification of the chip, which failed since the underlying chip underneath IGS’s adapter isn’t actually the chip it’s trying to query. I ultimately tricked it by inserting a real 27C322 first and reading that before swapping over to the chip I actually wanted to read. Once the reader’s recognized at least one chip, it seems happy to stick in 27C322 mode persistently.
My first read seemed fine, and the dumped data did have a different hash from what MAME recognized. Success! …or so I thought, until I tried actually booting the game, where it crashed again. I went back to the EPROM reader to make sure the chip was seated correctly before doing a new test read. From the physical design of the adapters, I knew that getting it seated might be a challenge.
The reader uses a ZIF socket which usually makes it easy to insert and remove chips. This time, though, there was an interesting complication. Because of how it’s constructed, the socket has a “lip” at the end past the final set of pins. With a normal 27C322, that’s not a problem; the chip ends right at the final set of pins, so nothing hangs over the end of the chip. This adapter has a very different shape from a real 27C322 chip, however—there’s a dangling “head” that contains the actual chip, as seen in the photo above showing the underside of the adapter. On the real board it hangs harmlessly over the end of the socket, but on a ZIF socket it ends up actually making contact with the end of the socket and keeps the pins from being able to sit as deeply as it would normally sit. I haven’t spoken to the person who originally dumped this revision, but I suspect that this is the issue behind the bad dump.
I ended up holding the apdater with one hand to stabilize it and keep all of the pins as even as I could while I locked the ZIF socket’s lever a second time; this time, it seemed as though I’d been able to get it sitting as even as possible. I then performed several more reads and, before trying to boot it again, compared them against each other. This time, I saw that these new reads were different from the first attempt—and that they were byte-for-byte identical to each other.
Once I had what seemed like good dump of both chips, I booted them up in MAME to see if it would work. Unlike MAME’s ROMs, it booted right away without issues and worked perfectly. After I played a few rounds without a single crash or unexpected behaviour, I was satisfied that my new dumps were fine. As I was getting ready to submit a pull request to MAME to update the hashes in its database, however, I happened to grep the source for them and noticed something funny—they were already there. In another version of Martial Masters.
I mentioned earlier that I was surprised that MAME had labelled the Taiwanese 1.02 version of Martial Masters as a separate revision from the Chinese 1.02. Well, as it turns out, once the ROMs are dumped correctly it’s not a separate revision. The ROMs are actually byte-for-byte identical; it’s only the bad dump that had made MAME consider martmasttw a separate revision this whole time.
This is the point where I’d intended to open a pull request to MAME just updating a few hashes for the correct dump, but with everything I’d learned the final pull request deleted martmasttw entirely. I had set out to fix a revision of the game in MAME, and make one more verison of it playable. Instead, I’d proven it didn’t exist in the first place. This wasn’t where I expected to end up, but it does teach an important lesson: corrupted data can go unnoticed for years if it’s not double and triple checked.
And, more than that, it’s a reminder that databases are an eternal work in progress. MAME’s list of ROMs is also as close as there is to a global catalogue of arcade games and their revisions, but it’s still fallible. Databases grow and, sometimes, they shrink; proving a work doesn’t exist can be just as important as uncovering new works.
Every now and then, when working on ScummVM’s Director engine, I run across a disc that charms me so much I just have to get it working right away. That happened when I ran into Classical Cats, a digital art gallery focused on the work of Japanese artist and classical musician Mitsuhiro Amada. I wrote about the disc’s contents in more detail at my CD-ROM blog, but needless to say I was charmed—I wanted to share this with more people.
I first found out about Classical Cats when fellow ScummVM developer einstein95 pointed me at it because its music wasn’t working. Like a lot of early Director discs, Classical Cats mostly just worked on the first try. At this point in ScummVM’s development, I’m often more surprised if a disc made in Director 3 or 4 fails to boot right away. The one thing that didn’t work was the music.
Classical Cats uses CD audio for its music, and I’d already written code to support this in early releases of Alice: An Interactive Museum for Mac. I’d optimistically hoped that Classical Cats might be as easy, but it turned out to present some extra technical complexity. Regardless, for a disc called “Classical” Cats, I knew that getting music working would be important. I could tell that I wasn’t having the full experience.
While many CD-ROMs streamed their music from files on the disc, some discs used CD audio tracks for music instead. (If you’re already familiar with CD audio and mixed-mode CDs, you can skip to the next paragraph.) CD audio is the same format used in audio CDs; these tracks aren’t files in a directory and don’t have names, but are simply numbered tracks like you’d see in a CD player. Data on a CD is actually contained within a track on the disc, just like audio; data tracks are just skipped over by CD players. A mixed mode CD is one that contains a mixture of one or more data tracks and one or more audio tracks on the same disc. This was often used by games and multimedia discs as a simple and convenient way to store their audio.
Director software is written in its own programming language called Lingo; I’ve written about it a fewtimes before. In addition to writing logic in Lingo, developers are able to write modules called XObjects; these can be implemented in another language like C, but expose an interface to Lingo code. It works very similarly to C extensions in languages like Ruby or Python.
While ScummVM is able to run Lingo code directly, it doesn’t emulate the original XObjects. Instead, it contains new clean-room reimplementations embedded into ScummVM that expose the same interfaces as the originals. If a disc tries to call an unimplemented XObject, ScummVM just logs a warning and is able to continue. I’d already implemented one of Director’s builtin audio CD XObjects earlier, which was how I fixed Alice’s music earlier.
ScummVM has builtin support for playing emulated audio CDs by replacing the audio tracks with MP3 or FLAC files. For Alice, I wrote an implementation of Director’s builtin Apple Audio CD XObject. That version was straightforward and easy to implement; it has a minimal API that allows an app to request playback of a CD via track number, which maps perfectly onto ScummVM’s virtual CD backend.
I already knew Classical Cats uses a different XObject, and so I’d have to write a new implementation for it, it turns out the API was very different from Alice’s. Alice, along with many other Director games I’ve looked at, uses a fairly high-level, track-oriented API that was simple to implement. ScummVM’s builtin CD audio infrastructure is great at handling requests like “play track 5”, or “play the first 30 seconds of track 7”. What it’s not at all prepared for is requests like “play from position 12:00:42 on the disc”.
You can probably guess what Classical Cats does! Instead of working with tracks, it starts and stops playback based on absolute positions on a disc. This may sound strange, but it’s how the disc itself is set up. On a real CD, tracks themselves are just indices into where tracks start and stop on a disc, and a regular CD player looks up those indices to decide where to seek to when you ask it to play a particular track. In theory, it’s pretty similar to dropping a record player needle on a specific spot on the disc.
This might not sound too complex to manage, but there’s actually something that makes it a lot harder: translating requests to play an absolute timecode to an audio file on disc. ScummVM isn’t (usually) playing games from a real CD, but emulating a drive using the game data and FLAC or MP3 files replacing the CD audio tracks. ScummVM generally plays games using the data extracted from the CD into a folder on the hard drive, which causes a problem: the data track on a mixed mode CD is usually the first track, which means that the timing of every other track on the disc is offset by the length of the data track. We can’t guess where anything else is stored without knowing exactly how long the data track is. If we’ve extracted the data from the CD, we no longer know how big that track is, and we can’t guess at the layout of the rest of the disc.
“Knowing the disc layout” is a common problem with CD ripping and authoring, and a number of standards exist already. Single-disc data CDs can easily be represented as an ISO file, but anything more complex requires an actual table of contents. When thinking about how to solve this problem for ScummVM, I immediately thought of cuesheets—one of the most popular table of contents formats for CD ripping, and one that’s probably familiar to gamers who have used BIN/CUE rips of 32-bit era video games. Among all the formats available for documenting a disc’s table of contents, cuesheets were attractive for a few reasons: I’ve worked with it before, so I’m already familiar with it; it’s human-readable, so it’s easy to validate that it’s being used properly; and it provides a simple, high-level interface that abstracts away irrelevant details that I wouldn’t need to implement this feature. A sample cuesheet for a mixed mode CD looks something like this:
12345678910
FILE "CLSSCATS.BIN" BINARY
TRACK 01 MODE1/2352
INDEX 01 00:00:00
TRACK 02 AUDIO
PREGAP 00:02:00
INDEX 01 17:41:36
TRACK 03 AUDIO
INDEX 01 19:20:46
TRACK 04 AUDIO
INDEX 01 22:09:17
Once you understand the format, it’s straightforward to read and makes it clear exactly where every track is located on the disc.
The main blocker here was simply that ScummVM didn’t have a cuesheet parser yet, and I wasn’t eager to write one myself. Just when I was on the verge of switching to another solution, however, ScummVM project lead Eugene Sandulenko offered to write a new one integrated into ScummVM itself. As soon as that was ready, I was able to get to work.
The XObject Classical Cats uses has a fairly complicated interface that’s meant to support not just CDs, but also media like video cassettes. To keep things simple, I decided to limit myself to implementing just the API that this disc uses and ignore methods it never calls. It’s hard to make sure my implementation’s compatible if I don’t actually see parts of it in use, after all. By watching to see which method stubs are called, I could see that I mainly had to deal with a limit set of methods. Aside from being able to see which methods are called and the arguments passed to them, I was able to consult the official documentation in the Director 4.0 manual.1
Two of the most fundamental methods I began with were mSetInPoint and mSetOutPoint, whose names were pretty self-explanatory. Rather than have a single method to begin playback with start/stop positions, this library uses a cue system. Callers first call mSetInPoint to define the start playback position and mSetOutPoint to set a stop position. These positions are tracked in frames, a unit representing 1/75th of a second.
On a real drive, they can then call mPlayCue to seek to the start of the position so that the drive is ready. Given the slow seek times of early CD-ROM drives, this separation forced developers to consider that the device might not actually be able to start playback as soon as they request it and take that into account with their app’s interactive features. After starting the seek operation, the developer was meant to repeatedly call mService to retrieve a status code and find out whether the drive was still seeking, had finished seeking, or encountered an error. Since ScummVM is usually acting on an emulated drive without actual seek times, I simplified this. mSetInPoint and mSetOutPoint simply assign instance variables with the appropriate values, and mService always immediately returns the “drive ready” code.
At this point, I did what I should have done in the first place and checked the source code. As I mentioned in a previous post, early Director software includes the source code as a part of the binary, and luckily that’s true for Classical Cats. As I checked its CD-ROM helper library, I stumbled on the method that made me realize exactly where I’d gone wrong:
12345678
on mGetFirstFrame me, aTrack
put the pXObj of me into myXObj
if myXObj(mRespondsTo, "mGetFirstFrame") = 0 then
return 0
else
return myXObj(mGetFirstFrame, aTrack)
end if
end
This code might be familiar to Rubyists, since Ruby has a very similar construct. This class wraps the AppleCD SC XObject, instantiated in the instance variable myXObj, and calls methods on it. But it’s written defensively: before calling a number of methods, it calls mRespondsTo first to see if myXObj has the requested method. If it doesn’t, it just stubs it out instead of erroring. Since ScummVM implements mRespondsTo correctly, it means this code was doing what the original authors intended: seeing that my implementation of AppleCD SC didn’t have an mGetFirstFrame method, and just returning a stub value. Unfortunately for me, I was being lazy and had chosen which methods to implement based on seeing the disc try to use them—so I let myself be tricked into thinking those methods were never used.
As it turns out, they were actually key to getting the right timing data. Classical Cats was trying to ask the CD drive about timing information for tracks, and storing that to use to actually play the songs. With these methods missing, it was stuck without knowing where the songs were and how to play them.
And here I realized the great irony of what I was doing. Internally, Classical Cats thinks about its audio in terms of tracks, and asks the XObject for absolute timing data for each track. It then passes that data back into the XObject to play the songs, where ScummVM intercepts it and translates it back into track-oriented timing so its CD drive emulation knows how to play them. It’s a lot of engineering work just to take it all full circle.
At the end of the day, though, what’s important is it does work. Before I finished writing this, it was difficult to play Classical Cats on any modern computer; now, anyone with version 2.8.0 or later of ScummVM can give it a try. Now that it’s more accessible, I hope other people are able to discover it too.
Note: CD audio support for this disc is available in nightly builds of ScummVM, and will be available in a future stable release.
Schmitz, J., & Essex, J. (1994). Basic device control. In Using Lingo: Director Version 4 (pp. 300–307). Macromedia, Inc.↩
My latest blog post is over at my employer’s blog post and talks about the work I’ve done to get system dependency management integrated into cargo-dist, an open source release management tool for Rust. The new release lets users specify non-Rust dependencies in Cargo.toml using a Cargo-like syntax and also provides a detailed report on the resulting binary’s dynamic linkage. Here’s a sample of the dependency syntax:
I was testing out a new Macromedia Director CD in ScummVM, and I noticed a non-fatal error at startup:
123
WARNING: ###################### LINGO: syntax error, unexpected tSTRING: expected ')' at line 2 col 70 in ScoreScript id: 2!
WARNING: # 2: set DiskChk = FileIO(mnew,"read"¬"The Source:Put Contents on Hard Drive:Journey to the Source:YZ.DATA")!
WARNING: # ^ about here!
It may have been non-fatal, but seeing an error like that makes me uneasy anyway—I’m never sure when it’ll turn out to have ramifications down the line. This comes from the parser for Director’s custom programming language, Lingo, so I opened up the code in question1 to take a look. The whole script turned out to be only three straightforward lines. The part ScummVM complained about came right at the start of the file, and at first glance it looked pretty innocuous.
123
set DiskChk = FileIO(mnew,"read"¬
"The Source:Put Contents on Hard Drive:Journey to the Source:YZ.DATA")
IF DiskChk = -35 THEN GO TO "No CD"
The symbol at the end of that first line is a continuation marker, which you might remember from a previous blog post where I debugged a different issue with them. The continuation marker is a special kind of escape character with one specific purpose: it escapes newlines to allow statements to extend across more than one line of code, and nothing else.
At first I thought maybe the issue was with the continuation marker itself being misparsed, like in the error I documented in that older blog post; maybe it was failing to be recognized and wasn’t being replaced with whitespace? To figure that out, I started digging around in ScummVM’s Lingo preprocessor. Spoiler: it turned out not to be an issue with the continuation marker, but it pointed me in the right direction anyway.
ScummVM handles the continuation marker in two phases. In a preprocessor phase, it removes the newline after the marker in order to simplify parsing later. Afterwards, in the lexer, it replaces the marker with a space to produce a single normal line of code. The error message above contains a version of the line between those two steps: the preprocessor has combined the two lines of code into one, but the continuation marker hasn’t been replaced with a space yet.
If we do the work of the preprocessor/lexer ourselves, we get this copy of the line:
1
set DiskChk = FileIO(mnew,"read" "The Source:Put Contents on Hard Drive:Journey to the Source:YZ.DATA")
In this form, the error is a bit more obvious than when it was spread across multiple lines. The problem is with how the arguments are passed to FileIO: the first two arguments are separated by a comma, but the second and third aren’t. The newline between the second and third arguments makes it easy to miss, but as soon as we put it all together it becomes obvious.
In the last case I looked at, described in the previous blog post, this was an ambiguous parse case: the same line of code was valid if you added the missing comma or not, but it was interpreted two totally different ways. This time is different. If you add the missing comma, this is a normal, valid line of code; if you don’t, it’s invalid syntax and you get the error we’re seeing at the top.
As far as I can tell, the original Director runtime actually accepts this without throwing an error even though this isn’t documented as correct syntax. The official Director programming manual tells the user to use commas to separate arguments, but it’s tolerant enough to support when they’re forgotten like they are here2. ScummVM doesn’t get that same luxury. As I mentioned in the previous blog post, later Director versions tightened up these ambiguous parse cases, and supporting the weird case in Director 3 would significantly complicate the parser. Since this is only the second case of this issue, though, it’s not really necessary to support it either. ScummVM has builtin support for patching a specific disc’s Lingo source code, so I was able to simply fix this by patching the code to the properly-formatted version.
The disc in question still doesn’t fully work, but I’m putting some time into it. I’m planning on writing a followup on the other fixes necessary to get it running as expected. And for today’s lesson? Old software is weird. Just like new software.
Before version 4, Director software was interpreted from source code at runtime—so, conveniently, that means that you can peek at the source code to any early Director software.↩
MacroMind Director Version 3.0: Interactivity Manual. (1991). MacroMind, Inc. Page 64.↩
I recently had the chance to do something that I’ve wanted to do for years: dump an arcade game to add it to MAME.
MAME is a massive emulator that covers thousands of arcade games across the history of gaming. It’s one of those projects I’ve used and loved for decades. I’ve always wanted to give back, but it never felt like I had something to contribute—until recently.
You might remember from one of my other posts that I recently got into collecting arcade games. This year I’ve been collecting games for a Taiwanese system called the PolyGame Master (PGM), a cartridge-based system with interchangeable games sold by International Games System (IGS) between 1997 and 2005. It has a lot of great games that aren’t very well-known, in particular some incredibly well-designed beat-em-ups.
A couple months ago, I bought a copy of The Gladiator, a wuxia-themed beat em up set in the Song dynasty. My specific copy turns out to be an otherwise-undumped game revison. Many arcade games were released in many different versions, including regional variations and bugfix updates, and it can take collectors and MAME developers years to track them all down. In the case of The Gladiator, MAME has the final release, 1.07, and the first few revisions, but it’s missing most of the versions in between. When my copy came here, I booted it up and found out it was one of those versions MAME was missing: 1.04.
Luckily, I already had the hardware on hand to dump it. I own an EPROM burner that I’d bought to write chips so that I could mod games I’d bought, but EPROM burners can read chips as well. I own an adapter that supports several common chips that, luckily, can handle exactly the chips I needed for this game.
It’s easy to think of game cartridges as just being a single thing, but arcade game boards typically have a large number of chips. Why’s that? It’s partly technical; specific chips can be connected directly to particular regions of the system’s hardware, like graphics or sound, which means that even though it’s less flexible than an all-in-one ROM, it has some performance advantages too. The two chips I dumped here are program code for two different CPUs: one for the 68000 CPU in the system itself, and one for the ARM7 CPU in the game cartridge.
The other advantage is that using a large number of chips can make it easier to update a game down the line. Making an overseas release? It would be much cheaper to update just a couple of your chips instead of producing new revisions of everything on your board. Releasing a bugfix update? It’s much more quick and painless to update existing games if all your program code is on a single chip.
From looking at MAME, I could tell that every other revision of The Gladiator used a single set of chips for almost everything. Only the two program ROM chips are different between versions, which made my life a lot easier. I was also lucky that these chips were easy to get to. Depending on the kind of game, chips might be attached straight to the board, or they might be in sockets where they can be easily removed and reattached. The Gladiator has two game boards, one of which uses two socketed chips. And, thankfully, those were the only two chips I had to remove and dump.
To remove the chips, I used an EPROM removal tool—basically just a little set of pliers on a spring, with pair of needle noses that get in under the edge of the chip in the socket so you can pull it out. The two chips were both common types that my EPROM burner supports, so once I got them out they weren’t difficult to read. The most important chip, which has the game’s program code, is an EPROM chip known as the 27C160—a 2MB chip in a specific form factor. I already own a reader that supports that and the 4MB version of the same chip, which you can see in the above photo. The second chip is a 27C4096DC which has a much smaller 512KB capacity.
Why are there two program ROMs? Many games for the PGM use a fascinating and pretty intense form of copy protection. As I mentioned earlier, the PGM motherboard has a 20MHz 68000 processor, a 16-bit CPU that was very widely used in the 90s. The game cartridge, meanwhile, has a 20MHz ARM7 coprocessor. For early games, that ARM7 was there just for copy protection. Game cartridges would feature an unencrypted 68000 ROM and an encrypted ARM7 ROM; the ARM7 must successfully decrypt and execute code from the encrypted ROM for the main program code to be allowed to boot and run. By the end of the PGM’s life, they’d clearly realized it was silly to be devoting the ARM7 just to copy protection when it was faster than the CPU on the motherboard, so they put it to use for actual game logic. On games like The Gladiator, the unencrypted 68000 ROM and the main CPU are only doing very basic bootstrapping work and then hand off the rest of the work to the ARM7, which runs the game using code on the encrypted ARM7 chip.
I spent awhile fumbling around trying to get the dumped ARM7 ROM to work, but it turns out that’s because I read it as the wrong kind. Oops. My reader has a switch that switches between the 2MB and 4MB versions of the chip… and I had it set to 4MB, even though the chip helpfully told me right on the package it’s a 2MB chip. So, after I spent a half hour fumbling, I realize what I’d done and went back to redump it—and that version worked first try. Phew.
Once I dumped it, I was able to figure out that one of the two program ROMs is identical to one that’s already in MAME; only the ARM ROM is unique. That meant adding it to MAME was very easy; I could mostly copy and paste existing code defining the game cartridge, changing just one line with the new ROM and a few lines with some different metadata, and I was good to go. I submitted a pull request and, after some discussion, it was merged. For something I’ve wanted to be able to contribute to for years, feels good and, honestly pretty painless. And now, as of MAME 0.249, The Gladiator 1.04 can finally be emulated!
I’ve been spending some time lately contributing to ScummVM, an open-source reimplementation of many different game engines that makes it possible to play those games on countless modern platforms. They’ve recently added support for Macromedia Director, an engine used by a ton of 90s computer games and multimedia software that I’m really passionate about, so I wanted to get involved and help out.
One of the first games I tried out is Difficult Book Game (Muzukashii Hon wo Yomu to Nemukunaru, or Reading a Difficult Book Makes You Sleepy), a small puzzle game for the Mac by a one-person indie studio called Itachoco Systems that specialized in strange, interesting art games. Players take on the role of a woman named Miss Linli who, after falling asleep reading a complicated book, finds herself in a strange lucid dream where gnomes are crawling all over her table. Players can entice them to climb on her or scoop them up with her hands. If two gnomes walk into each other, they turn into a strange seed that, in turn, grows into other strange things if it comes into contact with another gnome. Players guide her using what feels like an early ancestor to QWOP, with separate keys controlling the joints on each of Linli’s arms. It’s bizarre, difficult to control, and compelling.
A lot of early Director games play fine in ScummVM without any special work, so I was hoping that would be true here too. Unfortunately, it didn’t turn out to be quite that simple. I ended up taking a dive into ScummVM’s implementation of Director to fix it.
Director uses its own programming language, Lingo, which is inspired by languages like Smalltalk and HyperCard. HyperCard was Apple’s hypermedia development environment, released for Macs in 1987, and was known for its simple, English-like, non-programmer friendly programming language. Smalltalk, meanwhile, is a programming language developed in the 70s and 80s known for its simple syntax and powerful object oriented features, very new at the time; it’s also influenced other modern languages such as Python and Ruby. Lingo uses a HyperCard-style English-like way of programming and Smalltalk-style object oriented features.
Early versions of Director are unusual for having the engine interpret the game logic straight from source code1—which means if you’ve got any copy of the game, you’ve got the source code too. It’s great for debugging and learning how it works, but there’s a downside too. If you’re writing a new interpreter, like ScummVM, it means you have to deal with writing a parser for arbitrary source code. As it turns out, every issue I’d have to deal with to get this game working involved the parser.
I’ll get into the details later, but first some background. To give a simplified view, ScummVM processes Lingo source in a few steps. First, it translates the text from its source encoding to Unicode; since Lingo dates to before Unicode was widely used, each script is stored in a language-specific encoding and needs to be translated in order for modern Unicode-native software to interpret it correctly. Next, there’s a preprocessing stage in which a few transformations are made in order to make the later stages simpler. The output of this stage is still text which carries the same technical meaning, it’s just text that’s easier for the next stages to process. This is followed by the two stages of the actual parser itself: the lexer, in which source code text is converted into a series of tokens, and the parser, which has a definition of the grammar for the language and interprets the tokens from the previous stage in the context of that grammar.
This all sounds complicated, but my changes ended up being pretty small. They did, however, end up getting spread across several of these layers.
1. The fun never ends!
The very first thing I got after launching the game was this parse error:
1
WARNING: ###################### LINGO: syntax error, unexpected tMETHOD: expected end of file at line 83 col 6 in MovieScript id: 0!
Taking a look at the code in question, there’s nothing that really looks too out of the ordinary:
123456
factory lady
method mNew
instance rspri,rx,ry,rhenka,rkihoncala,rflag,rhoko,rkasoku
end method
method mInit1 spri
# etc
This is the start of the definition of the game’s first factory. Lingo supports object-oriented features, something that was still pretty new when it was introduced, and allows for user-defined classes called “factories”2. Following the factory lady definition are a number of methods, defined in a block-like format: method NAME, an indented set of one or more lines of method definitions, and an end method line.
That last line, it turns out, was the problem. To my surprise, it turns out those end method blocks are totally optional even though it’s the documented syntax in the official Director manual. Not only can it have any text there instead of method, but it turns out you don’t need any form of end statement at all. If ScummVM didn’t recognize it, it seems that many games must have just skipped it.
Luckily, this was a very easy fix: I added a single line to ScummVM’s Bison-based parser and it was able to handle end statements without breaking support for methods defined without them. I hoped that was all it was going to take for Difficult Book Game to run, but I wasn’t quite so lucky.
2. Language-dependent syntax
Unlike most modern languages, Lingo doesn’t have a general-purpose escape character like \ that can be use to extend a line of code across multiple lines. Instead, it uses a special character called the “continuation marker”, ¬3, which serves that purpose and is used for nothing else in the language4. (Hope you like holding down keys to type special characters!) Here’s an example of how that looks with a couple lines of code from a real application:
12
global theObjects,retdata1,retdata2,ladytime,selif,daiido,Xzahyo,Yzahyo,StageNum, ¬
daihoko
Since Lingo was originally written for the Mac, whose default MacRoman character set supported a number of “special” characters and accents outside the normal ASCII range, they were able to get away with characters that might not be safe in other programming languages. But there’s a problem there, and not just that it was annoying to type: what happens if you’re programming in a language that doesn’t use MacRoman? This is before Unicode, so each language was using a different encoding, and there’s no guarantee that a given language would have ¬ in its character set.
Which takes me back to Difficult Book Game. I tried running it again after the fix above, only to run into a new parse error. After checking the lines of code it was talking about, I ran into something that looks almost like the code above… almost.
12
global theObjects,retdata1,retdata2,ladytime,selif,daiido,Xzahyo,Yzahyo,StageNum, ツ
daihoko
Spot the difference? In the place where the continuation marker should be, there’s something else: ツ, or the halfwidth katakana character “tsu”. As it turns out, that’s not random. In MacRoman, ¬ takes up the character position 0xC2, and ツ is at the same location in MacJapanese. That, it turns out, seems to be the answer of how the continuation marker is handled in different languages. It’s not really ¬, it’s whatever character happens to be at 0xC2 in a given text encoding.
Complicating things a bit, ScummVM handles lexing Lingo after translating the code from its source encoding to UTF-8. If it lexed the raw bytes, it would be one thing: whatever the character is at 0xC2 is the continuation marker, regardless of what character it “means”. Handling it after it’s been turned into Unicode is a lot harder. Since ScummVM already has a Lingo preprocessor, though, it could get fixed up there: just look for instances of ツ followed by a newline, and treat that as though it’s a “real” continuation marker5. A little crude, but it works, and suddenly ScummVM could parse Difficult Book Game’s code6. Or, almost…
3. What’s in a space?
Now that I could finally get in-game, I could start messing around with the controls and see how it ran. Characters were moving, controls were responding—it was looking good! At least until I pressed a certain key…
Her arms detached—that doesn’t look comfortable. In the console, ScummVM flagged an error that looked relevant:
1
Incorrect number of arguments for handler mLHizikaraHand (1, expected 3 to 3). Adding extra 2 voids!
This sounded relevant, since “hiji” means elbow. I figured it was probably the handler called when rotating her arm around her elbow, which is exactly what visually broke. I took a look at where mLHizikaraHand and the similar handlers were being called, and noticed something weird. In some places, it looks like this:
1
locaobject(mLHizikaraHand,(rhenka + 1),dotti)
And in other places, it looked slightly different:
1
locaobject(mLHizikaraHand (rhenka + 1),dotti)
Can you find the difference? It’s the character immediately after the handler name: instead of a comma, it’s followed by a space. Now that I looked at it, the ScummVM error actually sounded right. It does look like it’s calling mLHizikaraHand with a single argument (rhenka + 1). After talking it over with ScummVM dev djsrv, it sounds like this is just a Lingo parsing oddity. Lingo was designed to be a user-friendly language, and there are plenty of cases where its permissive parser accepts things that most languages would reject. This seems to be one of them.
Unfortunately, this parse case also seems to be different between Lingo versions. Fixing how it interprets it might have knock-on effects for parsing things created for later Director releases. Time to get hacky instead. The good news is that ScummVM has a mechanism for exactly this: it bundles patches for various games, making it possible to fix up weird and ambiguous syntax that its parser can’t handle yet. I added patches to change the ambiguous cases to the syntax used elsewhere, and suddenly Miss Linli’s posture is looking a lot healthier.
This whole thing ended up being much more of a journey than I expected. So much for having it just run! In the end, though, I learned quite a bit—and I was able to get a cool game to run on modern OSs. I’m continuing to work on ScummVM’s Director support and should have more to write about later.
Thanks to ScummVM developers djsrv and sev for their help working on this.
Later versions switched to using a bytecode format, similar to Java or C#. This makes processing a lot easier, since bytecode produced by Director’s own compiler is far more standardized than human-written source code.↩
Despite the name, it isn’t really implementing the factory pattern.↩
It’s a bit of a weird choice, but Lingo didn’t do it first. It showed up first in Apple’s HyperCard and AppleScript languages.↩
Tempting as it is to refactor the lexer, I had other things to do, and I really wasn’t familiar enough with its innards to take that on.↩
As it turns out, this wasn’t the only game with the same issue. Fixing this also fixed several other Japanese games, including The Seven Colors: Legend of Psy・S City and Eriko Tamura’s Oz.↩
Everyone had their weird pandemic hobby, right? Well, mine is that I bought an arcade video game. Not the whole cabinet - just a board, to hook up to a TV and a game controller. If I can’t go to the arcade, at least I can bring a favourite game home, right?
As you might imagine, an arcade board isn’t just plug and play; you can’t just plug in a USB gamepad and call it a day. Finding out how to actually use it took some research. I looked up the different methods, and found I had more or less two options: a more expensive one involving repurposed arcade hardware, and an open-source one using off-the-shelf parts. While I was figuring out if the more expensive options were worth it, I decided I’d try out the open source one, OpenJVS. I started out as just a user, but ended up getting involved in development. Over the past few months, I’ve contributed a bunch of new features and bugfixes. In the process I learned a lot about the specification… and quashed my fair share of weird bugs.
So what is JVS anyway?
Imagine this: you’re an arcade machine manufacturer, and you need to make it easy for a machine operator to swap a new board into one of their cabinets. How do you make sure they don’t have to run dozens of wires to get the controls hooked up? Video is easy; you can just use VGA, DVI, or HDMI. Power is easy too. What else is left? Well, there’s many different kinds of input so players can actually play the game; coin inputs and money bookkeeping; smart cards to communicate information to a game; in other words, all kinds of input from the player.
The solution? JAMMA Video Standard, or JVS, the standard used by most arcade games since the late 90s.1 It’s a serial protocol that makes arcade hardware close to plug and play: connect a single cable from your cabinet’s I/O board to your new arcade board, and you get all of your controls and coin slots connected at once.
If you want to use a JVS game at home, you could always put something together using an official I/O board, but - as it happens, the JVS spec is available2. There’s nothing stopping you from making your own I/O board, and it turns out a few different people have. There are a few different open source options intended to work in a few different ways. I use OpenJVS, written by Bobby Dilley, which transforms a Raspberry Pi running Linux into a JVS I/O board.
How does JVS work?
So that’s JVS, but how does it work? I won’t go into deep detail, but here’s the 10,000 meter view.
JVS defines communication between an I/O board and a video game board. It uses the RS-485 serial standard to communicate data, meaning that it’s possible to use a standard USB RS-485 device to send and receive commands.3 The I/O board handles tracking the state of all of the cabinet’s inputs, and it’s also responsible for all the coin management: it has its own count of how much money players have spent, and the game just reads that number instead of keeping track of it itself.
The game board communicates with the I/O board by sending commands, then receiving packets of information in return. The game board tells the I/O board how many bytes of data to expect, then sends one or more messages as a set of raw bytes. The I/O board reads those bytes, interprets them, then sends a set of responses for each command.
Open source JVS implementations don’t just emulate the things a player would normally touch in the arcade, either. They also simulate features like the service and test buttons, which are tucked away inside a cabinet where only arcade employees can normally use them. The service button allows access to the operator menu, where employees can change hardware and software settings, while the test button does exactly what it sounds like. Arcade players will never get to use these, but for home players it’s useful to be able to do things like change how many coins a play costs, turn on free play, or change the game language. OpenJVS supports both of these, and they seemed to work when I tested them. The test button did cause OpenJVS to log a message about an “unsupported command”, which seemed suspicious, but Bobby and I suspected this was just the board playing fast and loose with the spec so we ignored it. (This will come up again later.)
OpenJVS was already basically complete and implemented all of the common parts of the protocol, so I started out as just a user. As I ran into a few bugs, I started contributing bugfixes and then implementations of more parts of the protocol.
Let’s talk money
I mentioned coin management earlier. Obviously, in arcades, games are pay-to-play. Most game boards have a free play option so you can play as much as you want without paying, and a lot of home players will just switch their boards into the free play mode. There’s nothing stopping you from emulating a coin slot though. OpenJVS lets you map a button on your controller to inserting a coin for something closer to the authentic arcade feeling. It seems like this might not be a common usecase, but the option is always there.
Before long, I noticed a really strange bug. I’d be playing games, and then notice that somehow I had over 200 credits in the machine. I definitely wasn’t mashing the coin button that much, so I knew something had to be wrong. After some experimentation, I eventually figured out it only happened if you did a few specific things in exactly the right order:
Insert one or more coins
Enter the service menu and then the input test menu
Exit the service menu
At this point, I realized something suspicious was happening. I was always getting 200+ coins… but it was more specific than that. I was ending up with exactly 256 minus the number of coins I had when entering the service menu. For example, if I started with 3 coins, I’d always have 253 coins after leaving the service menu. That was a pretty good sign I was seeing something in particular: integer underflow.
Experienced programmers, feel free to skip this paragraph. But for those who aren’t familiar: languages like C, which OpenJVS is written in, feature something called integer overflow and underflow. Number types have a maximum size which affects how many numbers it can represent. A 16-bit (or 2-byte) integer that can store only positive numbers, for example, can only represent numbers between 0 and 65535. Picture, for a moment, what happens if you ask a program to subtract 1 from a 16-bit integer that’s already at 0, or add 1 to a number that’s already at 65535. In C, and many other languages, it will underflow or underflow: subtract 1 from 0, and you get 65535; add 1 to 65535, and you get 0.
Having figured out that I was probably seeing underflow, I took a look at the protocol. Since the JVS I/O board keeps track of the money balance, the protocol provides commands for dealing with that and it seemed like the most likely place I’d find the bug. When I dug into OpenJVS’s implementation of the “decrease number of coins” command, it wasn’t too hard to find the culprit. It was right here:
123456
/* Prevent underflow of coins */if(coin_decrement>jvsIO->state.coinCount[0])coin_decrement=jvsIO->state.coinCount[0];for(inti=0;i<jvsIO->capabilities.coins;i++)jvsIO->state.coinCount[i]-=coin_decrement;
When OpenJVS received the command to decrement the number of coins by a certain amount, it tried to prevent underflows. Unfortunately, the underflow protection itself was buggy: it checked the number of coins in the first coin slot, but then decremented the number of coins in every slot. If slot 2 has fewer coins than slot 1, then slot 2 will end up underflowing by whatever the difference is. I still wasn’t sure why it was trying to remove 256 coins, which seemed weird. I figured it must just be trying to clear the slot of all coins and trusting the I/O board to prevent underflows, and moved on.
With that bug fixed, I decided to keep working at improving coin support. While I was working on that command, I noticed that OpenJVS was ignoring a similar command. While the board usually only needs to send commands to reduce the number of coins in the balance, like when the player starts a game, it can also send a command to increase the number of coins. I’d noticed that my game board was trying to send that command, but OpenJVS had only implemented a placeholder that logged the request and then moved on without doing anything. The quickest way to figure out what was going on was just to implement the command myself. The actual command in the spec is pretty simple:
Purpose
Sample
Command code (always 35)
0x35
Coin slot index
0x01
Amount (first half)
0x00
Amount (second half)
0x01
Easy enough: you can identify that it’s the command to increase coins by the first byte, 35, and then it tells you which coin slot to act on and how many coins to add to it. But when I was replacing the old placeholder command, I noticed something funny:
The placeholder command just reported success without doing anything, then jumped ahead in the stream by the command’s length. But it jumped ahead 3 bytes, and according to the spec this command should be 4 bytes long. I tested OpenJVS with the command fully implemented, and noticed two things: pressing the test button now inserted 256 coins into the second coin slot, instead of doing nothing; and the “unsupported command” error I used to see went away. Why? When the buggy placeholder skipped ahead by three bytes, that left one byte in the buffer for OpenJVS to find and mistake for being a command. That byte, 0x01, was actually the last byte of the “insert coin” command that was being sent when the test button was activated. I hadn’t even set out to fix the “unsupported command” bug, but I fixed it anyway.
At this point I’d fixed all the bugs I set out to, but I was still seeing something that just didn’t feel right. When exiting the service menu, the game board now withdrew coins from the balance; when pressing the test button, the board now added coins. But the number of coins looked wrong: it was happening in increments of 256, instead of 1, and even for test commands that seemed unlikely. So I took another look at the spec, and realized the answer had been staring me in the face the entire time. Let’s take another look at the last part of that table from earlier:
Purpose
Sample
Amount (first half)
0x00
Amount (second half)
0x01
The number of coins to add or remove is a two-byte, or 16-bit, value. Since JVS is communicating through single bytes, any multi-byte values have to be split up into single bytes for transmission. The spec helpfully tells us how to decode that: the numbers are stored in the big-endian, or most-significant byte first, format. But taking a look at OpenJVS’s code shows that anywhere it was decoding these values, it was decoding them in little-endian format - in other words, it was reading the bytes backwards. What’s 256 in little-endian format? It’s the bytes “0” and “1”, in that order. What’s 1 in big-endian format? It’s those same two bytes in that same order. In other words, the game board hadn’t been trying to add or subtract 256 coins all this time: it was just trying to add and remove single coins.
Putting it all together, what exactly was happening when I saw coins being added and removed? It actually turns out to be pretty simple. Pressing the test button inserts a coin in the second coin slot. When the operator menu is activated, it can be used to test the coin counter. When the operator leaves the menu, the unit sends commands to remove all of the coins that were added by the test button; this should leave it with the same number of coins as it had when they started. The strange behaviour was a combination of all the bugs working together: the test button didn’t do anything because the command wasn’t implemented, and the buggy bounds checking and incomplete “remove coins” command meant it could underflow and leave the player with hundreds of coins.
I originally set out to fix some pretty simple bugs, but every new bug I uncovered revealed a new one. The experience was quite a fun one. I can’t say I ever thought I’d be writing software for arcade machines, but not only did I fix my own problems, but I had the chance to learn more about how things work in a domain I might never otherwise have gotten the chance to touch.
This was actually the second JAMMA standard, following a simpler standard that was used internationally between 1985 and the late 90s.↩
Has this ever happened to you? You’re writing a project in Ruby, JavaScript, Go, etc., and you have to build a dependency that uses a system library. So you bundle install and then, a few minutes later, your terminal spits up an ugly set of C compiler errors you don’t know how to deal with. After dealing with this enough times I decided to do something about it.
Homebrew already has a great tool in its arsenal for dealing with these problems. Homebrew needs to be able to build software reliably and robustly, after all - even if the user’s system has weird software installed on it or strange misconfigurations. The “superenv” build environment features intelligent automatic setup of build-related environment variables and PATHs based on just the requested dependencies, which filters out unrequested software and prevents a lot of common build failures that come from interfering software. It also uses shims for many common build tools to enforce just the right arguments passing through to the real tools.
So I thought to myself - we solved that problem for Homebrew builds already, right? Wouldn’t it be nice if I could just reuse that work for other things? So that’s what I did. Homebrew already provides the Brewfile dependency declaration format and the brew bundle tool to library dependencies with Homebrew, and as a result there’s already a great way to get the dependency information we’d need to produce a reliable build environment. Since brew bundle is a Homebrew plugin, it has access to Homebrew’s core code - including build environment setup. Putting these together, I wrote a feature called brew bundle exec. It takes the software you specify in your Brewfile and builds a dependency tree out of that, then sets up just the right build flags to let anything you want use them.
For example, say I want to gem install mysql2. Often, you get something like this:
1234567
$ gem install mysql2
Building native extensions. This could take a while...
# several dozen lines later...
linking shared-object mysql2/mysql2.bundle
ld: library not found for -lssl
clang: error: linker command failed with exit code 1 (use -v to see invocation)
make: *** [mysql2.bundle] Error 1
Ew, right? Let’s make that better.
By creating a Brewfile with the line brew "mysql", we can specify that we want to build against a Homebrew-installed MySQL and all of its dependencies. Just by running our command prefixed with brew bundle exec --, for example, brew bundle exec -- gem install mysql2, we can run that command in a build environment that knows exactly how to use its dependencies. Suddenly, everything works—no messing around with flags, no special options passed to gem install, and no fragile bundle config trickery.
1234567
$ brew bundle exec -- gem install mysql2
Building native extensions. This could take a while...
Successfully installed mysql2-0.5.3
Parsing documentation for mysql2-0.5.3
Installing ri documentation for mysql2-0.5.3
Done installing documentation for mysql2 after 0 seconds
1 gem installed
What exactly does brew bundle exec set? There’s a variety of flags set which are useful for a variety of different compilers and buildsystems.
CC and CXX, the compiler specification flags, point to Homebrew’s compiler shims which help ensure that the right flags are passed to the real compiler being used.
CFLAGS, CXXFLAGS, and CPPFLAGS ensure that C and C++ compilers know about the header and library lookup paths for all of the Brewfile dependencies.
PATH ensures that all of the executables installed by Brewfile dependencies will be found first, before any tools of the same name that may be installed elsewhere on your system.
PKG_CONFIG_LIBDIR and PKG_CONFIG_LIBDIR ensure that the pkg-config tool finds Brewfile dependencies.
Buildsystem-specific flags, such as CMAKE_PREFIX_PATH, ensure that buildsystems can make use of the Brewfile dependencies.
So the next time you’re bashing your head against build failures in your project, give brew bundle exec a try. It might just solve your problems for you!