The Future Is Now

Do You Speak the Lingo?

I’ve been spending some time lately contributing to ScummVM, an open-source reimplementation of many different game engines that makes it possible to play those games on countless modern platforms. They’ve recently added support for Macromedia Director, an engine used by a ton of 90s computer games and multimedia software that I’m really passionate about, so I wanted to get involved and help out.

One of the first games I tried out is Difficult Book Game (Muzukashii Hon wo Yomu to Nemukunaru, or Reading a Difficult Book Makes You Sleepy), a small puzzle game for the Mac by a one-person indie studio called Itachoco Systems that specialized in strange, interesting art games. Players take on the role of a woman named Miss Linli who, after falling asleep reading a complicated book, finds herself in a strange lucid dream where gnomes are crawling all over her table. Players can entice them to climb on her or scoop them up with her hands. If two gnomes walk into each other, they turn into a strange seed that, in turn, grows into other strange things if it comes into contact with another gnome. Players guide her using what feels like an early ancestor to QWOP, with separate keys controlling the joints on each of Linli’s arms. It’s bizarre, difficult to control, and compelling.

A lot of early Director games play fine in ScummVM without any special work, so I was hoping that would be true here too. Unfortunately, it didn’t turn out to be quite that simple. I ended up taking a dive into ScummVM’s implementation of Director to fix it.

Director uses its own programming language, Lingo, which is inspired by languages like Smalltalk and HyperCard. HyperCard was Apple’s hypermedia development environment, released for Macs in 1987, and was known for its simple, English-like, non-programmer friendly programming language. Smalltalk, meanwhile, is a programming language developed in the 70s and 80s known for its simple syntax and powerful object oriented features, very new at the time; it’s also influenced other modern languages such as Python and Ruby. Lingo uses a HyperCard-style English-like way of programming and Smalltalk-style object oriented features.

Early versions of Director are unusual for having the engine interpret the game logic straight from source code1—which means if you’ve got any copy of the game, you’ve got the source code too. It’s great for debugging and learning how it works, but there’s a downside too. If you’re writing a new interpreter, like ScummVM, it means you have to deal with writing a parser for arbitrary source code. As it turns out, every issue I’d have to deal with to get this game working involved the parser.

I’ll get into the details later, but first some background. To give a simplified view, ScummVM processes Lingo source in a few steps. First, it translates the text from its source encoding to Unicode; since Lingo dates to before Unicode was widely used, each script is stored in a language-specific encoding and needs to be translated in order for modern Unicode-native software to interpret it correctly. Next, there’s a preprocessing stage in which a few transformations are made in order to make the later stages simpler. The output of this stage is still text which carries the same technical meaning, it’s just text that’s easier for the next stages to process. This is followed by the two stages of the actual parser itself: the lexer, in which source code text is converted into a series of tokens, and the parser, which has a definition of the grammar for the language and interprets the tokens from the previous stage in the context of that grammar.

This all sounds complicated, but my changes ended up being pretty small. They did, however, end up getting spread across several of these layers.

1. The fun never ends!

The very first thing I got after launching the game was this parse error:

WARNING: ######################  LINGO: syntax error, unexpected tMETHOD: expected end of file at line 83 col 6 in MovieScript id: 0!

Taking a look at the code in question, there’s nothing that really looks too out of the ordinary:

factory lady
method mNew
    instance rspri,rx,ry,rhenka,rkihoncala,rflag,rhoko,rkasoku
end method
method mInit1 spri
# etc

This is the start of the definition of the game’s first factory. Lingo supports object-oriented features, something that was still pretty new when it was introduced, and allows for user-defined classes called “factories”2. Following the factory lady definition are a number of methods, defined in a block-like format: method NAME, an indented set of one or more lines of method definitions, and an end method line.

That last line, it turns out, was the problem. To my surprise, it turns out those end method blocks are totally optional even though it’s the documented syntax in the official Director manual. Not only can it have any text there instead of method, but it turns out you don’t need any form of end statement at all. If ScummVM didn’t recognize it, it seems that many games must have just skipped it.

Luckily, this was a very easy fix: I added a single line to ScummVM’s Bison-based parser and it was able to handle end statements without breaking support for methods defined without them. I hoped that was all it was going to take for Difficult Book Game to run, but I wasn’t quite so lucky.

2. Language-dependent syntax

Unlike most modern languages, Lingo doesn’t have a general-purpose escape character like \ that can be use to extend a line of code across multiple lines. Instead, it uses a special character called the “continuation marker”, ¬3, which serves that purpose and is used for nothing else in the language4. (Hope you like holding down keys to type special characters!) Here’s an example of how that looks with a couple lines of code from a real application:

global theObjects,retdata1,retdata2,ladytime,selif,daiido,Xzahyo,Yzahyo,StageNum, ¬

Since Lingo was originally written for the Mac, whose default MacRoman character set supported a number of “special” characters and accents outside the normal ASCII range, they were able to get away with characters that might not be safe in other programming languages. But there’s a problem there, and not just that it was annoying to type: what happens if you’re programming in a language that doesn’t use MacRoman? This is before Unicode, so each language was using a different encoding, and there’s no guarantee that a given language would have ¬ in its character set.

Which takes me back to Difficult Book Game. I tried running it again after the fix above, only to run into a new parse error. After checking the lines of code it was talking about, I ran into something that looks almost like the code above… almost.

global theObjects,retdata1,retdata2,ladytime,selif,daiido,Xzahyo,Yzahyo,StageNum, ツ

Spot the difference? In the place where the continuation marker should be, there’s something else: , or the halfwidth katakana character “tsu”. As it turns out, that’s not random. In MacRoman, ¬ takes up the character position 0xC2, and is at the same location in MacJapanese. That, it turns out, seems to be the answer of how the continuation marker is handled in different languages. It’s not really ¬, it’s whatever character happens to be at 0xC2 in a given text encoding.

Complicating things a bit, ScummVM handles lexing Lingo after translating the code from its source encoding to UTF-8. If it lexed the raw bytes, it would be one thing: whatever the character is at 0xC2 is the continuation marker, regardless of what character it “means”. Handling it after it’s been turned into Unicode is a lot harder. Since ScummVM already has a Lingo preprocessor, though, it could get fixed up there: just look for instances of followed by a newline, and treat that as though it’s a “real” continuation marker5. A little crude, but it works, and suddenly ScummVM could parse Difficult Book Game’s code6. Or, almost…

3. What’s in a space?

Now that I could finally get in-game, I could start messing around with the controls and see how it ran. Characters were moving, controls were responding—it was looking good! At least until I pressed a certain key…

Her arms detached—that doesn’t look comfortable. In the console, ScummVM flagged an error that looked relevant:

Incorrect number of arguments for handler mLHizikaraHand (1, expected 3 to 3). Adding extra 2 voids!

This sounded relevant, since “hiji” means elbow. I figured it was probably the handler called when rotating her arm around her elbow, which is exactly what visually broke. I took a look at where mLHizikaraHand and the similar handlers were being called, and noticed something weird. In some places, it looks like this:

locaobject(mLHizikaraHand,(rhenka + 1),dotti)

And in other places, it looked slightly different:

locaobject(mLHizikaraHand (rhenka + 1),dotti)

Can you find the difference? It’s the character immediately after the handler name: instead of a comma, it’s followed by a space. Now that I looked at it, the ScummVM error actually sounded right. It does look like it’s calling mLHizikaraHand with a single argument (rhenka + 1). After talking it over with ScummVM dev djsrv, it sounds like this is just a Lingo parsing oddity. Lingo was designed to be a user-friendly language, and there are plenty of cases where its permissive parser accepts things that most languages would reject. This seems to be one of them.

Unfortunately, this parse case also seems to be different between Lingo versions. Fixing how it interprets it might have knock-on effects for parsing things created for later Director releases. Time to get hacky instead. The good news is that ScummVM has a mechanism for exactly this: it bundles patches for various games, making it possible to fix up weird and ambiguous syntax that its parser can’t handle yet. I added patches to change the ambiguous cases to the syntax used elsewhere, and suddenly Miss Linli’s posture is looking a lot healthier.

This whole thing ended up being much more of a journey than I expected. So much for having it just run! In the end, though, I learned quite a bit—and I was able to get a cool game to run on modern OSs. I’m continuing to work on ScummVM’s Director support and should have more to write about later.

Thanks to ScummVM developers djsrv and sev for their help working on this.

  1. Later versions switched to using a bytecode format, similar to Java or C#. This makes processing a lot easier, since bytecode produced by Director’s own compiler is far more standardized than human-written source code.

  2. Despite the name, it isn’t really implementing the factory pattern.

  3. The mathematical negation operator.

  4. It’s a bit of a weird choice, but Lingo didn’t do it first. It showed up first in Apple’s HyperCard and AppleScript languages.

  5. Tempting as it is to refactor the lexer, I had other things to do, and I really wasn’t familiar enough with its innards to take that on.

  6. As it turns out, this wasn’t the only game with the same issue. Fixing this also fixed several other Japanese games, including The Seven Colors: Legend of Psy・S City and Eriko Tamura’s Oz.