Curing The Director Pain

Technology·Calvin Correli·Oct 11, 1999· 16 minutes

The Outdated Metaphor

The notion of a score, with a single timeline and a single depth axis, the stage and the cast, all worked well in the beginning. Because the (multi) medium of computers was new, people needed a familiar metaphor they could relate to. And the level of user interaction wasn’t too high, so it made sense to think of it as a movie with perhaps a little interaction on top.

Not so anymore. As the demands for user interaction increases, the limitations of the metaphor become ever more apparent.

First of all, having only one timeline doesn’t make sense when user interaction will constantly change the flow of the movie.
Having only one z-axis i.e. the layers of the score, doesn’t make much sense either – it is an unnecessary limitation of our “character’s” freedom to move around the “stage”.
The cast is simply dreadful. It is one big junkyard where everything is piled together – images, video clips, sound clips, color palettes, scene transition effects, and several different kinds of programming – no means of structuring them.

So Director supports a programming language, Lingo, to help you out. You can program jumps from one frame to another, you can program responses to events like mouse clicks or key presses, you can even programmatically create new cast members and show them on the screen. But the more you let Lingo control what’s happening, the more it becomes apparent that Director is making your job a whole lot harder than it had to be.

What Do We Do About It?

If you can accept the fact that the movie metaphor is no longer serving it’s purpose, the obvious question is: What metaphor should we use, then? Well, to answer that, we’ll need to consider what job, Director is actually doing—or what job it’s supposed to do.

At the very heart of it, Director is, or should be considered, a programming environment. A multimedia “thing” is above all a program. It’s not a movie, because a movie lives on film and there’s no interaction. A multimedia production lives in it’s own medium, the computer. And the content native to computers is software, or programs.

So it’s a programming environment, much like Visual Basic, Delphi or JBuilder. I’ll define a programming environment as the combination of the following pieces:

A programming language.
A library of classes and functions specific to its target domain.
An environment that facilitates development.
A compiler to translate the language into some byte code form.
A run-time interpreter.

The target domain of Borland Delphi is database applications; Director’s is cross-platform multimedia applications. This should determine the design of the components. (Note: You could do without either the compiler or the interpreter but not both.)

Why This Isn’t Harder For The Novice

The short answer: Because you can implement an environment on top of the language that will make it easy to do simple things. Another answer is: Because Director is almost as hard as anything ever gets.

Trivial programming is easy. Non-trivial programming is hard. There’s nothing you or I or Macromedia can do to change this. The hard part is not in the actual coding, it’s in the program design. You can’t make the design part easier, but you can surely make the coding (and debugging) part harder. Which is what Director does.

We can “easily” implement a development environment on top of (most any) language that will allow a user to do drag-and-drop programming of pre-determined simple things in a subset of the language. And I propose doing just that.

But as soon as you have a turing-capable language, users will be able to do immensely complex things in that language. And eventually, they will. The day that happens, you want to be ready for them, supporting them with all the tools that makes the complexity bearable.

These tools include:

reasonably clear semantics of the programming language so you can reason about your program
a reasonable debugger (no, the one in Director is not reasonable) so you can trace the path leading to a bug
version control so you can track changes to a project and revert to a working version

My philosphy is, generally speaking, that we should empower users rather than restrict them. Some things, like memory allocation, computers simply do better than sloppy human beings. But computers can’t write non-trivial programs. Instead, you want your environment to stay out of the user’s way and simply offer him tools to help with specific tasks.

What I Propose

In the following, I will outline a multimedia development environment as I would create it, if given the chance (and a few hundred great programmers). There is a great deal more work to do before a single line of code can be written but this is what I have in mind.

The Language

By simply considering the context in which it will be used, we can come up with certain ideas about how it should work. Since an object is a very useful metaphor for thinking about things on-screen (e.g. buttons, images, monsters, etc.) it should probably be object-oriented.

Since programmers will always want to do complex things, it should allow for easy structuring of code through aggregation, association and inheritance. Also, complexity hiding through encapsulation and layering is important.

Since we would want our programs to be robust and safe, we’d probably want an interpreted language. The programmer shouldn’t be allowed any explicit memory manipulation and we should have some sort of garbage collection.

Finally, since programs will always experience run-time errors, we should have some sort of clean exception handling.

Java comes to mind as a reasonable alternative fulfilling all of the above. However, I think we would benefit from constructing our own context-specific language. There is no reason, however, to reinvent the wheel, so we would clone the syntax of some well-known language (but not the dreadful Lingo, please!).

The Class Library

The class library should be targeted at high-level development of multimedia applications. Obviously, it should seamlessly handle the elements of multimedia—text, images, sound and video clips. Its primary focus should be on ease of use, performance and cross-platform development.

It should also allow us to structure the story in meaningful ways at a high level. This is the hard part. What constitutes “meaningful”? It depends on the user. We can’t possibly anticipate this. But we can try to get close.

What we should do is lump together a large group of authors of different kinds of multimedia, i.e. educational, games, quizzes, etc.; interview each one about how he or she thinks about it; then try to construct a base class library that is the destillation of the common ground.

This basis should provide the means to build more complex and context-specific classes on top, that will facilitate different ways of thinking about multimedia presented.

It should be possible for anyone to build his own package on top of the standard one and integrate seamlessly with it. Packages should also be easily distributable.

Having to make changes in the low-level structure is painful, so we need to get it right from the start. (And since no program is ever right from the start, we also need to have some way of evolving the class library, like Java has with it’s “deprecated” keyword. But that’s really part of the language.)

The Development Environment

Designing the development environment so it supports novice and expert developers alike is also a hard task. It’s design depends largely on the design of the class library, so I’ll leave most of the discussion till after we have sorted that out. Some general comments are in order, though.

It is imperative that everything that determines how the multimedia program will run, is stored in the form of language source code. Thus, there can be no extra (binary) information files. The reason is that if everything is in simple text source code form, the user will have a host of other programs he can use with it, including version control, search tools, editors and even third-party development environments. These tools are very important to every software developer, and consequently will be to multimedia developers as well.

It is obvious that some files must be in binary form, including image, sound and video files. And things that don’t affect run-time, like where on the screen the source code editor is placed in the development environment, can be stored anywhere in any form for all I care.

The environment should also include some form of debugger that will let the user step through his code, inspect variables, set breakpoints, etc. As with any programs, multimedia programs will have bugs, and we must help the user find and eliminate those. And it should have a reasonable editor (I simply cannot understand how Macromedia succeeded in developing an editor as slow and cumbersome as the script editor in Director. Even if I tried, I am not sure I would know how to!).

The Compiler

Even with an interpreted language, a compiler can be important. First of all, it will allow us to do compile-time checks to prevent certain types of errors at run-time (e.g. undeclared variables).

More importantly, however, it allows us to make our language even more context-specific while keeping the interpreter simple (and thus simple to port), by doing complex translations of the source code into something that is simpler to interpret.

The Run-time Interpreter

There isn’t much to say about the interpreter. If the language, class library and compiler are all well designed it should be clear and simple, allowing it to be easily ported to other platforms.

It cannot be too simple, however. The interpreter should help take care of all platform idiosyncrasies, in order to enable cross-platform development. The Java Virtual Machine would probably be a good source of inspiration for this.

How To Support Story Structure

As mentioned before, this really is where all the meat of designing such a system is: designing a langauge and a class library that will adequately support what users of the system want to do. I don’t have a complete answer to this, not even a sketch. I have a faint idea of how it could look and work. I will try to describe that idea here, and hopefully I or someone else will flesh it out over time. I will only use as my basis one specific type of multimedia program.

I was recently involved in a kiosk project. When talking about it, we had the notion of five distinct parts of content:

An attract-loop
A main menu with appetizers for the individual sections
The sections containing the meat of the content, all structured similarly
Special pop-up boxes that would provide even more information on particular subjects for the very interested and energetic users
A database of the museum’s archive on the subject

The attract loop is, as the name says, a loop of images showing, sounds playing or whatever, subject to some specified timing. On a mouse click, it exists, and control should be given to the main menu. The main menu, in turn, consists of a presentation of each of the content sections, one after the other. The presentation of a section, in turn, consists of displaying text, playing sound and showing images under certain time constraints. The text could fade in and out, the images could slide or fade on-screen, or whatever the author might want them to do. The text can fade, sound play and images slide, all simultaneously if the author wishes so.

So it seems we have multiple time-lines at play here. The attract loop has it’s own, but it can quit at any time, leaving control to the main menu timeline. This, in turn, determines that the presentation of the individual sections should happen one at a time, the next not showing until the first is finished. Each section presentation, in turn, has it’s own timeline, determining when text should start fading, sound should start playing or an image should start slide on-screen.

The Timeline

So what is a timeline? I see it as being essentially a program. It can say things like “do this, then that, then that”, e.g. sequencing; it can contain loops and conditionals; it can call other functions, which could in themselves spawn other timelines. For instance, the timeline for the main menu might look something like this:

for (int i = 0; i < menuSections.length; i++) {
  menuSection[i].execute();
}
pleaseSelect.execute();

What happens is that we loop through all the sections of menu, displaying each one in turn. After having showed all of them, we show a special “please select an option now” scrren, probably containing links to all the sections in one screen. If the user doens’t click on any of the sections, we’re out of things to do, the timeline function exits, and returns to whereever it was called from. (Don’t let the Java-like syntax confuse you—believe me, it’s not Java!)

I imagine these calls would all be blocking in the sense, that the call don’t return, and thus, the for loop don’t continue, until the menu section is done doing what it does, in it’s own time. Other calls could be non-blocking, i.e. we tell some other object to start doing something, but we continue our own timeline without waiting for the foreign object to finish what we started. This would be more like firing an event, rather than calling it as a function.

A More Complex Timeline

This idea seems fairly simple. But what if we need to synchronize with sound or video, or we want to wait for other things to happen?

These are all events. Some events are pre-defined, like mouse clicks or a cue point in an audio track. Others could be user-defined.

class MenuText {
  Sound theSound;

  event fadeIn;
  event fadeOut;

  timeline {
    on (fadeIn) {
      for (int i = 0; i < 255; i += 20) {
        setAlphaChannel(i);
        sleep(100ms);
      }
      while (true) {
        setAlphaChannel(200);
        sleep(100ms);
        setAlphaChannel(255);
        sleep(100ms);
      }
    }

    on (fadeOut, theSound.end) {
      for (int i = 255; i >= 0; i -= 20) {
        setAlphaChannel(i);
        sleep(100ms);
      }
    }
  }
}

I mentioned before the use of text that can fade in and out. This is how that could be described. I also added the silly feature, that when fully faded in, the text will “shake”, constantly alternating between alpha channel levels 200 and 255.

First of all, we have a reference to a sound object. Whe don’t care where the reference might come from for now. We use it to subscribe to a pre-defined event of that object, namely the event that occurs when the sound is done playing. We also declare two of out own events, fadeIn and fadeOut.

This timeline dictates that a MenuText object does nothing until triggered by a “fadeIn” event, which must be issued by some other object. Then it does a for loop to fade the text in. When fully faded in, it will continue flashing back and forth between alpha channel values 200 and 255 until it receives another event. This could be either an explicit fadeOut event, or the event that occurs when the specified sound stops playing. Then it fades out, and it reaches the end of the timeline.

The Special Timeline Method

Note that the “on ()” things are sort of like subfunctions. There is no “main timeline” in this case. Also note, that the “on(fadeIn)” enters an infinite loop.

Of couse, there’s got to be some way to break that loop. But I don’t want to force the programmer to explicitly break it. Instead, when an event occurs that will push it into another (or perhaps also the same?) state, whatever is currently running should be interrupted, and control transferred to the new handler.

A Few Of The Things I Haven’t Figured Out Yet

I’m getting in way over my head here. Basically, what we have here, is a multi-threaded program, each time-line being a thread. And each thread being more or less an arbitrary piece of code. That’s not trivial. What about synchronization and critical regioins? What about deadlocks?

The answer is: I don’t know. As I have noted before, this is to be considered only a sketch of a faint idea. First thing to do is to brainstorm on possible language designs. Then we’ll see what implications it might have and whether it can be done at all.

How To Support Development

How do we create a development environment that will let users do drag-and-drop editing of arbitrary programs and insulate users from the complexity when they don’t want it at the same time as allowing full turing-machine capabilities.

That is also a hard question. The solution, I suggest is that we provide a “double-layered” interface. A simple layer that will do drag-and-drop editing on a subset of the language, and another that will let the programmer edit the code manually. If the user has opted for the simple interface, it will try to make sense of the code for e.g. a particular timeline. If it is able to recognize it as belonging to the subset it understands, it will offer the user to edit it in the simple form. Otherwise it will have to resort to simply an editor.

The simple form of interface could be like the Director we know today (except, of course, for the Lingo script editor), with cast, stage, score and a single timeline. It should be possible to express all that in a subset of my language. We’d probably want to expand it a bit and make it more flexible, though.

Where To Go From Here

This is all just some thoughts that I’ve had while suffering the pain of working with Director. As I have made clear, there are still a lot of things that need to be sorted out—including giving it a name!

While I might find the time to elaborate more on some of the ideas, I doubt that I will ever have the time to develop something like this myself. Thus I hereby donate it to the public (under the <a href=”http://www.fsf.org/copyleft/gpl.html”>GPL) in the hope that someone somewhere will feel inspired and develop it for me and for everyone else to enjoy.

228 Park Ave S
PMB 92530
New York, NY 10003
United States

powered by