Synthespianism and anthropomorphization of computer graphics

October 2, 2002 by Diana Walczak

The anthropomorphization of computer graphics has been a classiccase of exponential growth powered by technology, art, commerceand culture. Funding for military and aerospace applications likenuclear weapons design, weather prediction and flight simulationpaid for much of the initial heavy lifting required to build thefoundation of the computer graphics industry during the 1960’s andearly 1970’s.

As the sophistication of graphics software marched forward andthe cost of computing slid downward, the annual SIGGRAPH film andvideo show became the crucible in which technologists, filmmakersand artists were introduced to one another: the baton of computergraphics design was passed from those who wrote the programs requiredto create imagery to those who perceived and exploited the incrediblecommunicative potential of this fledgling medium.

Along the way, the natural predilection to create computer graphicsin the image of ourselves has led to a striking body of creativeendeavor, ever growing in realism and resolution, which now knocksat the door of imperceptibility; that is, the rendering of humanperformances indistinguishable from flesh and blood performances.Some of the steps in that evolution are traced here from a personalperspective, and some speculations on future developments are presented.

Initial attempts at simulating human body motion took several formsat Digital Effects, a company I co-founded in 1978 along with collegeassociates. Hierarchical skeletons were created and keyframe-animatedto move in a bipedal fashion, but without any IK (inverse kinematics)solutions available, the results were stilted and difficult to edit.Rotoscoping live-action footage of a subject festooned with witnesspoints at the joints and digitizing the positing of the points onfilm allowed us to imbue our characters with more lifelike motion,but the process yielded primarily 2-D information that could notbe used from all angles.

At that time, 3-D modeling software had been designed for architecturalapplications and was not capable of modeling the human form in asatisfactory manner. Early attempts at creating and linking facialexpressions using software solutions in computer animation werevery disappointing and unconvincing at even the most basic levelsof lip synchronization.

3-D rotoscoping at Robert Abel and Associates yielded the firstmotion-captured animation in the commercial project, Sexy Robot,under the technical direction of Frank Vitz, who used two 35mm camerasto triangulate the 3-D positions of witness points on a live subject.

During the period of 1985-1986, a good deal of seminal researchand development in character animation was conducted at Abel aswell as Digital Productions and Omnibus Computer Graphics, all threeof which joined together and imploded under the weight of theirlargesse.

First Synthespian

While at Digital-Omnibus-Abel (to become known as DOA) I met DianaWalczak, a recent college graduate who was searching for ways tocombine science and art. We formed a partnership based on our mutualinterest in developing computer-generated characters and came upwith sculpture-based solutions to the problem of modeling the humanform and creating facial animation.

Diana sculpted a human form in clay and metal armature that wascast in hydrocal, from which individual body parts could be createdand digitized using a magnetic digitizing device called the 3SpaceDigitizer by Polhemus. The body parts were lined with thin tapeto define the optimum topology in polygons and digitized by handusing a magnetic sensor.

Thesebody parts were then assembled digitally into a skeletal hierarchyto form Nestor Sextone, our first Synthespian. Nestor’s joints wereformed by interpenetrating solids-due to the fact that softwaredid not yet exist that would allow for flexors at the joints-whichgave us seams similar to that of a plastic action figure. For facialanimation, a neutral face was cast in hydrocal, which allowed Dianato make multiple clay copies that could be sculpted into variousphonemes of speech and facial expressions.

Larry Weinberg, a programmer from Digital Effects and Omnibus wholater would write Poser, contributed software that would allow usto link the various digitized facial expressions together by re-orderingthe polygons. With multiple faces re-ordered into exactly the samepolygonal topology, we could interpolate from one to another, enablingus to create scripts that could simulate lip synchronization withour soundtracks. Using keyframe animation and Larry’s facial animationsoftware, Sextone made his screen debut for SIGGRAPH 1988 in a filmof 30 seconds duration in which he campaigns for the presidencyof the Synthetic Actors Guild.

Intriguedby the potential of motion capture to link natural human motionto our synthetic characters, we created Don’t Touch Me, amusic video piece that premiered at SIGGRAPH 1990 in which singer/songwriterPerla Batalla was optically motion captured (by Motion Analysisin Santa Rosa, CA) to drive a singing synthespian called Dozo. Bythis time, we had suitable flexing software for simple joints likeelbows and knees, but the multi-axis requirements for the shoulderjoint meant a solution was still several months of development away.

Facial animation was again created by linking digitized sculpturesof various facial expressions and this technique yielded superiorresults to any of the software solutions of the time that soughtto model the musculature of the face. Software solutions would requiremany years of development before they would overcome the qualityand believability of the sculpture-based technique, which allowedfor the preservation of facial volume and the illusion of the preservationof facial muscle integrity during motion.

This is the result of the fact that Diana’s keyframe sculpturesall had appropriate muscle definition and maintained that definitionwhile interpolating from one to the next.

Ourfirst stereoscopic synthespians were created for In Search ofthe Obelisk, a theme park trilogy for the Luxor Hotel in LasVegas, designed by Doug Trumbull. Using optical motion capture oflive dancers, we created the illusion of glass synthespians dancingon a hovering beach that floated over the audience.

Since we used ray tracing to refract the background through thebodies of the dancers, the stereoscopic image perceived by the audiencewas accurately rendered with slightly different refractions fromthe point of view of the left eye as compared to the right, yieldinga very realistic illusion reminiscent of the optical propertiesof a glass object when viewed stereoscopically.

Forthe feature film Judge Dredd, digital stunt doubles werecreated to solve a technical problem: many of the shots in the climacticchase sequence required Sylvester Stallone and Rob Schneider toappear to ride on a flying motorcycle that weaves around other flyingvehicles and skyscrapers. The close-ups were shot on a green screenstage with a gimbaled prop of the motorcycle and composited intomocon (motion control) footage of the huge model of the city. Othershots required the motorcycle to fly toward camera from a long distanceand maneuver in a complex flight path as it whizzed past camera.

These shots were not able to be photographed due to limitationsin the length of the green-screen camera rig as well as the reluctanceof the producers to allow a large, heavy, motion-controlled camerarig to careen within a few feet of their lead actors. We used magneticmotion capture (Ascension Technologies’ Flock of Birds) to obtainthe body dynamics of the motorcycle riders during the various changesin attitude of the motorcycle.

Playing back the previsualization on video, the subject (in thiscase Diana Walczak) was affixed with magnetic trackers and was wobbledaround on a gimbaled motorcycle mockup in sync with the previz playback.The way her body moved in response to the motion of the bike was captured and applied to photoreal synthetic versions of Stalloneand Schneider. For the faces, we used CyberScans for the first timeand the results were satisfactory since the camera never lingeredon the faces at close range.

Organic shape-shifting

A later project for the feature film X-Men involved the characterMystique, played by Rebecca Romijn-Stamos. Mystique is a shape-shifterwho transforms from her scaly blue form into other characters andback again using a combination of live action photography and CG animation.Director Bryan Singer was looking for a transformation that wouldstand apart from the typical morphing that had risen like a plaguethrough the visual arts, becoming a constant technique used in advertisingand in films to change one object into another using simultaneous2D shape transformation with dissolving texture vertices.

We designed a technique that would allow for a dimensional transformationthat would begin at various locations and spread across and aroundthe limbs in an organic, infectious fashion accented by 3-D scalesbursting through the surface and settling down like a shaking dog’scoat to form the scales on her body. In most cases we used CyberScandata of the outgoing actor matched to the 3-D position of the actorin the shot as a matting element to transform into an all-CG Mystique.

This technique required eighteen stages of production to createthe multilayered complex transformation and very careful matchingof CG skin, clothing and hair to the live action footage. Althoughthe mandate was to make the CG Mystique appear photoreal, her blue,scaly body was very different from that of a normal person, yieldingconsiderable visual leeway

For the Revolution Films production of the Jet Li film, TheOne, Jet Li battles his identical doppelganger from anotherdimension. For many shots, a simple split screen or a Patty Duke-styleover- the-shoulder shot would suffice, but for high-speed kung fubattle sequences in which punches and kicks had to land and be feltby the audience, digital face replacement was the technique of choice.

The separation of a facial performance from a physical performancehad been accomplished before in Jurassic Park in the shotwhere the velociraptor leaps up from below to attack a child character.

The adult stunt double’s face was replaced with that of the childactor, but that was simply a composite of photographic elements.In The One, the complex high-speed motion of the subjectsduring the fight sequences-coupled with the requirement that thetwo subjects sometimes appear to move at different camera framerates-required us to develop a fully CG face-replacement solution.

The stunt double, a kung fu expert with a very similar body typeto Jet Li, was outfitted with a plastic mask that was milled froma CyberScan of Jet Li’s face. The mask was equipped with retro-reflectivewitness points and the camera was outfitted with a fluorescent,circular light around the lens to ensure that the markers wouldshow up on film.

Jet Li plays a police officer pursued by his evil alterego from a parallel universe who seeks to kill him and becomeThe One. Advanced face replacement techniques allow Lito fight his twin. Both faces are visible and fully expressivein close-ups.

The fight sequences were choreographed so that the force of theimpacts would impart proper reaction in the two participants. Usingthe known positions of the facemask markers, we could determinethe precise orientation of the stunt double’s face on each frame,allowing us to track a CG face over top of his mask.

Using CyberScan data of Jet’s face, along with high-resolutionphotographs, we created and rigged a detailed 3-D Jet Li face withblendshapes that would allow us to simulate different facial expressionsduring the fight. The CG face was then animated to give us the appropriateexpression for each sequence, matted into the shot covering up themask and blended into the stunt double’s natural color around theface. Because it was not possible to photograph Li’s face in theproper dynamic orientation with the proper expression for a givenmoment of a fight, a CG face was the only solution.

The resulting technology, which allows us to separate the physicalperformance from the facial performance, has far-reaching implicationsfor the future of filmmaking. First of all, stunt sequences thatnormally would be staged in such a way that the face of the stuntdouble is never facing camera can now be staged according to theneeds of the director, and the actor’s face can be inserted accuratelyand believably. More broadly, the facial performance of an actorwho is incapable of the physical aspects of a performance can becomposited into the footage of a stunt double to multiply the rangeof an actor’s possible roles. Recent projects making use of ourtechnology include inserting an actor’s face onto stunt doubleswho are surfing and riding motorcycles.

Animation trumps mo-cap

More interesting from our standpoint is the creation of whollyCG characters and their application to entertainment projects. UniversalStudios came to us with the mandate to create the best theme parkattraction in the world based on the Spider-Man characters, andwe spent three years in production on The Amazing Adventuresof Spider-Man, a multimedia, stereoscopic, moving motion-simulatorattraction that was to become the flagship of their new theme park,Islands of Adventure in Orlando, Florida.

The Amazing Adventures of Spider-Man was createdfor Universal's billion-dollar Islands of Adventure themepark in Orlando. It's the first ride in history to combine stereoscopic3D film projected onto giant screens with the latest in motion-basedvehicle technology. This virtual-reality adventure immersesriders in a comic-book battle between Spider-Man and membersof the sinister syndicate as riders move through a 1.5-acreset environment.

Working with our head software designer, Frank Vitz, we developedsoftware that would compensate for the viewing position of the movingaudience, who would sit in six degrees-of-freedom motion simulatorstraveling on a track past 13 large reflective screens. The imagerywas projected in stereoscopic eight-perf 70mm film.

A great deal of attention was paid to matching the physical setsin the ride to the imagery projected onto the screens so that thelines were blurred between the real world and the virtual, projectedworld. In fact, many of the sets adjacent to the screens were dressedwith CG textures that originated from our virtual sets and werescanned onto eight-foot-wide canvas murals so that imagery and sightlines would match up and blend the two worlds into one.

From a design standpoint, our goal was to take the audience intoa comic book world that combines the hard key-lighting and saturated-colorstyle of comic art with enough textural detail to feel like a realplace. It was a balancing act between stylization and realism thatresulted in a unique and exciting environment in which to stagethe epic struggle between Spider-Man and a gaggle of super-villainsled by Dr. Octopus, one that swirls around the audience whom Spider-Manmust protect.

We tested and abandoned motion capture for the project based onthe fact that the superhuman performances of the Marvel characterscould be better realized by talented animators using keyframe techniquesrather than by animators trying to extend the physical range ofmotion-captured athletes.

Our first totally original synthespian project was made possibleby Busch Entertainment, who gave us virtual “carte blanche”to design a ride from the ground up for a new area at Busch Gardensin Williamsburg, VA. With only one word of direction, “Ireland,”from the client, we wrote a story called Corkscrew Hill thatwould exploit the physical parameters available: two 60-person Reflectonemotion bases in two identical warehouse spaces.

The Corkscrew Hill computer-animated stereoscopicepic ride experience takes audiences on an adventure to OldIreland, populated with humans and mythical creatures. In thepre-show, the audience shrinks to fit in a magic box. Then theyenter a motion base and are strapped into their seats for themain show: one continuous-point-of-view shot from the box ascharacters carry it on a wild adventure on Corkscrew Hill.SensAble's FreeForm System was used to sculpt characterheads. Pieces of character models were joined with Paraformsoftware. Maya was used for modeling, animation, and rendering.Large-format digital projection was engineered by Electrosonic.

We specified very large reflective screens and anopen cockpit design for the attraction and—working with theaudio-visual engineers at Electrosonic in Burbank, CA—we cameup with a digital projection system that would give us film resolutionon a large screen despite the fact that digital projectors werecurrently not up to the task.

Byrotating four Barco DLP projectors 90 degrees and edge-blendingdown the middle (using two projectors for each image), we couldget stereoscopic image pairs onto the 30 x 40 foot screens at 2048horizontal by 1280 vertical resolution. Since the brain fuses theleft and right images into a single mental image, any image artifactsfrom the projection were lost in the mental blending process, resultingin excellent stereoscopic imagery. We choreographed a camera movethat takes us on an adventure through ancient Ireland, encounteringIrish townspeople, a magic flying horse, banshees, a troll, a witchand a griffin.

Thiseight-minute attraction allowed us to create a completely syntheticworld and populate it with mythical creatures and characters witha visual style akin to that of a storybook. Again, we opted forkeyframed character animation instead of motion capture, which oftenseems pedestrian when applied to CG characters. When keyframing,an animator enters into and becomes the character, breathing originallife into it that cannot be obtained through motion capture, whichis in effect the three-dimensional “xeroxing” of a physicalperformance.

In the same way that a caricature of a person looks more like thesubject than would a tracing off a photograph, or a good sculptureof a person looks more like them than a life cast, a stylized CGcharacter created by a talented keyframe animator looks more believableand lifelike than one created with motion capture and CyberScans.

The limits of photorealism

Looking to the future, one must examine one’s goals in creatingCG life forms. There are those who hold up photorealism as the ultimategoal: to create a synthespian indistinguishable from a live actor.This idea has intrigued, taunted and tormented programmers for 30years, going back to the films Westworld in 1973 and Lookerin 1981. The broad base of development required to accomplish thisfeat has been gaining momentum at an exponential rate as more applications,competition and funding enter the arena.

There exists a trade-off between what level of realism is possibleversus how much computing time can be spent on each frame. We suppliedvery efficient body databases to Ray Kurzweil’s Ramonaproject, which was presented at the Technology Entertainment Design(TED) conference in February 2001. This real-time performance tookadvantage of recent developments in hardware rendering that alloweda fairly sophisticated human figure to be rendered and displayedat 30 frames per second. Through the use of real-time motion captureand voice synthesis, Ray was able to inhabit his female alter ego,Ramona.

The performance was designed within the limitations of the technology,in that the “camera” did not venture too close to Ramona’sface, where the “efficiency” of our data would becomea liability in terms of image quality. As the camera approachesthe subject, the resolution requirements skyrocket, and to rendera photorealistic close-up on film requires orders of magnitude morecalculation than can be supported by real-time rendering engines.

As Ray points out, computing speeds are increasing at exponentialrates, but current technology still gets slammed to the mat whenit is applied to creating a synthespian who appears real in everydetail. The problem is that we spend so much of our time studyingthe nuances of facial expression in our colleagues, friends andfamily, so we have become quite expert at spotting flaws. Thereare many subtle details in a real face, including how the complexmuscle system perturbs the skin surface, how light scatters insidethe skin, and how surface pores, blemishes and other minute detailslook and react to light.

A spectacular amount of money was poured into solving these problemsin the all-CG feature film Final Fantasy: the Spirits Withinand the results did not pay off at the box office. Many projectshave been proposed that would use CG characters to bring deceasedactors to the screen for a posthumous encore, but the technologyis not yet ready for this task, and many of us cringe at the prospectof this sort of application. The recent release of S1m0nereiterates the basic problem these sorts of projects face: we canget about 90 percent of the way to photorealism in CG actors, butthe last ten percent is extremely expensive and time-consuming incomparison to photographing real actors.

In Final Fantasy, the hair, skin, cloth dynamics and lightingare all in that 95% range that just doesn’t make it to photoreality,except in stills. In motion, the illusion lacks the subtlety ofmicro-motion and micro-detail of live action photography, and theresults are unsettling and distracting from the storyline.

In Simone, the producers opted to use a live actor who wasdigitally altered to be just slightly idealized through image processing.Coupled with a few shots of a CyberScanned 3-D model being revealedlike an orange peel wipe-on, the processed footage told the storyadequately and carried the story point of a believable CG humancost-effectively.

Tohave used a CG character throughout would have been many times morecostly and it is unlikely that the audience would believe that theCG character could be mistaken for a real actor. Albeit a valiantattempt, the film presumes a mythical world where Hollywood producersand the general public have no knowledge of the history and progressof visual effects, computer animation and digital compositing.

Animator as actor

The exciting areas to be explored are those where the animator becomesanalogous to the actor. By animators using a robust set of tools andtechniques, performances of the quality and richness currently createdby the finest actors will be made possible in a medium free of theconstraints of live-action photography. These will be characters,roles, and plots that exploit our intimate familiarity with the humanform and its subtleties, but don’t attempt to recreate photorealisticrenderings of it.

When painters developed the skills to recreate realistic images,a golden era of realism followed. But when photography came alongand replaced the role of the painter as visual documentarian, paintersresponded with expressionism and abstraction, modes of image-makingonly possible in the era of post-realism.

In the same way, after the CG industry is able to reproduce realityin its most intricate detail, the next step will be to build uponthat foundation a new and exciting future of non-realistic style.But rather than being limited to the confines of a painted canvasor a physical sculpture, the realm of imagination becomes the onlyouter limit.

Beyond the capability to achieve photorealism, there is a muchmore compelling goal of creating entertainment that takes placebeyond what and where we can photograph. The writer and directorare now the creative overlords, equipped with unlimited theatricalpossibilities in terms of locations, characters, storylines andvisual style. The entire world of science fiction and fantasy-basedliterature can be shot “on location” without limitation.New stories heretofore inconceivable will be created and broughtto the world of the visual arts and entertainment.

In this work, the emphasis will be on the concept. The writer anddirector stand at the door of a new space that has thus far beenexplored by precious few—and marvel at the possibilities.

In Search of the Obelisk Computer Animation byKleiser-Walczak. Produced by The Trumbull Company for Circus CircusEnterprises, Inc.

the Kurzweil Library + collectionsTracking breakthroughs in tech, science, and world progress.

Future Visions