General

How a Tracking Algorithm Helped Insert Tom Hanks into ‘Forrest Gump’ Historical Character News Scenes – Before and After

A look back at the early days of tracking and matching at ILM for the film.

to Robert Zemeckis forrest gump (1994) might be the most famous “invisible effects” film in history, thanks to Industrial Light & Magic’s efforts to create a series of scenes with Gump (Tom Hanks) that touch on key moments in American history. Among other things, Gump meets famous presidents and musicians – these scenes are often presented by Hanks’ seamless insertion into old newsreel footage.

ILM pioneered new ways of creating visual effects, particularly in the intro and out of and to film, as well as through the use of CG animation and digital compositing. A particular challenge the studio had to deal with forrest gump he watched the often jittery news footage that Hanks would be inserted into, partly because it tended to be old 16mm footage and also constantly moved or subject to regular zooms in and outs.

That’s where JP Lewis stepped in. He recently joined ILM after studying at the famed NYIT CGL, and one of his early tasks at the visual effects studio was to implement an efficient and reliable tracking algorithm into ILM’s existing toolsets. Lewis later worked at other studios such as ESC, Wētā FX and Disney, and at companies such as Google and NVIDIA.

Here, on the 30th anniversary of forrest gump, before and after asked Lewis specifically about working with the tracking algorithm. The film, of course, won the Academy Award for Best Visual Effects (awarded to Ken Ralston, George Murphy, Stephen Rosenbaum and Allen Hall).

b&a: How did you end up working at ILM in 1993? What did you study or work on right before this?

JP Lewis: I had started my career at NYIT CGL, an early graphics lab that had a lot of pioneers (Jim Blinn, Ed Catmull, Alvy Ray Smith, Jim Clark, Fred Parke), although most had left before I joined. Lance Williams and Pat Hanrahan were still there, but left shortly after they realized I had joined.

b&a: Tell me about the tracking algorithm you worked on for Forrest Gump? What needed to be “solved” at the time and how had the VFX studios done it up until then?

JP Lewis: ILM was developing iComp, an in-house compositing program, and as part of it a cross-correlation template matching algorithm was added (I think implemented by Jeff Yost or Brian Knep). The problem was that it took many hours just to track a single point in a single shot. This was on expensive (and slow – about 50 MHz) SGI machines. ILM didn’t have much, so a more efficient solution was needed.

Cross-correlation can be implemented in the Fourier domain, with increasing and sometimes dramatic relative speed when tracking larger images. However, Fourier domain convolution does not provide the normalized form of cross-correlation needed to make tracking useful. I realized that a precomputed table of running sums (previously used in graphics in Frank Crow’s summation table work) could be used to efficiently add the necessary normalization to the Fourier domain approach.

Original footage.
Tom Hanks blue screen.
JFK rotoscoped.
New oval desk top.
Final blow.

b&a: What tools were available and used for this job at the time, and how did you implement the algorithm into the workflow at ILM?

JP Lewis: The algorithm was originally implemented in Repo, a matching program written by John Horn (lead author) and myself. We’ve also added limited ability to track rotation, scaling, and perspective homography with corners using four tracked points.

b&a: Can you remember specific shots that your algorithm was used on Gump? What made these photos particularly difficult to follow or deal with?

JP Lewis: It was used in a series of photos on forrest gump. Shots where Tom Hanks was inserted into news scenes of historical figures (Martin Luther King, Kennedy, Nixon, etc.) were challenging because they had blur caused by reporters moving their hand-held cameras to find better views. The same motion blur had to be applied to Tom Hanks to make him fit the scene more perfectly. Tracking relatively large regions was necessary due to motion blur, and handling rotation and scale was sometimes necessary.


Comp. the final

b&a: How was your research packaged and used for subsequent projects at ILM, or in any software tools, after Gump?

JP Lewis: After leaving ILM, I reimplemented it for the roto Commotion tool developed by Scott Squires, where it was actually fast enough to track in real time in some cases. At ESC, the algorithm was the 2D tracking component of the Labrador matchmove tool (developed by Dan Piponi and Doug Moore), used in The Matrix sequels. I heard that ILM was still using it in the early 2010s as the basis for some higher level tracking techniques.

At ILM we had also published the algorithm. The paper and an extensive technology report received approximately 3000 citations, including in such far-flung areas as medical imaging and astronomy. It has been implemented independently in some software such as Matlab and one of the Python image processing libraries. The work also introduced the sum-sum trick to computer vision, where it later became known as “integral images”.

We had a later version that combined the correlation surfaces at multiple scales, thus disambiguating spurious matches. It worked remarkably well in one use, but I never had time to follow it up.

algorithm paper link: http://scribblethink.org/Work/nvisionInterface/nip.pdf

commotion: https://www.toolfarm.com/news/toolfarm-throwback-remember-commotion/

b&a: Did you keep your interest in the follow-up (in particular) after this work? What would you say about the development of tools in this area that you’ve seen since the early 90s?

JP Lewis: There has been some very promising neural tracing work in the last year or two, and other work that can do “semantic” correspondences (eg Tang et al., Emergent correspondence from image diffusion) could have some interesting and unusual uses in pursuit I. think.

Perhaps the only current limitation of neural techniques is just resolution – in computer vision research, 512×512 is sometimes seen as “high resolution”, so a practical solution at this point might require using neural tracking as an initialization or constraint associated with a more accurate resolution. (but less robust) traditional method. I think the ultimate approach to tracking will be to not treat it as a separate issue from understanding the scene as a whole. A system that understands 3D and even object categories can provide better high-level disambiguation for tracking, and vice versa, tracking informs that understanding. Advances in AI suggest that this will likely be achievable.

Comp. the final

#Tracking #Algorithm #Helped #Insert #Tom #Hanks #Forrest #Gump #Historical #Character #News #Scenes

Leave a Reply

Your email address will not be published. Required fields are marked *