Google AI introduces Frame Interpolation for Large Motion (FILM): A new neural network architecture for creating high-quality slow-motion videos from nearly duplicate photos

0

Increasingly, many studies are focusing on frame interpolation, which synthesizes intermediate images between a pair of input frames. Temporal upsampling can increase the frame rate or create slow-motion videos.

A new application has recently appeared. Due to the ease with which digital photography works, individuals often snap multiple shots in quick succession to find the best one as they can now create multiple images in a matter of seconds. Interpolation between these “near-duplicates” reveals scene movement (and some camera movement) that often conveys a more appealing sense of the event than any of the original photos, and has interesting potential. However, traditional interpolation approaches have a significant handicap when dealing with still images, as the time gap between near-duplicates can be a second or more, given correspondingly large scene motion.

Recent approaches have shown promising results for the challenging problem of frame interpolation between consecutive video frames, which often exhibit subtle motion. However, interpolation for large scene movements, which typically occur in near-duplicates, has received little attention. Although the study attempted to solve the large movement problem by training with a dataset of very extreme movements, its performance on small movement tests was disappointing.

A recent study by Google and the University of Washington proposes the Frame Interpolation for Large Motion (FILM) algorithm for interpolating large frames of motion, with a focus on near-duplicate image interpolation. FILM is a straightforward, unified, and one-stage model that can only be trained on standard frames and does not require the use of optical flow or depth networks or their limited pre-training data. It includes a “scale-agnostic” bi-directional motion estimator that can learn from normal motion frames but still generalizes well to high motion frames, and a “feature pyramid” that distributes importance across scales. They modify a multi-scale shared-weight feature extractor and present a scale-insensitive bidirectional motion estimator that can handle small and large motion effectively using only standard training frames.

Based on the assumption that fine-grained motion should be analogous to coarse-grained motion, the method increases the number of pixels (since finer scale means higher resolution) accessible for large motion monitoring.

Researchers found that when state-of-the-art algorithms perform well on benchmarks, the interpolated frames often look shaky, especially in large unobscured areas created by large camera movements. To address this problem, they optimize their models using the Gram matrix loss, which is consistent with the autocorrelation of the high-level VGG features, yielding notable improvements in image sharpness and realism.

In addition to relying on limited data to pre-train additional optical flow, depth, or other prior networks, the training complexity of modern interpolation techniques is a significant limitation. Lack of information is particularly problematic for major changes. This study also contributes to a consistent architecture for frame interpolation that can only be trained using standard frame triplets, greatly simplifying the training procedure.

Extensive experimental results demonstrate that FILM delivers high-quality, temporally smooth video and outperforms competing approaches for both large and small movements.

This Article is written as a research summary article by Marktechpost Staff based on the research paper 'FILM: Frame Interpolation for Large Motion'. All Credit For This Research Goes To Researchers on This Project. Check out the paper, project, github link and reference article.

Please Don't Forget To Join Our ML Subreddit


Asif Razzaq is an AI journalist and co-founder of Marktechpost, LLC. He is a visionary, entrepreneur and engineer who strives to harness the power of artificial intelligence for good.

Asif’s latest project is the development of an artificial intelligence media platform (Markechpost) that will revolutionize how people can find relevant news related to artificial intelligence, data science and machine learning.

Asif was featured by Onalytica in Who’s Who in AI? (Influential Voices & Brands)” as one of the “Influential Journalists in AI” (https://onalytica.com/wp-content/uploads/2021/09/Whos-Who-In-AI.pdf). His interview was also published by Onalytica (https://onalytica.com/blog/posts/interview-with-asif-razzaq/).


Share.

Comments are closed.