Artificial intelligence (AI) proves to be very adept at certain tasks – such as inventing faces that don’t really exist, or winning poker games.
Once a person knows what a cat is, we can easily imagine a cat of a different color, a cat in a different pose, or a cat in a different environment. This is much more difficult for AI networks, although they can recognize a cat when it sees it (with enough training).
In an attempt to unlock the imagination of AI, researchers have developed a new method that artificial intelligence systems can use to figure out what an object should look like, even if they have never seen one so closely.
âWe were inspired by the abilities of human visual generalization to try to simulate the human imagination in machines,â says Yunhao Ge, computer scientist of the University of Southern California (USC).
âPeople can separate the knowledge they have learned according to attributes – for example shape, pose, position, color – and then combine them again in order to imagine a new object. Our contribution tries to simulate this process with neural networks. “
The key is extrapolation – being able to take a large bank of training data (like pictures from a car) and then go beyond what is seen into the invisible. This is difficult for AI as it is usually trained to recognize certain patterns rather than broader attributes.
What the team has developed here is known as controllable disentangled representation learning and uses a similar approach to deepfake creation – disentangling different parts of a sample (i.e. separating facial movement and facial identity in the case of a deepfake video).
This means that when an AI sees a red car and a blue bike, it can âimagineâ a red bike – even if it has never seen one. The researchers have summarized this in what they call Group Supervised Learning.
One of the main innovations in this technique is to process samples in groups rather than individually and to establish semantic connections between them. The AI ââis then able to identify similarities and differences in the samples it sees and use that knowledge to produce something completely new.
“This new unbundling approach really unleashes a new imagination in AI systems for the first time and brings it closer to the human understanding of the world,” says USC computer scientist Laurent Itti.
These ideas aren’t entirely new, but here the researchers have refined the concepts and made the approach more flexible and compatible with additional data types. They also made the framework open source so that other scientists could use it more easily.
In the future, the system developed here could protect against AI bias by removing more sensitive attributes from the equation – for example, helping to create neural networks that are not racist or sexist.
The same approach could also be applied in the medical and self-driving fields, say the researchers, where AI can “introduce” new drugs or visualize new street scenarios for which it was not specially trained in the past.
“Deep learning has already demonstrated unmatched performance and promise in many areas, but all too often it has done so through superficial imitation and without a deeper understanding of the individual attributes that make each object unique,” says Itti.
The research was presented at the International Conference on Learning Representations 2021 and can be read here.