Gemini Is Google’s Best AI Model Yet, But Who Cares?

Google’s reveal of Gemini, an AI model built to close the gap between the search giant and OpenAI, made a great first impression. Strong benchmarks, a flashy video demo, and immediate availability (albeit for a cut-back version) signaled confidence.

But the positivity soured as AI engineers and enthusiasts picked through the details and found flaws. Gemini is an impressive entry that may eventually erode GPT-4’s dominance, but Google’s slippery messaging has left it playing defense.

“There’s more question than there are answers,” says Emma Matthies, lead AI engineer at a major North American retailer who was speaking for herself not her employer. “I did find there to be a discontinuity between the way [Google’s Gemini video demo] was shown and details that are actually in Google’s tech blog.”

Google’s troubled demo

Google’s Gemini demo drew criticism as AI developers noticed inconsistencies.Google

The demo in question is titled “Hands-on with Gemini,” and launched on YouTube alongside Gemini’s reveal. It’s fast-paced, friendly, fun, and packed with easy-to-understand visual examples. It also exaggerates how Gemini works.

A Google representative says the demo “shows real prompts and outputs from Gemini.” But the video’s editing leaves out some details. The exchange with Gemini occurred over text, not voice, and the visual problems the AI solved were input as images, not a live video feed. Google’s blog also describes prompts not shown in the demo. When Gemini was asked to identify a game of rock, paper, scissors based on hand gestures, it was given the hint “it’s a game.” The demo omits that hint.

And that’s just the start of Google’s problems. AI developers quickly realized Gemini’s capabilities were less revolutionary than they initially appeared.

“If you look at the capabilities of GPT-4 Vision, and you build the right interface for it, it’s similar to Gemini,” says Matthies. “I’ve done things like this as side projects, and there’s experiments on social media like this as well, such as the ‘David Attenborough is narrating my life’ video, which was extremely funny.”

GPT-4 Vision can interpret images in ways similar to Google’s Gemini demo.Replicate

On 11 December, just five days after Gemini’s reveal, an AI developer named Greg Sadetsky produced a rough recreation of the Gemini demo with GPT-4 Vision. He followed up with a head-to-head comparison between Gemini and GPT-4 Vision, which didn’t go Google’s way.

Google is taking flack for its benchmark data, too. Gemini Ultra, the largest of three models in the family, claims to beat GPT-4 in a variety of benchmarks. This is broadly true, but the quoted figures are selected to paint Gemini in the best light.

Google used different methodologies from others for measuring performance. The way a user prompts an AI…

Read full article: Gemini Is Google’s Best AI Model Yet, But Who Cares?