Google has consistently promised that its Gemini AI model would be better than OpenAI’s GPT-4, the model that powers ChatGPT Plus. Now that Google Gemini has launched, we can finally put it to the test and see how Gemini compares to GPT-4.
When Google launched Bard in March 2023, there were many reasons to be excited. Finally, OpenAI’s ChatGPT monopoly would be broken, and we’d get worthy competition.
But Bard was never the AI titan people hoped for, and GPT-4 remains the dominant generative AI chat bot platform. Now, Google’s Gemini is here—but is the long-awaited AI model better than ChatGPT?
Gemini is Google’s most capable generative AI model , able to understand and operate across different data formats, including text, audio, image, and video. It is Google’s attempt to create a unified AI model drawing capabilities from its most capable AI technologies. Gemini will be available in three variants:
On its official blog,The Keyword , Google says Gemini Ultra outperforms the state-of-the-art in several benchmarks. Google claims Gemini Ultra beats the industry-leading GPT-4 in several key benchmarks.
With an unprecedented 90.0% score on the rigorous MMLU benchmark, Google says Gemini Ultra is the first model to surpass human-level performance on this multifaceted test spanning 57 subjects.
CollageIt Pro
Google
Gemini Ultra can also understand, explain, and generate high-quality code in some of the world’s most popular programming languages, including Go, JavaScript, Python, Java, and C++. On paper, these are all great results. But these are all benchmarks, and benchmarks do not always tell the whole story. So, how well does Gemini perform in real-world tasks?
Of the three variants of the Gemini AI model, you can start using Gemini Pro right now. Gemini Pro is currently available on Google’s Bard chatbot. To use Gemini Pro with Bard, head tobard.google.com and sign in with your Google account.
Google says that Gemini Ultra will roll out in January 2024, so we’ve had to settle for testing Gemini Pro against ChatGPT for now.
When any new AI model is launched, it is tested against OpenAI’s GPT AI models, which are generally accepted as the state-of-the-art model other models should be held up against. So, using Bard and ChatGPT, we tested Gemini’s ability in math, creative writing, code generation, and accurately processing image inputs.
Starting with the easiest math question we could think of, we asked both chatbots to solve:-1 x -1 x -1 .
Bard went first. We repeated the question twice, all coming back with wrong answers. We did get the answer on the third attempt, but that doesn’t count.
We tried ChatGPT running on GPT-3.5. The first trial got it right.
To test Gemini’s image interpretation abilities, we tasked it with interpreting some popular memes. It declined, saying it can’t interpret images with people in it. ChatGPT, running GPT-4V, was willing and able to do so flawlessly.
We tried another attempt at making it interpret an image while testing its problem-solving and coding ability. We gave Bard, running Gemini Pro, a screenshot and asked it to interpret and write HTML and CSS code to replicate the screenshot.
Here’s the source screenshot.
Below is Gemini Pro’s attempt to interpret and replicate the screenshot using HTML and CSS.
And here’s GPT-4’s attempt at replicating the screenshot. The result is not surprising, considering GPT-4 has historically been strong at coding. We’ve previously demonstratedusing GPT-4 to build a web app from scratch .
We asked Gemini Pro to create a poem about Tesla (the electric vehicle car brand). It showed marginal improvements from previous tests we’ve done in the past. Here’s the result:
At this point, we thought comparing the results against GPT-3.5 rather than the supercharged GPT-4 would be more appropriate. So, we asked ChatGPT running GPT-3.5 to create a similar poem.
It may be a personal choice, but Gemini Pro’s take on this seems better. But we’ll let you be the judge.
Before Google launched Bard, we thought it’d be the ChatGPT competition we had been waiting for—it wasn’t. Now, Gemini is here, and so far, Gemini Pro doesn’t seem like the model to give ChatGPT the knockout punch.
Google says Gemini Ultra is going to be much better. We truly hope it is, and that it meets or exceeds the claims made in the Gemini Ultra announcement. But until we see and test the best version of Google’s generative AI tool, we won’t know if it can unseat other AI model competitors. As it stands, GPT-4 remains the undisputed AI model champion.