Google introduces Gemini 2.0 with autonomous tool connectivity

Google is embracing “agentic experiences” in the rollout of Gemini 2.0, its new flagship family of generative AI that is expected to compete with ChatGPT with OpenAI o1, GitHub Copilot and Amazon Nova.

The tech giant released the first model, Gemini 2.0 Flash, on December 11 to global developers through the Gemini API in Google AI Studio and Vertex AI. Consumers can expect Gemini 2.0 to affect Google Search and AI reviews, with limited testing starting next week. A public rollout is set for early 2025.

Through Gemini 2.0, developers have access to multimodal input and text output, while early access partners can test text-to-speech and native image generation. The Gemini app will be updated “soon” with Gemini 2.0 Flash Google said in a press release.

General availability, and additional model sizes such as the base model Gemini 2.0, are expected to follow in January.

What is Gemini 2.0?

Gemini 2.0 is a multimodal generative AI model running on Google’s Trillium hardware. It is designed to make online tasks easier and more intuitive by helping with summarizing information, performing web searches, and even more naturally interacting with tools or applications.

Google noted that Gemini 2.0 Flash is twice as fast as its predecessor, 1.5 Pro, and outperforms it in AI performance benchmarks such as MMLU-PRO and LiveCodeBench.

“If Gemini 1.0 was about organizing and understanding information, Gemini 2.0 is about making it much more useful,” Google CEO Sundar Pichai said in a statement.

What sets Gemini 2.0 apart is its agency capacity. Pichai described these capabilities as allowing the model to “understand more about the world around you, think multiple steps ahead and act on your behalf, with your oversight.”

Google further emphasized that Gemini 2.0 differentiates itself by:

  • The multimodal processing.
  • Ability to understand long books or wide sections of the web.
  • Function calling.
  • “Using Native Tools.”
  • “Complex instruction following and planning.”

Native tool usage allows the AI ​​to incorporate tools like Google Search and code execution to perform autonomous actions. In practical terms, it sometimes resembles Google’s Project Astra – an Android app currently in testing that uses the phone’s camera and Gemini’s reasoning to answer questions about the world in real time. Project Astra can analyze up to 10 minutes of video at a time.

Google also announces additional projects, prototypes

Project Mariner

Another proof of concept is Project Mariner, an experimental Chrome extension that showcases Google’s effort to enable Gemini to read browser screens. Users can ask it to summarize web pages or make a purchase.

“It is still early days, but Project Mariner shows that it is becoming technically possible to navigate within a browser, even if today it is not always accurate and slow to complete tasks, which will improve rapidly over time,” Demis Hassabis, CEO of Google DeepMind and Koray Kavukcuoglu, CTO of Google DeepMind, wrote in the press release.

SEE: Google also unveiled specialized image and video generation AI models in early December.

Deep research

Deep Research, available with a Gemini Advanced subscription, is an experimental model connected to the web. It is designed to create research plans and outlines for graduate students, scientists or entrepreneurs. The tool searches the web for the topic of your choice, presents a research plan to approve or modify, and then analyzes the existing work.

Jules developer assistant

Google also announced a new developer tool called Jules, a coding assistant powered by Gemini 2.0 Flash. Jules sits within GitHub and can write code, fix bugs, and create and execute multi-step plans. Jules is available today to a limited pool of testers. Google expects expanded availability in early 2025.

Google prepares for cyber threats

Google also noted that it is aware Project Mariner, in particular, could be a rich hunting ground for rapid injection attacks. The company said it is working to set up safeguards against phishing and fraud attempts where attackers can sneak AI instructions into emails, websites or documents.

================
BSB UNIVERSITY – AI – IT SOLUTIONS

AISKILLSOURCE.COM


Leave a Comment

Your email address will not be published. Required fields are marked *

Shopping Cart
Scroll to Top