Google DeepMind introduces Gemini 2.5 Computer Use model

08-10-2025

On the 7th of October Google DeepMind has introduced the Gemini 2.5 Computer Use model. In this article we explain all about it and it's impact.

Written by:

Jorick van Weelie

gemini computer use model

The AI that controls your screen

On the 7th of October 2025, Google DeepMind has released the Gemini 2.5 Computer Use model, a step forward in AI systems that can interact directly with computer interfaces. Building on Gemini 2.5 Pro’s visual reasoning, this model powers AI agents capable of navigating screens, clicking buttons, typing text, and scrolling. Effectively mirroring human computer interaction without relying on traditional APIs.

What makes Gemini 2.5 stand out?

The model operates in a loop: it receives a request, takes a screenshot, analyzes the interface, and decides on actions. This enables it to fill out forms, manipulate dropdowns, operate behind login screens, and complete workflows that traditionally required human hands.

Key capabilities

  • Web and mobile control: Automatically retrieves information, organizes content, books appointments, and transfers data between systems.
  • Enterprise uses: Deployed for UI testing, powering Project Mariner, Firebase Testing Agent, and AI Mode in Search. Early users report strong results for personal assistants and workflow automation.
  • Multi-step automation: Breaks complex tasks into smaller actions, adapting to unexpected changes during execution.

Performance

Benchmarks show Gemini 2.5 outperforming competitors on web and mobile control with low latency and high accuracy, especially in browser control. Third-party testing by Browserbase confirms its reliability across diverse environments.

Competition

  • Claude 3.5 Sonnet: Strong task performance but vulnerable to prompt injection attacks and “blind goal-directedness,” pursuing actions without safety checks.
  • OpenAI’s computer-using agent: Powers Operator service with 38% OSWorld benchmark success. Uses pixel-based perception and safety checks for sensitive actions.
  • Microsoft Copilot studio: “Use your computer” feature integrates with Windows, emphasizing security through explicit user permissions in controlled environments.

Benefits

  • Universal access: Works with any software or website a human can operate, no custom API needed.
  • Productivity gains: Automates data entry, form filling, and repetitive workflows, freeing staff for strategic work.
  • Accessibility: Helps individuals with physical limitations navigate digital systems.

Risks

  • Prompt injection: Malicious content can trigger unintended actions.
  • Privacy concerns: AI access to credentials and sensitive data poses significant risks.
  • Audit challenges: Hard to distinguish AI actions from human activity in logs.
  • Blind autonomy: Agents may pursue unsafe or impossible goals.
  • Adversarial manipulation: Vulnerable to visual or contextual tricks that mislead reasoning.

Safety measures

Google addresses risks with per-step safety checks, developer-defined refusal rules, and user confirmation for sensitive actions. The system avoids harmful activity such as bypassing CAPTCHAs or controlling medical devices. Most deployments include human-in-the-loop oversight, though research warns this isn’t foolproof. Sandboxed environments further minimize unintended consequences.

Future impact

Advanced computer use models like Gemini 2.5 could redefine human-computer interaction, accelerating workflows and enabling new automation possibilities. Yet security, privacy, and control remain critical challenges. Ongoing competition between AI leaders will push innovation, underscoring the urgency for industry-wide safety standards. Another question that arises in regards to website owners is how they should set their websites in a future that doesn’t necessarily has human website visitors, but bots. One of the follow-up questions from that is how the future of Captchas will look as soon that may mean missing out on potential purchases or leads that an AI agent would have submitted.

For more information please visit the official announcement on Gemini 2.5 Computer Use Model by Google