gemini-computer-use

Developer Tools

Gives your agent the ability to automate web browser tasks by controlling a browser through the Gemini Computer Use model and Playwright.

When to use

When you need an agent to navigate websites and perform UI actions
When you want to automate complex multi-step browser-based workflows
When you need to integrate safety confirmations for risky online operations

How to use

1
Provide the goal or prompt for the browser automation task
2
Specify the starting URL and any turn limits for the agent loop
3
Configure the browser channel or custom executable if required
4
Review and confirm any risky actions if safety confirmations are enabled

Example conversation

case

You

Find the latest blog post title on example.com and summarize it.

Agent

I will launch a Playwright-controlled browser, navigate to example.com, and identify the latest blog post. I'll capture screenshots to analyze the UI, click the post link, and then extract and summarize the content for you. You can watch the progress through the agent loop as I navigate each step.

FAQ

Related Skills

browser-useDeveloper Tools

Gives your agent the ability to automate browser interactions for web testing, form filling, data extraction, and authenticated browsing using local or cloud-based sessions.

use-domDeveloper Tools

Gives your agent the ability to run web code in a webview on native platforms using Expo DOM components.

baoyu-danger-gemini-webDeveloper Tools

Gives your agent the ability to generate text and images via the Gemini Web API, supporting multi-turn conversations and vision-based vision input.

ui-component-patternsDeveloper Tools

Gives your agent the ability to build reusable React components following modern patterns like composition and compound components.

internal-commsContent & Writing

Gives your agent the ability to write professional internal communications like status reports, newsletters, and leadership updates using established company formats.

gemini-computer-use

Developer Tools

Gives your agent the ability to automate web browser tasks by controlling a browser through the Gemini Computer Use model and Playwright.

When to use

When you need an agent to navigate websites and perform UI actions
When you want to automate complex multi-step browser-based workflows
When you need to integrate safety confirmations for risky online operations

How to use

1
Provide the goal or prompt for the browser automation task
2
Specify the starting URL and any turn limits for the agent loop
3
Configure the browser channel or custom executable if required
4
Review and confirm any risky actions if safety confirmations are enabled

Example conversation

case

You

Find the latest blog post title on example.com and summarize it.

Agent

FAQ

Related Skills

browser-useDeveloper Tools

Gives your agent the ability to automate browser interactions for web testing, form filling, data extraction, and authenticated browsing using local or cloud-based sessions.

use-domDeveloper Tools

Gives your agent the ability to run web code in a webview on native platforms using Expo DOM components.

baoyu-danger-gemini-webDeveloper Tools

Gives your agent the ability to generate text and images via the Gemini Web API, supporting multi-turn conversations and vision-based vision input.

ui-component-patternsDeveloper Tools

Gives your agent the ability to build reusable React components following modern patterns like composition and compound components.

internal-commsContent & Writing

Gives your agent the ability to write professional internal communications like status reports, newsletters, and leadership updates using established company formats.

Product

Community

gemini-computer-use

When to use

How to use

Example conversation

FAQ

Related Skills

Product

Community

gemini-computer-use

When to use

How to use

Example conversation

FAQ

Related Skills

Product

Community