ChatGPT agent can become your personal assistant at home and work

ChatGPT can choose from a set of agentic skills to complete tasks using its own computer

Last updated:
Anupam Varma, Online Editor
3 MIN READ
ChatGPT will request permission before taking actions of consequence
ChatGPT will request permission before taking actions of consequence
ChatGPT

ChatGPT will now be able to handle tasks for users using its own computer, such as checking the calendar and creating slideshows.

Not only that, the AI will be able to intelligently navigate websites, filter results, prompt the user to log in securely when needed, run code, conduct analysis, and even deliver editable slideshows and spreadsheets that summarize its findings.

How does ChatGPT do this?

“At the core of this new capability is a unified agentic system. It brings together three strengths of earlier breakthroughs: Operator’s⁠ ability to interact with websites, deep research’s⁠ skill in synthesizing information, and ChatGPT’s intelligence and conversational fluency,” the company said.

“ChatGPT carries out these tasks using its own virtual computer, fluidly shifting between reasoning and action to handle complex workflows from start to finish, all based on your instructions,” it added.

Is it safe?

According to OpenAI, the user is always in control. ChatGPT will request permission before taking actions of consequence, and the user can easily interrupt, take over the browser, or stop tasks at any point.

How do you activate it?

Starting today, Pro, Plus, and Team users can activate ChatGPT’s new agentic capabilities directly through the tools dropdown from the composer by selecting ‘agent mode’ at any point in any conversation.

“While ChatGPT agent is already a powerful tool for handling complex tasks, today’s launch is just the beginning. We’ll continue to iteratively add significant improvements regularly, making it more capable and useful to more people over time,” OpenAI said.

Other achievements

Previously, Operator and deep research had unique strengths. Operator could scroll, click, and type on the web, while deep research analysed and summarized information. However, Operator couldn’t dive deep into analysis or write detailed reports, while deep research couldn’t interact with websites to refine results or access content requiring user authentication.

“In fact, we saw that many queries users attempted with Operator were actually better suited for deep research, so we brought the best of both together,” the company said.

The new model, which integrates these strengths, unlocks new capabilities within one model. It can now actively engage websites — clicking, filtering, and gathering more precise, efficient results. Moreover, it allows users to naturally transition from a simple conversation to requesting actions directly within the same chat.

So how do you use it?

ChatGPT agent has a suite of tools: a visual browser that interacts with the web through a graphical-user interface, a text-based browser for simpler reasoning-based web queries, a terminal, and direct API access.

The agent can also leverage ChatGPT connectors, which allows users to connect apps like Gmail and Github so ChatGPT can find information relevant to the prompts and use them in its responses.

Users can also log in on any website by taking over the browser, allowing it to go deeper and broader in both its research and task execution. Giving ChatGPT these different avenues for accessing and interacting with web information means it can choose the optimal path to most efficiently perform tasks. For instance, it can gather information about the user’s calendar through an API, efficiently reason over large amounts of text using the text-based browser, while also having the ability to interact visually with websites designed primarily for humans.

As ChatGPT works, users can interrupt at any point to clarify their instructions, steer it towards desired outcomes, or change the task entirely. It will pick up where it left off, now with the new information, but without losing previous progress. Likewise, ChatGPT itself may proactively seek additional details from the user when needed to ensure the task remains aligned with the goals.

Related Topics:

Sign up for the Daily Briefing

Get the latest news and updates straight to your inbox

Up Next