ChatGPT agent can become your personal assistant at home and work

ChatGPT can choose from a set of agentic skills to complete tasks using its own computer

Last updated: July 18, 2025 | 13:30

3 MIN READ

ChatGPT will request permission before taking actions of consequence

ChatGPT

ChatGPT will now be able to handle tasks for users using its own computer, such as checking the calendar and creating slideshows.

Not only that, the AI will be able to intelligently navigate websites, filter results, prompt the user to log in securely when needed, run code, conduct analysis, and even deliver editable slideshows and spreadsheets that summarize its findings.

Also Read: Boost your investment strategy with ChatGPT: Top 5 AI tips that work

How does ChatGPT do this?

“At the core of this new capability is a unified agentic system. It brings together three strengths of earlier breakthroughs: Operator’s⁠ ability to interact with websites, deep research’s⁠ skill in synthesizing information, and ChatGPT’s intelligence and conversational fluency,” the company said.

“ChatGPT carries out these tasks using its own virtual computer, fluidly shifting between reasoning and action to handle complex workflows from start to finish, all based on your instructions,” it added.

Also Read: From poverty to profit: How ChatGPT helped Edward Frank Morris outperform the stock market

Is it safe?

According to OpenAI, the user is always in control. ChatGPT will request permission before taking actions of consequence, and the user can easily interrupt, take over the browser, or stop tasks at any point.

How do you activate it?

Starting today, Pro, Plus, and Team users can activate ChatGPT’s new agentic capabilities directly through the tools dropdown from the composer by selecting ‘agent mode’ at any point in any conversation.

“While ChatGPT agent is already a powerful tool for handling complex tasks, today’s launch is just the beginning. We’ll continue to iteratively add significant improvements regularly, making it more capable and useful to more people over time,” OpenAI said.

Also Read: How ChatGPT saved Dubai family’s Maldives vacation with beach villa and pool— but can AI really plan travel?

Other achievements

Previously, Operator and deep research had unique strengths. Operator could scroll, click, and type on the web, while deep research analysed and summarized information. However, Operator couldn’t dive deep into analysis or write detailed reports, while deep research couldn’t interact with websites to refine results or access content requiring user authentication.

“In fact, we saw that many queries users attempted with Operator were actually better suited for deep research, so we brought the best of both together,” the company said.

Also Read: AI is here to stay: How UAE schools are coping with ChatGPT in classrooms

The new model, which integrates these strengths, unlocks new capabilities within one model. It can now actively engage websites — clicking, filtering, and gathering more precise, efficient results. Moreover, it allows users to naturally transition from a simple conversation to requesting actions directly within the same chat.

So how do you use it?

ChatGPT agent has a suite of tools: a visual browser that interacts with the web through a graphical-user interface, a text-based browser for simpler reasoning-based web queries, a terminal, and direct API access.

The agent can also leverage ChatGPT connectors, which allows users to connect apps like Gmail and Github so ChatGPT can find information relevant to the prompts and use them in its responses.

Also Read: Should you ask AI chatbots like ChatGPT, Gemini to manage your salaries better?

Users can also log in on any website by taking over the browser, allowing it to go deeper and broader in both its research and task execution. Giving ChatGPT these different avenues for accessing and interacting with web information means it can choose the optimal path to most efficiently perform tasks. For instance, it can gather information about the user’s calendar through an API, efficiently reason over large amounts of text using the text-based browser, while also having the ability to interact visually with websites designed primarily for humans.

As ChatGPT works, users can interrupt at any point to clarify their instructions, steer it towards desired outcomes, or change the task entirely. It will pick up where it left off, now with the new information, but without losing previous progress. Likewise, ChatGPT itself may proactively seek additional details from the user when needed to ensure the task remains aligned with the goals.