Deep Learning With The Wolf
Deep Learning With The Wolf
The Day of the Agents
0:00
-20:17

The Day of the Agents

A Tale of Two Agents: One Took Action. The Other Cheered Me On.

Today was supposed to be a standard writing day: me, coffee, AI headlines, and one short conference call.

Plan: Execute.

Instead, during that one call:

🧠 OpenAI dropped the ChatGPT Agent—a fully agentic AI that can browse the web, click buttons, fill out forms, and build docs or slides. In short, it can do things, not just talk about them.

🌠 Simultaneously, Perplexity granted me access to Comet, their new agent that lives inside your browser. I’d been on the waitlist, and I’d volunteered for testing, so maybe I stacked the deck. Either way, it cheerfully just appeared in my browser just as OpenAI was promising us the future.

Article content

An agent war, announced in real time, while I was blowing the steam off my coffee.


Is This Even Real?

Naturally, my first reaction was suspicion. The Comet app immediately requested access to all my Chrome passwords.

Wait—what? It felt like the start of a phishing scam: target nerds who love early tech, trigger a browser pop-up, and ask for access to the Chrome keychain. Deviously clever.

ChatGPT is my go-to tech support these days. I asked: “Help me verify this.”

Chat took the job seriously. We opened Terminal windows, ran security checks, and verified installation paths. (It passed.) It even explained how I could run Comet without giving it access to every password I’ve ever saved.

So I let Comet in… with conditions. No, Comet, you may not know my entire online life. You can log into Perplexity, sure. But everything else? You’ll have to earn that level of trust.

Comet asked if I wanted it to be my default browser. Comet… we just met. We can’t already be besties. (I mean, I do have a second laptop, so maybe there’s room for something to blossom. But let’s not rush it.)


Agents at Work (Sort Of)

I’ve been writing a lot of mini-news stories lately. Think AI and robotics headlines—short, timely, useful. They take time to research, write, and package, especially when paired with original artwork.

This seemed like a perfect test.

I asked Comet to help me:

  • 🧠 Find good stories

  • 📝 Draft them using my format and subtitle structure

  • 🎨 Generate matching images

It did surprisingly well on the first two. Fast, competent, and formatted exactly as I requested. For a first run? I was impressed.

Then came the artwork.


When “Integration” Isn’t

I noticed a little “Ideogram” tab in the Comet sidebar. Exciting! This looked like a direct integration—a seamless way to generate images without leaving the tool.

So I gave Comet my artwork style prompt and told it to get to work.

I give it a solid B- for trying very, very hard. (Chat wanted to give Comet and "F" for failing the assignment. Seriously, Chat?)

Comet navigated to the Ideogram website, but from there, it absolutely lost the plot. It couldn’t find the “Create” button. It kept getting lost in the navigation of the page and scrolling endlessly through the artwork of the public pages. I mean, I do that, too, when my blood sugar is low and my attention levels are shot.

Comet narrated its journey with an epic monologue worthy of those first 100 pages of LOTR. You know, a story that kind of goes on and on and on and on and on... and you wonder? Where is this going?

It landed on an account named “Generate” and tried to interpret that as a verb. It gave itself pep talks in long internal monologues, scrolling aimlessly and narrating the experience like a lost tourist reading every street sign aloud.

Eventually, I stepped in and said, “The button’s the plus sign in the black circle. Bottom of the page.”

Article content
I wrote this message (a prompt, really) to Comet while we (sort of) integrating with Ideogram. My gentle coaching of the AI agent reminded me of how I used to encourage the little ones when I worked at an elementary school. "Come on, little Comet!"

More scrolling. More monologuing. Finally, Comet stopped and told me:

“You can generate your image now.”

Oh. Thank you. For… warming up the interface?

That saved me a LOT of time.

Hooray agents?


What Went Wrong?

Chat tattled on Comet: it didn’t have API access. That was the issue.

I’d been feeding ChatGPT screenshots of Comet’s flailing attempt to use Ideogram, hoping it could help me troubleshoot. I could have asked Perplexity support, but that would’ve returned 37 scholarly articles, six GitHub links, and a whitepaper. Chat, at least, gives me a straight answer—eventually.

Here’s what we pieced together: Comet couldn’t click buttons, verify logins, or pass prompts inside Ideogram. That level of access usually requires APIs or some kind of built-in integration. Comet had a front-row seat to Ideogram, but no backstage pass.

Sure, I could’ve done it the developer way—via API—but I’m a writer. I don’t want to manage tokens. I want to say: “Write the article. Make the picture.”

And then go work on something fun in my shed.

That’s the promise of agentic AI. But today? We’re not quite there.

Not a failure. Just a reminder: agents still have to learn which doors push and which ones pull.


So What Can They Do?

Honestly? A lot. But not everything.

Agents like Comet and ChatGPT Agent represent a shift from language-only AI to actionable AI—systems that can complete complex, multi-step tasks without you micromanaging each step.

But they’re still learning the terrain.

  • They can write.

  • They can search.

  • They can follow templates and click around familiar apps.

But:

  • They struggle with unfamiliar interfaces.

  • They can’t always verify session states or pass security barriers.

  • They still need you in the loop.

Today was a good reminder that the future is here—but it still needs a human holding the map.


And, also today...

Open AI Introduces ChatGPT Agent

At first glance, the ChatGPT Agent announcement might look like just another upgrade. But today’s news marks something far more transformative. This isn’t just a smarter chatbot. It’s an AI that does things.

The new ChatGPT Agent combines several earlier experimental features—like web browsing, code execution, and file handling—but adds the ability to interact with websites, fill out forms, click buttons, and build full documents or slide decks… all within a secure, sandboxed virtual machine inside ChatGPT. Think of it like giving your AI a mouse, keyboard, and private desktop—and watching it go to work on your behalf.

You can hand it a task like:

  • “Book me a flight under $300 from SFO to LAX.”

  • “Research a list of AI tools for education, compile a Google Doc, and draft an intro email.”

  • “Generate a chart using my uploaded CSV, insert it into a slide deck, and write speaker notes.”

The kicker? It asks before doing anything consequential. This isn’t a runaway script. It checks in before sending emails or making purchases.

OpenAI calls this a “fully agentic AI.” And while it’s early days, the direction is clear: ChatGPT isn’t just an assistant anymore. It’s aiming to become your AI teammate—one that’s helpful, autonomous (with boundaries), and capable of navigating the same internet you do.


  1. Agents are real—they’re not coming tomorrow, they’re live today.

  2. This is the next step in AI: from answering to acting.

  3. Comet showed promise—it just needs integration polish.

  4. GPT-4o+Agents can already speed up heavy-lifting tasks—research, formatting, packaging.

And me? I still had to do the thinking, editing, a whole lot of rewriting- and yes- explain why Comet got stuck so badly it made me laugh. It took me back to gently coaching elementary schoolers—and I’m grateful for the reminder that assistants might show up… but they don’t replace the teacher.


#ChatGPT #OpenAI #PerplexityAI #CometAI #AgentFails #RobotAssistant #HumanInTheLoop #WritersLife #DeepLearningwiththeWolf #DianaWolfTorres #Perplexity #ChatGPTAgent

Discussion about this episode

User's avatar