Gemini CLI vs. Antigravity: What works, not the spec sheet
Google has decommissioned its Gemini CLI, the open-source terminal tool that just shipped last year. It had over 100,000 GitHub stars and hundreds of contributors, and the replacement is Antigravity CLI, a closed-source Go binary you invoke with the command agy. As of June 18, the Gemini CLI stopped serving requests for free, AI Pro, and Ultra personal accounts. There is no grace period. If you’re on a personal plan, June 18 was the day the old tool went dark.
Google’s pitch for the swap is built on three claims. To quote group product manager Dmitry Lyalin and principal engineer Taylor Mullen in the announcement: “Built in Go, Antigravity CLI is snappier and more responsive.” It runs “multiple agents for complex tasks in the background, letting you run large-scale refactors or research several topics without locking up your terminal session.”
And while admitting “there won’t be 1:1 feature parity right out of the gate,” they promise it “keeps the most critical features of Gemini CLI: Agent Skills, Hooks, Subagents, and Extensions.”
These are all pretty strong claims. But does it work? Let’s find out.
The test
I tested Gemini CLI version 0.47.0 against Antigravity CLI version 1.0.9. I migrated one workflow across both tools and ran the same four tests on each:
- A speed benchmark
- A real multi-file coding task on a repo
- A feature-migration test
- An automation test for scripted use
There’s one more thing I have to tell you about before any results, because it shaped the entire Gemini half of this test…
The 503 errors ate my free credits
The very first prompt I sent to Gemini’s default model was a trivial “what is 2+2.” It came back with an HTTP 503 saying “this model is currently experiencing high demand.” Then it hung on automatic retry for 1 minute and 22 seconds before I gave up and canceled.
To get any response at all, I had to drop down to the lighter gemini-2.5-flash-lite model. And the 503s kept coming. They hit me twice when I tried the coding task. The second attempt burned through 2 minutes and 12 seconds of retry backoff before dying with “Invalid stream: The model returned an empty response or malformed tool call.” They hit again when I tested a custom command.
Could they be happening because I ran this test on migration day? Yes, absolutely. But here’s the part about this that stings. Every one of those failed calls got automatically retried, and every retry counted against my quota.
Before long, Gemini stopped even trying and returned a TerminalQuotaError saying “You have exhausted your daily quota on this model.” The fine print showed the free tier caps you at 20 requests per day on flash-lite.
The 503 errors, which were Google’s server problem and not mine, had eaten all my free credits on requests that never returned an answer. To keep testing at all, I had to enable billing on my API key. No, it wasn’t expensive, but I definitely feel ripped off.
The good news here is that I had no 503s during my Antigravity test.
Test 1: Speed
Google led with “snappier” when describing the speed of the Antigravity CLI compared to the Gemini CLI. I timed a trivial prompt three times on each tool, using the shell’s time command, to investigate the claim.
“Snappier” claim falls short
Gemini CLI came in at 3.205s, 2.861s, and 5.545s, a median around 3.2 seconds. Antigravity CLI came in at 3.974s, 3.490s, and 4.176s, a median around 3.97 seconds. In the simplest possible prompt, the new Go binary was slightly slower than the tool it replaces. The “snappier” claim didn’t hold up in my test.
What the new tool did win on was consistency and conciseness. Antigravity’s three timed tests clustered between 3.5 and 4.2 seconds. Gemini’s bounced from 2.9 to 5.5. And every single Gemini invocation printed startup noise, “Ripgrep is not available. Falling back to GrepTool,” plus two [STARTUP] warnings, while Antigravity printed nothing but the answer. I’ll concede the few seconds for an easier-to-use tool.
Test 2: Coding
I pointed both at a real repo, jsonpickle-opus (a repo I had on my machine from a previous test), and asked each to add type hints to the functions in jsonpickle/pickler.py. I ran them non-interactively, the way you would in a script.
Gemini couldn’t do it. It read the file, described the change it was going to make, then stopped and returned: “I am unable to complete your request… because the necessary tool to write to files (write_file) is not available.”
A separate run showed run_shell_command was missing too. In non-interactive mode, Gemini CLI couldn’t perform the job. It can read and think, but it can’t write a file or run the commands you need for automation.
Headless file edits diverge
Antigravity walked right through it. Using its --dangerously-skip-permissions flag, it completed the task in 2 minutes and 10.88 seconds. In addition to editing the file, it ran an entire engineering workflow on its own. It listed the repo, spun up a Python 3.12 virtual environment, installed dependencies, ran the test suite with pytest, ran mypy, checked the history with git, wrote its own AST analysis script, and generated a report artifact. It concluded that every function in the file was already fully type-hinted and passed strict mypy checks, so it correctly declined to make the requested edits.
Because the file was already typed, this test proved Antigravity can act, meaning write files, run shells, and verify itself, but it didn’t let me judge the quality of an actual edit. That is a test for another day. It did prove that Antigravity can work in scripted mode while Gemini CLI can’t.
Test 3: Will it bring your stuff with you?
Google promised the new CLI keeps your extensions. I wanted to test that, so I made a Gemini customization and tried to migrate it.
My first attempt was a mistake, but if it’s a mistake I made, it’s possible someone else would too. I created a custom command and ran agy plugin import gemini. It returned “No gemini extensions found.” For a second, I thought migration was broken. It wasn’t. I’d built the wrong thing. Gemini’s importer looks for extensions, which are full folders with a manifest, not the loose single-file command I wrote. The empty result was correct. My test was wrong.
Then I built a proper extension. I made a demo-migrate folder with a gemini-extension.json manifest, a context file, and a bundled command. Gemini confirmed it with gemini extensions list, showing “demo-migrate (1.0.0),” enabled. Then I ran the import again.
This time it worked. Antigravity reported the extension’s command was “processed (converted to skills),” skipped the categories I hadn’t included, and staged everything to its config. agy plugin list showed demo-migrate with its source marked gemini-cli. When I sent agy -p "say hello", it triggered the migrated skill and replied “hello from demo-migrate extension,” and even cited the SKILL.md file. Success!
This was a surface-level test. I migrated a simple command-bearing extension. I didn’t test skills, subagents, MCP servers, or hooks, because I didn’t have any. So “extensions migrate” is confirmed. The rest of the list is untested.
Test 4: Does it survive automation?
The question Google didn’t address is whether the new tool plays nicely with scripts and pipelines, given that it logs in via a browser rather than with an API key.
Gemini authenticates headless. You set an environment variable, and you’re done; no browser needed. That’s friendly to servers and CI. But as Test 2 showed, once you’re in headless mode, it can’t write or run anything, which limits the value of that friendliness.
Antigravity flips it. It made me sign in through a browser OAuth flow on first run, the kind of thing that’s fine on a laptop and a real obstacle on a headless server. But once authenticated, it behaved well in a pipeline: I piped a function into it with echo "def add(a,b): return a+b" | agy -p "review this code in one sentence" and got back a clean one-sentence review, exit code 0. Whether it offers any non-browser login path for true CI use is something I couldn’t confirm, and it’s the open question I’d most want answered before trusting it in production automation.
Test 5: The test I wasn’t expecting.
Both tools communicate differently. When Gemini failed, it dumped terse errors wrapped in full Node.js stack traces, _ApiError objects with JavaScript call stacks, on top of the startup noise it printed every run. Reading its output required some level of familiarity. This is dangerous in today’s world where every tool speaks like a person.
Antigravity spoke like a person. Well, more like an AI narrator, but you get the point. As it worked through the coding task, it told me, in plain English, what it was doing at each step: “I will read the contents of…”, “I will run the test suite using pytest…”, and it closed with a “Summary of Work” section with clickable links to the files it touched.
I wasn’t expecting such a difference in communication, but there was. Antigravity communicated much more clearly.
The comparison
Here’s where the two tools landed, side by side.
| Dimension | Gemini CLI (0.47.0) | Antigravity CLI (1.0.9) |
| Install | One-line npm | curl script, needed manual PATH fix |
| Auth | API key, headless-friendly | Browser OAuth on first run |
| Trivial-prompt speed | ~3.2s median | ~3.97s median |
| Consistency | Jittery (2.9-5.5s) | Tight (3.5-4.2s) |
| Startup noise | Every run | None |
| 503 errors today | Many | Zero |
| Free credits | Burned by retried 503s; forced me to enable billing | n/a |
| Headless file edits | Not available | Completed the task |
| Headless shell commands | Not available | Ran pytest, mypy, git, venv |
| Migration | (source tool) | Extension migrated and ran |
| When it failed | Stack traces | Plain-language narration |
Antigravity is a better tool. I can’t confirm that it’s a 1:1 match in capabilities, but from my experience, working with Antigravity was much better. Antigravity can write files and run commands, and Gemini can’t, which is the difference between a tool you can automate and one you can only chat with. The migration worked once I fed it the right kind of artifact. And it was the calmer, clearer, more reliable of the two on a day when its predecessor was throwing 503s and eating credits.
I know, I know. I’m judging first-pass behavior on shutdown day, against a tool actively being switched off, on a free-then-barely-paid account getting hammered by server load, but even if I grade on a curve, I still had a better time using Antigravity. Even if it’s not a perfect migration, it will be worth it in the long run.