The Chat Agent in Your Closet
Every doc you write, every cart you fill. Watching. For you.
My web traffic this year is down. I’ve been tracking it because I’m trying to figure out how my own usage is shifting, and the pattern is clearer than I expected: housekeeping has fallen off a cliff. Tracking recipes, building a grocery list, calendaring — Hermes does all of that now. Education and entertainment have expanded into the space it left, so I’m probably online longer than I was a year ago. I’m just not opening a browser to do things on the open web anymore. I’m opening it to talk to a chat agent.
And as of this month, the chat agent is mine.
Last month, I walked you through setting up a private morning briefing — the agent doing things while you slept. This month is the other end of the same problem: what changes when you sit down at a keyboard and the chat blinking back at you is something you own.
I have a tendency to build things when I’m curious, so I happened to have an old Bitcoin mining rig from a past wave of curiosity lying around. What’s on the shelf in my closet now is an open-frame box of 80/20 aluminum extrusion, three graphics cards bolted to a metal skeleton, fans blasting, sitting next to the router like a workshop project I forgot to finish. Running a model on those GPUs all month is cheaper than the equivalent OpenRouter calls. The rig is mining tokens again, just a different kind. And they’re mine.
BYO GPU
If you don’t have a GPU lying around from a past life, go with an Intel NUC or a Mac mini. The NUC is the cheaper path — used, around $300, runs Ubuntu out of the gate, fits in a shoebox. The Mac mini costs more, and you’ll pay extra if you want your agent inside the Apple ecosystem: texting you on iMessage, pulling tracks from your Apple Music library, reaching into Photos and Contacts. The only legitimate way through Apple’s wall is a real Apple machine sitting on your network, running the open-source iMessage bridge BlueBubbles and whatever other shims you need. (I actually have a Mac mini on my network too, sitting next to my no-case monster, just so Hermes can use it for this stuff.)
Neither the NUC nor the Mac mini will run a serious model. Without a GPU, you don’t get local inference — at least not at a speed that makes the agent feel like it’s doing something. So you have to point at something else. Two paths. The easier: point at Anthropic or OpenAI directly, get the agent up tonight, accept that you’re renting the most important layer of the stack from the company whose moat depends on you renting it. The cleaner: OpenRouter pointed at an open-weight model — Qwen, Gemma, DeepSeek, take your pick. The model is open. The metal isn’t. There’s no TEE between your prompts and the operator, which means OpenRouter sees what you ask, the model provider sees what you ask. Maybe better than renting both layers from a frontier lab. Maybe worse than running it in your closet. An honest waypoint.
The call I’d make today, if you don’t have a GPU? OpenRouter, Qwen 235B, eat the caveat. The model is the easy part to swap. The harness — the agent in the loop, the memory, the wires into your life — is what becomes yours.
Now, make the agent yours. Here’s how.
Everything that makes the agent yours lives in ~/.hermes — config, .env: memory, skills, cron, the whole tree. rsync -avz it to the same path on the server. Save the command; you’ll re-run it any time you teach the laptop something you want the server to know. The container that runs Hermes is disposable. The state directory is not.
You need to make two edits on the server before you start. In ~/.hermes/config.yaml, change terminal.backend to local — local to the container you’re about to run, not to the host. Then swap the model endpoint: the 127.0.0.1:8000 from your laptop obviously doesn’t exist on the server. If you’re going the OpenRouter route, drop your key into ~/.hermes/.env, re-run hermes model, pick OpenRouter from the menu, then pick a model.
Next, a three-line docker-compose.yml. The image, a restart policy so the agent comes back after reboots, one bind mount from ~/.hermes on the host to /opt/data inside the container.
services:
hermes:
image: nousresearch/hermes-agent:latest
container_name: hermes
restart: unless-stopped
command: gateway run
volumes:
- /home/raffi/.hermes:/opt/dataThat bind mount is the only door between host and container. Tutorials all over the internet will tell you to mount the host’s Docker socket inside, too, so the agent can run Docker commands “just like on the Mac.” Don’t. A process in one container with the host socket mounted can inspect, start, stop, and remove every other container on the box. That isn’t a sandbox. It’s a key to the building.
The bind mount keeps the agent inside its own container: It can read and write only what’s in /opt/data, and nothing else on the host. That’s a real boundary, and a useful one. It is not a force field. The moment you put a Google OAuth token, an OpenRouter key, or a Whole Foods session cookie inside that directory, you’ve handed the agent the keys to systems that don’t live in your closet. The box is sandboxed; the things it reaches aren’t. Every new connection you wire up is a new attack surface, so treat it like one.
docker compose up -d, then docker compose logs -f. The logs will sit at Fixing ownership of /opt/data to hermes for what feels like a long time. You’re not stuck. The directory you rsynced is large. Walk away.
Mine didn’t reply on the first try. The logs reported “no user allowlists configured,” which I was sure was wrong because I had set both. Half an hour later, I figured out the bind-mount path on the host wasn’t quite the path I thought it was, and the .env that Hermes was reading inside the container was empty. The lesson generalizes: in a bind-mounted setup, the truth is what the container sees, not what’s on the host.
docker compose exec hermes cat /opt/data/.env
I fixed the mount. The bot replied. The next morning, the briefing arrived with my laptop closed.
I drive the agent through OpenWebUI — an open-source chat frontend, a ChatGPT-shaped tab pointed at whatever model server you tell it to, including your own. It’s an actual workspace. You can paste a draft in, share a screenshot, sit with it for half an hour while you work through an argument. OpenWebUI runs as a second service in the same docker-compose. Both containers sit on a private bridge network — they can talk to each other and to nothing else on the host. The OpenWebUI port binds only to 127.0.0.1, which means the UI is reachable from the box itself and from nowhere else without a tunnel.
services:
hermes:
image: nousresearch/hermes-agent:latest
container_name: hermes
restart: unless-stopped
command: gateway run
volumes:
- /home/raffi/.hermes:/opt/data
networks:
- internal
openwebui:
image: ghcr.io/open-webui/open-webui:main
container_name: openwebui
restart: unless-stopped
ports:
- “127.0.0.1:3000:8080”
volumes:
- openwebui-data:/app/backend/data
networks:
- internal
networks:
internal:
driver: bridge
volumes:
openwebui-data:I’d budgeted a weekend for the OpenWebUI side. It took 20 minutes.
Three things a rented chat agent will never — and should never — do for you
My agent does three things in the closet (and more!), all of them the kind of thing a rented chat agent will never do for you, because doing them would require access you shouldn’t hand over. And none of them is about having a smarter model. They’re all harness — the agent wired into the rest of my life, with the wires ending inside my house instead of someone else’s data center.
When I chat with it, it remembers. The last time I made strawberry shortcake I told Hermes the America’s Test Kitchen version was mine, and now I can type @hermes, can you add everything I need for strawberry shortcake to my shopping cart? and the Whole Foods cart fills with what the recipe calls for. The browser tab I used to keep open for grocery delivery is closed.
(I keep meaning to rename it. I’ve been thinking Zora, after the ship’s computer on Star Trek: Discovery. Haven’t pulled the trigger.)
I’ve given the agent access to every Google Doc I’ve written. Drafts I forgot I started, meeting notes from 2024, the half-finished argument I gave up on in March — all in reach, just by asking. Last week I was three paragraphs into a piece and asked what I’d written on the same topic before; it surfaced a draft from spring I’d forgotten existed. I revived the thesis instead of re-deriving it.
It writes me a weekly self-report. Every Sunday morning, a digest of what I actually touched: every doc I edited, every long email thread, every commit. Not the calendar version of my week — the actual one. The first report landed in my inbox a month ago and I sat with it for ten minutes. I had forgotten half of it.
Almost all of this could be done in ChatGPT. The connectors are there: ChatGPT can talk to Google Drive, Gmail, your calendar. You just have to plug those accounts into your ChatGPT account and let OpenAI see them.
Absolutely not.
It isn’t that a rented chat agent can’t do these things. It’s that doing them requires giving someone else read access to your work, your inbox, your grocery list, your meeting notes — and trusting their terms of service this quarter, and next quarter, and the one after the IPO. The terms say what they say. The next major training run will use what it uses. You will not be told either way.
The agent in my closet has the same access. The difference is the access goes from one part of my house to another. Nothing leaves.
Two years ago, AI meant a tab in somebody else’s browser. Today, mine lives on a shelf next to the router. It doesn’t send a token of any of it anywhere I don’t pay the power bill for.
The rented-vs.-owned gap isn’t closing. It’s widening at the integration layer — not the raw model. The moat isn’t the weights anymore; it’s the access to your memory, your stuff, and your right to opt out of training on either. You can’t buy a pre-assembled version of this setup yet. Somebody will build it.
For me, the whole stack is in the closet. For most of you, the GPU is the last piece. Let’s get building.
After that, two directions I've been kicking around. One: running this whole setup on a cloud container instead of your own hardware, for the closet-less. Two: wiring multiple Hermes agents together and giving them different jobs — either the obvious next move or too nerdy to publish, depending on who's asking. Tell me which. Or both.



