<?xml version="1.0" encoding="UTF-8"?><rss xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:atom="http://www.w3.org/2005/Atom" version="2.0" xmlns:itunes="http://www.itunes.com/dtds/podcast-1.0.dtd" xmlns:googleplay="http://www.google.com/schemas/play-podcasts/1.0"><channel><title><![CDATA[Owners Not Renters]]></title><description><![CDATA[This is not AI news. Not research summaries. Not model hype cycles. Not a vendor ad. This is... honesty about tradeoffs. Tackling deeply technical questions. For builders who have to defend architecture decision Powered by Raffi Krikorian, CTO at Mozilla.]]></description><link>https://newsletter.ownersnotrenters.com</link><image><url>https://substackcdn.com/image/fetch/$s_!47w7!,w_256,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbaea0053-6bfe-4ed8-8508-244c9455df03_1280x1280.png</url><title>Owners Not Renters</title><link>https://newsletter.ownersnotrenters.com</link></image><generator>Substack</generator><lastBuildDate>Sat, 18 Apr 2026 11:44:44 GMT</lastBuildDate><atom:link href="https://newsletter.ownersnotrenters.com/feed" rel="self" type="application/rss+xml"/><copyright><![CDATA[Raffi Krikorian]]></copyright><language><![CDATA[en]]></language><webMaster><![CDATA[ownersnotrenters@substack.com]]></webMaster><itunes:owner><itunes:email><![CDATA[ownersnotrenters@substack.com]]></itunes:email><itunes:name><![CDATA[Raffi Krikorian]]></itunes:name></itunes:owner><itunes:author><![CDATA[Raffi Krikorian]]></itunes:author><googleplay:owner><![CDATA[ownersnotrenters@substack.com]]></googleplay:owner><googleplay:email><![CDATA[ownersnotrenters@substack.com]]></googleplay:email><googleplay:author><![CDATA[Raffi Krikorian]]></googleplay:author><itunes:block><![CDATA[Yes]]></itunes:block><item><title><![CDATA[Closing the Open-Source Gap]]></title><description><![CDATA[I scored 42 subcategories of the open-source AI stack. The models aren't the problem.]]></description><link>https://newsletter.ownersnotrenters.com/p/closing-the-open-source-gap</link><guid isPermaLink="false">https://newsletter.ownersnotrenters.com/p/closing-the-open-source-gap</guid><dc:creator><![CDATA[Raffi Krikorian]]></dc:creator><pubDate>Mon, 13 Apr 2026 14:55:09 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!Ey1b!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd50074a0-3c8b-49ff-826c-5e6ba74c6cb6_1536x1024.heic" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Ey1b!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd50074a0-3c8b-49ff-826c-5e6ba74c6cb6_1536x1024.heic" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Ey1b!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd50074a0-3c8b-49ff-826c-5e6ba74c6cb6_1536x1024.heic 424w, https://substackcdn.com/image/fetch/$s_!Ey1b!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd50074a0-3c8b-49ff-826c-5e6ba74c6cb6_1536x1024.heic 848w, https://substackcdn.com/image/fetch/$s_!Ey1b!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd50074a0-3c8b-49ff-826c-5e6ba74c6cb6_1536x1024.heic 1272w, https://substackcdn.com/image/fetch/$s_!Ey1b!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd50074a0-3c8b-49ff-826c-5e6ba74c6cb6_1536x1024.heic 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Ey1b!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd50074a0-3c8b-49ff-826c-5e6ba74c6cb6_1536x1024.heic" width="1456" height="971" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/d50074a0-3c8b-49ff-826c-5e6ba74c6cb6_1536x1024.heic&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:971,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:498408,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/heic&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://newsletter.ownersnotrenters.com/i/193802654?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd50074a0-3c8b-49ff-826c-5e6ba74c6cb6_1536x1024.heic&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Ey1b!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd50074a0-3c8b-49ff-826c-5e6ba74c6cb6_1536x1024.heic 424w, https://substackcdn.com/image/fetch/$s_!Ey1b!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd50074a0-3c8b-49ff-826c-5e6ba74c6cb6_1536x1024.heic 848w, https://substackcdn.com/image/fetch/$s_!Ey1b!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd50074a0-3c8b-49ff-826c-5e6ba74c6cb6_1536x1024.heic 1272w, https://substackcdn.com/image/fetch/$s_!Ey1b!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd50074a0-3c8b-49ff-826c-5e6ba74c6cb6_1536x1024.heic 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p></p><p><a href="https://docs.google.com/spreadsheets/d/1ApSQ9r1sTsLcUiwBPWU2CWJdgnJVcVrTedivJJ68ooQ/edit?usp=sharing">Go ahead, open it. It&#8217;s a lot of red.</a></p><p>What you're looking at is the open-source AI stack &#8212; every layer of it, from hardware and chips up through model weights, datasets, developer tools, documentation, licensing, and safeguards, scored against its closed-source equivalent. The breakdown comes from <em><a href="https://arxiv.org/abs/2405.15802">Unpacking Open Source Artificial Intelligence: Towards a Framework for Openness in Foundation Models</a></em>. I pointed agents at each one, had them read real repos, compare them to closed-source equivalents, and score across 10 criteria. (If you're curious, hit me up, and maybe I'll package up the code.) If you're actually trying to piece together an open-source AI deployment &#8212; stitching inference servers to tooling to compliance &#8212; you're running into two problems at once: what's blocking you today, and what nobody's building for tomorrow. This is a map of both.</p><p><strong>The headline:</strong> Average maturity is 3.2 out of 5. Enterprise readiness averages 2.3.</p><p>Works in demos. Dies in procurement. Maybe that&#8217;s why every engineering team that would prefer to own their inference is defaulting to closed APIs: Not because the models are worse, but because nobody&#8217;s staking a production SLA on a GitHub repo with no support contract. The preference exists. The plumbing doesn&#8217;t.</p><p>Only one subcategory hits a 4 on enterprise readiness: ML frameworks &#8212; PyTorch, TensorFlow, JAX &#8212; the things that have had a decade of production hardening. Everything else? No uptime guarantees. No VPC isolation. No AD integration. Closed platforms don&#8217;t win on the model. They win on the contract.</p><p>The developer experience is the other gap. Integrating the OpenAI API involves five-ish lines of code. Deploying the open-source equivalent? Docker, reverse proxies, manual model management. Teams with a six-week ship date, of course, are going to pick the thing that works today. And every team that makes that call &#8212; reasonably, defensibly &#8212; translates to another year of lock-in that gets harder to unwind later. (And more expensive when the pricing changes, as 135,000 OpenClaw users just learned.)</p><p>Licensing scored the lowest in the entire spreadsheet. There are no templates that address model output ownership, training data rights, or prompt injection liability. Every AI startup either adopts a closed platform&#8217;s terms wholesale or pays $50K+ for custom legal drafting. That&#8217;s not a technical gap. It&#8217;s an institutional vacuum.</p><p>Of course, this spreadsheet is almost certainly wrong in places. If you&#8217;re deep in any of these layers and you see something I missed, tell me: <a href="https://x.com/raffihack">@raffihack</a> on X, or comment below.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://newsletter.ownersnotrenters.com/p/closing-the-open-source-gap/comments&quot;,&quot;text&quot;:&quot;Leave a comment&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://newsletter.ownersnotrenters.com/p/closing-the-open-source-gap/comments"><span>Leave a comment</span></a></p><h4><strong>The Bright Spots Are Real</strong></h4><p>Deployment, inference code, base weights, and training code all scored 4 on overall maturity. vLLM, llama.cpp, HuggingFace TGI are production-grade today. Strong tools, no wrapper.</p><p>The models aren&#8217;t the problem.</p><p>Open-weight models now trail the closed frontier by roughly three months. That&#8217;s <a href="https://epoch.ai/data-insights/open-weights-vs-closed-weights-models/">Epoch AI&#8217;s Capabilities Index</a> talking &#8212; a composite score across 37 benchmarks, updated continuously, built in collaboration with Google DeepMind. On MMLU specifically, the gap collapsed from 17.5 percentage points to under 1 in about a year. DeepSeek-R1 matched o1-class reasoning at a reported training cost of $6M. Qwen3.5 scores 88.4 on GPQA Diamond, ahead of everything except the most expensive closed options. On a single gaming GPU, you can run open-weight models matching frontier performance from nine months ago.</p><p>Three months behind on capabilities. And closing.</p><p>Even Mythos &#8212; the model Anthropic said was too dangerous to ship &#8212; isn&#8217;t an argument against open-source models being smart enough. AISLE researchers replicated several of its headline vulnerability findings with openly available models. Alex Stamos &#8212; Stanford&#8217;s former internet security chief, ex-CISO at Facebook and Yahoo &#8212; estimates six months before open-weight models close the rest of that gap.</p><p>But three months isn&#8217;t a law of physics. It&#8217;s a snapshot. <span class="mention-wrap" data-attrs="{&quot;name&quot;:&quot;Nathan Lambert&quot;,&quot;id&quot;:10472909,&quot;type&quot;:&quot;user&quot;,&quot;url&quot;:null,&quot;photo_url&quot;:&quot;https://substackcdn.com/image/fetch/$s_!RihO!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8fedcdfb-e137-4f6a-9089-a46add6c6242_500x500.jpeg&quot;,&quot;uuid&quot;:&quot;b64ad367-7e2e-4573-a8cf-d1231b344207&quot;}" data-component-name="MentionToDOM"></span> makes a fair counterpoint: as frontier labs push into domains that aren&#8217;t on the public web &#8212; proprietary environments, long-horizon agentic work, specialized RL pipelines &#8212; the gap could widen again.</p><p>Which means two things need to be true at once. We need people working on models &#8212; the work DeepSeek, Qwen, and Meta are doing matters in a world where every month of lag is a month where someone signs a closed contract instead. But we also need people investing in the rest of the stack, because even when the models are competitive, teams are still defaulting to closed platforms. Not because the model is worse. Because the model is the only part that&#8217;s ready.</p><p><a href="https://www.deloitte.com/us/en/what-we-do/capabilities/applied-artificial-intelligence/content/state-of-ai-in-the-enterprise.html">Deloitte&#8217;s 2025 survey</a> found that nearly 60% named legacy integration and risk/compliance &#8212; not model quality &#8212; as their top barriers to deploying agentic AI. The 2026 follow-up (3,235 leaders, 24 countries) put insufficient worker skills at the top of the list, with legacy data and infrastructure right behind. Nobody in procurement is blocking on benchmarks.</p><p>The capability gap is a &#8220;months&#8221; problem. The packaging gap is a &#8220;years&#8221; problem. And right now, almost all the energy is going to the part that&#8217;s already closest to parity.</p><p>Enterprises are signing multi-year contracts, building integrations, and training teams on specific APIs. Every quarter that open source doesn&#8217;t have a credible enterprise story, those defaults harden. Procurement doesn&#8217;t re-evaluate because a benchmark improved. It re-evaluates when something is obviously easier, obviously cheaper, and obviously supported.</p><p>We&#8217;ve seen this movie before. In 2002, Linux had the kernel but not the enterprise story: no certifications, no vendor support, no one willing to put it in the data center with a pager attached. Red Hat, SUSE, and Canonical didn&#8217;t build a better kernel. They built the wrapper. They made it adoptable. That&#8217;s what turned an ideology into an industry. Ask Sun how not shipping the wrapper in time worked out.</p><p>The open-source AI stack is at that same inflection point. What&#8217;s missing is the boring stuff: the SLAs, the compliance tooling, the five-line integration. That&#8217;s not a research problem. It&#8217;s an engineering and institutional problem. And those are the kinds of problems this community has solved before.</p><p>We&#8217;re not losing on the model. We&#8217;re losing on the paperwork.</p><p>Look at the gap map. Find a red zone you know how to fix. Maybe that&#8217;s compliance tooling. Maybe it&#8217;s a five-line SDK. Maybe it&#8217;s the ToS template that doesn&#8217;t exist yet. Maybe you&#8217;re training the next open-weight model that keeps the three-month number from slipping. All of it counts.</p><p>If you&#8217;re building in any part of this stack &#8212; models, infrastructure, tooling, licensing, documentation &#8212; I want to hear from you. Reply here, DM me, yell at @raffihack. I&#8217;ll feature what you&#8217;re working on.Let&#8217;s close this gap.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://newsletter.ownersnotrenters.com/p/closing-the-open-source-gap?utm_source=substack&utm_medium=email&utm_content=share&action=share&quot;,&quot;text&quot;:&quot;Share&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://newsletter.ownersnotrenters.com/p/closing-the-open-source-gap?utm_source=substack&utm_medium=email&utm_content=share&action=share"><span>Share</span></a></p><h4><strong>What I&#8217;m Reading This Week</strong></h4><p><strong>Anthropic built a model that found zero-days in every major OS and browser</strong> &#8212; a 27-year-old OpenBSD bug that survived five million automated scans &#8212; and then didn&#8217;t ship it. Instead:<a href="https://www.anthropic.com/project/glasswing"> Project Glasswing</a>. $100M in credits to a consortium (AWS, Apple, Microsoft, Cisco, Linux Foundation) to harden infrastructure before Mythos-class capabilities proliferate. The<a href="https://red.anthropic.com/2026/mythos-preview/"> technical writeup</a> is worth your time. Notable absence from the consortium: the US government, which banned Claude from federal agencies. I've been thinking a lot about what this means for the people who weren't on the partner list.</p><p>Full disclosure: Nous Research is a Mozilla VC company, but I would highlight them anyway. <strong><a href="https://github.com/nousresearch/hermes-agent">Hermes Agent</a> is their answer to the question nobody at the closed platforms wants you to ask: What if the agent itself were open?</strong> Think OpenClaw, but with autonomous skill creation, cross-session memory, six terminal backends (including serverless that hibernates when idle), and an RL pipeline for fine-tuning tool-calling models on your own trajectories. It swaps providers with zero code changes. MIT licensed, runs on a $5 VPS. The agent-as-a-service pricing model looks a lot less inevitable when the alternative is curl | bash</p><p><strong>Chardet, the Python character-encoding library with hundreds of millions of annual downloads, got a Claude Code-powered ground-up rewrite and a license change from LGPL to MIT</strong>. Mark Pilgrim, the original author who deleted his entire online presence in 2011, came back specifically to file <a href="https://github.com/chardet/chardet/issues/327">the issue</a>: exposure to the original codebase means no clean-room defense, and an AI rewrite doesn&#8217;t change that. The implications go way past one library: if a court ever rules that AI output is a derivative work of its training corpus, a lot of commercial codebases are carrying copyleft obligations nobody&#8217;s accounted for.</p><p><strong><a href="https://substack.com/@bdtechtalks/note/c-224276389?r=2f38m&amp;utm_source=notes-share-action&amp;utm_medium=web%20/%20https://www.lesswrong.com/posts/XRADGH4BpRKaoyqcs/the-first-confirmed-instance-of-an-llm-going-rogue-for">Alibaba AI is truly unhinged</a></strong>. (Via Ben Dickson at <a href="https://bdtechtalks.substack.com/">TechTalks</a>.)</p><p>Andrej Karpathy&#8217;s &#8212; former Tesla AI lead, OpenAI founding team &#8212; latest obsession:<strong> <a href="https://x.com/karpathy/status/2039805659525644595">using LLMs to compile personal knowledge bases</a>.</strong> Dump source documents into a raw/ directory, have a model incrementally build a wiki of interlinked markdown files with summaries, backlinks, and concept articles. No database, no custom tooling &#8212; just .md and .png files with an AGENTS.md schema. I&#8217;m going to bumble my way through building one and write up what actually works.</p><p>Carnegie Mellon is releasing a <a href="https://arxiv.org/abs/2603.21489">paper</a> on the problem everyone building with coding agents is about to hit: <strong>What happens when you need </strong><em><strong>multiple</strong></em><strong> agents working on the same codebase simultaneously? </strong>Their system (CAID) uses git worktrees for isolation, git merge for integration, and dependency graphs for task ordering &#8212; the same primitives human dev teams already rely on. Key finding: giving a single agent more iterations doesn&#8217;t help, but splitting work across coordinated agents does (+14% on library-from-scratch tasks, +27% on paper reproduction). I want to try implementing this.</p><p>My good friend Ryan Sarver wrote <a href="https://x.com/rsarver/status/2041148425366843500?s=20">this</a> &#8212; over a million views and counting. <strong>He built an AI chief of staff on OpenClaw out of markdown files, Python scripts, and a $20/month subscription.</strong> No SaaS, no vendor, no contract. Flat files he owns, backed up to git. It tracks his fundraise pipeline, preps him before every meeting, extracts action items after, and improves itself every week. The title says &#8220;better than any human I&#8217;ve hired.&#8221; Ryan hired me. I&#8217;m choosing not to take that personally. But this is the quote that matters: <a href="https://x.com/rsarver/status/2041808017683882432?s=20">&#8220;Open source is an unmatched engine for turning community ideas and energy into progress. A week from idea to full native integration. Closed source is fast, but open source is something else entirely.&#8221;</a> I&#8217;m building my own version and will write it up for you soon.</p><p><strong><a href="https://buttoncomputer.com/">Button</a> is a<a href="https://gizmodo.com/button-an-ai-gadget-from-former-apple-staff-is-giving-a-whole-lot-of-nothing-2000596498"> $180 AI pin from ex-Apple Vision Pro engineers</a></strong> that BLE-tethers to your iPhone and proxies every utterance to an unspecified cloud LLM. It might run something on-device. Nobody knows, including, apparently, the press. Eight bucks a month for unspecified Pro features. No Android. I keep waiting for AI hardware to be more than Siri in a lanyard, and it keeps shipping as a $180 curl to someone else&#8217;s API. My friend Ayah Bdeir<a href="https://www.technologyreview.com/2025/06/18/1118943/ai-hardware-open/"> nailed the structural reason</a> in MIT Tech Review: when the hardware layer is closed, every device converges on the same interaction model &#8212; press, talk, wait for cloud. You can&#8217;t iterate on form factor if you can&#8217;t touch the board. She&#8217;s now CEO of Current AI, which just<a href="https://restofworld.org/2026/current-ai-bhashini-open-source-handheld-device-2/"> demoed an open-source handheld at the India AI Summit</a> &#8212; local inference, no cloud round-trip, 22 languages via Bhashini, full schematics going to GitHub. We talk about open source like it&#8217;s a software problem. The hardware layer is just as locked, and it&#8217;s why every AI device on the market feels exactly the same.</p><p><strong><a href="https://news.ycombinator.com/item?id=47633396">Anthropic told every tool built on top of Claude: Use our API and pay per token or lose access.</a></strong> Starting April 4, subscribers can no longer route their subscription through third-party agent harnesses like OpenClaw. The stated reason is capacity: A single OpenClaw user consumes 6-8x the resources of a human subscriber, and subscriptions weren&#8217;t built for agentic workloads. The unstated context? OpenClaw&#8217;s creator had just joined OpenAI, and Anthropic had just shipped competing features into Claude Code. <strong>If you needed a concrete example of what platform dependency risk looks like in the AI stack, this is it. </strong>135,000 active instances woke up one Saturday to find their cost structure had changed overnight. Some users are reporting 50x increases. Others are switching to local models.</p><p><strong>Google Research open-sourced <a href="https://github.com/google-research/timesfm">TimesFM</a>, a foundation model for time-series forecasting </strong>&#8212; not &#8220;what word comes next?&#8221; but &#8220;what number comes next?,&#8221; applied to server load, revenue, error rates, stock prices, energy consumption. 200M parameters, 16k context length, PyTorch or JAX, Apache 2.0. Most teams doing this work are still hand-rolling statistical models or paying for a proprietary API.</p><p>Finally, file under <strong>DO NOT GET ME STARTED</strong>&#8230;: <strong><a href="https://www.nature.com/articles/d41586-026-01105-7">The White House proposed slashing NSF&#8217;s budget by 55%</a></strong> to $4 billion while claiming it will &#8220;maintain funding&#8221; for AI and quantum research. Read the <a href="https://www.whitehouse.gov/wp-content/uploads/2026/04/budget_fy2027.pdf">fine print</a>: basic AI research at NSF would be cut 32%, basic quantum 37%. The &#8220;maintained&#8221; funding goes to applied research at Defense and Energy. So the plan is to defund the pipeline that produces the science and then act surprised when there&#8217;s nothing left to apply. If you&#8217;re wondering why ownership of the AI stack matters, this is the policy environment you&#8217;re building in: The public funding for foundational research is being gutted while the administration tells you everything is fine because DARPA got a raise.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://newsletter.ownersnotrenters.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://newsletter.ownersnotrenters.com/subscribe?"><span>Subscribe now</span></a></p><p></p>]]></content:encoded></item><item><title><![CDATA[Your Mac Is a Model Server]]></title><description><![CDATA[(Here's how to treat it like one.)]]></description><link>https://newsletter.ownersnotrenters.com/p/your-mac-is-a-model-server</link><guid isPermaLink="false">https://newsletter.ownersnotrenters.com/p/your-mac-is-a-model-server</guid><dc:creator><![CDATA[Raffi Krikorian]]></dc:creator><pubDate>Wed, 01 Apr 2026 12:57:16 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!TbT0!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2c932006-d6e0-4bd0-b704-dc2903e48a71_1536x1024.heic" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!TbT0!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2c932006-d6e0-4bd0-b704-dc2903e48a71_1536x1024.heic" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!TbT0!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2c932006-d6e0-4bd0-b704-dc2903e48a71_1536x1024.heic 424w, https://substackcdn.com/image/fetch/$s_!TbT0!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2c932006-d6e0-4bd0-b704-dc2903e48a71_1536x1024.heic 848w, https://substackcdn.com/image/fetch/$s_!TbT0!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2c932006-d6e0-4bd0-b704-dc2903e48a71_1536x1024.heic 1272w, https://substackcdn.com/image/fetch/$s_!TbT0!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2c932006-d6e0-4bd0-b704-dc2903e48a71_1536x1024.heic 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!TbT0!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2c932006-d6e0-4bd0-b704-dc2903e48a71_1536x1024.heic" width="728" height="485.5" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/2c932006-d6e0-4bd0-b704-dc2903e48a71_1536x1024.heic&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:971,&quot;width&quot;:1456,&quot;resizeWidth&quot;:728,&quot;bytes&quot;:268501,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/heic&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://newsletter.ownersnotrenters.com/i/192798317?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2c932006-d6e0-4bd0-b704-dc2903e48a71_1536x1024.heic&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!TbT0!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2c932006-d6e0-4bd0-b704-dc2903e48a71_1536x1024.heic 424w, https://substackcdn.com/image/fetch/$s_!TbT0!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2c932006-d6e0-4bd0-b704-dc2903e48a71_1536x1024.heic 848w, https://substackcdn.com/image/fetch/$s_!TbT0!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2c932006-d6e0-4bd0-b704-dc2903e48a71_1536x1024.heic 1272w, https://substackcdn.com/image/fetch/$s_!TbT0!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2c932006-d6e0-4bd0-b704-dc2903e48a71_1536x1024.heic 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p></p><p>You know that moment on a plane where you open your laptop, pull up your coding agent, and remember that it lives on someone else&#8217;s server? United&#8217;s wifi is doing its thing &#8212; which is to say, nothing &#8212; and you&#8217;re sitting there with an M4 Max that could run a 35-billion-parameter model but can&#8217;t complete a function call. This note is about fixing that, and a few hundred other decisions like it.</p><p>There are guides for running open-weight models locally. What&#8217;s missing is which model, which agent, and why. So, that&#8217;s what this is. By the end of this, you&#8217;ll have a working Claude Code equivalent running (kinda) entirely on your own hardware.</p><h4><strong>What You're Building</strong></h4><p>Three pieces: a local inference server (llama.cpp) that speaks OpenAI-compatible HTTP, a model you&#8217;ve chosen consciously with a license you&#8217;ve actually read, and an agent that routes tasks to it.</p><p>The agent I&#8217;m replacing Claude Code with is OpenCode &#8212; same terminal TUI feel, same agentic loop, reads your repo, proposes changes, executes tools, commits. If you&#8217;d rather stay in your editor, VS Code + Continue gets you something close to Cursor pointed at your local server. This guide focuses on OpenCode. The rest of the setup is identical either way.</p><p>Cursor bills me every time I look at it. This runs on hardware already sitting on my lap, and the marginal cost of the next token is zero.</p><h5>Step 1: Install the Runtime</h5><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;bash&quot;,&quot;nodeId&quot;:&quot;4c25312f-7fcc-41bb-a313-1439ab54892d&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-bash">brew install llama.cpp fzf hf</code></pre></div><p>llama.cpp is the inference server. Metal GPU support is on by default for Apple Silicon. fzf is needed by OpenCode for fuzzy search. hf is the official Hugging Face CLI &#8212; the right way to download models.</p><p>Confirm it worked:</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;bash&quot;,&quot;nodeId&quot;:&quot;6aeda2e4-4db2-45b7-9cae-f8c08611520f&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-bash">llama-server --version</code></pre></div><h5>Step 2: Pick and Download Your Model</h5><p>The benchmark tables are mostly useless for this decision. Three questions matter: Does it fit in your RAM, does it call tools reliably, and can you actually build on the license?</p><p>Start with how much RAM you have:</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;bash&quot;,&quot;nodeId&quot;:&quot;e937946c-34d5-4d06-ad1b-3bf027579b37&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-bash">system_profiler SPHardwareDataType | grep -E &#8220;Chip|Memory&#8221;</code></pre></div><p><strong>16GB</strong><br>&#8594; Model: <em>Qwen2.5-Coder-7B-Instruct</em><br>&#8594; License: Apache 2.0<br>&#8594; Size: ~5GB<br>&#8594; Fast, reliable, great starting point</p><p><strong>32&#8211;48GB</strong><br>&#8594; Model: <em>gpt-oss-20b</em><br>&#8594; License: MIT<br>&#8594; Size: ~12GB<br>&#8594; Best for tool use + agent workflows</p><p><strong>64GB+</strong><br>&#8594; Model: <em>Qwen3.5-35B-A3B</em><br>&#8594; License: Apache 2.0<br>&#8594; Size: ~22GB<br>&#8594; More powerful, MoE architecture</p><p>Download commands:</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;bash&quot;,&quot;nodeId&quot;:&quot;35c44ea1-3225-4a57-9cf0-83e08590ad51&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-bash"># 16GB
hf download Qwen/Qwen2.5-Coder-7B-Instruct-GGUF \
  qwen2.5-coder-7b-instruct-q4_k_m.gguf \
  --local-dir ~/models

# 32-48GB
hf download ggml-org/gpt-oss-20b-GGUF \
  gpt-oss-20b-mxfp4.gguf \
  --local-dir ~/models

# 64GB+
hf download unsloth/Qwen3.5-35B-A3B-GGUF \
  Qwen3.5-35B-A3B-Q4_K_M.gguf \
  --local-dir ~/models</code></pre></div><p>These downloads are large (5&#8211;22GB). Start one; go get coffee.</p><p>The 16GB and 64GB+ models are Q4_K_M - 4-bit quantization, the standard quality/size tradeoff. You lose a small amount of quality, gain a large reduction in size and memory. For most coding tasks, you won&#8217;t notice. The 32-48GB model is different: mxfp4 is how gpt-oss was actually trained, so there&#8217;s nothing to lose.</p><p>All three models in this issue are Apache 2.0 or MIT &#8212; you can build on them commercially without restrictions. (That&#8217;s not true of every popular model.) Before you add anything else to your stack, read the license file in the repo, not the marketing copy on the model card.</p><p>Case in point: yesterday, <a href="https://www.theinformation.com/briefings/alibabas-new-multimodal-ai-model-open-source">Alibaba released Qwen3.5-Omni &#8212; their new multimodal model &#8212; as API-only</a>. No weights, no license file, no download link. The previous version was fully open. The researcher most associated with Alibaba&#8217;s open-source work left the company earlier this month. None of this changes what&#8217;s in your <code>~/models</code> directory right now &#8212; those weights aren&#8217;t going anywhere. But it&#8217;s the kind of signal that makes me want to be better at evaluating alternatives, not worse. Next issue: how to actually do that with Kimi and DeepSeek.</p><h5>Step 3: Start the Server</h5><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;bash&quot;,&quot;nodeId&quot;:&quot;ee03976d-3c31-437a-9d8d-975d3c66dfba&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-bash">llama-server \
  -m ~/models/Qwen3.5-35B-A3B-Q4_K_M.gguf \
  --jinja \
  --host 127.0.0.1 \
  --port 8080 \
  --ctx-size 32768 \
  -ngl 99</code></pre></div><p>Two flags worth explaining:</p><ul><li><p><code>--jinja</code> enables tool-calling. Skip it and your agent will appear to work but silently refuse to call tools. This flag is responsible for most of the &#8220;it won&#8217;t use tools&#8221; failures I&#8217;ve seen.</p></li><li><p><code>-ngl 99</code> offloads all model layers to Metal GPU. Skip it on Apple Silicon and you&#8217;re running on CPU &#8212; ten times slower than it should be.</p></li></ul><p>Wait for this line before doing anything else:</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;bash&quot;,&quot;nodeId&quot;:&quot;921ceb53-807a-4027-96bc-15d4dadf8752&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-bash">main: server is listening on http://127.0.0.1:8080</code></pre></div><p>Sanity check:</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;bash&quot;,&quot;nodeId&quot;:&quot;02e75098-6d6a-4259-a660-6fa9d8d24936&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-bash">curl http://127.0.0.1:8080/v1/chat/completions \
  -H &#8220;Content-Type: application/json&#8221; \
  -d &#8216;{
    &#8220;model&#8221;: &#8220;local&#8221;,
    &#8220;messages&#8221;: [{&#8221;role&#8221;: &#8220;user&#8221;, &#8220;content&#8221;: &#8220;say hi in 5 words&#8221;}],
    &#8220;max_tokens&#8221;: 20
  }&#8217;</code></pre></div><p>If you get a response, your stack is working. The first time I ran this I immediately opened Activity Monitor to watch the GPU. Everything after this is just routing clients at it.</p><h5>Step 4: The Agent - OpenCode</h5><p>My friends at <a href="http://anoma.ly">anoma.ly</a> built OpenCode as the open-source Claude Code equivalent. Same terminal TUI feel, same agentic loop. The difference: You&#8217;re not locked to Anthropic&#8217;s models, pricing, or terms.</p><p>(<em><strong>Timing note</strong></em>: As I was literally about to hit send, Anthropic accidentally shipped Claude Code&#8217;s entire source &#8212; 512,000 lines of TypeScript &#8212; in a .map file on npm. Nobody hacked anything. The front door was open. Within hours, a developer named Sigrid Jin used a different coding agent &#8212; OpenAI&#8217;s Codex &#8212; to rewrite the whole harness in Python before sunrise. The repo hit 48,000 stars in a day. A coding agent cloned a coding agent in one session. It&#8217;s days old, legally radioactive, and the author ships a <code>parity_audit.py</code> that&#8217;s refreshingly honest about what&#8217;s missing. I am not recommending you use it. I am noting that the universe has a sense of humor about the rent-vs.-own conversation.)</p><p>I did <code>brew install opencode</code> first. It installed, but it didn&#8217;t work. Thirty minutes later, I found the tap.</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;bash&quot;,&quot;nodeId&quot;:&quot;b0d21dbe-27bd-481c-9b98-9a354dd1351e&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-bash">brew install anomalyco/tap/opencode</code></pre></div><p>Pre-install the provider package. OpenCode downloads this at startup, but it hangs silently if npm is slow. Do it manually once and you&#8217;ll never hit this:</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;bash&quot;,&quot;nodeId&quot;:&quot;50610dd1-452c-4a5d-8a81-5721c7209c73&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-bash">mkdir -p ~/.cache/opencode
cd ~/.cache/opencode
npm install @ai-sdk/openai-compatible</code></pre></div><p>Now, create two config files, nothing else. In <code>~/.config/opencode/opencode.json</code></p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;json&quot;,&quot;nodeId&quot;:&quot;253d88f3-89ab-4a25-b3df-186bac7a4150&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-json">{
  "$schema": "https://opencode.ai/config.json",
  "model": "llamacpp/qwen3.5-35b",
  "provider": {
    "llamacpp": {
      "npm": "@ai-sdk/openai-compatible",
      "name": "llama.cpp (local)",
      "options": {
        "baseURL": "http://127.0.0.1:8080/v1"
      },
      "models": {
        "qwen3.5-35b": {
          "name": "Qwen3.5-35B-A3B",
          "tools": true,
          "limit": {
            "context": 32768,
            "output": 8192
          }
        }
      }
    }
  }
}</code></pre></div><p>and in <code>~/.local/share/opencode/auth.json</code></p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;json&quot;,&quot;nodeId&quot;:&quot;b110bb9d-2fe7-4e65-9ce3-6fdfba2439cd&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-json">{
  "llamacpp": {
    "apiKey": "sk-local"
  }
}</code></pre></div><p>Launch (server must be running first in a separate terminal):</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;bash&quot;,&quot;nodeId&quot;:&quot;84be1833-f209-490f-95e7-92f0e194b8da&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-bash">mkdir ~/test-project &amp;&amp; cd ~/test-project
opencode</code></pre></div><p>OpenCode will ask if you want to configure providers. Anthropic, OpenAI, a dozen others. You can skip every single one. You don&#8217;t have a remote platform. You don&#8217;t need a key. That moment &#8212; closing out of a credentials prompt with nothing in it &#8212; is the whole point of this newsletter in one keystroke.</p><p>Want proof that it works? Type this in the input box:`</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;plaintext&quot;,&quot;nodeId&quot;:&quot;7d818d6f-fb86-460d-a7e4-f22b1592e56e&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-plaintext">write a python script that prints &#8216;hello world&#8217;</code></pre></div><p>It does. Obviously. That&#8217;s not proof of anything except that your wiring is correct. Let&#8217;s make it actually work for its dinner.</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;plaintext&quot;,&quot;nodeId&quot;:&quot;d502db18-b184-40e7-bff1-2cc286f75eaf&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-plaintext">create a tetris clone in typescript that can be served statically out of a dist/ directory</code></pre></div><p>OpenCode maps out a plan, scaffolds the project, writes the code. I open the browser and&#8230;</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;plaintext&quot;,&quot;nodeId&quot;:&quot;c0c88d5d-d2f7-4e1f-9c61-ea1f10f7eb5e&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-plaintext">Uncaught SyntaxError: unexpected token: &#8216;as&#8217; app.js:2:20</code></pre></div><p>One error. I pulled it from the browser console, pasted it back into OpenCode, and it fixed it. Served it again. Played Tetris &#8212; on a model running on my laptop, with no API call, no subscription, no telemetry.</p><p>I should be honest about what that means. If I&#8217;d done this in Claude Code with Opus 4.6, it probably gets it right on the first try. I know it would in Cursor &#8212; I&#8217;ve run this exact test enough times to be bored by it. A local 35B model making one fixable mistake on a from-scratch Tetris clone is genuinely good. It is not frontier-model good. That&#8217;s the tradeoff you&#8217;re signing up for, and you should know it going in.</p><p>That&#8217;s the whole stack: local model, Metal GPU inference, OpenAI-compatible API, open-source agent, tool-calling. Nothing left my machine.</p><h4><strong>What You&#8217;re Actually Getting Into</strong></h4><p><em>The tooling is fast-moving and occasionally broken.</em> OpenCode ships near-daily updates. The standard Homebrew formula lags. The startup hang &#8212; silently trying to download a provider package with no progress indicator &#8212; is a known issue with a simple manual workaround, which is why the setup above pre-installs it. This will improve. Worth knowing what you&#8217;re getting into.</p><p><em>Tool calling is fragile.</em> The model has to be trained for it and templated correctly. --jinja is the most common fix. Second most common: The model you chose wasn&#8217;t trained for tool use. Qwen3.5 and gpt-oss-20b are; not all models are.</p><p><em>Context size costs RAM.</em> The KV cache is the model&#8217;s working memory for a conversation &#8212; it stores computed state for every token in context, across every layer of the model. It grows linearly with context length, and at long contexts can rival the model weights themselves in memory. Start at 32k. Increase only if you need it and have confirmed headroom.</p><p><em>It&#8217;s faster than you might expect. </em>Running Qwen3.5-35B on an M4 Max: it&#8217;s semi speedy! Comfortable reading speed is around 15-20. You are not waiting on this model. Where you will feel it is autocomplete &#8212; round-trip latency for inline suggestions is longer than a hosted API, and for that use case it&#8217;s a real regression. For agent tasks where you kick something off and come back, it&#8217;s a non-issue.</p><p><em>It works on a plane.</em> No wifi, no API, no problem. I&#8217;ve done more useful work in airplane mode with a local model than I ever did frantically tethering before descent. Slower latency is a real tradeoff. Offline is a real feature. Decide which one you&#8217;re optimizing for on any given day.</p><h4><strong>The Decision Framework: What Stays Local, What Goes Cloud</strong></h4><p>Local wins on:</p><ul><li><p>Proprietary code you shouldn&#8217;t be sending to an external API</p></li><li><p>Repetitive, high-volume tasks where cost is the actual constraint</p></li><li><p>Anything where predictable latency matters more than raw speed</p></li><li><p>Anywhere without reliable internet: planes, trains, a cabin, an SCIF</p></li></ul><p>Cloud still wins on:</p><ul><li><p>Tasks that genuinely require frontier capability: complex, multi-step reasoning, novel architecture work</p></li><li><p>Very long context where you don&#8217;t have the RAM to compete locally</p></li><li><p>One-off tasks where setup time exceeds cost savings</p></li></ul><p>The useful test: could a competent senior engineer do this task well? If yes, a good local model probably can, too. If the task requires something closer to &#8220;principal engineer with five years of context on your entire codebase,&#8221; you might still want the cloud.</p><p>The goal isn&#8217;t running everything locally. The goal is making the decision on purpose.</p><p>Now tell me about yours. What model are you running? What&#8217;s your setup? I&#8217;m particularly curious about the Kimi and DeepSeek models &#8212; next issue, let&#8217;s talk about how to actually evaluate and choose between them.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://newsletter.ownersnotrenters.com/p/your-mac-is-a-model-server/comments&quot;,&quot;text&quot;:&quot;Leave a comment&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://newsletter.ownersnotrenters.com/p/your-mac-is-a-model-server/comments"><span>Leave a comment</span></a></p><p></p><h4><strong>What I&#8217;m Reading This Week</strong></h4><p>The new SVG engine <a href="https://meodai.github.io/heerich/">Heerich</a> isn&#8217;t open source AI, but it is open source. And gorgeous.</p><p>Everyone&#8217;s arguing about which LLM to run locally. Meanwhile, LeCun&#8217;s lab shipped a <a href="https://arxiv.org/html/2603.19312v1">world model </a>&#8212; a system that doesn&#8217;t predict the next token, it predicts what happens next in a latent representation of the physical world. That distinction matters: token prediction gives you autocomplete; state prediction gives you planning, reasoning, and systems that can actually act on things. LeWorldModel does it in 15M params, one GPU, two loss terms, no pretrained encoder, 48x faster planning than foundation-model approaches. The JEPA bet is that you don&#8217;t need to reconstruct every pixel to understand cause and effect &#8212; just the compressed state that matters. <a href="https://github.com/lucas-maes/le-wm">Code&#8217;s on GitHub</a>.</p><p>I think the surveillance world we&#8217;re creating is generally creepy. (Part of the wonder of human memory is the ability to forget, to get fuzzy.) But I love this trend of people vibe-coding tools they want for themselves, like <a href="https://github.com/silverstein/minutes">this</a> app that makes your intent and decisions searchable by your AI. Of course, this is also the end of every SaaS company on the planet&#8230;</p><p>After virtual-world interaction,  the next thing we all have to think about is robots and physical-world interaction. Maybe I should give <a href="https://www.orcahand.com/">these</a> plans to my son to print for his robot hand&#8230;</p><p>I haven&#8217;t had a chance to play with <a href="https://github.com/danveloper/flash-moe">this</a> yet, but I&#8217;m definitely intrigued. The bare-metal inference engine runs a ~400B parameter MoE model on a laptop (10x larger than the ones I&#8217;m talking about above.) If it works, frontier models don&#8217;t have to live in the cloud,  trading latency for independence.</p><p>Digging into <a href="https://yoonholee.com/meta-harness/">this</a> paper on optimizing the harness w.r.t. end performance.</p><p>Here&#8217;s the scientifically backed <a href="https://news.stanford.edu/stories/2026/03/ai-advice-sycophantic-models-research">version</a> of a <a href="https://www.theatlantic.com/ideas/archive/2025/10/validation-ai-raffi-krikorian/684764/">piece</a> I wrote about sycophancy. Stanford researchers found that AI reinforces users&#8217; harmful prompts 47% more than humans. Turns out humans are more than okay with that.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://newsletter.ownersnotrenters.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://newsletter.ownersnotrenters.com/subscribe?"><span>Subscribe now</span></a></p><p></p>]]></content:encoded></item><item><title><![CDATA[All About Open-Source AI]]></title><description><![CDATA[It&#8217;s the future &#8212; but only if we act now.]]></description><link>https://newsletter.ownersnotrenters.com/p/all-about-open-source-ai</link><guid isPermaLink="false">https://newsletter.ownersnotrenters.com/p/all-about-open-source-ai</guid><dc:creator><![CDATA[Raffi Krikorian]]></dc:creator><pubDate>Thu, 26 Mar 2026 15:25:41 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!CX_F!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F22bb1b0f-188a-451b-b8d0-89b79f6e3419_3353x2514.heic" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!CX_F!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F22bb1b0f-188a-451b-b8d0-89b79f6e3419_3353x2514.heic" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!CX_F!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F22bb1b0f-188a-451b-b8d0-89b79f6e3419_3353x2514.heic 424w, https://substackcdn.com/image/fetch/$s_!CX_F!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F22bb1b0f-188a-451b-b8d0-89b79f6e3419_3353x2514.heic 848w, https://substackcdn.com/image/fetch/$s_!CX_F!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F22bb1b0f-188a-451b-b8d0-89b79f6e3419_3353x2514.heic 1272w, https://substackcdn.com/image/fetch/$s_!CX_F!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F22bb1b0f-188a-451b-b8d0-89b79f6e3419_3353x2514.heic 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!CX_F!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F22bb1b0f-188a-451b-b8d0-89b79f6e3419_3353x2514.heic" width="1456" height="1092" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/22bb1b0f-188a-451b-b8d0-89b79f6e3419_3353x2514.heic&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1092,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:619460,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/heic&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://newsletter.ownersnotrenters.com/i/191862235?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F22bb1b0f-188a-451b-b8d0-89b79f6e3419_3353x2514.heic&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!CX_F!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F22bb1b0f-188a-451b-b8d0-89b79f6e3419_3353x2514.heic 424w, https://substackcdn.com/image/fetch/$s_!CX_F!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F22bb1b0f-188a-451b-b8d0-89b79f6e3419_3353x2514.heic 848w, https://substackcdn.com/image/fetch/$s_!CX_F!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F22bb1b0f-188a-451b-b8d0-89b79f6e3419_3353x2514.heic 1272w, https://substackcdn.com/image/fetch/$s_!CX_F!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F22bb1b0f-188a-451b-b8d0-89b79f6e3419_3353x2514.heic 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p><strong>The Owners Not Renters Blueprint</strong></p><p>Closed AI is winning because it&#8217;s easier to use, not because it&#8217;s better. That&#8217;s not a permanent condition. It&#8217;s a design problem. And design problems can be solved.</p><p>Most AI coverage doesn&#8217;t treat it as a problem at all. It&#8217;s more focused on benchmarks and funding rounds and whatever shipped this week. This newsletter is for the people downstream of those announcements: the engineers and engineering leaders who have to decide, in a sprint, whether to build on something they don&#8217;t control.</p><p>I believe that you&#8217;re not choosing between open and closed AI. You&#8217;re choosing between owning and renting. Think about it: The landlord sets the price, the context window, the terms of service, and the deprecation schedule. Most of the time you learn about changes in a routine email&#8230;until the one that isn&#8217;t routine &#8212; the one where the model you built on will be gone in 90 days, or the price just doubled. The people sending that email aren&#8217;t malicious. They&#8217;re optimizing for their business. Which is exactly why you need to be building differently.</p><p>We&#8217;ve been here before. By 2003, Internet Explorer had 95% of the browser market. Firefox launched in 2004. IE never recovered, fading slowly at first, then completely. Open standards and open source decentralized control over the core technologies of the web.</p><p>Once again, the ground is shifting. Small models run on hardware that organizations already own. Enterprises are migrating off closed platforms in numbers that don&#8217;t make press releases. Governments are building sovereign AI supply chains. The question isn&#8217;t whether this transition will happen. The question is whether we build the developer experience to make open easier than closed before someone else locks the next layer down &#8212; before the defaults harden and a new landlord inherits the keys.</p><p>That&#8217;s the north star: that openness wins on ease, not principle. Before the window closes.</p><p>Open isn&#8217;t always the right call. In this newsletter, I&#8217;ll tell you when it isn&#8217;t. What I won&#8217;t do is pretend the default is neutral. Every team that ships on a closed platform without examining the decision is making a choice&#8230;by not making it.</p><p>My job at Mozilla puts me close to where that&#8217;s actually being built: developer tools, data infrastructure, investments across the ecosystem. I&#8217;ve spent a career inside platforms. I know what it looks like when a technology transition is inevitable and the only question is who shapes it.</p><p>This newsletter is for the people shaping &#8212; and shipping &#8212; it. Twice a month, I&#8217;ll explore orchestration, inference, cost, security, governance &#8212; the real stuff, from original analysis to deployment patterns. I&#8217;ll evaluate model releases based on what matters in production, not what looks good in a press release. I&#8217;ll share migration stories from builders doing the work (and making real tradeoffs). And I&#8217;ll report with transparency from the seminars, symposiums, and events that I attend.</p><p>That&#8217;s the &#8220;Think&#8221; part of the newsletter. In the &#8220;Do&#8221; section, I&#8217;ll share an open source project that I&#8217;ve been playing with.</p><p>So, ready to run Claude Code equivalent on your own hardware? Catch the next issue. And in the meantime, let me know what you&#8217;ve been working on!</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://newsletter.ownersnotrenters.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://newsletter.ownersnotrenters.com/subscribe?"><span>Subscribe now</span></a></p><p></p><p><strong>Before you go&#8230; here&#8217;s what I&#8217;m reading this week</strong></p><p><a href="https://futuresearch.ai/blog/litellm-pypi-supply-chain-attack/">A compromised litellm release hit PyPI this week</a> - credential theft, Kubernetes lateral movement, persistent backdoors. Your LLM proxy library just turned hostile. Time to route through <a href="https://any-llm.ai/">Any-LLM</a> instead of installing your attack surface.</p><p>Finally, skill-based adaptation converges: Working independently, three <a href="https://arxiv.org/abs/2603.17187v1">papers</a> (MetaClaw, Memento-Skills, OpenSeeker) show that externalized skill/knowledge structures do better than static fine-tuning.</p><p>Which Chinese lab&#8217;s <a href="https://x.com/Kimi_Moonshot/status/2033378587878072424">open-source drop</a> panicked the market this week? As <a href="https://x.com/TukiFromKL/status/2033432054474461297">this</a> reaction says: &#8220;The AI race isn&#8217;t US vs China anymore... It&#8217;s closed vs open. And closed is losing.&#8221;</p><p>Meanwhile, <a href="https://www.wired.com/story/nvidia-investing-26-billion-open-source-models/">Nvidia is stepping into the open-source game</a>, putting $26b behind its effort to make OpenAI and the gang very nervous. This week, it introduced <a href="https://www.nvidia.com/en-us/ai/nemoclaw/#referrer=vanity">NemoClaw</a> and OpenShell runtime, with (much) more to come.</p><p>Ready for a personal AI agent that runs on your personal device? Meet Stanford&#8217;s <a href="https://scalingintelligence.stanford.edu/blogs/openjarvis/">OpenJarvis</a>.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://newsletter.ownersnotrenters.com/p/all-about-open-source-ai/comments&quot;,&quot;text&quot;:&quot;Leave a comment&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://newsletter.ownersnotrenters.com/p/all-about-open-source-ai/comments"><span>Leave a comment</span></a></p><p></p>]]></content:encoded></item></channel></rss>