Field notes

Short notes from running the businesses.

Things I notice on Mondays. Lessons that cost me time or money. Build logs from the systems on this site. Occasional X posts that hit something worth keeping.

AI avatar summary

The notes are here so I do not lose the lesson.

Some of these came from shipped work. Some came from mistakes. The useful ones usually point to a copy edit, a process change, or a tool I should build.

Here's a summary of this page by my AI Avatar

Build Log
Read note01

From Claude to Codex: a year of dual-CLI coding, charted.

A year of running Claude Code and Codex CLI side by side, reconstructed from local session logs and 22,945 commits across 58 repos. There's one specific month, November 2025, where the dominance flipped, and a five-month streak since where it has stayed flipped. Charts, model adoption latencies, and 11.78 billion Claude tokens of throughput.

aitoolingbuild-log
Build Log
02

Aggregation level is the lever, not the model.

On the anonymized forecasting work, going from Region × Product × Country × Week to Region × Week cut MAPE from 23% to 14%. Same data, same XGBoost, different granularity. Most 'model tuning' rounds should be aggregation experiments first.

mlforecasting
Build Log
03

AI agents are great until they aren't watched.

Three months into running them at volume: every autonomous agent worth running needs a human checkpoint before it sends, posts, or commits. The good ones tell you exactly what they want to do and wait.

agentsai
Operating Note
04

Most contractor software is built for the office, not the truck.

The owners I work with are on a job site at 7am and answering missed calls at 9pm. If your software needs a Wednesday training session to use, you've already lost them.

contractorsux
Build Log
05

Ad spend without attribution is a tax on optimism.

Every Google or Meta account I audit has at least one campaign that hasn't fired a real conversion event since 2023. Fix the tracking before you touch a single bid.

adsattribution
Operating Note
06

Enterprise AI fails because the process is broken, not because the model is.

The known stat is that 95% of enterprise AI pilots return no measurable ROI. The bottleneck is almost never model quality. Every quarter the models improve and the failure rate doesn't move. The gap is between how work is documented and how work actually flows. The pilots that succeed embed someone hands-on for four-plus weeks before any AI is written, decompose the work into ~85% deterministic code and ~15% LLM judgment, and ship the boring part first.

ai-implementationenterpriseops
Operating Note
07

An AI-native agency targeting 65% gross margin designed it wrong.

Legacy agency gross margins sit at 60-70% because delivery labor eats 30-40% of revenue. Restructured on top of AI, delivery COGS should drop to 10-20% and gross margin should clear 80%. If you're targeting the old number after the redesign, you missed the redesign. The margin is the diagnostic. It tells you whether your AI integration actually changed how work flows, or whether you just bought more tools.

agencymarginai-native
Operating Note
08

Service businesses are winning the next five years. Stop forcing yourself to build SaaS.

The "SaaS beats agency" narrative rested on one assumption: that agencies couldn't scale without proportional labor. AI removes that ceiling. Service businesses already have distribution, trust, immediate cash flow, and customer relationships that take SaaS companies years to build. If you're running both, the agency might be the thesis and the software might be the funding mechanism. Most of tech Twitter has this inverted right now.

business-modelservice-businessai-era
Operating Note
09

The AI margin collapse hits agencies that sold labor, not agencies that built a moat.

If your delivery COGS is 30-40% of revenue, AI is going to compress those numbers from underneath whether you adopt it or not. The way out isn't faster tools. It's outcome-based pricing on top of vertically-accumulated data nobody else has. Agencies that priced by the hour are about to compete with anyone who can run the same model. Agencies that priced by results and own their domain dataset can't be commoditized because the commodity is execution, not insight.

agencyai-disruptionpricing
Operating Note
010

Selling "more leads" to home-services contractors is selling CPU speed to someone who needs a database.

Contractors in mature niches have heard the leads pitch 500 times. They're solution-aware, and they don't believe a volume claim from anyone they haven't worked with. What lands is naming a mechanism they recognize as proprietary (seasonal patterns, specific competitor data, a niche-only signal stack) that commodity lead platforms can't replicate. Pitch differentiation, not throughput.

positioningagency-salesmessaging
Operating Note
011

Ask what the contractor is actually doing, not what they want.

Most paving and garage-door contractors will tell you they want commercial work. Their scheduling, follow-up, and pricing are all built around residential volume. The diagnostic move is to measure the stated target mix against the actual mix and ask why the gap exists. Most of the time the answer isn't repositioning. It's tightening operations to support the work they already do.

contractor-opsdiagnosticspositioning
Operating Note
012

Market feedback isn't three themes. It's the ledger of objections you stopped listening to.

When you compress sales-call findings into three executive-summary themes, the real signal (repeated wording, edge-case friction, qualification flags) gets thrown out. Keep the small findings. Every minor pattern that shows up three or more times maps to a copy edit, a qualification question, or an automation that compounds. The themes don't ship product. The ledger does.

product-feedbackgtmresearch
Operating Note
013

Your CRM should be separate from your job-execution system.

Contractors try to make one tool own lead capture, pipeline, and job production. Splitting it cleanly, GHL for lead CRM and the trade-specific tool (ServiceTitan, HCP, Jobber) for production, prevents over-promising one platform and clarifies who owns what handoff. If your CRM is also dispatching trucks, neither half is doing its job well.

agency-opscrmtooling
Operating Note
014

Default settings in cold-email platforms route most of your sends to spam, and the dashboard won't tell you.

Five settings tank deliverability silently: open tracking on (pixel flags spam), send intervals under 20 minutes (machine cadence), daily limit above 15 per mailbox, weekend sends, and the default "delivery optimization" toggles both on. Most platforms keep showing flat open rates from spam-folder pixel hits, so you don't notice until reply rate craters. Reply rate is the only honest deliverability metric. Audit before you scale, not after.

cold-emaildeliverabilityops
Build Log
015

Cold-email reply classification doesn't scale unless you keep watching it.

A reply classifier ships at 95% accuracy on a heuristic stack. Three months later it's at 80% because the 5% miss class grew, the senders changed, and nobody resampled. Sustainable classification is sampling-on-a-cadence as an architecture decision, not a training step you do once and walk away from. Otherwise quality silently degrades and no one notices until pipeline drops.

cold-emailautomationquality
Build Log
016

API credentials aren't secrets. They're assumptions about risk.

Encrypting a Stripe publishable key is security theater. Encrypting a key that can drain a bank account is necessary. Treating every credential the same forces low-leverage controls on harmless tokens and lets high-risk ones hide in the middle of a long list. The right unit of protection is the credential's blast radius, not its format.

securitycredentialsops
Build Log
017

Autonomous agents need three things wired before the loop runs.

A heartbeat tells your supervisor the agent is still alive. A fallback provider keeps the loop running when the primary inference fails or quota burns. A cheap default model stops the loop from auto-selecting the priciest option every iteration. Skip any of the three and you'll get one of: a hung run nobody notices, a cascade outage, or a five-figure surprise bill. None of them are optional in production.

agentsreliabilitycost
Build Log
018

Edge middleware will silently bundle Node-only modules and crash in production.

Next.js middleware runs on the Edge runtime, which can't load crypto, pg, bcrypt, or anything that ships native bindings. Bundlers will happily pull them in via an unsuspecting auth import and not warn you. The deploy goes green, pages load, and you start logging "Cannot redefine property: __import_unsupported" until you trace the import tree. Keep middleware lean and move database guards into route handlers.

nextjsedge-runtimebundling
Build Log
019

When OAuth or environment-variable errors hit production, check the database schema first.

A surprising number of "secrets are missing" or "auth is broken" production incidents trace back to a migration that didn't run. The code expects a column, the column isn't there, the ORM throws a generic 500, and the error reaches the user as something else. Check supabase_migrations.schema_migrations (or your equivalent) before you go chasing secrets and tokens. Schema drift is the silent failure mode that masquerades as everything else.

databasedeploymentdebugging
Build Log
020

Bad WiFi is a topology problem, not a speed problem.

People upgrade their ISP every time the home network drags. The real fix is usually the placement of access points, band steering, and a wired backhaul between APs. Latency and jitter happen in the layout of your network, not in the size of the pipe coming in. I've gotten more uplift from a $40 ethernet run than from a $40-per-month plan upgrade.

networkinghome-infradiagnostics