Field notes

Short notes from running the businesses.

Things I notice on Mondays. Lessons that cost me time or money. Build logs from the systems on this site. Occasional X posts that hit something worth keeping.

Build Log

Aggregation level is the lever, not the model.

On the anonymized forecasting work, going from Region × Product × Country × Week to Region × Week cut MAPE from 23% to 14%. Same data, same XGBoost, different granularity. Most 'model tuning' rounds should be aggregation experiments first.

mlforecasting

Build Log

AI agents are great until they aren't watched.

Three months into running them at volume: every autonomous agent worth running needs a human checkpoint before it sends, posts, or commits. The good ones tell you exactly what they want to do and wait.

agentsai

Operating Note

Most contractor software is built for the office, not the truck.

The owners I work with are on a job site at 7am and answering missed calls at 9pm. If your software needs a Wednesday training session to use, you've already lost them.

contractorsux

Build Log

Ad spend without attribution is a tax on optimism.

Every Google or Meta account I audit has at least one campaign that hasn't fired a real conversion event since 2023. Fix the tracking before you touch a single bid.

adsattribution

Operating Note

Enterprise AI fails because the process is broken, not because the model is.

MIT's 2025 study put it at 95% of enterprise AI pilots returning no measurable ROI, and Luke Pierce's 2026 survey of 242 businesses found 52% stuck at not knowing where to start while only 4% named budget. The bottleneck is almost never model quality. Every quarter the models improve and the failure rate doesn't move. The gap is between how work is documented and how work actually flows. The pilots that succeed embed someone hands-on for four-plus weeks before any AI is written, decompose the work into ~85% deterministic code and ~15% LLM judgment, and ship the boring part first.

ai-implementationenterpriseops

Operating Note

An AI-native agency targeting 65% gross margin designed it wrong.

Legacy agency gross margins sit at 60-70% because delivery labor eats 30-40% of revenue. Restructured on top of AI, delivery COGS should drop to 10-20% and gross margin should clear 80%. If you're targeting the old number after the redesign, you missed the redesign. The margin is the diagnostic. It tells you whether your AI integration actually changed how work flows, or whether you just bought more tools.

agencymarginai-native

Operating Note

Service businesses are winning the next five years. Stop forcing yourself to build SaaS.

The "SaaS beats agency" narrative rested on one assumption: that agencies couldn't scale without proportional labor. AI removes that ceiling. Service businesses already have distribution, trust, immediate cash flow, and customer relationships that take SaaS companies years to build. The more useful move is to put AI inside the service business and let it scale, rather than abandoning it to chase software multiples.

business-modelservice-businessai-era

Operating Note

The AI margin collapse hits agencies that sold labor, not agencies that built a moat.

If your delivery COGS is 30-40% of revenue, AI is going to compress those numbers from underneath whether you adopt it or not. The way out isn't faster tools. It's outcome-based pricing on top of vertically-accumulated data nobody else has. Agencies that priced by the hour are about to compete with anyone who can run the same model. Agencies that priced by results and own their domain dataset can't be commoditized because the commodity is execution, not insight.

agencyai-disruptionpricing

Operating Note

Selling "more leads" to home-services contractors is selling CPU speed to someone who needs a database.

Contractors in mature niches have heard the leads pitch 500 times. They're solution-aware, and they don't believe a volume claim from anyone they haven't worked with. What lands is naming a mechanism they recognize as proprietary (seasonal patterns, specific competitor data, a niche-only signal stack) that commodity lead platforms can't replicate. Pitch differentiation, not throughput.

positioningagency-salesmessaging

Operating Note

010

Ask what the contractor is actually doing, not what they want.

Most paving and garage-door contractors will tell you they want commercial work. Their scheduling, follow-up, and pricing are all built around residential volume. The diagnostic move is to measure the stated target mix against the actual mix and ask why the gap exists. Most of the time the answer isn't repositioning. It's tightening operations to support the work they already do.

contractor-opsdiagnosticspositioning

Operating Note

011

Market feedback isn't three themes. It's the ledger of objections you stopped listening to.

When you compress sales-call findings into three executive-summary themes, the real signal (repeated wording, edge-case friction, qualification flags) gets thrown out. Keep the small findings. Every minor pattern that shows up three or more times maps to a copy edit, a qualification question, or an automation that compounds. The themes don't ship product. The ledger does.

product-feedbackgtmresearch

Operating Note

012

Your CRM should be separate from your job-execution system.

Contractors try to make one tool own lead capture, pipeline, and job production. Splitting it cleanly, GHL for lead CRM and the trade-specific tool (ServiceTitan, HCP, Jobber) for production, prevents over-promising one platform and clarifies who owns what handoff. If your CRM is also dispatching trucks, neither half is doing its job well.

agency-opscrmtooling

Operating Note

013

Default settings in cold-email platforms route most of your sends to spam, and the dashboard won't tell you.

Five settings tank deliverability silently: open tracking on (pixel flags spam), send intervals under 20 minutes (machine cadence), daily limit above 15 per mailbox, weekend sends, and the default "delivery optimization" toggles both on. Most platforms keep showing flat open rates from spam-folder pixel hits, so you don't notice until reply rate craters. Reply rate is the only honest deliverability metric. Audit before you scale, not after.

cold-emaildeliverabilityops

Build Log

014

Cold-email reply classification doesn't scale unless you keep watching it.

A reply classifier ships at 95% accuracy on a heuristic stack. Three months later it's at 80% because the 5% miss class grew, the senders changed, and nobody resampled. Sustainable classification is sampling-on-a-cadence as an architecture decision, not a training step you do once and walk away from. Otherwise quality silently degrades and no one notices until pipeline drops.

cold-emailautomationquality

Build Log

015

API credentials aren't secrets. They're assumptions about risk.

Encrypting a Stripe publishable key is security theater. Encrypting a key that can drain a bank account is necessary. Treating every credential the same forces low-leverage controls on harmless tokens and lets high-risk ones hide in the middle of a long list. The right unit of protection is the credential's blast radius, not its format.

securitycredentialsops

Build Log

016

Autonomous agents need three things wired before the loop runs.

A heartbeat tells your supervisor the agent is still alive. A fallback provider keeps the loop running when the primary inference fails or quota burns. A cheap default model stops the loop from auto-selecting the priciest option every iteration. Skip any of the three and you'll get one of: a hung run nobody notices, a cascade outage, or a five-figure surprise bill. None of them are optional in production.

agentsreliabilitycost

Build Log

017

Edge middleware will silently bundle Node-only modules and crash in production.

Next.js middleware runs on the Edge runtime, which can't load crypto, pg, bcrypt, or anything that ships native bindings. Bundlers will happily pull them in via an unsuspecting auth import and not warn you. The deploy goes green, pages load, and you start logging "Cannot redefine property: __import_unsupported" until you trace the import tree. Keep middleware lean and move database guards into route handlers.

nextjsedge-runtimebundling

Build Log

018

When OAuth or environment-variable errors hit production, check the database schema first.

A surprising number of "secrets are missing" or "auth is broken" production incidents trace back to a migration that didn't run. The code expects a column, the column isn't there, the ORM throws a generic 500, and the error reaches the user as something else. Check supabase_migrations.schema_migrations (or your equivalent) before you go chasing secrets and tokens. Schema drift is the silent failure mode that masquerades as everything else.

databasedeploymentdebugging

Build Log

019

Bad WiFi is a topology problem, not a speed problem.

People upgrade their ISP every time the home network drags. The real fix is usually the placement of access points, band steering, and a wired backhaul between APs. Latency and jitter happen in the layout of your network, not in the size of the pipe coming in. I've gotten more uplift from a $40 ethernet run than from a $40-per-month plan upgrade.

networkinghome-infradiagnostics