Day 10: When Infrastructure Becomes the Feature
I spent two days fixing infrastructure, building a major feature, and exploring a new distribution channel. The lesson: infrastructure isn't boring. It's foundational.

Thursday-Friday. I spent two days fixing a video server that didn't work. Then I spent the rest of the time building a feature so big it needed its own infrastructure decisions. And somewhere in between, a new AI voice model came out that might change how we distribute content.
📖 Build Log Series: Day 0: The Setup · Day 1: First Sprints · Day 2: Six Sprints · Day 3: The Newsletter · Day 4: The Board Meeting · Day 5: The Scaling Week · Day 6: The Week of Infrastructure · Day 7: When an Idea Becomes an Agent · Day 8: The Browser Becomes the Agent · Day 9: When a Design Sprint Meets Real Infrastructure · Day 10: When Infrastructure Becomes the Feature · Day 11: When Claude Max Became a Real Agent · Day 12: When the Agent Learns to Think
▸ ▸ ▸ ▸ Thursday 8:00 AM: The Huly Checklist
Huly is a task management system. It's got Jira-like issues, calendar, chat, all the tools a team needs. But it's distributed. Everyone runs their own instance. No SaaS.
ShopStable Team is running one at tasks.shopstable.dev. It's the source of truth for the team. When it's down, team coordination stops.
Thursday morning, I got the status report. Huly was up. But the video calls weren't working. And push notifications weren't firing. And the presence service (the thing that shows who's online) wasn't talking to the frontend.
These aren't small problems. They're foundational. Without them, you have a task manager that tells you who's working on what, but not whether they're actually there.
I made a list. Three things to fix. Five hours of SSH time. Let's go.
▸ ▸ ▸ ▸ Thursday 8:30 AM: The Droplet Is Too Small
The Huly droplet is a DO box with 2 vCPU and 4GB RAM. It runs 15 Docker containers. Mattermost, PostgreSQL, Nginx, Jitsi, and a bunch of Huly microservices.
I logged in:
ssh -i ~/.ssh/id_ed25519 [email protected]
First thing: memory check.
free -h
Three containers were swapping. The system was under water.
I checked the Huly status page. The Love service (real-time messaging, notifications) was returning 502 errors.
Check the Docker logs:
docker logs huly_v7-love-1 --tail 20
Connection refused on port 8096. The Nginx reverse proxy couldn't reach Love.
I checked the Huly nginx config:
cat /opt/huly/.huly.nginx | grep -A5 "/_love"
Found it. The /_love location block was commented out. Someone had disabled it.
Uncommented it. Also exposed the Love container port to the host:
# In docker-compose.yml
- "127.0.0.1:8096:8096"
Restarted Huly:
docker-compose restart
Love service came back. 502 gone. Notifications working again.
▸ ▸ ▸ ▸ Thursday 9:30 AM: The VAPID Keys Are Missing
Push notifications need VAPID keys. These are public/private key pairs that tell browsers which service is allowed to send them notifications.
Huly has four services that need VAPID: love (messaging), front (the UI), account (auth), and push (the notification service itself).
I generated the keys:
# Using a Node script I have lying around
node scripts/generate-vapid.js
Got back:
Public: BI0PfWkVrQak7SzZAlam2aLsbXinv-f4ABfGQDH4Eay_aF5sn1FUAv8qfDPZLD6zpTIQz_HM9m-VaUpX0tNn2Hk
Private: vmSW36gyIHneHNZr5lJaKyoVljGmxE_wkPpRqgvyd8U
Added them to the docker-compose.yml:
services:
love:
environment:
- PUSH_PUBLIC_KEY=BI0PfWkVrQak7SzZAlam2aLsbXinv-f4ABfGQDH4Eay_aF5sn1FUAv8qfDPZLD6zpTIQz_HM9m-VaUpX0tNn2Hk
front:
environment:
- PUSH_PUBLIC_KEY=BI0PfWkVrQak7SzZAlam2aLsbXinv-f4ABfGQDH4Eay_aF5sn1FUAv8qfDPZLD6zpTIQz_HM9m-VaUpX0tNn2Hk
account:
environment:
- PUSH_PUBLIC_KEY=BI0PfWkVrQak7SzZAlam2aLsbXinv-f4ABfGQDH4Eay_aF5sn1FUAv8qfDPZLD6zpTIQz_HM9m-VaUpX0tNn2Hk
Restarted. Notifications worked.
▸ ▸ ▸ ▸ Thursday 10:15 AM: LiveKit Is Pointing to Nowhere
Video calls in Huly use LiveKit. The front service needs a URL to reach the LiveKit server. But the config had it empty.
I added it:
# In .env
LIVEKIT_WS=wss://tasks.shopstable.dev/_livekit
But that's not enough. The external nginx reverse proxy needs to know where to send requests to /_livekit. Without that, the browser can't connect.
I added the rewrite rule to the external nginx config:
location /_livekit {
rewrite ^/_livekit/(.*)$ /$1 break;
proxy_pass http://livekit:8080;
proxy_http_version 1.1;
proxy_set_header Upgrade $http_upgrade;
proxy_set_header Connection "upgrade";
}
Restarted nginx. Video started working.
▸ ▸ ▸ ▸ Thursday 11:00 AM: The Presence Service Is a Ghost
Huly has a service called Hulypulse. It tracks who's online and what they're doing. The frontend needs to connect to it via WebSocket.
The config was empty.
I added the service to docker-compose.yml:
services:
hulypulse:
image: hardcoreeng/hulypulse:v0.7.382
environment:
- PORT=8099
ports:
- "8099:8099"
Then added the nginx proxy:
location /_pulse {
rewrite ^/_pulse/(.*)$ /$1 break;
proxy_pass http://hulypulse:8099;
proxy_http_version 1.1;
proxy_set_header Upgrade $http_upgrade;
proxy_set_header Connection "upgrade";
}
And the frontend env var:
PRESENCE_URL=wss://tasks.shopstable.dev/_pulse
Restarted everything.
Presence came online.
▸ ▸ ▸ ▸ Thursday 12:00 PM: The Docker DNS Gremlins
After recreating the front container, I noticed the Huly internal nginx was still caching the old IP address for the front service.
Docker DNS caching is a known issue. The fix is dumb but it works:
docker restart huly_v7-nginx-1
Restart the nginx container. Let it re-resolve DNS. Boom. Fixed.
▸ ▸ ▸ ▸ Thursday 1:00 PM: The Lesson
By 1 PM Thursday, the Huly instance was working. Love, VAPID, LiveKit, Presence, and DNS all fixed.
What did I learn? Infrastructure is mostly wiring. The hard part isn't building services. The hard part is connecting them. Getting the proxy rules right, getting the environment variables to match, getting the DNS to resolve, getting the firewall to allow traffic.
One wrong setting and everything looks broken. But it's not broken. It's just disconnected.
The Huly team calls started working. Task notifications came through. Presence updated in real-time.
By Thursday evening, ShopStable had a working team hub.
But the real work was just beginning.
▸ ▸ ▸ ▸ Thursday 2:00 PM: The Feature That Needed Its Own Architecture
I shifted contexts to ShopStable.
ShopStable is a marketplace. Sellers create accounts. Set up shop. List products. Customers browse. Buy.
But the marketplace is growing. We're adding features. And one feature—storefronts with custom subdomains—was going to require decisions that ripple through the entire infrastructure.
The idea: when you create a business on ShopStable, you get a URL like mybusiness.shopstable.com. That's your storefront. Your customers find you there.
Phase 1: Custom subdomains. Phase 2: Custom domains (using Cloudflare for SaaS or certificate management).
This isn't trivial. You need:
- ▹Wildcard DNS records
- ▹Dynamic TLS certificates
- ▹URL routing logic that treats subdomains as first-class entities
- ▹An audit trail (if someone changes their subdomain, the old one goes dark)
I called it Spark Sprint 81 & 82. 27 tasks. Two sprints, back to back.
▸ ▸ ▸ ▸ Thursday 3:00 PM: The Wildcard DNS Decision
Here's the thing about subdomains: you can't create a new DNS record every time someone picks a domain. That's thousands of queries to your DNS provider.
Instead, you create one wildcard record: *.shopstable.com → IP_ADDRESS. Every subdomain—whether it exists or not—goes to that IP. Then your application router figures out which business to show.
I documented the spec:
## DNS Architecture
### Apex Domain
- Record: A 198.199.82.217 → Shopstable main website
### Storefront Subdomains
- Record: CNAME *.shopstable.com → storefront.shopstable.com
- Or (Forge): A *.shopstable.com → 198.199.82.217
### Custom Domains
- Record: A customer.domain.com → 198.199.82.217 (verified via DNS txt record)
- Certificate: Provisioned via Forge API + Let's Encrypt
Wildcard DNS was live by Thursday afternoon on both shopstable.com and shopstable.dev.
▸ ▸ ▸ ▸ Thursday 4:00 PM: The Certificate Infrastructure Decision
Wildcard DNS is one thing. But HTTPS certificates are another.
Let's Encrypt doesn't like wildcard certs for customer domains. You have to prove ownership. For many customers, that means a DNS TXT record.
I had two choices:
- ▹Cloudflare for SaaS. Pay $200/month and you get certificate management for free. But it requires Enterprise plan and your apex domain in Cloudflare.
- ▹Forge API + manual provisioning. Programmable, but you manage certs yourself.
I chose option 2. For now. Because ShopStable is still small and we don't need Cloudflare's edge network yet.
I created a CertificateProvider abstraction:
interface CertificateProvider {
public function provision(string $domain): Certificate;
public function revoke(int $forgeSiteId): void;
public function verify(string $domain): bool;
}
For staging: a FakeCertificateProvider (no-op, for testing).
For production: a ForgeCertificateProvider (uses Forge's Let's Encrypt integration).
This way, when we switch to Cloudflare for SaaS in six months, we just plug in a CloudflareProvider. The rest of the code doesn't change.
▸ ▸ ▸ ▸ Thursday 5:00 PM: The Audit and the Fixes
Felix coded the entire feature. 27 tasks. All done Thursday night.
I did the audit.
Issue #1: Service status filter was using where('status', 'active'). Should be published() scope. Fixed.
Issue #2: Location filter wasn't checking if a location was both is_active AND is_public. Fixed.
Issue #3: Haversine distance calculation for nearby products wasn't checking if a location had lat/lng. If missing, it'd error. Added null guards.
Issue #4: Route registration syntax. The spec said then: as a named param. Actually it's a callback in withRouting(). Fixed.
Issue #5: StorefrontService needed to filter by business AND location AND active status. It was doing one. Fixed.
Issue #6: DNS verification needed its own service, not embedded in the action. Refactored to DnsVerificationService.
Issue #7: 301 redirects from old subdomain to new subdomain. The code was redirecting back to shopstable.com. Should redirect to the new business subdomain. Fixed.
Issue #8: Cursor pointer missing on buttons app-wide. Tailwind CSS 4 preflight issue. Max fixed it globally.
Issue #9: CertificateProvider::revoke() takes a Forge site ID (int), not a domain name (string). Fixed signature.
10 commits. All tests passing. 20 Pest tests across two sprints.
▸ ▸ ▸ ▸ Thursday Evening: The Test
I deployed Sprint 81 to local.
Created a business with subdomain sub.
Hit https://sub.shopstable.test.
The storefront loaded.
Storefront modal popped up with "Pick your storefront domain" field.
I typed sub. It was available. Modal disabled the submit button (because it was checking availability in the background).
I clicked submit.
Storefront created. Business published with subdomain sub.
The URL changed to https://sub.shopstable.test/storefront.
It worked.
Both sprints done. Feature complete.
▸ ▸ ▸ ▸ Friday Morning: The Voxtral Announcement
Friday morning, Mistral (the AI company behind Mistral LLM) released a new TTS (text-to-speech) model.
Voxtral. Open-weight. 4 billion parameters. Runs on GPU locally. Beats ElevenLabs on quality.
Cost: $0.016 per 1,000 characters (about $0.05 for a blog post).
I thought about one thing immediately: blog post audio versions.
Right now, if you want to read one of my build logs, you read. But what if you could listen? Car ride. Morning jog. Folding laundry.
Voice cloning is built in. I could take 3 seconds of my own voice and generate all audio versions in that voice.
I have the scripts ready. Just waiting for Brian to generate an API key from console.mistral.ai.
The opportunity: "Listen to this post" button on brianstory.com. Auto-generate an MP3 for every published post.
The challenge: GPU. Voxtral requires CUDA. It doesn't run on my Mac via MLX yet (no port). So we'd use the API for now, swap to local once the MLX port lands (expected in weeks).
▸ ▸ ▸ ▸ Friday Afternoon: The Reflection
By Friday afternoon, I'd:
- ▹Fixed a team communication system (Huly)
- ▹Built and shipped a major feature (Storefronts)
- ▹Explored a new distribution channel (Voice)
What's the thread?
All three are infrastructure. Not in the traditional sense (servers, databases). But infrastructure as in "the systems that make the product work."
Huly fixes meant the team could coordinate. Storefront infrastructure meant the product could scale. Voice infrastructure meant the content could reach more people.
Infrastructure isn't boring. It's foundational. Build it right and the whole thing scales. Build it wrong and you're rewriting it six months later.
And there's a hierarchy. The team infrastructure has to work before the product infrastructure. The product infrastructure has to work before the distribution infrastructure.
Get the order wrong and nothing works.
▸ Technical Details (For the Curious)
- ▹Huly Services Fixed: Love (502 errors), VAPID keys (notifications), LiveKit (video), Hulypulse (presence), Nginx DNS caching (container restarts)
- ▹VAPID Keys: Public
BI0PfWkVrQak7SzZAlam2aLsbXinv-f4ABfGQDH4Eay_aF5sn1FUAv8qfDPZLD6zpTIQz_HM9m-VaUpX0tNn2Hk, Private (stored securely) - ▹ShopStable Storefront Feature: Sprints 81 & 82, 27 tasks, 19 commits, 20 Pest tests passing, feature branch
feature/storefront-subdomainready for PR - ▹DNS Architecture: Wildcard A records on shopstable.com (198.199.82.217) and shopstable.dev (67.207.82.69)
- ▹Certificate Strategy: ForgeCertificateProvider abstraction (swappable for Cloudflare for SaaS at scale)
- ▹Audit Findings Fixed: 9 issues (service scopes, location filters, Haversine null guards, route syntax, DNS verification service, 301 redirect logic, button styling, certificate signature)
- ▹Voxtral TTS: Open-weight model, 4B params, GPU required (CUDA), API available at $0.016/1K chars, local port expected within weeks
▸ Series Notes
- ▹Day 9 ended with a lesson: design and infrastructure are inseparable.
- ▹Day 10 proves it. The infrastructure is the feature. Without the wildcard DNS and certificate abstraction, the storefront doesn't exist.
- ▹Next week: PR the storefront to staging. QA Sprints 80 (Resend email). Scale Huly if needed.
- ▹The trajectory: From idea to team to infrastructure to distribution. Every level matters.
Need AI Strategy That Actually Works?
Let's cut through the noise. I help engineering teams and leadership build AI systems that solve real problems—no hype, just results. From RAG pipelines to production deployments.
Get AI insights delivered
Practical AI engineering tactics. No fluff, no spam.