Founder & sole engineer (Software Is Nothing, LLC) · 2024–present
tailharbor.eu
A European pet-adoption SaaS spanning 40 countries and 16 UI languages — full-stack web, native iOS + Android apps, and a 3,479-adapter scraper system feeding ~124K live listings, 3,620 shelters, and 764K managed photos.
What it is
A consumer-facing pet-adoption aggregator covering 40 European countries in 16 languages. Pulls listings from a long tail of rescue and shelter sites, normalises the data, scores photos, translates descriptions, and serves a unified search experience at tailharbor.eu — on the web and on native iOS and Android apps.
Stack
- Web: Next.js 15 (App Router, server components, ISR), Clerk for auth, full UI in 16 locales (EN / DE / FR / ES / IT / NL / PT / PL / RO / SV / EL / HU / CS / DA / FI / NO).
- API: Express + Prisma ORM on PostgreSQL 16 with PostGIS, behind PgBouncer. Redis caching layer. Homepage served from a
homepage_mixed_poolmaterialized view refreshed every 5 min — drove the mixed-sort homepage query from 10,610 ms (live CTE) to 89 ms (MV). - Search: Elasticsearch with typo-tolerance disabled on breed fields so exact breed names don’t fuzzy-match.
- Scraper system: BullMQ job queue with 3,479 country-specific adapters in production. Cheerio for static HTML, Puppeteer for JS-rendered sites (with a pooled browser and
[puppeteer-pool]telemetry that warns at p95 > 5 s). WP REST API, Wix data APIs, and ACF endpoints used where exposed. NordVPN SOCKS5 proxy for IP-blocked targets. - Mobile: React Native via Expo EAS. iOS Build 43 live on TestFlight, Android Build 4 in Play Internal testing. One-command
ship-both.shparallel pipeline with auto-bump versioning, preflight checks, andeas submitintegration — both stores reached in ~8 min wall-clock. - Storage: Cloudflare R2 for animal photos (764,086 photos under lifecycle management: stamp-on-sight + 04:30 UTC stale cleanup + 05:00 UTC null-stored retry, all gated by an
off → observe → dry-run → enforcemode flag). - Payments: Stripe (freemium → PRO unlimited contacts).
- AI: Anthropic Claude for photo aesthetic scoring, breed/colour/size enrichment, and translation; Google Cloud Vision for face-detection smart cropping.
- Deploy: Docker Compose to a Hetzner box, GitHub Actions CI (with a local
ci-local.sh+ pre-push hook that mirrors the workflow so CI billing is never load-bearing). - Compounding layer: a Recipe DB shipped 2026-05-22 — 77 markdown recipes (68 build + 9 triage; 50 promoted to
provenat link-count ≥ 3) backing a SQLite compiled index, queried by both the builder agent and the maintenance agent before they touch code.
Scale today (live prod counts)
- 3,479 scraper adapters in production across 40 countries.
- 124,181 available animal listings (~175K total including recently adopted), refreshed continuously.
- 3,620 shelters in the canonical registry.
- 764,086 photos in R2, lifecycle-managed.
- 2,965 long-form articles (country guides, breed pages, health-certificate walkthroughs).
- 16-locale UI surface plus 11-language adopted-filter coverage in the scraper layer.
- Native mobile apps shipping to TestFlight + Play Internal on a parallel pipeline.
Why it’s an integration story
The hard problems aren’t the user-facing app — they’re the compounding automated maintenance of a long-tail data pipeline. Three agents form a self-improving loop:
scraper-wave(builder): orchestrates parallel sub-agent builds of new adapters. Pre-flight emits--jsonwith matching recipe IDs so each sub-agent gets a hand-delivered briefing of relevant patterns (Wix DOM paths, WP REST quirks, Drupal/Joomla/Squarespace conventions, adopted-filter regexes per language, photo-source priority orderings).scraper-triage(maintainer): daily 08:00 UTC cron diffs broken adapters against last week, classifies the breakage via an 8-cause decision tree (FALSE_BROKEN/DEAD/WAF_NEW/URL_CHANGED/JS_NEW/PHOTO_CDN_MIGRATED/SELECTOR_DRIFT/AGGREGATOR_CHANGED), and walks per-cause recipes. Hard rules baked in from real production incidents — e.g. “fail-closed when the adopted-leak filter has no data” (prevents 1,000+ ghost-listing leaks) and “no synthetic descriptions, ever” (prevents Google HCU-tank fuel).recipe-distiller(compounder): every wave merge runs a post-merge distillation agent that proposes new recipes from net-new patterns. Every triage fix can extend the triage recipe set. The next wave consumes what the previous one learned.
Other automated discipline:
- A mobile release pipeline (
preflight.sh+ship-both.sh) gates a 7-minute build behind a 5-second sanity check — certs, profiles, keystores, fingerprints, EAS env, JSON validity, Node / Xcode / SDK present — so failures surface up-front instead of after the build. - A photo-lifecycle pipeline rolled out behind the same
off → observe → dry-run → enforcemode flag, with a 30-day post-rollout review agent that quantifies drift before any destructive operation goes live. - A MV-refresh + nightly-ANALYZE cron pair on prod that recovered the site from a 504 outage in ~200 ms after pg stats drifted under heavy scrape ingest.
The same patterns I apply to marketing-data integrations at Bagjump and to the AI-agent compliance pipeline at my other consulting build.
← All work