Founder & sole engineer (Software Is Nothing, LLC) · 2024–present

tailharbor.eu

A European pet-adoption SaaS spanning 40 countries and 16 UI languages — full-stack web, native iOS + Android apps, and a 3,479-adapter scraper system feeding ~124K live listings, 3,620 shelters, and 764K managed photos.

Next.jsExpressPrismaPostgresElasticsearchBullMQReact NativeExpo EASPuppeteerClaude

tailharbor.eu

What it is

A consumer-facing pet-adoption aggregator covering 40 European countries in 16 languages. Pulls listings from a long tail of rescue and shelter sites, normalises the data, scores photos, translates descriptions, and serves a unified search experience at tailharbor.eu — on the web and on native iOS and Android apps.

Stack

  • Web: Next.js 15 (App Router, server components, ISR), Clerk for auth, full UI in 16 locales (EN / DE / FR / ES / IT / NL / PT / PL / RO / SV / EL / HU / CS / DA / FI / NO).
  • API: Express + Prisma ORM on PostgreSQL 16 with PostGIS, behind PgBouncer. Redis caching layer. Homepage served from a homepage_mixed_pool materialized view refreshed every 5 min — drove the mixed-sort homepage query from 10,610 ms (live CTE) to 89 ms (MV).
  • Search: Elasticsearch with typo-tolerance disabled on breed fields so exact breed names don’t fuzzy-match.
  • Scraper system: BullMQ job queue with 3,479 country-specific adapters in production. Cheerio for static HTML, Puppeteer for JS-rendered sites (with a pooled browser and [puppeteer-pool] telemetry that warns at p95 > 5 s). WP REST API, Wix data APIs, and ACF endpoints used where exposed. NordVPN SOCKS5 proxy for IP-blocked targets.
  • Mobile: React Native via Expo EAS. iOS Build 43 live on TestFlight, Android Build 4 in Play Internal testing. One-command ship-both.sh parallel pipeline with auto-bump versioning, preflight checks, and eas submit integration — both stores reached in ~8 min wall-clock.
  • Storage: Cloudflare R2 for animal photos (764,086 photos under lifecycle management: stamp-on-sight + 04:30 UTC stale cleanup + 05:00 UTC null-stored retry, all gated by an off → observe → dry-run → enforce mode flag).
  • Payments: Stripe (freemium → PRO unlimited contacts).
  • AI: Anthropic Claude for photo aesthetic scoring, breed/colour/size enrichment, and translation; Google Cloud Vision for face-detection smart cropping.
  • Deploy: Docker Compose to a Hetzner box, GitHub Actions CI (with a local ci-local.sh + pre-push hook that mirrors the workflow so CI billing is never load-bearing).
  • Compounding layer: a Recipe DB shipped 2026-05-22 — 77 markdown recipes (68 build + 9 triage; 50 promoted to proven at link-count ≥ 3) backing a SQLite compiled index, queried by both the builder agent and the maintenance agent before they touch code.

Scale today (live prod counts)

  • 3,479 scraper adapters in production across 40 countries.
  • 124,181 available animal listings (~175K total including recently adopted), refreshed continuously.
  • 3,620 shelters in the canonical registry.
  • 764,086 photos in R2, lifecycle-managed.
  • 2,965 long-form articles (country guides, breed pages, health-certificate walkthroughs).
  • 16-locale UI surface plus 11-language adopted-filter coverage in the scraper layer.
  • Native mobile apps shipping to TestFlight + Play Internal on a parallel pipeline.

Why it’s an integration story

The hard problems aren’t the user-facing app — they’re the compounding automated maintenance of a long-tail data pipeline. Three agents form a self-improving loop:

  • scraper-wave (builder): orchestrates parallel sub-agent builds of new adapters. Pre-flight emits --json with matching recipe IDs so each sub-agent gets a hand-delivered briefing of relevant patterns (Wix DOM paths, WP REST quirks, Drupal/Joomla/Squarespace conventions, adopted-filter regexes per language, photo-source priority orderings).
  • scraper-triage (maintainer): daily 08:00 UTC cron diffs broken adapters against last week, classifies the breakage via an 8-cause decision tree (FALSE_BROKEN / DEAD / WAF_NEW / URL_CHANGED / JS_NEW / PHOTO_CDN_MIGRATED / SELECTOR_DRIFT / AGGREGATOR_CHANGED), and walks per-cause recipes. Hard rules baked in from real production incidents — e.g. “fail-closed when the adopted-leak filter has no data” (prevents 1,000+ ghost-listing leaks) and “no synthetic descriptions, ever” (prevents Google HCU-tank fuel).
  • recipe-distiller (compounder): every wave merge runs a post-merge distillation agent that proposes new recipes from net-new patterns. Every triage fix can extend the triage recipe set. The next wave consumes what the previous one learned.

Other automated discipline:

  • A mobile release pipeline (preflight.sh + ship-both.sh) gates a 7-minute build behind a 5-second sanity check — certs, profiles, keystores, fingerprints, EAS env, JSON validity, Node / Xcode / SDK present — so failures surface up-front instead of after the build.
  • A photo-lifecycle pipeline rolled out behind the same off → observe → dry-run → enforce mode flag, with a 30-day post-rollout review agent that quantifies drift before any destructive operation goes live.
  • A MV-refresh + nightly-ANALYZE cron pair on prod that recovered the site from a 504 outage in ~200 ms after pg stats drifted under heavy scrape ingest.

The same patterns I apply to marketing-data integrations at Bagjump and to the AI-agent compliance pipeline at my other consulting build.


← All work