Owner & sole engineer (Software Is Nothing, LLC) · 2025–present
B2B Deal Intelligence Platform
A Python AI-agent platform for a specialist B2B operator — multi-source discovery, dedup, scoring, and human-gated outreach in the operator's voice.
What it is
An internal-use AI-agent platform for a specialist B2B sales operation. The system ingests deal flow from inbound counterparty channels alongside reference data from a wide set of public sources, dedups and legitimacy-scores entries against a canonical registry, surfaces match candidates between sides of the market, drafts personalised outreach in the operator’s voice under human-approval gates, and monitors relevant external signals on a tiered cadence.
Built for the operator’s own practice — not commercially distributed. Industry, counterparties, geography, and underlying data sources are not disclosed publicly.
Stack
- Storage: PostgreSQL 16, large relational schema (entities, listings, signals, deals, correspondence, outreach campaigns, outreach messages, system health) enforced via CHECK-constraint enums.
- API + frontend: Flask REST API behind nginx basic-auth, Next.js 14 + Tailwind + React Query review UI, both containerised.
- Tooling: 25+ Python modules across intake, extraction, dedup, entity matching, multi-rubric scoring, counterparty discovery, match engine, outreach engine, sequence manager, reply detection, transcript-intel extraction, and morning brief — plus 15+ scheduled ingestion jobs alongside.
- AI: Multi-provider AI router with priority-ordered failover — the operator’s own Claude Code subscription wired as priority-0 via the CLI inside the container, with rotation across additional providers behind it. Two-pass extraction: a cheap triage classifier filters thousands of source rows, a full extraction classifier processes only confirmed-positive ones. Hard cost-discipline guards: permanent per-model token caps, MOCK_MODE on every direct-API callsite, an in-app banner that surfaces provider degradation in real time, and shared post-processing that strips a curated list of corp-speak filler phrases from every model output before it reaches the operator.
- Compliance: every draft passes voice-profile validation, frequency caps, signed-NDA gating on sensitive information, a business-domain-only sender guard (personal-domain emails blocked at the DB trigger, send endpoint, draft endpoint, auto-cadence, and UI — no env-var bypass), and mandatory regulatory footers.
- Microsoft Graph: Outlook for email I/O, Teams for transcript pulling, SharePoint for attachment storage.
- Public data: 15+ regulatory, market, and entity-formation sources ingested on daily / weekly / monthly cadences.
- Schedules: APScheduler in a dedicated
schedulercontainer for in-process crons, plus a host-cron AI extraction worker that lives outside Docker because it shells out to a host-installed CLI for the priority-0 AI provider.
What it demonstrates
- AI agent design under hard compliance constraints — voice profiles, signed-NDA gating, per-contact frequency caps, mandatory regulatory footers, four-layer personal-email blocking with no env-var bypass.
- Long-running automated workflows with cost discipline — the operator’s own AI subscription is wired in as priority-0, so day-to-day usage carries no per-token billing; the failover stack adds resilience without recurring spend.
- Multi-source data normalisation — every incoming entity gets a primary-key hash, a fuzzy-match secondary, and (where applicable) a perceptual-image hash before promotion into the canonical registry.
- Coordinator + sub-agent engineering pattern — the development loop itself is structured around a planning agent that dispatches scoped implementation agents for each unit of work, with a fragment/merge protocol for shared files that multiple agents touch in a single batch.
Industry, counterparties, geography, asset specifics, and underlying data sources are not disclosed publicly.
← All work