PATH //
/
LYLE UNDERWOOD // INDEPENDENT PRACTICE
AVAILABLE NOW
[00_INDEX]
INDEPENDENT SENIOR ENGINEER // PRODUCT DEV LEAD
LINE 001 // HERO

TURN YOUR
AI AMBITION
INTO REAL PRODUCT
ADVANTAGE.

I work on the production-reliability layer of AI features. Harness engineering, retry/repair loops, context engineering. Reliability instincts from 11 years on a B2B commerce platform at $21B annual GMV.

I work directly with clients through my own practice. Senior hands-on product engineering, no agency layer.

[SEC_A] // METRICS
6 / 06
$21B+
WHSL_GMV_HANDLED
B2B platform at Elastic
MTRC_01
$47M
ACQUISITION_2021
Emerald X (EEX)
MTRC_02
350+
AFA_BRANDS
MTRC_03
250K
RETAILER_USERS
Elastic B2B platform
MTRC_04
25%
AR_STAFFING_CUT
Simms payment portal
MTRC_05
7.5K+
BRAND_RESEARCH_RECORDS
ShowLabs discovery pipeline
MTRC_06
[01_POSITIONING]

WHAT
I DO

I build software
that creates
leverage.

For 11+ years I helped build Elastic Suite, the B2B commerce platform behind a company acquired for $47M in 2021. Wholesale ordering, ERP integrations, real-time inventory, and merchandising for 250+ AFA brands including Patagonia, The North Face, Crocs, Oakley, Burton, and Puma, serving more than 250K retailer users. Built the engineering team from the ground up along the way.

More recently I led product development at ShowLabs, rebuilding most of the platform around AI systems, harness engineering, and workflow design inside a DAM/PLM product.

Right now I'm specializing in AI and harness engineering: figuring out where LLMs, agentic workflows, and human-in-the-loop systems belong in a product, then building the harness around them. Context delivery, tool integration, quality gates, retry/repair loops. The kind of thing that holds up in production.

AI THAT SHIPSHARNESS ENGINEERINGCONTEXT ENGINEERINGCONSUMER-GRADE UX IN B2B
POINT_01

I get things done.

POINT_02

I care about product impact more than technical theater.

POINT_03

I move quickly, but not blindly.

POINT_04

I look for the solution that makes the product stronger, not the easiest one to ship.

POINT_05

I am comfortable in ambiguity, especially in early-stage and transitional environments.

[02_WORK] // SELECTED

SELECTED
WORK

Five projects that show how I work: AI systems inside and outside the user loop, large-scale commerce, and non-obvious product design.

MULTI-AGENT // QUALITY-GATED
WORK_01

ShowLabs Copilot

I led product development on a DAM/PLM platform rebuild for apparel, footwear, and accessories. On custom harness infrastructure I built, I shipped a surface-aware multi-agent copilot inside the platform. The model was the easy part.

MULTI-AGENT COPILOTHARNESS ENGINEERINGSURFACE-AWARE
QUALITY-GATED PIPELINESLONG-CONVERSATION MEMORYTOOL-DRIVEN ACTIONS

WHERE THE MODEL ENDS

The model produces text and stops. A real product needs predictable behavior, persistent context, structured operations on real state, and surfaces that are not chat. The gap between those is the work.

  • Models are non-deterministic; products are not
  • Model context is bounded; product workflows are not
  • Models suggest in text; products need actions on real state
  • Models are chat-shaped; product surfaces are not

HARNESS STACK

STACK

Four layers around the model, each closing one of the four gaps. Together they are the harness.

L01
PRIMARY

Multi-Agent Routing

A surface-aware router that picks specialized agents based on which product surface triggered the call. Each agent has a defined task, scoped tool access, and explicit handoffs. The copilot lives inside the editor, the batch view, and the search, not next to them.

L02

Memory + Compaction

Long-conversation memory beyond the model's context window. Compaction summarizes older turns behind the scenes while anchored references survive: workflow_id, batch_id, brand_vocab. The user does not see compaction happen. The same pattern powers a .ai context system that keeps coding agents consistent across the codebase.

L03

Tool Integration

Structured operations on real product state. Every action is a function call (set_title, set_desc, tag_set) with typed arguments, validation, and an audit trail. The model commits state, not text suggestions.

L04

Quality Gates

Confidence scoring, post-processing, and approval flows on every generation. Outputs land graded and reviewable. The variance the model produces stays inside the gate; customers see only what passed.

WHAT IT DELIVERS

Four concrete outcomes the harness produces in production.

RESULT_01

Acts on the Page

next-gen agentic UX

Agents read the page they live on and act on it directly. State changes appear in the surface in front of the user, with a tool-call audit trail. The copilot operates the product, not the database behind it.

RESULT_02

Better Answers via Specialization

tight context per agent

Routing splits the work across specialized agents, each scoped to a tight contextual domain. Deep context, kept manageable. Each agent's answers improved as a result. No single agent had to juggle everything.

RESULT_03

Lossless Context Management

LCM · invisible compaction

Full historical context preserved across long conversations and multi-day workflows. Compaction happens behind the scenes; users never see it. No forced new-conversation prompts. No lost context.

RESULT_04

No Busy Work for the User

below 80% confidence · silent retry

Quality gates score every generation. Below 80% confidence, the harness retries silently; the user never sees the failed attempts. The review queue that would otherwise pile up on the end user stays inside the harness until something worth showing comes back.

BATCH CONSISTENCY // SHARED CONTEXT
WORK_02

ShowLabs Treatment Workflows

I built a treatment workflow system on the ShowLabs platform that solves batch consistency for AI generation. A treatment is a shared creative-direction context, authored once and branched across an entire batch instead of cold-starting each generation. One AI output is easy; the batch is where the work lives.

TREATMENT WORKFLOWSBATCH CONSISTENCYVARIANT GENERATION
HUNDREDS OF PRODUCTS PER BATCHSHARED CREATIVE DIRECTIONNO COLD STARTS

THE BATCH PROBLEM

One impressive AI output is easy. Hundreds of products in the same batch, all needing to look like they belong to the same product line, is where most generation flows fall apart.

  • Cold starts on every generation lose creative continuity
  • Per-product prompts drift the longer the batch runs
  • Users cannot manually re-establish context for hundreds of items

TREATMENT ARCHITECTURE

STACK

Four layers I built to close the batch problem. Together they are the treatment workflow.

L01
PRIMARY

Treatment as Shared Context

A treatment is a structured, persistent creative-direction object. Authored once per batch, referenced by every generation in the batch.

L02

Branching, Not Forking

Each generation branches off the treatment but does not modify it. The shared context stays canonical; per-product variation happens at the leaf.

L03

Compaction Across The Batch

Long-running batches use compaction to keep the conversation context tractable as the batch progresses, without losing the treatment's anchoring direction.

L04

Authoring Surface

Treatments are authored inside the product by users with creative judgment, not by engineers in a config file. The system supports the creative role.

WHAT IT DELIVERS

Four outcomes the treatment system produces at batch scale.

RESULT_01

One Creative Direction, Hundreds of Variants

batch reads as one creative line

Variants in a batch read as one coherent creative direction, not 200 isolated outputs that drifted apart. The treatment is the anchor every variant references.

RESULT_02

Author Once, Apply to Hundreds

no per-product prompting

The author writes the treatment once. Every generation in the batch branches from it. No re-prompting, no cold starts, no per-product context juggling.

RESULT_03

Long Batches That Don't Drift

compaction keeps the anchor

Whether the batch runs for minutes or days, compaction summarizes older turns without losing the treatment's anchoring direction. The last variant matches the first.

RESULT_04

Creative Judgment Stays Human

non-engineers author the direction

The treatment is authored inside the product by users with creative judgment, not by engineers in a config file. Designers and brand authors set the direction; the system applies it across hundreds of variants without engineering involvement per batch.

AGENTIC WORK // OUTSIDE THE USER LOOP
WORK_03

ShowLabs Brand Discovery Pipeline

An autonomous AI workflow tied to Shyft's retailer-bridge strategy. When users upload product data for unrecognized brands, the system researches the public web, identifies the brand from UPCs and SKUs, and writes structured taxonomy data back into the product. The agent has a defined task and runs without a prompt; the brand-retailer network grows as a side effect.

AUTONOMOUS RESEARCHAGENTIC WORKFLOWGEMINI EXTRACTION
7,500+ RETAIL BRANDSPARALLEL-WORKER POOLTAXONOMY REPAIR LOOPS

RUNNING WITHOUT A USER

Autonomous agents fail differently than chat AI. The user is not in the loop, so the harness has to handle what chat skips: coordination, validation, repair, integration.

  • Concurrent workers race on shared targets without coordination
  • Non-deterministic model output cannot persist directly into a typed database
  • Categories drift as data scales; mapping requires repair
  • Discoveries that don't reach product features are wasted compute

DISCOVERY PIPELINE

STACK

Four stages I built to run an agent without a user. Each stage closes one of the failure modes from above; together they are the discovery pipeline.

L01
PRIMARY

Autonomous Worker Pool

A pool of parallel workers coordinated by per-target locks held in the database. Multiple jobs hit different brands simultaneously without colliding. If a worker dies mid-job, another picks up where it left off without duplicating effort.

L02

Gemini Extraction

Gemini calls wrapped in a schema-validate-then-retry harness. Each output is parsed against a typed brand schema; on failure, the prompt is re-sent with the validation error inlined until the model produces conforming structured data. Outputs land in the database typed and clean, no cleanup pass required.

L03

Taxonomy Repair

Background repair loops scan the structured outputs for category drift, then merge, split, or remap categories to keep the taxonomy canonical. Brand records accumulate; the taxonomy stays coherent past 7,500 records and counting.

L04

Retailer-Bridge Integration

Each discovered brand record lands in the brand-retailer network model with retailer associations attached. Future uploads recognize the brand instantly; unsigned brands surface in cross-sell flows as concrete leads. Discoveries flow into product features as soon as they are written.

WHAT IT DELIVERS

Four outcomes the discovery pipeline produces in production.

RESULT_01

7,500+ Brands Discovered

structured records ready for product use

The pipeline has captured structured research for over 7,500 retail brands. Each record is schema-validated, normalized, and ready to feed product features. A usable taxonomy, not a scrape dump.

RESULT_02

Network Grows As A Side Effect

every unknown upload feeds the product

Every unknown brand upload triggers research that improves the product for the next user. The brand-retailer network grows continuously, without any user being asked to map or validate brands.

RESULT_03

Self-Healing Taxonomy

mapping repair runs continuously

Mapping repair loops catch category drift as the dataset grows past thousands of records. Taxonomy stays consistent without manual cleanup.

RESULT_04

Continuous, Unattended

no human bottleneck in discovery

The pipeline runs unattended on production traffic. Schema validation, retry, lock coordination, and taxonomy repair handle the failure modes that would otherwise need engineering attention. The agent has a job and does it.

WHOLESALE COMMERCE // AT SCALE
WORK_04

Elastic / PlumRiver B2B Ordering Platform

I built the core software platform behind a wholesale commerce business acquired for $47M. The platform handled ERP integrations, real-time inventory, client-specific workflows, and large transaction volume across 350+ AFA brands including Patagonia, The North Face, Crocs, Oakley, Burton, and Puma.

Unlike other B2B platforms, this one is fluid. No heavy loading times, even with massive brand catalogs.

INTERNATIONAL RETAILER // ELASTIC
ORDERINGERP / EDIREAL-TIME INVENTORY
$21B ANNUAL GMV21% WHOLESALE LIFT250K+ RETAILERS17% FASTER TTM

INTEGRATION-BOUND

Integration was the gnarly part. Ingest client ERP data on one side, push orders out through EDI 850 and SAP on the other, and make the round-trip reliable enough for real money to move on top.

  • Legacy ERP environments and source-system variability
  • Order output through EDI 850 and similar standards
  • Operational reliability for live brand-retailer commerce

CORE COMMERCE STACK

STACK

Five layers I built to make wholesale ordering work at enterprise scale. Every integration is legacy, every client is different, every order moves money.

L01
PRIMARY

Order Authoring

Wholesale ordering, not ecommerce checkout. Multi-PO entry, broken-pack quantities, retailer-specific assortment workflows, and the operational logic buying teams use during a market week.

L02

ERP + EDI Integration

Heterogeneous legacy ERPs were the hardest constraint. ETL on the way in, EDI 850 and SAP order output on the way out. The platform serviced 30+ currencies and 18 languages across international clients.

L03

Shared Toolbox

Modular client-specific extensions on a shared toolbox. Brand-by-brand customization without collapsing the platform into per-client forks.

L04

Real-Time Inventory

Inventory was not a static catalog field. ERP sync, regional warehouses, multiple shipments per PO, and date-based availability, all reflected back to the buyer at order time.

L05

Modernization in Place

Incremental Dojo to React migration across a live commerce codebase taking actual orders. No platform downtime, no feature freeze.

WHAT IT ENABLED

This held up under commercial pressure. The numbers below are from public case studies.

RESULT_01

Scale

$21B annual GMV

Public materials now associate the platform with annual GMV in the tens of billions.

SRC // ELASTICSUITE.COM
RESULT_02

Revenue Impact

21% increase in wholesale revenue

Case-study material reports up to a 21% lift in wholesale revenue tied to the product.

RESULT_03

Launch Velocity

800+ orders / $11M in two weeks

Hestra processed more than 800 orders totaling over $11M in the first two weeks after launch.

SRC // HESTRA CASE STUDY
RESULT_04

Operational ROI

Revenue + operational impact

$1.8M weeks scaling to $4M weeks for one brand. One month faster time to market for The North Face.

NON-OBVIOUS PRODUCT // CONSUMER-GRADE UX IN B2B
WORK_05

Elastic Whiteboard · Visual Merchandising System

I built a visual merchandising and presentation system that pushed the product past order entry. Brands could build line sheets, assortments, and whiteboard-style sell-in decks inside the same platform that took the order, instead of bouncing between InDesign, Excel, and PDFs nobody trusted. The order form is the floor; merchandising is the work.

If anyone can use Amazon, they can enter an order on Elastic.

BUYER VOICE // ELASTIC
VISUAL SELLINGCATALOG TOOLINGCANVAS UX
21% B2B SALES LIFT$100K/SEASON PRINT SAVED103% AVG KPI LIFT

PAST ORDER ENTRY

B2B ordering software handles transactions and stops. The actual sales process happens in disconnected tools the platform never sees: assortment planning, line sheets, sell-in decks.

  • PDFs and decks go stale the moment a product detail changes
  • Reps re-key data from sell-sheets back into the order form
  • Visual catalogs cannot reference live inventory or pricing
  • Merchandising and ordering live in different products

MERCHANDISING STACK

STACK

Four layers that turn a B2B ordering product into a visual merchandising one. Users see the canvas; the three layers below make it work.

L01
PRIMARY

Catalog Templating

An HTML/CSS system for rebuilding each brand's print catalog digitally. Brand identity preserved per client, not flattened into one generic layout.

L02

Authoring Harness

Enough structure that template work shifted to operations and design, off the engineering critical path.

L03

PDF Pipeline

Automated print-to-PDF generation via headless Chromium. The digital workflow still produced the printable output brand teams depended on.

L04

Whiteboard Canvas

A browser-side canvas for assortments, line sheets, and merchandising decks. Every product on the canvas was live in the order. Drag, drop, edit a quantity, the order updated.

WHAT IT CHANGED

What changed for brands once the merchandising layer landed.

RESULT_01

Sales Impact

21% increase in B2B sales

Up to a 21% direct increase attributed to the platform's merchandising tools and ease of use.

RESULT_02

Print Savings

$100k per season saved

Stanley-PMI saved about $100,000 per season in print costs through the digital catalog transition.

SRC // STANLEY-PMI CASE STUDY
RESULT_03

Merchandising Performance

103% average KPI increase

A footwear case study reported an average 103% lift across key KPIs tied to collection-builder and visual selling workflows.

SRC // LEADING FOOTWEAR CASE STUDY
RESULT_04

Merchandising Where The Order Lives

no more parallel tooling

Brand teams stopped maintaining parallel catalogs in InDesign and order forms in Elastic. Merchandising and ordering moved into one platform with the same product data on every surface.

[01B_FIT] // WHO THIS IS FOR

THE RIGHT
FIT.

I'm a sharper fit for some engagements than others. Rather than waste anyone's time, here's the kind of work where I move things forward, and the kind where I'd steer you elsewhere.

FIT_FOR
+01

Who this is for.

  • Software with real users, or about to have them.
  • AI features that need to ship, not demo.
  • Token cost or rate limits creeping upward.
  • Senior IC or embedded lead role.
  • Teams ready to move now.
NOT_FOR
-01

Who this isn't.

  • Demo prototypes with no path to users.
  • Pure research projects with no production goal.
  • Junior team staffing or backfill roles.
  • Teams designing for hypothetical scale they don't have.
  • Mature internal AI platform teams already shipping reliably.
[01C_SHAPE] // HOW I ENGAGE
ROLE
Senior IC or embedded lead. No agency overhead.
SHAPE
Project-shaped engagements.
CADENCE
Full- or half-time. Daily standups if useful, not required.
HOURS
Madrid-based. Flexible overlap with US Pacific and Eastern.
[05_GET_IN_TOUCH] // CONTACT

LET'S
TALK.

Currently available for senior contract or full-time work. Best fit: companies adding practical AI, LLM, or agentic workflows to software products, where the harness around the model matters as much as the model itself.

If you're working on something in that space, get in touch.

Lyle Underwood
EMAIL
lyleunderwood@gmail.com
BASED
Madrid, Spain
WORKS WITH
US companies, remote
AVAILABLE
Short-term contracts
LOOKING FOR
Practical AI / harness work