How AI Estimating Runs on 80-Page Commercial TI Plan Sets

AI estimating tools that work on a 5-sheet bath remodel break in predictable ways on an 80-sheet pharmaceutical compounding TI. The break shows up as missing scope categories, hallucinated quantities, generic line items that do not match the actual plan, or simply a request body cap that prevents the upload from completing. The BuildCrux multi-pass pipeline is built specifically to not break in those ways on commercial scope. This page documents the architecture.

This is the pipeline as it actually runs in production today. Everything below is observable in the BuildCrux dashboard during a commercial estimating run.

Why commercial TI breaks single-pass AI

Request body cap: most LLM APIs cap upload bodies at 25-32 MB; commercial plan sets routinely exceed that. Single-pass tools either reject the upload or strip pages silently.
Context window dilution: 80 sheets of architectural detail compete for the models attention. Important detail on the structural sheet gets missed because the cover sheet, finish schedule, and equipment cut sheets are also competing for context.
Reasoning time cap: serverless function timeouts (typically 60-300 seconds) cannot accommodate the 4-12 minute reasoning passes required for genuine commercial complexity.
No tool use: most single-pass AI estimating tools do not give the model a compute_area or lookup_unit_cost tool. Without tools, the model estimates instead of measures, and applies generic unit costs instead of calibrated ones.
No scope hierarchy: model outputs whatever line items emerge from its attention pattern. Customer-facing commercial proposals need scope grouped by trade with code-driven categories explicit.

Pipeline overview

Component	Job	Model / tool	Typical commercial runtime
Files API upload	Move plan set up to 500 MB to AI accessible storage	Files API	15-90 seconds depending on file size
Pass 1	Identify every sheet, tag by type	Fast pipeline	30-90 seconds
Mode detection + confirmation	Surface standard vs detailed pipeline modal	Deterministic logic + UI	~10 seconds (user input)
Pass 2	Multi-discipline quantity takeoff	Fast pipeline + compute_area tool	2-6 minutes
Pass 3 (standard)	Priced estimate, residential or simple commercial	Fast pipeline + lookup_unit_cost	1-3 minutes
Pass 3 (detailed)	Priced estimate, complex commercial	Detailed pipeline + lookup_unit_cost + streaming	4-12 minutes
Post-process	Scope-driven categories, baselines, display name	Deterministic code	<2 seconds
Total commercial 30-sheet	—	—	5-10 minutes
Total commercial 80-sheet	—	—	10-15 minutes

Pass 1: Multi-discipline sheet identification

The first pass reads every page of the plan set and tags it. For an 80-sheet pharma compounding plan set, Pass 1 completes in 38 to 75 seconds. The output is a structured inventory that drives the rest of the pipeline.

Architectural sheets: cover, sheet index, existing conditions, demolition, proposed floor plan, elevation, section, finish schedule, door/hardware schedule, wall types.
Engineering sheets: structural (foundations, slab cuts, equipment supports), electrical (load calcs, panel schedules, lighting, controls), plumbing (USP-compliant water, lab waste, gas), mechanical (HVAC, classified-space pressure, exhaust).
Fire protection sheets: sprinkler plans, head schedule, modifications, special-hazard suppression.
Specialty equipment sheets: cut sheets, equipment schedules, install detail.
Non-drawing pages: energy reports, geotech excerpts, addenda, specifications. Flagged so Pass 2 and Pass 3 skip them — a meaningful efficiency win on commercial plan sets where 5 to 15 percent of pages are non-drawing.

Auto commercial detection + mode confirmation

Between Pass 1 and Pass 2, the pipeline detects whether the plan set is commercial or residential based on Pass 1 output: sheet count, discipline mix (presence of structural / fire / specialty equipment sheets), scope keywords in cover and general notes. A confirmation modal surfaces:

Standard pipeline (the fast pipeline for Pass 3): 1 credit, ~3-5 minute total runtime, appropriate for residential and simple commercial.
Detailed pipeline (the detailed pipeline for Pass 3): 15 credits, ~10-15 minute total runtime, appropriate for multi-discipline commercial TI with scope-driven categories.

The contractor confirms before any credits are deducted. The recommended mode is pre-selected based on detection (commercial defaults to detailed pipeline; residential defaults to standard). Contractor can override either direction.

Pass 2: Multi-discipline quantity takeoff

Pass 2 uses the sheet inventory from Pass 1 and runs takeoff. The model has access to compute_area, a deterministic measurement tool that runs on PDF coordinates the model hands it. When the model needs to know the square footage of a classified-space partition or the linear feet of sealed ductwork, it invokes compute_area and gets a measured answer instead of estimating from the rendered image.

Pass 2 output for the $686K pharma compounding TI included:

Pass 2 takeoff output for the $686K pharma compounding TI: 48 quantity items across 12 trade groups.

Trade group	Items	Quantity examples
Demo	3 items	2,950 sf TI to studs + dumpster + protection
Hazmat abatement	2 items	1,850 sf asbestos + 420 sf lead paint
Structural	2 items	85 lf slab cuts + 125 sf slab patch
Framing/partitions	6 items	385 lf classified partition + 95 lf non-classified + 1,840 sf sealed ceiling
Plumbing	3 items	1 USP water loop + 12 fixtures + 85 lf lab waste
HVAC	5 items	1 dedicated AHU + 24 HEPA boxes + 485 lf sealed duct
Electrical	5 items	400A-to-800A panel + 48 classified receptacles + 64 cleanroom LEDs
Fire protection	3 items	85 sprinkler heads + 32 smoke detectors + special hazard
Specialty equipment	7 items	2 LAF + 2 BSC + 4 pass-throughs + 1 sterilizer + 3 refrigeration + 6 workstations + commissioning
Roof/exterior	2 items	4 curb cuts + 8 wall penetrations
Finishes/detail	7 items	epoxy paint + standard paint + casework + doors + signage + glazing
Closeout	3 items	final clean + validation support + punch

Pass 3: Detailed pipeline priced estimate with streaming

Pass 3 is the heaviest reasoning pass. On commercial multi-discipline plan sets the model needs to reason about which scope categories apply, which unit costs are appropriate for classified vs non-classified scope, how commercial uplift varies by trade (specialty equipment uplift is different from finishes uplift), and how to group output into a customer-facing line-item structure.

BuildCrux runs commercial Pass 3 on the detailed pipeline with two infrastructure pieces that single-pass AI tools typically lack:

Streaming API: the detailed pipeline reasoning on 80-sheet plan sets can take 8 to 12 minutes. Standard serverless function timeouts (60-300 seconds) cannot accommodate this. Streaming keeps the request open during the long reasoning run and surfaces partial output as it completes.
1M context beta: the full plan set, the Pass 1 inventory, the Pass 2 takeoff, and the system prompt all need to coexist in context for Pass 3 to reason accurately. The 1M context beta accommodates this on the largest commercial plan sets.

Pass 3 also enforces the 3-tier line-item structure described in the estimating-guide page: universal categories (always present), scope-driven categories (present when triggered by scope), trade-detail categories (present when scope is granular enough). The structure prevents output from being a flat list of 80 line items; it produces customer-readable 40 to 60 line-item estimates grouped by trade.

Post-processing: scope-driven categories + validation

Pass 3 output is not the final estimate. Deterministic post-processing applies:

Scope-driven category validation: confirm all 5 commercial categories surfaced where triggered (fire, hazmat, structural, specialty equipment, roof). If a category is triggered by Pass 1 sheet inventory but missing from Pass 3 output, flag for contractor review.
Universal baselines: confirm every estimate includes the universal categories (demolition, general conditions, final clean, dumpster). Add baselines if Pass 3 omitted them.
Scope filter application: if the user requested a sub-bid scope filter (e.g. millwork-only), strip line items outside the filter.
Display name auto-generation: pull address from cover sheet, version number from history, generate display name like "1234 Main St, Suite 100 - V1".
Cost telemetry: log actual model usage and per-line-item costs to ai_usage_log for margin tracking.

Scope filter for sub-trade contractors

Commercial TI sub-trade contractors (electrical, mechanical, fire protection, specialty equipment) often bid off the same plan set as the GC but need only their scope in the output. The scope filter lets a sub generate a single-discipline estimate from the same plan set:

Pass 1 still tags every sheet (no behavior change).
Pass 2 still computes takeoff across all disciplines (no behavior change).
Pass 3 system prompt receives the scope filter as a constraint: produce only line items inside the filter scope. The model can still reference cross-discipline detail (e.g. classified-space requirements that affect electrical), but output is filtered.
Post-processing applies a second filter pass to catch any leakage.

In testing on the $686K pharma plan set: full-scope output $686,646; millwork-only filter output $86,400 with clean line items; electrical-only filter output $138,820 with clean line items; HVAC-only filter output $225,975 with clean line items. No contamination between filter scopes.

Try the pipeline on your next commercial bid

14-day free trial. Upload a 30 to 80 page plan set. Watch all three passes run.

Get Started

Frequently asked questions

Why use the detailed pipeline instead of the fast pipeline for commercial?+

The detailed pipeline routes Pass 3 to a higher-reasoning model class. On commercial multi-discipline scope (30+ sheets, scope-driven categories, classified vs non-classified space differentiation), the detailed pipeline produces measurably better line-item structure and scope-category capture than the fast pipeline. For residential or simple commercial (5-15 sheets, single-discipline dominant), the fast pipeline is faster and equally accurate.

What happens if the AI provider API is down or the detailed pipeline is rate-limited?+

Pipeline falls back: Pass 3 retries on the detailed pipeline first, then falls back to the fast pipeline with a flag in the output noting the fallback. The contractor sees a clear notice and can re-run later on the detailed pipeline if they want the higher-quality output. The pipeline does not silently degrade.

How much does a commercial pipeline run cost in credits?+

Commercial AI estimates use 15 credits (versus 1 credit for residential). Credit pricing varies by tier — Crew tier ($149/mo) includes 200 standard credits; the worst-case detailed-pipeline run on a 80-sheet pharma TI costs approximately $4-5 in actual AI inference cost, well inside the credit margin. Overage pricing is calibrated to maintain 70%+ gross margin even at worst-case AI cost.

Does the pipeline work on hand-marked-up plan sets (markups over original PDF)?+

Yes, with the caveat that AI accuracy on handwritten markups is lower than on clean digital plans. The pipeline reads markups but interprets them as approximate rather than authoritative. Best practice on commercial TI is to ask the architect for a clean revision before bidding; second-best is to flag the marked-up areas during the contractor review step.

Can I see the intermediate output from Pass 1 and Pass 2?+

Yes. The BuildCrux dashboard exposes Pass 1 sheet inventory and Pass 2 quantity list as inspectable intermediate outputs. This is useful for sanity-checking the pipeline before committing to a Pass 3 detailed-pipeline run that uses 15 credits.

How does the pipeline handle plan sets larger than 500 MB?+

The Files API caps at 500 MB. Plan sets larger than that are split per-page using pdf-lib in the upload step, then re-assembled in the model context. The split is rare in practice — even 100-sheet plan sets typically come in under 250 MB. The 500 MB ceiling is a long-tail safety net.

The bottom line

AI estimating works on commercial scope because of architecture, not magic. Multi-pass pipelines with tool use, streaming infrastructure, 1M context, and scope-driven category enforcement beat single-pass prompting on 80-sheet commercial plan sets. BuildCrux is the only AI estimating tool that publishes both the architecture and the validation results. If you bid commercial TI work, this is the pipeline you want behind the takeoff.

See the $686K pharma TI this pipeline produced

Run the pipeline on your next TI bid

14-day free trial. 30-day money-back guarantee.