AI Document Automation for Mid-Market Insurance Broker

The Challenge

Harrington & Cole Insurance Brokers handle commercial and professional indemnity insurance for over 4,000 SME clients. At the centre of their renewal and new business process is a documentation-heavy workflow: insurers supply policy documents, endorsements, and schedules in inconsistent formats — PDF, Word, and occasionally scanned paper — that must be read, validated, and re-keyed into their broker management system (Acturis).

The manual process consumed approximately 1.8 full-time equivalent staff roles and created a bottleneck that slowed renewal turnaround times and introduced transcription errors that occasionally led to client disputes.

The specific problems:

A team of three junior staff spent 60–70% of their time on manual document reading and data entry
Error rate on transcription was estimated at 3.2% — causing downstream issues when policy terms were incorrectly recorded
Peak renewal periods (January and June) created processing backlogs that delayed client-facing documentation by 2–5 days
The brokerage could not scale without hiring more document handlers

The brief: automate the extraction, validation, and ingestion of insurance documents into Acturis — without replacing the human review step for high-value or complex policies.

Our Approach

We began with a two-week discovery sprint to understand the document taxonomy: 14 distinct document types across 8 insurance categories, each with varying layouts from different insurer templates.

Week 1–2: Discovery & scoping

We audited 1,200 historical documents to understand layout variation, field distribution, and extraction complexity. We identified three document tiers by complexity:

Tier 1 (68% of volume): Standardised formats with consistent field positions — high automation confidence
Tier 2 (24% of volume): Semi-structured with variable layout — AI extraction with human validation flag
Tier 3 (8% of volume): Highly bespoke or handwritten — routed directly to human handler

Week 3–6: Pipeline development

We built a multi-stage extraction pipeline:

Ingestion layer: Email attachment monitoring and SharePoint folder polling that captured incoming documents automatically — no manual upload required
Document classification: A fine-tuned classification model (built on a base vision transformer) that identified document type with 97.3% accuracy across the 14 categories
Field extraction: GPT-4 Vision with structured output schemas for each document type — extracting policyholder name, policy number, coverage limits, excess, premium, inception and expiry dates, endorsements, and special conditions
Validation layer: Business rule validation against known insurer formats, cross-field consistency checks, and confidence scoring. Documents below a configurable threshold were automatically flagged for human review rather than auto-submitted
Acturis integration: A REST API integration that pushed extracted data into the correct Acturis record via their documented import API — with idempotency checks to prevent duplicate entries

Week 7–9: Human-in-the-loop interface

For Tier 2 documents and low-confidence extractions, we built a lightweight review UI that showed the extracted fields alongside the source document — allowing staff to verify and correct values before submission with a single click. The correction data was logged for model fine-tuning.

Week 10–11: Testing, training & deployment

End-to-end testing with 500 live documents, staff training on the review interface, and a phased rollout starting with Tier 1 documents only.

The Results

The system went live processing Tier 1 documents in week 11, with Tier 2 enabled two weeks post-launch after staff were comfortable with the review workflow.

Operational outcomes at 60 days:

Average document processing time reduced from 8.4 minutes to 1.9 minutes (78% reduction)
Data entry error rate fell from 3.2% to 0.18% — a 94% reduction
Daily throughput capacity increased from ~400 documents to 3,000+ without additional staff
Renewal turnaround time reduced by an average of 1.8 days during the January peak

Business outcomes:

The three staff previously dedicated to document handling were redeployed to client relationship management and complex case handling — higher-value work
Projected annual staff cost saving of £210,000 based on prevented future hiring
Project cost recovered in approximately 6 weeks at current processing volumes

The Tier 3 finding: The 8% of documents routed to human handlers were processed 40% faster because staff now had the AI's partial extraction as a starting point — reducing the time even on documents the system could not fully automate.

Technical Stack

Document ingestion: Python, Microsoft Graph API (SharePoint/Outlook), AWS S3
Classification model: Fine-tuned ViT on custom labelled dataset (1,200 examples)
Extraction: GPT-4 Vision with Pydantic-validated structured outputs, prompt versioning via LangSmith
Validation engine: Python rule engine with configurable per-document-type schemas
Review UI: Next.js with react-pdf for document rendering
Acturis integration: REST API with exponential backoff and dead-letter queue
Monitoring: Datadog for pipeline health; custom extraction accuracy dashboard
Infrastructure: AWS Lambda + SQS for async processing; RDS PostgreSQL for audit log

AI Document Automation for Mid-Market Insurance Broker

Key Results

The Challenge

Our Approach

The Results

Technical Stack

Ready to Achieve Results Like These?

Uzair Technology