Question 1

How does a bespoke pipeline actually work?

Accepted Answer

You send representative samples and tell us the data you need out. We design and tune an extraction pipeline around your specific documents, then run it at volume while you track progress and usage in the client portal.

Question 2

Is my data secure? What happens to my documents?

Accepted Answer

Documents are encrypted in transit and at rest, access is least-privilege, and we never use your data to train machine-learning models. Files are deleted on completion or on request, and client work is covered by a data-processing agreement. Full detail is on our Data Processing & Security page.

Question 3

What formats do you accept, and what do I get back?

Accepted Answer

Inputs: PDFs (scanned or born-digital), images (JPG, PNG, TIFF, HEIC), and Office files (Word, Excel, PowerPoint). Outputs: structured JSON to your schema, CSV or Excel, searchable PDF, plain text, direct-to-database, or API and webhook delivery.

Question 4

What languages and document types can you handle?

Accepted Answer

English and a wide range of other languages — including Arabic, Persian, and historical or mixed-script material — across books, journals, archives, forms, tables, and more. Multilingual and non-Latin scripts are a particular strength.

Question 5

How accurate is the extraction?

Accepted Answer

Automated extraction is highly accurate but not infallible. For critical fields we offer a validation and QA tier that rules-checks output and flags low-confidence results for human review before delivery.

Question 6

What volumes do you handle — is there a minimum?

Accepted Answer

From a few thousand pages to tens of millions. Per-page pricing falls with volume, there is no hard minimum, and very large archives are quoted at bespoke rates.

Question 7

How does billing work?

Accepted Answer

Work is drawn against pre-purchased credits, charged per page as it runs. You top up and monitor your balance and usage in the client portal — no surprise invoices.

Custom-built pipelines for bulk digitisation and data extraction.

Three jobs, one pipeline.

Digitisation

Document processing

Data extraction

Anything in. Structured data out.

A pipeline built around your documents.

Understand your documents

Design a bespoke pipeline

Extract & validate

Deliver structured data

Precise by design.

Accuracy you can trust

Built around your documents

Scales to bulk volumes

Confidential & secure

Transparent volume pricing.

Questions, answered.

Tell us about your documents.