Is this safe to use with real invoice data?

Claude.ai has a privacy policy that covers how data is handled. For sensitive financial documents, review Anthropic's data usage terms before uploading production invoices. Enterprise and API customers have stronger data privacy controls. If your organisation has strict data residency requirements, use the API with appropriate configuration rather than the Claude.ai web interface.

How accurate is line item extraction with Claude?

On clean, machine-generated PDF scans, 90%+ field-level accuracy is realistic. On low-resolution scans, photographed invoices, or handwritten documents, accuracy drops and the validation prompts become more important, not less.

Can I use this for invoices in other languages?

Yes, with caveats. Claude handles major European languages well. For less common languages or mixed-language invoices, test your specific documents before building a workflow around them.

What is the difference between using Claude.ai and the Claude API?

Claude.ai is the web interface, good for manual, one-at-a-time processing. The API is for automated, programmatic workflows. The underlying model is the same.

How does Claude compare to dedicated OCR tools?

Traditional OCR extracts text. Claude extracts meaning. OCR will give you the raw characters on the page. Claude will give you structured, labelled fields. For line item extraction specifically, where structure and context matter, Claude typically outperforms raw OCR without additional parsing logic.

What if the extraction is wrong?

The validation prompts in Step 4 are designed to catch the most common errors before data moves downstream. Any invoice that returns a REVIEW or MANUAL REVIEW recommendation should go to a human before processing. Do not skip the validation step.

How to Use Claude to Extract Line Items from Scanned Invoices

Why this guide exists

If your AP team is still manually keying line items from scanned invoices into your ERP, this post is for you. Not because it is a painful process, though it is. But because it is an unnecessary one. Claude, Anthropic's AI, can read a scanned invoice, identify every line item, and return structured, usable data in seconds. No special software. No SQL. No BI tool. This is the first post in our Claude for AP Teams series, practical guides for finance and AP professionals who want to use Claude as a working tool, not just a chatbot. We will cover what Claude can reliably extract, the exact prompts to use, how to validate the output, where Claude has limits, and, for the technically inclined, how to take this further with the API.

Why line item extraction is still a manual problem

Most AP automation tools handle header-level data reasonably well. Vendor name, invoice number, total amount, due date, these fields are relatively standard across invoice formats. Line items are a different problem entirely. A single invoice can have 3 line items or 300. Descriptions vary wildly. Unit prices, quantities, tax treatments, discount structures, GL codes, none of it is standardised. And when the invoice arrives as a scanned PDF or a photographed document, the formatting is whatever the vendor decided it should be. The result: most teams either key line items manually, or skip capturing them altogether and just process the total. Both options cost money. Manual keying costs time and introduces errors. Skipping line items means you lose the granular spend data that would actually tell you something useful, vendor pricing drift, quantity discrepancies, contract compliance. Claude changes this. Not completely, and not without caveats. But meaningfully.

What Claude can extract from a scanned invoice

Before you build any process around Claude, it helps to know exactly what it is reliable on and where it struggles.

Reliably extracted (90%+ accuracy on clean scans)

Vendor name and address
Invoice number and date
Due date and payment terms
Line item descriptions
Quantities and unit prices
Line item totals
Subtotal, tax amount, and invoice total
PO number (when present on the invoice)
Currency

Less reliable (requires validation)

Line items on low-resolution or heavily compressed scans
Handwritten annotations or corrections
Tables with merged cells or unusual column structures
Multi-page invoices where line items span pages
Non-English invoices (Claude handles many languages, but accuracy varies)

Cannot extract

Data not visible on the invoice (GL codes, cost centres, internal approval references, these need to be mapped separately)
Handwritten invoices with poor legibility
Heavily damaged or partially obscured documents

The honest assessment

On clean, machine-generated scanned PDFs, which is the majority of what most AP teams process, Claude performs very well. On edge cases, you need a validation step and a clear exception path.

Step 1: Upload the invoice to Claude

Claude.ai (the web interface) accepts image files and PDFs directly. You can upload a scanned invoice as a JPG, PNG, or PDF and Claude will read it visually. If you are on Claude.ai Pro or Team, you get access to Claude's full vision capabilities, it can read the document as an image, identify table structures, and extract fields even when the layout is non-standard. If you are using ChatGPT today and considering a switch: Claude's document handling is comparable for clean PDFs. Where Claude tends to perform better is in following structured output instructions precisely and maintaining consistency across a batch, more on this when we get to the prompts. To upload: open Claude.ai, start a new conversation, and use the paperclip icon to attach your scanned invoice. Then use the prompts below.

Step 2: The line item extraction prompt

This is the most important part. The quality of Claude's output depends almost entirely on how clearly you tell it what you want back. Here is the base prompt. Copy it using the button on the prompt card, then paste it after uploading your invoice.

PROMPT 1: Structured Line Item Extraction

You are an accounts payable data extraction assistant.

Extract all line items from this invoice and return them as a structured JSON array.

For each line item, return the following fields:
- line_number (integer, sequential)
- description (string, exact text from invoice)
- quantity (number, null if not present)
- unit (string, e.g. "each", "hrs", "kg", null if not present)
- unit_price (number, null if not present)
- line_total (number)
- tax_code (string, null if not present on the invoice)
- notes (string, any additional information on that line, null if none)

Also return the following invoice-level fields:
- vendor_name
- invoice_number
- invoice_date (YYYY-MM-DD format)
- due_date (YYYY-MM-DD format, null if not stated)
- payment_terms (string, null if not stated)
- subtotal (number)
- tax_amount (number, null if not separately stated)
- invoice_total (number)
- currency (3-letter ISO code)
- po_number (string, null if not present)

Return only valid JSON. No preamble, no explanation, no markdown formatting. Start your response with { and end with }.

If a field cannot be determined from the invoice, return null for that field. Do not guess.

Why this prompt works

This prompt does four things that matter:

It specifies every field explicitly. Claude will not invent fields you did not ask for, and will not skip fields you did.
It enforces null rather than guessing. This is critical for validation downstream. A null value is catchable. A confident wrong value is not.
It asks for JSON only. No preamble means the output can go directly into a parser without cleanup.
It separates line-level from invoice-level data. This makes it immediately usable in most AP systems without restructuring.

Step 3: Handling multi-page and complex invoices

If your invoice runs across multiple pages, or has a complex structure (multiple tax rates, grouped line items, project-based billing), the base prompt above may need a small addition. Add this to the end of Prompt 1 before submitting:

PROMPT 2: Multi-Page Invoice Extension

This invoice may span multiple pages. Extract ALL line items across all pages, do not stop at the first page. If line items continue on subsequent pages, include them in the same line_items array.

Verify that the sum of all line_total values matches the subtotal field. If there is a discrepancy, add a field called "extraction_warning" with a description of the discrepancy.

Why this matters

The extraction_warning field is useful. It flags arithmetic inconsistencies in the output itself, so you catch them before the data hits your ERP rather than during a three-way match failure three weeks later.

Step 4: Validation prompts

Raw extraction is only half the work. Before you trust the output, run it through a quick validation sequence. These prompts work within the same Claude conversation, you do not need to start a new session.

PROMPT 3: Arithmetic Validation

Review the JSON you just returned. Check the following:

1. Do all line_total values equal quantity × unit_price for each line item? Flag any lines where this does not hold, or where quantity or unit_price is null so the check cannot be performed.

2. Does the sum of all line_total values equal the subtotal?

3. Does subtotal + tax_amount equal invoice_total?

Return a validation report with:
- arithmetic_check: "PASS" or "FAIL"
- lines_with_discrepancies: array of line_numbers with issues (empty array if none)
- subtotal_check: "PASS", "FAIL", or "CANNOT_VERIFY"
- total_check: "PASS", "FAIL", or "CANNOT_VERIFY"
- notes: any additional observations about the invoice structure

Triage the output

After the arithmetic check, run a completeness pass.

PROMPT 4: Completeness Check

Review the extracted data and flag any fields that are null which would typically be present on a standard commercial invoice. Return:

- missing_fields: array of field names that are null but expected
- confidence_assessment: your overall confidence in the extraction on a scale of LOW / MEDIUM / HIGH, with one sentence explaining your rating
- recommended_action: "APPROVE FOR PROCESSING", "REVIEW BEFORE PROCESSING", or "MANUAL REVIEW REQUIRED"

What this gives you

This gives you a triage output. High confidence, no missing fields, route to processing. Low confidence or missing critical fields, flag for human review. You are not removing humans from the loop, you are making sure humans only touch the invoices that actually need them.

Step 5: Exception routing

Once you have validated output, you need a decision on what to do with it. This prompt is designed for the cases where something does not look right.

PROMPT 5: Exception Triage

Based on the extracted invoice data and validation results, identify any of the following exception conditions:

1. Arithmetic discrepancy, line totals don't add up
2. Missing critical fields, invoice number, vendor name, or invoice total is null
3. Unusually high unit price, flag any line where unit_price exceeds [INSERT YOUR THRESHOLD, e.g. $5,000]
4. Duplicate risk, invoice number matches a pattern that may indicate resubmission (e.g. same number with suffix like -R or -2)
5. Missing PO reference, po_number is null on an invoice where a PO would be expected

For each exception found, return:
- exception_type
- affected_field or line_number
- recommended_action (one sentence)
- urgency: LOW / MEDIUM / HIGH

Tune the threshold

Replace the threshold in condition 3 with whatever makes sense for your business. A $5,000 unit price is unremarkable for a professional services firm and a red flag for an office supplies vendor.

What if you are already using ChatGPT for this?

If you have already built a version of this workflow using GPT-4o, here is the honest comparison.

Where Claude tends to perform better

Following structured output instructions precisely, especially JSON formatting
Maintaining field naming consistency across a batch of different invoice formats
The null-not-guess behaviour, Claude is less likely to fill in a field it cannot see with a plausible-sounding value

Where ChatGPT tends to perform better

Familiarity, most AP teams who have experimented with AI started with ChatGPT
The API ecosystem, GPT-4o has a slightly more mature set of third-party integrations
Vision on very low-quality images, marginal difference, but GPT-4o has been trained on a broader image dataset

The practical answer

Both tools can do this job. Claude's output consistency makes it slightly easier to build a repeatable process around. If you are starting from scratch, start with Claude. If you have an existing ChatGPT workflow that is working, the prompts in this post will work with minimal adaptation.

Advanced: taking this further with the Claude API

This section is for finance ops leads or developers who want to build this into a repeatable, automated workflow. If you are running this manually in Claude.ai, you can skip it. The Claude API lets you call the same model programmatically, meaning you can build a script that processes a folder of invoices, runs the extraction prompt on each one, validates the output, and writes the results to a CSV or posts them directly to your ERP. Here is the core API call in Python. You will need an Anthropic API key and the anthropic library installed (pip install anthropic).

extract_invoice_line_items.py

python

import anthropic
import base64
import json

def extract_invoice_line_items(invoice_path: str) -> dict:
    client = anthropic.Anthropic(api_key="your-api-key-here")

    # Read and encode the invoice file
    with open(invoice_path, "rb") as f:
        invoice_data = base64.standard_b64encode(f.read()).decode("utf-8")

    # Determine media type
    if invoice_path.endswith(".pdf"):
        media_type = "application/pdf"
    elif invoice_path.endswith(".png"):
        media_type = "image/png"
    else:
        media_type = "image/jpeg"

    extraction_prompt = """
    You are an accounts payable data extraction assistant.
    Extract all line items from this invoice and return them as structured JSON.

    Return:
    {
      "vendor_name": string,
      "invoice_number": string,
      "invoice_date": "YYYY-MM-DD",
      "due_date": "YYYY-MM-DD or null",
      "payment_terms": "string or null",
      "po_number": "string or null",
      "currency": "3-letter ISO code",
      "subtotal": number,
      "tax_amount": "number or null",
      "invoice_total": number,
      "line_items": [
        {
          "line_number": integer,
          "description": string,
          "quantity": "number or null",
          "unit": "string or null",
          "unit_price": "number or null",
          "line_total": number,
          "tax_code": "string or null",
          "notes": "string or null"
        }
      ]
    }

    Return only valid JSON. No preamble. No markdown. Start with { and end with }.
    If a field cannot be determined, return null. Do not guess.
    """

    message = client.messages.create(
        model="claude-opus-4-5",
        max_tokens=2000,
        messages=[
            {
                "role": "user",
                "content": [
                    {
                        "type": "document" if media_type == "application/pdf" else "image",
                        "source": {
                            "type": "base64",
                            "media_type": media_type,
                            "data": invoice_data,
                        },
                    },
                    {
                        "type": "text",
                        "text": extraction_prompt
                    }
                ],
            }
        ],
    )

    # Parse the JSON response
    raw_text = message.content[0].text.strip()
    return json.loads(raw_text)

# Example usage
result = extract_invoice_line_items("invoice_scan.pdf")
print(json.dumps(result, indent=2))

Notes on this code

The model string, use claude-opus-4-5 for best accuracy on complex invoices. For high-volume processing where cost matters, claude-sonnet-4-5 is faster and cheaper with only a marginal accuracy trade-off on clean scans.
Error handling, wrap the json.loads() call in a try/except. If Claude returns anything other than clean JSON (which should be rare with this prompt, but does happen on very unusual invoice formats), you want to catch it and route that invoice to manual review rather than crash the script.
Batch processing, to process a folder of invoices, wrap the function in a loop over os.listdir() and write results to a CSV using the csv module. Add a short time.sleep(1) between calls to stay within API rate limits.
Cost, at current Anthropic pricing, extraction runs at roughly $0.01 to $0.03 per invoice depending on document length and which model you use. For most mid-market AP teams processing 500 to 2,000 invoices a month, the API cost is negligible against the labour cost it replaces.

When Claude is not enough

Claude is a general-purpose AI. It is excellent at reading documents and returning structured data. It is not a purpose-built AP system. Here is what it cannot do on its own:

It has no memory across invoices. Claude does not know what you paid this vendor last month. It cannot flag that this invoice is 15% higher than the last three, unless you paste that context in manually.
It cannot validate against your contracts. If your contract with Vendor X caps hourly rates at $150 and this invoice shows $175, Claude will not catch it, unless you paste the contract terms into the prompt.
It cannot post to your ERP. The output is JSON. Getting that JSON into NetSuite, Sage Intacct, or Dynamics 365 requires an additional integration step.
It cannot resolve exceptions autonomously. When something does not match, Claude flags it. A human, or a more specialised system, still needs to decide what to do.

Scope boundaries, not criticisms

These are not criticisms. They are scope boundaries. Claude is a powerful first step in a workflow. For teams processing a manageable volume of invoices with some tolerance for manual follow-up on exceptions, the prompts in this post may be all you need. For teams processing hundreds or thousands of invoices a month, where contract validation, ERP posting, and autonomous exception resolution matter, that is where a purpose-built agentic Intake-to-Pay platform picks up where Claude stops.

Next in this series

Next in the series: How to Use Claude to Categorise and Clean Up Expense Data.