Set up guide
Estimated setup time: 15–20 minutes. If you use all defaults without changes, 10 minutes.
Step 1 — Activate the template
- Open the Phacet template gallery
- Select "Invoice Data Extraction"
- Click Create a new project from the template
Your project is now pre-configured with all six processing stages, column definitions, and agent instructions. No need to build anything from scratch.
Step 2 — Review extraction formatting rules
The template ships with formatting rules tuned for French and European invoices. Review these against your actual invoice mix before running your first batch.
- Open the core field extraction configuration (Step 1)
- Review the default formatting rules:
- Dates: DD/MM/YYYY for invoice date, DD-MM-YY for delivery date, DD-MM for due date
- Amounts: comma as decimal separator, two decimal places, space before € symbol (e.g.,
1 234,50 €) - Addresses:
[Number] [Street type] [Street name], [Postal code] [CITY]
- If your invoices use different conventions (e.g., US date format MM/DD/YYYY, period as decimal separator, $ instead of €), adjust the formatting rules in the agent instructions for the relevant fields
- If you process invoices in multiple currencies, add currency detection logic to the amount extraction instructions
Step 3 — Review payment method options
The template recognizes eight payment method categories.
- Open the payment method configuration (Step 1)
- Review the default options: Bank transfer, Check, Credit card, Cash, Direct debit, Payment at 30 days, Payment at 60 days, Not specified
- If your suppliers use payment terms not covered (e.g., Payment at 45 days, Letter of credit), add them to the single-select list and provide detection criteria in the agent instructions
Step 4 — Review industry classification categories
The template classifies suppliers into 19 industry sectors.
- Open the industry classification configuration (Stage 4)
- Review the default categories: Agriculture, Manufacturing, Construction, Retail, Transportation, Food Service, IT, Finance, Real Estate, Professional Services, Administration, Education, Healthcare, Leisure, Other Services, Energy, Crafts, Telecommunications, Other
- If your company uses a different industry taxonomy internally, rename or reorganize categories to match, this makes the output directly usable in your reporting
Step 5 (optional) — Add company-specific extraction rules
The template handles standard invoice layouts out of the box. If you have suppliers with unusual formats, add rules.
- Open the supplier name extraction configuration (Step 1) and relevant field configurations
- For suppliers whose PDF layout is non-standard, add extraction hints. For example: if a supplier puts their company name in the footer rather than the header, note this in the agent instructions
- If certain suppliers use abbreviations or trade names that differ from their legal name, add mapping rules so the agent normalizes consistently
Step 6 — Run a test batch
- Upload 10–20 test invoices into the project (PDF upload or push via automation)
- Let the agent process the full batch
- Review results against this checklist:
- ✅ Supplier names extracted correctly (including legal form: SARL, SAS, Ltd, etc.)?
- ✅ Invoice numbers preserved in their exact original format?
- ✅ Dates formatted correctly and pointing to the right date on the invoice?
- ✅ Amounts match the PDF: total incl. tax, tax amount, and the math checks out?
- ✅ Addresses standardized consistently?
- ✅ Payment methods identified correctly?
- ✅ Line items complete: all rows from the invoice table captured?
- ✅ Overall AI scores above 90% for clean, standard invoices?
- ✅ AI scoring rationale shows "Ok" for invoices you've verified manually?
- Open 5–6 results in the Review interface and check the citations:
- Does the agent point to the correct PDF section for each extracted field?
- Does the JSON export contain all fields with correct values?
- Compare the Invoice JSON output against your accounting system's expected input format
If extraction accuracy is low:
- Review the formatting rules or company-specific rules
- Ensure correct PDF quality: ask suppliers for native PDFs where possible.
If AI scores are consistently low: check whether the double extraction is catching real errors or formatting differences the scoring rules should tolerate.
Step 7 — Go live
- Establish your processing routine, batch upload after receiving invoices, continuous feed via automation (see the Integration Guide), or forwarded from another Phacet template (e.g., AI Inbox)
- Set your quality threshold, decide the minimum Overall AI Score at which invoices flow through without review (e.g., 95%)
- Assign team members to the Review queue for low-scoring and incomplete extractions
- Connect the JSON output to your downstream system , ERP import, accounting tool, or automation workflow
Troubleshooting
| Problem | Likely cause | Fix |
|---|---|---|
| Supplier name missing legal form (SARL, SAS, etc.) | The legal form appears in a different section than the main name on the invoice | Add extraction hints to Step 1 for the affected supplier's layout. |
| Amounts show wrong decimal separator | Invoice uses a format different from the default (comma separator, € symbol) | Adjust the amount formatting rules in Step 1 to match your invoice conventions. |
| Dates extracted in wrong format | Invoice date, delivery date, and due date each use a different output format by design | Check which date field is wrong, each has its own format rule (DD/MM/YYYY, DD-MM-YY, DD-MM). Adjust the specific stage. |
| Line items incomplete / rows missing | Invoice uses a multi-page table, merged cells, or unconventional layout | Check the PDF quality. For complex table layouts, add extraction hints to Step 2. If the table spans pages, verify the agent is reading all pages. |
| "Not specified" on payment method | No IBAN, no payment terms, no card terminal reference found on the invoice | This is correct behavior, the agent doesn't guess. If you know the payment method from context, add it manually or add a supplier-specific rule. |
| JSON export missing fields | A required field returned "Not specified" or was empty | The JSON consolidates extracted fields. If a source field is empty, the JSON field will be too. Fix the extraction first, then the JSON follows. |
| Supplier enrichment returns wrong company | Multiple companies with similar names in Pappers.fr | The agent picks the closest match by name. If it's consistently wrong for a supplier, add the exact legal name or SIREN to the enrichment search criteria. |
Updated about 23 hours ago