File

The File column type lets you upload PDF documents directly into your Phacet. Uploaded files can then be referenced by other columns to extract, analyze, or transform their content using AI or Python tools.

File columns are typically input columns: they hold the source documents that feed your data processing workflows.

Example use cases

Process contracts: upload PDF contracts and extract key clauses, dates, or parties using AI in other columns.
Analyze invoices: upload invoices and extract line items, amounts, or vendor details into Text, Number, or JSON columns.
Classify documents: upload a batch of documents and use AI to categorize them with Select columns.
Extract tabular data: upload financial reports and pull structured data into Collection columns.

Supported file format

Currently, File columns accept PDF files (.pdf) only.

⚙️
Maximum file size: 50 MB per file.

Setting Up a File Column

Step 1: Create the column

Add a new column to your Phacet.
Choose File as the column type.
Give it a clear, descriptive name (e.g. Contract, Invoice, Report).

Step 2: Upload files

You can upload files in three ways:

Click the "Add a file" button in the cell and select a file from your computer.
Drag and drop a file directly into the cell.
Paste a file from your clipboard (Ctrl+V / Cmd+V).

Once uploaded, the file name and extension are displayed as a tag in the cell. Click the tag to preview the document.

Batch upload

You can upload multiple files at once to quickly populate your Phacet:

Select multiple files from your file picker, or drag and drop several files onto the column.
The first file is uploaded to the current row. Each additional file creates a new row automatically.
Upload progress is displayed for each file.

⚙️
Maximum batch upload: 50 files at once.

Built-in PDF viewer

Clicking a file cell opens a document preview directly in Phacet. PDF files are rendered with a dedicated viewer supporting multi-page navigation and smooth scrolling. You don't need to download files to review their content.

Using files as input for other columns

File columns are designed to feed data into your processing pipeline. Reference a file column from other columns using the @ symbol to extract or analyze its content.

Extract text with AI

Create a Text column with the AI tool and reference your file column:

Read @Contract and extract the effective date and the names of all signing parties.

Extract structured data with JSON

Create a JSON column with the AI tool and reference your file column:

Extract the following information from @Invoice: invoice number, vendor name, total amount, and due date.

Extract multiple items with Collection

Create a Collection column and reference your file column to extract repeating items (e.g. all line items from an invoice, all clauses from a contract). Each item becomes a separate row in a dedicated table.

Compute from extracted data with Python

Once data is extracted into Text or Number columns, use Python columns to process it further:

def main():
    amount = @Total Amount
    if amount is None:
        return None
    return round(amount * 1.20, 2)

Citations and traceability

When AI extracts information from an uploaded file, Phacet automatically attaches citations to the generated cells. Citations link each extracted value back to its exact location in the source document — including page number and position.

Click the citation icon on a cell to highlight the relevant passage in the original document. This makes it easy to verify AI-extracted data against the source.

Deleting a file

You can delete a file from a cell using the delete button in the file tag or in the document viewer. When a file is deleted:

The cell value is cleared.
All downstream columns that reference this file (via AI or Python tools) are automatically marked as stale and need to be refreshed.

Best practices

Use descriptive file column names: name your column after the document type (e.g. Invoice, Contract, Resume) rather than generic names like File or Document.
One file type per column: keep each file column dedicated to a single type of document for consistent AI extraction results.
Batch upload for large datasets: use drag-and-drop with multiple files to quickly build your dataset, then configure AI columns to process them all at once.
Verify with citations: always check AI citations on critical extractions to confirm accuracy against the source document.
Combine with other column types: use File as the starting point, then chain Text, Number, JSON, Select, and Collection columns to build complete document processing pipelines.