Skip to main content
Lasso accepts a variety of file formats as source data for extraction. Upload files first, then reference them by ID when creating a table.

Supported formats

FormatExtensionsNotes
PDF.pdfScanned and text-based PDFs are both supported.
Spreadsheets.xlsx, .xls, .csvEach sheet or file is parsed for product data.
Images.jpg, .png, .webpOCR extracts text from images.
Documents.docx, .docWord documents are parsed for content.

Upload flow

1

Upload the file

Send the file as multipart/form-data to POST /v1/files. Maximum size is 1 GB.
2

Get the file ID

The response includes an id field. Store this for the next step.
3

Create a table

Pass the file ID in the file_ids array when creating a table.

Example

const file = new File([buffer], "catalog.pdf", { type: "application/pdf" });
const uploaded = await client.files.upload(file, "catalog.pdf");

const table = await client.tables.create({
  schema_id: "schema_abc",
  name: "My Catalog",
  file_ids: [uploaded.id],
});

Alternative sources

If you do not want to upload files, you can also provide:
  • URLs — Pass file_urls with publicly accessible URLs. Lasso downloads and processes them server-side.
  • Text — Pass source_text with raw text content for extraction.

Managing files

Files can be listed and deleted through the Files API. Deleting a file does not affect tables that were already created from it.