Docs / extract
Tool Referencev1.0.0simple

extract

Extracts key facts from text, files, and URLs in configurable output formats.

$ npx arrey@latest add extract
Category
Text & Documents
Arrey Range
>=1.0.0
Examples
3

Input Schema

FieldType
content
string
format?
string
maxItems?
number
instruction?
string

Output Schema

FieldType
extraction
string
format
string
chunks
number
model
string

Examples

Extract key facts as bullet points

{
  "content": "./meeting-notes.md",
  "format": "bullets"
}

Extract from a URL as JSON

{
  "content": "https://example.com/brief",
  "format": "json"
}

Limit output to top 5 findings

{
  "content": "paste text here",
  "format": "bullets",
  "maxItems": 5,
  "instruction": "Prioritize legal and financial risk facts."
}

README Reference

arrey/tools/extract

Extracts key facts from text, files, and URLs into configurable formats. Handles long documents automatically via chunk extraction and merge. This code is yours - edit it freely.


Usage

CLI

arrey extract ./notes.md
arrey extract ./notes.md --format json
arrey extract https://example.com/brief --format table
arrey extract "paste text here" --format bullets --maxItems 5

SDK

import { arrey } from 'arrey'

const result = await arrey.run('extract', {
  content: './research.md',
  format: 'json'
})

console.log(result.extraction)

Direct import (tool-local helper)

import { extract } from './arrey/tools/extract'

const result = await extract({
  prompt: './research.md',
  format: 'json',
  temp: 0.2,
  model: 'gpt-4.1-mini'
})

console.log(result.extraction)

Agent tool (Vercel AI SDK)

import { arrey } from 'arrey'
import { extract } from './arrey/tools/extract'

const tools = await arrey.toVercelAIFrom([extract])

Formats

FormatOutput
bullets5-10 concise factual bullets
jsonJSON array of facts with confidence labels
tableMarkdown table (`Fact
customWhatever you define in prompt.ts -> formats.custom

Customization

Tune extraction behavior

Edit prompt.ts -> chunk and combine.

Change output style

Edit prompt.ts -> formats.

Reduce output length

Pass maxItems at runtime:

arrey extract ./contract.md --format bullets --maxItems 3

Tool-specific model override

Set in arrey.config.yaml:

tools:
  extract:
    provider:
      model: gpt-4o

Files

FilePurposeEdit frequency
prompt.tsPrompts and output formatsOften
index.tsExecution, chunking, merge behaviorSometimes
manifest.jsonMetadata and I/O schemaRarely
README.mdThis fileOptional