Skip to content

Agent-Tracking-Mech/agentic-spreadsheet

Repository files navigation

agentic-spreadsheet

TypeScript library and HTTP/CLI tools for AI-assisted spreadsheet work: load Excel/CSV, ask natural-language questions, generate formulas, run structured cleaning transforms, and explain existing formulas. Built on SheetJS and the Vercel AI SDK.

TypeScript License: MIT Node.js

Contents: Features · Installation · Configuration · Usage · HTTP API · CLI · Docker · Testing · Troubleshooting

Features

  • Formula generation — Describe logic in English; get an Excel-style formula string.
  • Natural-language queries — Ask questions over tabular data (sample rows are sent to the model).
  • Transforms — LLM proposes a JSON plan (filter, dedupe, fill nulls, add column placeholders); the engine applies it deterministically.
  • Formula explanation — Plain-English explanation of a formula.
  • Formats — Load/save XLSX, CSV, ODS, XLS (via SheetJS).
  • Local-first — Files stay on disk; only prompts and excerpts go to your chosen LLM provider.

Requirements

  • Node.js 22+
  • An API key for at least one provider: OpenAI (OPENAI_API_KEY) and/or Anthropic (ANTHROPIC_API_KEY), depending on the model id you pass.

Installation

cd agentic-spreadsheet
npm install
npm run build

Configuration

Copy .env.example to .env and set keys:

OPENAI_API_KEY=sk-...
# Optional:
# ANTHROPIC_API_KEY=
# AGENT_MODEL=openai/gpt-4o-mini
PORT=3000

Model strings use a provider prefix, for example:

  • openai/gpt-4o-mini — requires OPENAI_API_KEY
  • anthropic/claude-3-5-sonnet-20241022 — requires ANTHROPIC_API_KEY

If you omit the prefix (e.g. gpt-4o-mini), OpenAI is assumed and OPENAI_API_KEY is required.

In code, new SpreadsheetAgent({ model: 'gpt-4o-mini' }) and model: 'openai/gpt-4o-mini' are equivalent for OpenAI (the prefix only selects the provider).

Scripts

Command Description
npm run build Compile TypeScript to dist/
npm run dev Run the HTTP server with hot reload (tsx watch)
npm start Run compiled server (node dist/api/main.js)
npm run cli Interactive CLI
npm test Run all Vitest tests (no API key required)
npm run clean Remove the dist/ build output
npm run test:watch Vitest watch mode
npm run test:integration Live LLM query test (needs OPENAI_API_KEY)
npm run test:formula Schema unit tests only

npm publish runs prepack, which executes npm run build so dist/ is always included in the package tarball.

Usage (library)

import { SpreadsheetAgent } from './src/index.js'; // or from 'agentic-spreadsheet' when installed as a dependency
import * as path from 'node:path';

const agent = new SpreadsheetAgent({
  model: 'openai/gpt-4o-mini',
  maxRows: 10000,           // optional: trim each sheet to this many rows on load
  maxContextRows: 200,      // optional: rows sent to the LLM per request
});

const workbook = await agent.loadWorkbook(path.join('samples', 'sales-sample.csv'), 'csv');

const result = await agent.query(workbook, 'What is the total revenue?');
console.log(result.answer);
console.log(result.formula);

const formula = await agent.generateFormula(
  'Commission is 10% of sales if sales > 1000, else 5%',
  { columns: ['Sales'] }
);

const cleaned = await agent.transform(
  workbook,
  'Remove rows where Status equals Cancelled'
);
await agent.saveWorkbook(cleaned, 'output/cleaned.xlsx', 'xlsx');

Direct data access (no LLM)

const rows = agent.getRows(workbook); // first sheet; pass { sheetName: '…' } for others
const columns = agent.getColumns(workbook);
const filtered = agent.filterRows(workbook, (row) => Number(row['Revenue']) > 1000);

API surface

Method Purpose
loadWorkbook(path | Buffer, format?) Load a file or buffer. For a Buffer, pass format unless it is XLSX (the default). Paths infer format from the extension.
saveWorkbook(workbook, path, format?) Save as XLSX or CSV. CSV writes the first sheet only (same limitation as typical Excel CSV export).
query(workbook, question, { sheetName? }) NL question → { answer, formula?, rows? }.
generateFormula(description, { columns? }) Single formula string.
transform(workbook, instruction, { sheetName? }) NL → structured actions → updated workbook.
explainFormula(formula, { columns? }) Text explanation.
getRows / getColumns / filterRows Non-LLM helpers.
detectFormatFromPath(path) Infer SheetFormat from a file path extension.
Type SheetFormat 'xlsx' | 'csv' | 'ods' | 'xls'

query, transform, generateFormula, and explainFormula reject empty or whitespace-only text inputs. For query and transform, sheetName is optional (defaults to the first sheet). Unknown sheet names throw an error listing available sheet names.

The HTTP API rejects whitespace-only command values and trims formula for explain mode.

HTTP API

npm run dev
# or after build: npm start
  • GET /health — Liveness check.
  • POST /spreadsheet — JSON body:
Field Description
mode query (default), transform, formula, or explain
command Natural language command (not required for explain)
file Path to a file under the uploads directory (relative name or absolute path under that root)
formula For mode: explain only
sheetName Optional sheet name

Examples:

# Place a file in ./uploads (created on startup), e.g. uploads/data.csv

curl -s http://localhost:3000/health

curl -s -X POST http://localhost:3000/spreadsheet \
  -H "Content-Type: application/json" \
  -d '{"mode":"formula","command":"Sum column B if column A equals Yes"}'

curl -s -X POST http://localhost:3000/spreadsheet \
  -H "Content-Type: application/json" \
  -d '{"mode":"query","command":"What is the average revenue?","file":"data.csv"}'

curl -s -X POST http://localhost:3000/spreadsheet \
  -H "Content-Type: application/json" \
  -d '{"mode":"explain","formula":"=IF(A1>0,SUM(B1:B10),0)"}'

POST /spreadsheet only accepts file paths that resolve inside the configured uploads directory (traversal and the uploads root itself are rejected).

CLI

npm run cli

Prompts for a file path, then loops on questions (same SpreadsheetAgent.query behavior as the library).

Docker

npm run build
docker compose build
docker compose up

Mount ./uploads and ./output as in docker-compose.yml. Set OPENAI_API_KEY in the environment.

Testing

npm test                  # Unit + API tests (no API key)
npm run test:integration  # LLM query test (skipped without OPENAI_API_KEY)
npm run test:formula      # Schema unit tests

Project layout

agentic-spreadsheet/
├── src/
│   ├── SpreadsheetAgent.ts
│   ├── index.ts
│   ├── llm/model.ts
│   ├── engine/
│   ├── prompts/
│   ├── schemas/
│   ├── api/
│   │   ├── server.ts    # buildServer(), routes
│   │   └── main.ts      # HTTP listen
│   └── cli/
├── samples/
├── tests/
├── README.md
├── Dockerfile
├── docker-compose.yml
├── .env.example
├── .gitignore
├── .dockerignore
├── package.json
├── LICENSE
├── tsconfig.json
└── vitest.config.ts

dist/ is produced by npm run build. dist/ and .vitest-cache/ (Vitest) are gitignored.

Troubleshooting

  • OPENAI_API_KEY is required — Export the key or add it to .env before calling LLM features (query, transform, generateFormula, explainFormula).
  • Port already in use — Set PORT=3010 (or another free port) in .env.
  • Integration test skippednpm run test:integration only runs when OPENAI_API_KEY is set in the environment.
  • Wrong sheet or empty workbookquery / transform require a valid sheetName when you do not want the first sheet; empty workbooks are rejected.

Google Sheets

The library does not call Google APIs. You can fetch values with googleapis, build a CSV buffer, and call loadWorkbook(buffer, 'csv') — see the Google Sheets API documentation.

License

MIT

Contact Info

Telegram: @AuraTerminal

About

TypeScript library: AI agent for spreadsheets (queries, formulas, transforms) with SheetJS & LLMs.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors