Skip to content

Commit 279010a

Browse files
improvement(execution): memory usage for aggregated results (#4650)
* improvement(execution): memory usage for aggregated results * progress * address comments * loop/parallel results compaction * address comment * remove build files, harden edge cases * remove hotpath serialiazation * display change to make use of only preview * materialize refs before sending in response block * preserve exact large-value access through workflow materialization * address comments * progress * fix notif error + sync manifest undefined exit * fix streaming ref materialization * fix tests
1 parent 6827be7 commit 279010a

78 files changed

Lines changed: 5583 additions & 527 deletions

File tree

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

apps/docs/content/docs/en/blocks/function.mdx

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -198,7 +198,7 @@ const file = <readfile.file>;
198198
const base64 = await sim.files.readBase64(file);
199199
```
200200

201-
`sim.files.readBase64(file)`, `sim.files.readText(file)`, `sim.files.readBase64Chunk(file, { offset, length })`, and `sim.files.readTextChunk(file, { offset, length })` read from server-side execution storage under memory caps. `sim.values.read(ref)` can explicitly read a large execution value reference. These helpers are available only in JavaScript functions without imports. JavaScript with imports, Python, and shell do not support these lazy helpers yet.
201+
`sim.files.readBase64(file)`, `sim.files.readText(file)`, `sim.files.readBase64Chunk(file, { offset, length })`, and `sim.files.readTextChunk(file, { offset, length })` read from server-side execution storage under memory caps. `sim.values.read(ref)` explicitly reads a large execution value reference, and `sim.values.readArray(ref)` reads a manifest-backed large array. These helpers are available only in JavaScript functions without imports. JavaScript with imports, Python, and shell do not support these lazy helpers yet.
202202

203203
Very large full reads can still fail by design; use chunk helpers or return a file when you need to handle more data.
204204

@@ -228,7 +228,7 @@ return { name: file.name, chunk: firstMegabyteBase64 };
228228

229229
Chunk `offset` and `length` are byte-based. For Unicode text, a chunk can split a multi-byte character at the boundary; use text chunks for approximate text processing and prefer smaller structured references when exact parsing matters.
230230

231-
Avoid passing a full large object into a Function block when you only need one field. For example, prefer `<api.data.customerId>` over `<api.data>` when the API response is large. If a JavaScript Function without imports references a large execution value, Sim automatically reads it through `sim.values.read(...)` at runtime under memory caps.
231+
Avoid passing a full large object into a Function block when you only need one field. For example, prefer `<api.data.customerId>` over `<api.data>` when the API response is large. If a JavaScript Function without imports references a whole large execution value, Sim automatically rewrites it to `sim.values.read(...)` at runtime under memory caps. If the value is a manifest-backed array, Sim rewrites it to `sim.values.readArray(...)` so array variables can stay compact between blocks.
232232

233233
For large generated data, write the result to a file or table with `outputPath`, `outputSandboxPath`, or `outputTable` instead of returning the entire payload inline.
234234

apps/docs/content/docs/en/execution/api-deployment.mdx

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -232,7 +232,7 @@ Workflow execution responses are capped by platform request and response limits.
232232
}
233233
```
234234

235-
The `version` field is part of the external API contract. Treat the reference as an opaque placeholder for a value that could not be safely embedded in the response. `id`, `key`, and `executionId` are not fetch URLs; `key` points to execution-scoped server storage. Use `selectedOutputs` to request a smaller nested field, reduce the data passed between blocks, or return the data from a Response block when your workflow intentionally owns the HTTP response body. File outputs are metadata-first; request `.base64` only when you need inline file content. JavaScript Function blocks can explicitly read large files or value refs with the `sim.files` and `sim.values` helpers under memory caps.
235+
The `version` field is part of the external API contract. Treat the reference as an opaque placeholder for a value that could not be safely embedded in the response. `id`, `key`, and `executionId` are not fetch URLs; `key` points to execution-scoped server storage. Use `selectedOutputs` to request a smaller nested field, reduce the data passed between blocks, or return the data from a Response block when your workflow intentionally owns the HTTP response body. File outputs are metadata-first; request `.base64` only when you need inline file content. JavaScript Function blocks can explicitly read large files, value refs, and manifest-backed arrays with the `sim.files` and `sim.values` helpers under memory caps.
236236

237237
### Asynchronous
238238

apps/sim/app/api/function/execute/route.test.ts

Lines changed: 135 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -12,9 +12,10 @@ import {
1212
import { NextRequest } from 'next/server'
1313
import { beforeEach, describe, expect, it, vi } from 'vitest'
1414

15-
const { mockExecuteInE2B, mockExecuteInIsolatedVM } = vi.hoisted(() => ({
15+
const { mockExecuteInE2B, mockExecuteInIsolatedVM, mockUploadFile } = vi.hoisted(() => ({
1616
mockExecuteInE2B: vi.fn(),
1717
mockExecuteInIsolatedVM: vi.fn(),
18+
mockUploadFile: vi.fn(),
1819
}))
1920

2021
vi.mock('@/lib/execution/isolated-vm', () => ({
@@ -42,16 +43,26 @@ vi.mock('@/lib/uploads/contexts/workspace/workspace-file-manager', () => ({
4243
uploadWorkspaceFile: vi.fn(),
4344
}))
4445

46+
vi.mock('@/lib/uploads', () => ({
47+
StorageService: {
48+
uploadFile: mockUploadFile,
49+
},
50+
}))
51+
4552
vi.mock('@/lib/workflows/utils', () => workflowsUtilsMock)
4653

4754
vi.mock('@/lib/core/config/feature-flags', () => featureFlagsMock)
4855

4956
import { validateProxyUrl } from '@/lib/core/security/input-validation'
57+
import { clearLargeValueCacheForTests } from '@/lib/execution/payloads/cache'
58+
import { isLargeArrayManifest } from '@/lib/execution/payloads/large-array-manifest-metadata'
59+
import { isLargeValueRef } from '@/lib/execution/payloads/large-value-ref'
5060
import { POST } from '@/app/api/function/execute/route'
5161

5262
describe('Function Execute API Route', () => {
5363
beforeEach(() => {
5464
vi.clearAllMocks()
65+
featureFlagsMock.isE2bEnabled = false
5566

5667
hybridAuthMockFns.mockCheckInternalAuth.mockResolvedValue({
5768
success: true,
@@ -60,6 +71,8 @@ describe('Function Execute API Route', () => {
6071
})
6172

6273
mockExecuteInIsolatedVM.mockResolvedValue({ result: 'test', stdout: '' })
74+
mockUploadFile.mockImplementation(async ({ customKey }) => ({ key: customKey }))
75+
clearLargeValueCacheForTests()
6376

6477
mockExecuteInE2B.mockResolvedValue({
6578
result: 'e2b success',
@@ -201,6 +214,60 @@ describe('Function Execute API Route', () => {
201214
expect(data.output).toHaveProperty('executionTime')
202215
})
203216

217+
it('compacts large array result fields to manifests when execution context is durable', async () => {
218+
mockExecuteInIsolatedVM.mockResolvedValueOnce({
219+
result: {
220+
rows: Array.from({ length: 120_000 }, (_, index) => ({
221+
key: `SIM-${index}`,
222+
payload: 'x'.repeat(100),
223+
})),
224+
},
225+
stdout: '',
226+
})
227+
228+
const req = createMockRequest('POST', {
229+
code: 'return rows',
230+
workflowId: 'workflow-1',
231+
workspaceId: 'workspace-1',
232+
executionId: 'execution-1',
233+
})
234+
235+
const response = await POST(req)
236+
const data = await response.json()
237+
238+
expect(response.status).toBe(200)
239+
expect(data.success).toBe(true)
240+
expect(isLargeArrayManifest(data.output.result.rows)).toBe(true)
241+
expect(data.output.result.rows).toMatchObject({
242+
__simLargeArrayManifest: true,
243+
kind: 'array',
244+
totalCount: 120_000,
245+
})
246+
})
247+
248+
it('keeps large string result fields as generic large value refs', async () => {
249+
mockExecuteInIsolatedVM.mockResolvedValueOnce({
250+
result: {
251+
text: 'x'.repeat(9 * 1024 * 1024),
252+
},
253+
stdout: '',
254+
})
255+
256+
const req = createMockRequest('POST', {
257+
code: 'return text',
258+
workflowId: 'workflow-1',
259+
workspaceId: 'workspace-1',
260+
executionId: 'execution-1',
261+
})
262+
263+
const response = await POST(req)
264+
const data = await response.json()
265+
266+
expect(response.status).toBe(200)
267+
expect(data.success).toBe(true)
268+
expect(isLargeValueRef(data.output.result.text)).toBe(true)
269+
})
270+
204271
it('should return computed result for multi-line code', async () => {
205272
mockExecuteInIsolatedVM.mockResolvedValueOnce({ result: 10, stdout: '' })
206273

@@ -240,6 +307,73 @@ describe('Function Execute API Route', () => {
240307
expect(response.status).toBe(200)
241308
expect(data.success).toBe(true)
242309
})
310+
311+
it('rejects large refs in runtimes without ref-native helpers', async () => {
312+
featureFlagsMock.isE2bEnabled = true
313+
const req = createMockRequest('POST', {
314+
code: 'echo "$__blockRef_0"',
315+
language: 'shell',
316+
contextVariables: {
317+
__blockRef_0: {
318+
__simLargeValueRef: true,
319+
version: 1,
320+
id: 'lv_ABCDEFGHIJKL',
321+
kind: 'array',
322+
size: 12 * 1024 * 1024,
323+
executionId: 'execution-1',
324+
},
325+
},
326+
})
327+
328+
const response = await POST(req)
329+
const data = await response.json()
330+
331+
expect(response.status).toBe(500)
332+
expect(data.success).toBe(false)
333+
expect(data.error).toContain(
334+
'Large execution values require the JavaScript isolated-vm runtime'
335+
)
336+
})
337+
338+
it('registers manifest array read broker for isolated-vm execution', async () => {
339+
const req = createMockRequest('POST', {
340+
code: 'return await sim.values.readArray(__blockRef_0)',
341+
language: 'javascript',
342+
contextVariables: {
343+
__blockRef_0: {
344+
__simLargeArrayManifest: true,
345+
version: 2,
346+
kind: 'array',
347+
totalCount: 1,
348+
chunkCount: 1,
349+
byteSize: 16,
350+
chunks: [
351+
{
352+
ref: {
353+
__simLargeValueRef: true,
354+
version: 1,
355+
id: 'lv_ABCDEFGHIJKL',
356+
kind: 'array',
357+
size: 16,
358+
executionId: 'execution-1',
359+
},
360+
count: 1,
361+
byteSize: 16,
362+
},
363+
],
364+
preview: [{ id: 1 }],
365+
},
366+
},
367+
})
368+
369+
const response = await POST(req)
370+
const data = await response.json()
371+
const [, options] = mockExecuteInIsolatedVM.mock.calls.at(-1) ?? []
372+
373+
expect(response.status).toBe(200)
374+
expect(data.success).toBe(true)
375+
expect(options?.brokers).toHaveProperty('sim.values.readArray')
376+
})
243377
})
244378

245379
describe('Template Variable Resolution', () => {

apps/sim/app/api/function/execute/route.ts

Lines changed: 62 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -14,7 +14,12 @@ import { withRouteHandler } from '@/lib/core/utils/with-route-handler'
1414
import { executeInE2B, executeShellInE2B } from '@/lib/execution/e2b'
1515
import { executeInIsolatedVM, type IsolatedVMBrokerHandler } from '@/lib/execution/isolated-vm'
1616
import { CodeLanguage, DEFAULT_CODE_LANGUAGE, isValidCodeLanguage } from '@/lib/execution/languages'
17-
import { isLargeValueRef } from '@/lib/execution/payloads/large-value-ref'
17+
import { recordMaterializedAccessKeys } from '@/lib/execution/payloads/access-keys'
18+
import {
19+
isLargeArrayManifest,
20+
materializeLargeArrayManifest,
21+
} from '@/lib/execution/payloads/large-array-manifest'
22+
import { containsLargeValueRef, isLargeValueRef } from '@/lib/execution/payloads/large-value-ref'
1823
import {
1924
MAX_FUNCTION_INLINE_BYTES,
2025
MAX_INLINE_MATERIALIZATION_BYTES,
@@ -699,6 +704,8 @@ interface FunctionRouteExecutionContext {
699704
workspaceId?: string
700705
executionId?: string
701706
largeValueExecutionIds?: string[]
707+
largeValueKeys?: string[]
708+
fileKeys?: string[]
702709
allowLargeValueWorkflowScope?: boolean
703710
userId?: string
704711
requestId: string
@@ -741,17 +748,26 @@ function getBrokerFileArgs(args: unknown): {
741748
function createFunctionRuntimeBrokers(
742749
context: FunctionRouteExecutionContext
743750
): Record<string, IsolatedVMBrokerHandler> {
751+
context.largeValueKeys ??= []
752+
context.fileKeys ??= []
753+
const largeValueKeys = context.largeValueKeys
754+
const fileKeys = context.fileKeys
744755
const base = {
745756
requestId: context.requestId,
746757
workflowId: context.workflowId,
747758
workspaceId: context.workspaceId,
748759
executionId: context.executionId,
749760
largeValueExecutionIds: context.largeValueExecutionIds,
761+
largeValueKeys,
762+
fileKeys,
750763
allowLargeValueWorkflowScope: context.allowLargeValueWorkflowScope,
751764
userId: context.userId,
752765
logger,
753766
}
754767

768+
const recordMaterializedKeys = (value: unknown) =>
769+
recordMaterializedAccessKeys({ largeValueKeys, fileKeys }, value)
770+
755771
const readFile = async (args: unknown, encoding: 'base64' | 'text', chunked = false) => {
756772
const fileArgs = getBrokerFileArgs(args)
757773
return readUserFileContent(fileArgs.file, {
@@ -786,6 +802,24 @@ function createFunctionRuntimeBrokers(
786802
if (value === undefined) {
787803
throw unavailableLargeValueError(ref)
788804
}
805+
recordMaterializedKeys(value)
806+
return value
807+
},
808+
'sim.values.readArray': async (args) => {
809+
const record = asRecord(args)
810+
const options = asRecord(record.options)
811+
const manifest = record.ref
812+
if (!isLargeArrayManifest(manifest)) {
813+
throw new Error('Expected a large array manifest.')
814+
}
815+
if (!context.executionId) {
816+
throw new Error('Large array manifests require an execution context.')
817+
}
818+
const value = await materializeLargeArrayManifest(manifest, {
819+
...base,
820+
maxBytes: clampInlineBytes(options.maxBytes, MAX_INLINE_MATERIALIZATION_BYTES),
821+
})
822+
recordMaterializedKeys(value)
789823
return value
790824
},
791825
}
@@ -810,7 +844,17 @@ async function functionJsonResponse<T>(
810844
context: FunctionRouteExecutionContext,
811845
init?: ResponseInit
812846
) {
813-
return NextResponse.json(await compactFunctionRouteBody(body, context), init)
847+
return NextResponse.json(
848+
await compactFunctionRouteBody(
849+
{
850+
...body,
851+
largeValueKeys: context.largeValueKeys,
852+
fileKeys: context.fileKeys,
853+
},
854+
context
855+
),
856+
init
857+
)
814858
}
815859

816860
async function maybeExportSandboxFileToWorkspace(args: {
@@ -955,6 +999,8 @@ export const POST = withRouteHandler(async (req: NextRequest) => {
955999
workflowId,
9561000
executionId,
9571001
largeValueExecutionIds,
1002+
largeValueKeys,
1003+
fileKeys,
9581004
allowLargeValueWorkflowScope = false,
9591005
workspaceId,
9601006
isCustomTool = false,
@@ -979,6 +1025,8 @@ export const POST = withRouteHandler(async (req: NextRequest) => {
9791025
workspaceId,
9801026
executionId,
9811027
largeValueExecutionIds,
1028+
largeValueKeys,
1029+
fileKeys,
9821030
allowLargeValueWorkflowScope,
9831031
userId: auth.userId,
9841032
requestId,
@@ -1013,6 +1061,12 @@ export const POST = withRouteHandler(async (req: NextRequest) => {
10131061
contextVariables = { ...codeResolution.contextVariables, ...preResolvedContextVariables }
10141062
}
10151063

1064+
if (lang === CodeLanguage.Shell && containsLargeValueRef(contextVariables)) {
1065+
throw new Error(
1066+
'Large execution values require the JavaScript isolated-vm runtime. Select a nested field or read the value in a JavaScript function.'
1067+
)
1068+
}
1069+
10161070
let jsImports = ''
10171071
let jsRemainingCode = resolvedCode
10181072
let hasImports = false
@@ -1124,6 +1178,12 @@ export const POST = withRouteHandler(async (req: NextRequest) => {
11241178
!isCustomTool &&
11251179
(lang === CodeLanguage.Python || (lang === CodeLanguage.JavaScript && hasImports))
11261180

1181+
if (useE2B && containsLargeValueRef(contextVariables)) {
1182+
throw new Error(
1183+
'Large execution values require the JavaScript isolated-vm runtime. Remove imports, select a nested field, or read the value in a JavaScript function without E2B.'
1184+
)
1185+
}
1186+
11271187
if (useE2B) {
11281188
logger.info(`[${requestId}] E2B status`, {
11291189
enabled: isE2bEnabled,

0 commit comments

Comments
 (0)