See Research section on Datalayer AI.
This repository includes a workflow at .github/workflows/datalayer-evals.yml to compare:
- agents with MCP tools and skills, codemode disabled
- the same agent setup with codemode enabled
The workflow now creates evalsets from two spec files in this repository:
- codemode-simple-1/no-codemode.evalset.json
- codemode-simple-1/codemode.evalset.json
- Add repository secret DATALAYER_API_KEY.
- Review or customize the two evalset spec files under codemode-simple-1/.
Trigger the datalayer-evals workflow manually with:
- no_codemode_spec_file (optional override)
- codemode_spec_file (optional override)
- optional run_limit, ai_agents_url, account_uid
The workflow publishes artifacts:
- artifacts/no-codemode-report.md
- artifacts/no-codemode-report.csv
- artifacts/codemode-report.md
- artifacts/codemode-report.csv
- artifacts/comparison-summary.md
The CI log output is generated by the Datalayer Core CLI report command, and the comparison summary is added to the workflow summary.