SciCode/eval/scripts at main · codestoryai/SciCode

Name	Name	Last commit message	Last commit date
parent directory ..
README.md	README.md
gencode_json.py	gencode_json.py
test_generated_code.py	test_generated_code.py

Name

Last commit message

Last commit date

Generate LLM code

To run the script, go to the root of this repo and use the following command from the repository root:

python evaluation/scripts/gencode_json.py [options]

Your first need to set up your API keys. For this, create a keys.cfg file at the root of the repository and add keys as follows:

OPENAI_KEY = 'your_api_key'
ANTHROPIC_KEY = 'your_api_key'
GOOGLE_KEY = 'your_api_key'

For example, to create model results with gpt-4o and the default settings, run

python evaluation/scripts/gencode_json.py --model gpt-4o

--model - Specifies the model name used for generating responses.
--output-dir - Directory to store the generated code outputs (Default: eval_results/generated_code).
--input-path - Directory containing the JSON files describing the problems (Default: eval/data/problems_all.jsonl).
--prompt-dir - Directory where prompt files are saved (Default: eval_results/prompt).
--temperature - Controls the randomness of the generation (Default: 0).

Download the numeric test results and save them as ./eval/data/test_data.h5

To run the script, go to the root of this repo and use the following command:

python evaluation/scripts/test_generated_code.py

Please edit the test_generated_code.py source file to specify your model name, results directory and problem set (if not problems_all.jsonl).