This project aims to train GraphSAGE for Retrieval Augmented Code Generation with context of whole repository.
You can read more about general idea in report.
Authors:
- Konstantin Fedorov (k.fedorov@innopolis.university)
- Boris Zarubin (b.zarubin@innopolis.university)
uv syncClone EvoCodeBenchPlus and set up repos according to its instructions.
Parse repos into a graph dataset:
uv run python -m ragc.datasets.create_dataset \
--evocodebench /path/to/EvoCodeBench/dataset/repos \
configs/evocodebench/create_ds.ymlThis creates cached PyG graphs under data/torch_cache/evocodebench/.
Edit a config under configs/evocodebench/gnn/<model>/greedy.yml:
retrieval.model_path— path to your trainedBEST_CHECKPOINT.ptfusion.generator— model path or API endpoint for the LLMtask_path— path to EvoCodeBenchoracle.jsonlrepos_path— path to EvoCodeBench cloned repos
Code completion:
uv run python -m ragc.test.inference \
-t completion \
-o output/evocodebench/completions.jsonl \
-c configs/evocodebench/gnn/<model>/greedy.ymlRetrieval metrics only (recall / precision):
uv run python -m ragc.test.inference \
-t retrieval \
-o output/evocodebench/retrieval_metrics.json \
-c configs/evocodebench/gnn/<model>/greedy.ymlOutput is a JSONL file with namespace and completion fields, compatible with EvoCodeBench evaluation scripts.
git clone https://github.com/peng-weihan/SWE-QA-Bench.git
cd SWE-QA-Bench
bash clone_repos.sh
cd ..Parse the cloned repos into a graph dataset:
uv run python -m ragc.datasets.create_swe_qa_dataset \
configs/swe_qa_bench/create_ds.yml \
--repos-dir SWE-QA-Bench/SWE-QA-Bench/datasets/reposThis creates cached PyG graphs under data/torch_cache/swe_qa_bench/.
Edit configs/swe_qa_bench/gnn/greedy.yml:
retrieval.model_path— path to your trainedBEST_CHECKPOINT.ptinference.fusion.generator.model— LLM model name (e.g.gpt-oss-120b)inference.fusion.generator.base_url— API endpointquestions_dir— path toSWE-QA-Bench/SWE-QA-Bench/datasets/questionsrepos— (optional) list of repo names to evaluate;nullfor all available
Set the API key:
export API_KEY=your-api-keyuv run python -m ragc.test.swe_qa_inference \
-c configs/swe_qa_bench/gnn/greedy.yml \
-o output/swe_qa_bench/gnnThis produces per-repo JSONL files (e.g. flask.jsonl, django.jsonl) in the output directory. Each line contains question, final_answer, and retrieved_context.
Use the SWE-QA-Bench scorer to evaluate answers against reference:
cd SWE-QA-Bench
# set OPENAI_API_KEY, OPENAI_BASE_URL, MODEL, METHOD in .env
python -m SWE-QA-Bench.score.mainScores are written to SWE-QA-Bench/datasets/scores/<model>/<method>/.