Finetuning
Finetuning has to be performed in an environment with GPU. We used a jupyter environment with cuda and an NVIDIA A100. GPUs with 24GB VRAM should be sufficient. GPUs with less than 16 GB VRAM will not be sufficient.
The process relies on files in other directories of the project. Have the whole repository available in the environemnt.
Work in finetuning.
Run finetuning
We provide a helper script finetune.sh that performs the finetuning for VQL, OCL and Java if the input files are provided correctly. It is only designed to work with LLMs supported by the python code.
./finetune.sh codellama/CodeLlama-7b-hf codellama-7b
./finetune.sh deepseek-ai/deepseek-coder-1.3b-base deepseek-coder-1.3b
./finetune.sh deepseek-ai/deepseek-coder-7b-base-v1.5 deepseek-coder-7b
./finetune.sh Qwen/Qwen3-1.7B-Base qwen3-1.7b
./finetune.sh Qwen/Qwen3-8B-Base qwen3-8b
./finetune.sh Qwen/Qwen2.5-Coder-7B qwen2.5-coder-7b
./finetune.sh Qwen/Qwen2.5-Coder-1.5B qwen2.5-coder-1.5b
Prompt LLMs
Initialize an empty database for collecting the responses.
sqlite3 evaluation.db < ../dataset_construction/schema-eval.sql
Run the following commands to prompt the LLMs and save resomnses to the databsae. Default database is evaluation.db. If you want to modify, you need to set the --db parameter to the location of the databse inside the promptllm.sh script file.
./promptllm.sh codellama/CodeLlama-7b-hf codellama-7b
./promptllm.sh deepseek-ai/deepseek-coder-1.3b-base deepseek-coder-1.3b
./promptllm.sh deepseek-ai/deepseek-coder-7b-base-v1.5 deepseek-coder-7b
./promptllm.sh Qwen/Qwen2.5-Coder-1.5B qwen2.5-coder-1.5b
./promptllm.sh Qwen/Qwen3-8B-Base qwen3-8b
./promptllm.sh Qwen/Qwen2.5-Coder-7B qwen2.5-coder-7b
./promptllm.sh Qwen/Qwen3-1.7B-Base qwen3-1.7b