Skip to content
View DIBench's full-sized avatar

Block or report DIBench

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
DIBench/README.md

DI-BENCH: Benchmarking Large Language Models on Dependency Inference with Testable Repositories at Scale

Ensure that Docker engine is installed and running on your machine.

Important

Our testing infrastructure requires ⚙️sysbox (a Docker runtime) to be installed on your system to ensure isolation and security.

# Suggested Python version: 3.10
poetry install .

Dataset

regular.jsonl and large.jsonl are dataset file for regular sets and large sets

Unzip Repository Instances

data can be found in https://doi.org/10.5281/zenodo.14499906

unzip repo-large.zip
unzip repo-regular.zip

Run All-In-On

prepare prompts

python -m bigbuild.make_prompts \
     --result_path prompts-regular.jsonl \
     --dataset_name_or_path regular.jsonl \
     --repo_cache repo-regular

Generate

python -m bigbuild.buildgen \
    --prompt_path prompts-regular.jsonl \
    --target_dir results\all-in-one \ # results will be saved in results\all-in-one\[model]
    --model [model] \
    --backend "openai" \
    # --base_url [base_url] \ # if you using vllm service

Run Imports-Only

prepare prompts

poetry run python -m bigbuild.make_prompts \
     --result_path [prompts-regular-imports.jsonl|prompts-large-imports.jsonl]
     --dataset [regular.jsonl|large.jsonl] \
     --pattern \

Generate

python -m bigbuild.buildgen \
    --prompt_path bigbuild-prompts-regular.jsonl\
    --target_dir results\all-in-one \
    --model_name [model] \
    --backend "openai"

Run File-Iterate

python -m bigbuild.inference.run_builder \
    --model {model} \
    --backend  "openai" \
    # --base_url [base_url] # if you using vllm server
    --dataset_name_or_path [regular.jsonl|large.jsonl] \
    --repo_cache [repo-regular/|repo-large/]

Evaluation

python -m bigbuild.eval \
    --result_dir results\[all-in-one|imports|file-iter]\{model} \ # the root of results generated by three baseline, json format results evaluating is WIP
    --repo_cache [repo-regular|repo-large] \
    --dataset_name_or_path [regular|large].jsonl

Run Experiments with VLLM

start server

pip install vllm
vllm serve [model] --port [port] --trust-remote-code

Popular repositories Loading

  1. DIBench DIBench Public

    Python