Skip to content

Commit 6ccbc96

Browse files
authored
Update GDrive links for pretrained models and preprocessed data since the Stanford Drive will expire soon
1 parent 65d7ac8 commit 6ccbc96

1 file changed

Lines changed: 12 additions & 12 deletions

File tree

README.md

Lines changed: 12 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -44,27 +44,27 @@ pip install torch-geometric==1.7.0 -f https://pytorch-geometric.com/whl/torch-1.
4444
## 2. Download data
4545

4646
### Download and preprocess data yourself
47-
**Preprocessing the data yourself may take long, so if you want to directly download preprocessed data, please jump to the next subsection.**
47+
**Preprocessing the data yourself may take a long time, so if you want to directly download preprocessed data, please jump to the next subsection.**
4848

49-
Download the raw ConceptNet, CommonsenseQA, OpenBookQA data by using
49+
Download the raw ConceptNet, CommonsenseQA, and OpenBookQA data by using
5050
```
5151
./download_raw_data.sh
5252
```
5353

54-
You can preprocess these raw data by running
54+
You can preprocess the raw data by running
5555
```
5656
CUDA_VISIBLE_DEVICES=0 python preprocess.py -p <num_processes>
5757
```
58-
You can specify the GPU you want to use in the beginning of the command `CUDA_VISIBLE_DEVICES=...`. The script will:
58+
You can specify the GPU you want to use at the beginning of the command `CUDA_VISIBLE_DEVICES=...`. The script will:
5959
* Setup ConceptNet (e.g., extract English relations from ConceptNet, merge the original 42 relation types into 17 types)
6060
* Convert the QA datasets into .jsonl files (e.g., stored in `data/csqa/statement/`)
61-
* Identify all mentioned concepts in the questions and answers
61+
* Identify all the mentioned concepts in the questions and answers
6262
* Extract subgraphs for each q-a pair
6363

6464
The script to download and preprocess the [MedQA-USMLE](https://github.com/jind11/MedQA) data and the biomedical knowledge graph based on Disease Database and DrugBank is provided in `utils_biomed/`.
6565

6666
### Directly download preprocessed data
67-
For your convenience, if you don't want to preprocess the data yourself, you can download all the preprocessed data [here](https://drive.google.com/drive/folders/1T6B4nou5P3u-6jr0z6e3IkitO8fNVM6f?usp=sharing). Download them into the top-level directory of this repo and unzip them. Move the `medqa_usmle` and `ddb` folders into the `data/` directory.
67+
For your convenience, if you don't want to preprocess the data yourself, you can download all the preprocessed data [here](https://drive.google.com/drive/folders/16hEDRfkIaHyldyeUGqKG614fQByNhOPg?usp=sharing). Download them into the top-level directory of this repo and unzip them. Move the `medqa_usmle` and `ddb` folders into the `data/` directory.
6868

6969
### Resulting file structure
7070

@@ -74,7 +74,7 @@ The resulting file structure should look like this:
7474
.
7575
├── README.md
7676
├── data/
77-
├── cpnet/ (prerocessed ConceptNet)
77+
├── cpnet/ (preprocessed ConceptNet)
7878
├── csqa/
7979
├── train_rand_split.jsonl
8080
├── dev_rand_split.jsonl
@@ -93,7 +93,7 @@ To train GreaseLM on CommonsenseQA, run
9393
```
9494
CUDA_VISIBLE_DEVICES=0 ./run_greaselm.sh csqa --data_dir data/
9595
```
96-
You can specify up to 2 GPUs you want to use in the beginning of the command `CUDA_VISIBLE_DEVICES=...`.
96+
You can specify up to 2 GPUs you want to use at the beginning of the command `CUDA_VISIBLE_DEVICES=...`.
9797

9898
Similarly, to train GreaseLM on OpenbookQA, run
9999
```
@@ -106,18 +106,18 @@ CUDA_VISIBLE_DEVICES=0 ./run_greaselm__medqa_usmle.sh
106106
```
107107

108108
## 4. Pretrained model checkpoints
109-
You can download a pretrained GreaseLM model on CommonsenseQA [here](https://drive.google.com/file/d/1QPwLZFA6AQ-pFfDR6TWLdBAvm3c_HOUr/view?usp=sharing), which achieves an IH-dev acc. of `79.0` and an IH-test acc. of `74.0`.
109+
You can download a pretrained GreaseLM model on CommonsenseQA [here](https://drive.google.com/file/d/1iu-d7Q23tUD_MYcYu9jmJintqtD9LPv_/view?usp=sharing), which achieves an IH-dev acc. of `79.0` and an IH-test acc. of `74.0`.
110110

111-
You can also download a pretrained GreaseLM model on OpenbookQA [here](https://drive.google.com/file/d/1-QqyiQuU9xlN20vwfIaqYQ_uJMP8d7Pv/view?usp=sharing), which achieves an test acc. of `84.8`.
111+
You can also download a pretrained GreaseLM model on OpenbookQA [here](https://drive.google.com/file/d/1mE6hUK2CIAz6wrFucxDaXU913pwdawdv/view?usp=sharing), which achieves a test acc. of `84.8`.
112112

113-
You can also download a pretrained GreaseLM model on MedQA-USMLE [here](https://drive.google.com/file/d/1j0QxiBiGbv0s9PhseSly6V6uiHWU5IEt/view?usp=sharing), which achieves an test acc. of `38.5`.
113+
You can also download a pretrained GreaseLM model on MedQA-USMLE [here](https://drive.google.com/file/d/1-P3hngsRfnflHAay6JnkNke069gYwdVb/view?usp=sharing), which achieves a test acc. of `38.5`.
114114

115115
## 5. Evaluating a pretrained model checkpoint
116116
To evaluate a pretrained GreaseLM model checkpoint on CommonsenseQA, run
117117
```
118118
CUDA_VISIBLE_DEVICES=0 ./eval_greaselm.sh csqa --data_dir data/ --load_model_path /path/to/checkpoint
119119
```
120-
Again you can specify up to 2 GPUs you want to use in the beginning of the command `CUDA_VISIBLE_DEVICES=...`.
120+
Again, you can specify up to 2 GPUs you want to use in the beginning of the command `CUDA_VISIBLE_DEVICES=...`.
121121

122122
Similarly, to evaluate a pretrained GreaseLM model checkpoint on OpenbookQA, run
123123
```

0 commit comments

Comments
 (0)