You cannot select more than 25 topics
Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
This file contains ambiguous Unicode characters that may be confused with others in your current locale. If your use case is intentional and legitimate, you can safely ignore this warning. Use the Escape button to highlight these characters.
# LLM_Evaluator
A simple program to evaluate large language model.
## Recommend Requirements
- Python 3.8
- torch 1.13.1+cu117
- transformers 4.33.2
- accelerate 0.26.1
- tqdm 4.66.1
- openai 0.28
## 需求其余文件
- 请下载[GLM模型](https://hf-mirror.com/THUDM/chatglm-6b)并放置于到`./THUDM/chatglm-6b`文件夹下
- 请下载[GLM2模型](https://hf-mirror.com/THUDM/chatglm2-6b)并放置于到`./THUDM/chatglm2-6b`文件夹下
- 微调后的lora模型可放置于`./lora`文件夹下, 可应用于ChatGLM2
- 微调后的ptuning模型可放置于`./ptuning`文件夹下, 可应用于ChatGLM
- 训练数据按照C-Eval格式, 放置于`./data`文件夹下,文件命名和`eval.py`中的`subject_name`相关
- 相较于C-Eval的数据集, 代码添加了'qa'的数据集,放置于`./data/qa`文件夹下,为非选择题的问答数据集。
## Run
```bash
python eval.py --model_name chatglm --cuda_device 0 --finetune ptuning1
```
## Arguments
- `--model_name` : 模型名称,可选`chatglm`、`chatglm2`
- `--cuda_device` : GPU编号
- `--finetune` : 微调模型名称,为放置于`lora/ptuning`文件夹下的文件夹名
- `--few_shot` : 使用少量数据进行微调(可选)
- `--ntrain` : 少量数据的数量(可选)
- `--cot` : 使用思维链(可选)