Evaluation indicators need to be updated 

Hi,
 Thanks for this good project! However, the evaluation procedure is incorrect leading to an overestimated result. Specifically, your project uses the test-suit evaluation over the database which is used in original execution accuracy. According to the [official evaluation project](https://github.com/taoyds/test-suite-sql-eval), you should use the new database_ts instead of the database. Therefore, the results will be lower! Here are my evaluation results of CodeLLama-13B-instruct-lora (the parameter config is same with your provided config) on the original database (78.1) and the correct database_ts (70.9).

![截屏2023-11-02 20 31 07](https://github.com/eosphoros-ai/DB-GPT-Hub/assets/38073482/a6cb301d-9e7c-40b7-917e-87d2912a2210)
![截屏2023-11-02 20 31 24](https://github.com/eosphoros-ai/DB-GPT-Hub/assets/38073482/8735fde0-09c2-4410-9ea7-ce945bfae1c1)


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Evaluation indicators need to be updated #119

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Evaluation indicators need to be updated #119

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions