Finish Evaluation Module 

The [evaluation module](https://github.com/huggingface/smol-course/tree/main/4_evaluation) is not complete. It requires a finalised structure, some more informations, and exercises.

### Structure
Here is a basic proposal for a structure:
- what's eval
- here are the well known benchmarks, limitations, and some alternatives people set up (arenas/llm judges)
- you should do your own evals for your own use case
- project on domain specific evaluation
- notebook on comparing models

### Comments
- add a small mention of human based elo rankings and llm as judges
- notebook for implementing a custom eval (you'll find one in the eval guidebook (could make sense to point towards it for further analysis/knowledge) 

- [ ] Refactor to basic structure and add TODOs
- [ ] Add all information and references from the evaluation guidebook
- [ ] Update projects
- [ ] Update notebook with exercises 

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Finish Evaluation Module #42

Structure

Comments

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Finish Evaluation Module #42

Description

Structure

Comments

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions