Mistral Fine-Tuning Lab
This guide documents a complete Mistral fine-tuning workflow built around four stages: dataset preparation, ChatML tokenization, QLoRA training, and inference testing.
What This Guide Covers
The workflow is organized as a reproducible fine-tuning pipeline:
- Prepare the dataset from OpenAssistant Guanaco.
- Tokenize conversations in ChatML format and apply label masking.
- Fine-tune a base Mistral model with QLoRA.
- Load the adapter and test the resulting agent interactively.
Each stage has its own page so you can read the workflow in order or jump directly to the part you need.
Before You Start
Read Environment Setup first if you want the repository prerequisites, expected tooling, and the local snapshot differences that matter before you run the project yourself.
Companion Repository
This documentation is the narrative layer of the project. The executable source of truth lives in the public repository:
Companion Repository
Use the docs to understand the pipeline, tradeoffs, and implementation details. Use the repository to run the code, inspect the full file tree, and work from the real project files instead of copying snippets out of the site.
- Mistral-Fine-Tuning-Lab repository Use this as the runnable working copy.
- Project README Start here for setup and execution flow.
Pipeline Map
| Step | Main file | Reads | Produces | Why it matters |
|---|---|---|---|---|
| Dataset Preparation | 1_Dataset/prepare_dataset.py | timdettmers/openassistant-guanaco | prepared_dataset_chatml | Converts the raw corpus into a consistent ChatML contract. |
| Tokenization | 2_Tokenizer/tokenizer.py | prepared_dataset_chatml | tokenized_dataset_chatml | Adds ChatML tokens and masks user-side labels. |
| Fine-Tuning | 3_FineTuning/fineTuning.py | tokenized_dataset_chatml | mistral-7b-chatml-adapter | Trains a LoRA adapter on top of a quantized base model. |
| Testing Agent | 4_Testing_agent/chat_agent.py | Base model + adapter | Interactive chat loop | Reuses the same ChatML template during inference. |
How To Read The Tutorial
- Start with Environment Setup if you are preparing a machine to run the pipeline.
- Start with Dataset Preparation if you want the pipeline in sequence.
- Jump to Tokenization if you care most about ChatML special tokens and masking.
- Use Fine-Tuning and Inference Testing together, because training and inference share the same prompt format.
- Keep the glossary pages nearby when reading the implementation details.
Snapshot Notes
Two project-level files referenced throughout the workflow are not present in this documentation workspace snapshot, but they do exist in the upstream public repository:
Snapshot Notes
The pages below keep those dependencies explicit, but they do not invent local values for files that are outside this site repository.
- config.ini Tokenizer, dataset, and model configuration.
- environment.yml Conda environment definition for the lab.
Suggested Reading Order
- Environment Setup
- Dataset Preparation
- Tokenization & ChatML
- Fine-Tuning with QLoRA
- Testing & Inference
- Fine-Tuning Glossary
- Inference Glossary
Baseline Environment Flow
The expected setup flow is:
conda env update --file environment.yml --prune
conda activate Mistral-FineTuning-Lab
huggingface-cli login
Treat those commands as the expected contract of the workflow, then verify the public repository files before running the pipeline end to end.
Continue with Environment Setup.