Environment Setup

Before running any of the pipeline stages, make sure the local environment matches the assumptions in the repository documentation and Python scripts.

Required Tooling

The workflow assumes the following tools are available:

  • Conda or Miniconda for environment management
  • Python with the dependencies defined by environment.yml
  • A Hugging Face account with access to the base model
  • A CUDA-capable GPU if you want to run the training and inference stages as written

The fine-tuning and inference scripts are clearly optimized for modern NVIDIA hardware, especially when BF16 and Flash Attention are available.

Expected Setup Flow

The documented setup sequence is:

conda env update --file environment.yml --prune
conda activate Mistral-FineTuning-Lab
huggingface-cli login

The first command creates or updates the Python environment. The second activates it. The third authenticates the machine against Hugging Face so the base model can be downloaded.

Configuration Files The Pipeline Expects

Two repository-level files are referenced throughout the workflow:

  • environment.yml
  • config.ini

In this documentation workspace snapshot, neither file is present locally. In the public project repository, both files are available.

That matters because:

  • environment.yml defines the Python environment needed by the scripts
  • config.ini provides dataset paths, tokenizer settings, model selection, and fine-tuning outputs

Without them in this workspace, the documentation remains useful as a code walk-through, but a full end-to-end run should start from the upstream repository where those configuration files already exist.

Start From The Public Repository

If you want to reproduce the pipeline rather than just read it, use the public repository as your working copy:

Start From The Public Repository

That keeps the docs focused on explanation while GitHub remains the place where you fetch the runnable project files.

Hugging Face Authentication

The base model download requires authentication. The expected flow is:

  1. Create a Hugging Face account if you do not already have one.
  2. Generate an access token.
  3. Run huggingface-cli login and paste the token when prompted.

If the login step is skipped, the tokenizer, fine-tuning, or inference stages may fail during model loading.

Practical Hardware Expectations

The codebase is designed around a resource-constrained but capable single-machine setup:

  • 4-bit loading through bitsandbytes
  • BF16 computation when supported
  • LoRA adapters instead of full fine-tuning
  • Flash Attention 2 when the package is installed and the GPU supports it

In other words, this is not a distributed training workflow. It is a focused local pipeline aimed at making a 7B model trainable on prosumer hardware.

Once the environment assumptions are clear, continue in this order:

  1. Dataset Preparation
  2. Tokenization & ChatML
  3. Fine-Tuning with QLoRA
  4. Testing & Inference