Chapter 1: Configuration Management (RunConfig
/ OptimizeConfig
)
Welcome to the pytorch_template
project! This tutorial will guide you through its core components, helping you understand how to run and manage your machine learning experiments effectively.
Let’s start with a fundamental question: When you run a machine learning experiment, how do you keep track of all the settings you used? What kind of model did you train? What learning rate did you use? How many training cycles (epochs) did you run? If you want to run the exact same experiment again later, or just change one small setting, how do you do that easily?
This is where Configuration Management comes in.
The Problem: Keeping Track of Experiment Details
Imagine you’re baking a cake. You need a recipe! The recipe tells you:
- What ingredients to use (flour, sugar, eggs)
- How much of each ingredient (2 cups flour, 1 cup sugar, 3 eggs)
- The instructions (mix dry ingredients, add wet ingredients, bake at 180°C for 30 minutes)
If you just mix things randomly, you might get a cake, but you probably won’t be able to make the same delicious cake again. And if you want to try making it a bit sweeter next time, you need to know how much sugar you used this time.
Machine learning experiments are similar. You have many “ingredients” and “instructions”:
- Model: What kind of neural network? How many layers? How many nodes per layer?
- Training: What learning rate? How many epochs? What batch size?
- Optimizer: Which algorithm to use for updating the model (e.g., AdamW, SGD)?
- Device: Run on CPU or GPU?
- Seeds: Random numbers for initialization (important for reproducibility).
Manually keeping track of all this is messy and error-prone. You might forget a setting, or make a typo when setting it up again.
The Solution: Recipe Cards for Experiments (YAML + RunConfig
)
pytorch_template
solves this using configuration files, specifically YAML files (.yaml
). Think of a YAML file as the master recipe card for your experiment. It clearly lists all the “ingredients” and “settings.”
We then use a special Python tool called RunConfig
to read this recipe card and set up the experiment exactly as described.
What is a YAML File?
YAML (YAML Ain’t Markup Language) is a human-friendly way to write down structured data. It uses indentation (spaces) to show structure.
Here’s a simplified example of what a configuration file might look like (configs/run_template.yaml
):
# configs/run_template.yaml (Simplified)
project: MyCoolProject
device: cuda:0 # Use the first GPU
# --- Model Settings ---
net: model.MLP # Which model code to use (MLP = Multi-Layer Perceptron)
net_config:
nodes: 64 # Number of nodes in hidden layers
layers: 4 # Number of hidden layers
# --- Training Settings ---
optimizer: torch.optim.adamw.AdamW # Optimizer algorithm
optimizer_config:
lr: 0.001 # Learning rate
epochs: 50 # Number of training cycles
batch_size: 256 # How many data samples per step
seeds: [42] # Random seed for reproducibility
This file clearly defines the key settings for a single experiment run. It’s easy to read and easy to change. Want to try a different learning rate? Just change lr: 0.001
to lr: 0.0005
. Want to use a different model? Change the net
value.
What is RunConfig
?
RunConfig
is a Python class (defined in config.py
) designed to read these YAML files. It acts like the “chef” who reads the recipe card (YAML) and knows how to use that information.
# config.py (Conceptual Snippet)
from dataclasses import dataclass
import yaml
@dataclass # Makes creating classes for storing data easier
class RunConfig:
project: str
device: str
seeds: list[int]
net: str
optimizer: str
epochs: int
batch_size: int
net_config: dict
optimizer_config: dict
# ... other fields like scheduler, early stopping ...
@classmethod
def from_yaml(cls, path: str):
# Reads the YAML file from the given path
with open(path, "r") as file:
config_data = yaml.safe_load(file)
# Creates a RunConfig object using the data
return cls(**config_data)
# ... methods to create model, optimizer etc. ...
- The
@dataclass
decorator helps automatically set up the class to store the data fields (likeproject
,device
, etc.). - The
from_yaml
method is crucial: it takes the path to a YAML file, reads it using theyaml
library, and creates aRunConfig
object holding all those settings.
How to Use RunConfig
In your main script (which we’ll explore more in Chapter 2: Experiment Execution Orchestration (main.py
)), you’ll typically load the configuration like this:
# main.py (Conceptual Snippet)
from config import RunConfig
# 1. Specify the path to your recipe card
config_path = "configs/run_template.yaml"
# 2. Load the configuration using RunConfig
run_config = RunConfig.from_yaml(config_path)
# 3. Now you can access all settings easily
print(f"Project Name: {run_config.project}")
print(f"Device: {run_config.device}")
print(f"Model Layers: {run_config.net_config['layers']}")
print(f"Learning Rate: {run_config.optimizer_config['lr']}")
# Output:
# Project Name: MyCoolProject
# Device: cuda:0
# Model Layers: 4
# Learning Rate: 0.001
# Later, use run_config to set up the experiment:
# model = run_config.create_model() # See Chapter 4
# optimizer = run_config.create_optimizer(model) # See Chapter 3
# ... and so on ...
By loading the YAML into RunConfig
, your Python code gets easy access to all the settings needed to set up and run the experiment consistently.
Internal Implementation: Loading the Config
Let’s trace what happens when you call RunConfig.from_yaml("configs/run_template.yaml")
:
sequenceDiagram
participant UserCode as Your Script (e.g., main.py)
participant RC as RunConfig Class
participant YamlLib as PyYAML Library
participant YamlFile as configs/run_template.yaml
participant RCObject as RunConfig Instance
UserCode->>+RC: Call from_yaml("configs/run_template.yaml")
RC->>+YamlLib: open("configs/run_template.yaml", "r")
YamlLib->>+YamlFile: Read content
YamlFile-->>-YamlLib: Return file content
YamlLib->>YamlLib: Parse content (yaml.safe_load)
YamlLib-->>-RC: Return parsed data (as Python dictionary)
RC->>+RCObject: Create new RunConfig instance (cls(**config_data))
RCObject-->>-RC: New instance created
RC-->>-UserCode: Return the created RunConfig instance
- Your script calls the
RunConfig.from_yaml
method with the file path. RunConfig
uses standard Python file opening to read theconfigs/run_template.yaml
file.- It passes the file content to the PyYAML library’s
safe_load
function. - PyYAML parses the text, understanding the structure (like
net_config
containingnodes
andlayers
), and converts it into a Python dictionary. RunConfig
receives this dictionary. Thecls(**config_data)
part cleverly unpacks the dictionary, matching keys (like"project"
,"device"
) to theRunConfig
class fields, and creates a newRunConfig
object (instance) filled with these values.- This fully-populated
RunConfig
object is returned to your script.
The code for this is quite concise:
# config.py (Inside RunConfig class)
@classmethod
def from_yaml(cls, path: str):
# Open the specified YAML file for reading ('r')
with open(path, "r") as file:
# Use the yaml library to safely load the structure
# This turns the YAML text into a Python dictionary
config_data = yaml.safe_load(file)
# 'cls' refers to the RunConfig class itself.
# '**config_data' unpacks the dictionary so that
# key-value pairs become arguments to the constructor.
# e.g., if config_data = {'project': 'MyCoolProject', ...}
# this becomes RunConfig(project='MyCoolProject', ...)
return cls(**config_data)
What About Finding the Best Settings (OptimizeConfig
)?
So far, RunConfig
helps us manage the settings for one specific experiment run. But what if you don’t know the best learning rate or the optimal number of layers? You might want to try many different combinations automatically to find the best-performing setup.
This is called Hyperparameter Optimization. For this, pytorch_template
uses a second type of configuration: OptimizeConfig
.
Think of OptimizeConfig
as the recipe for running many experiments to find the best RunConfig
. It defines:
- Which settings (
RunConfig
parameters likelr
,layers
) you want to search over. - What range of values to try for each setting.
- How many different combinations (trials) to run.
- Which metric to use to judge which run is “best” (e.g., lowest validation loss).
Here’s a peek at a simplified configs/optimize_template.yaml
:
# configs/optimize_template.yaml (Simplified)
study_name: FindBestMLP
trials: 20 # Try 20 different combinations
metric: val_loss # Aim to minimize validation loss
direction: minimize
# Define the search space: which settings to vary
search_space:
net_config:
layers:
type: int # Layer count is an integer
min: 3 # Try between 3 and 5 layers
max: 5
optimizer_config:
lr:
type: float # Learning rate is a float
min: 0.0001 # Try between 0.0001 and 0.01
max: 0.01
log: true # Search learning rate on a logarithmic scale
OptimizeConfig
works closely with a library called Optuna to automate this search process. We’ll dive deep into this in Chapter 5: Hyperparameter Optimization (Optuna Integration). For now, just know that RunConfig
defines one run, and OptimizeConfig
defines how to search for the best RunConfig
.
Conclusion
You’ve learned about the importance of configuration management for reproducible and organized machine learning experiments.
- YAML files act as human-readable “recipe cards” for your experiments.
RunConfig
is a Python class that reads these YAML files and holds all the settings for a single experiment run. It makes accessing settings in your code clean and easy.- Using
RunConfig.from_yaml(path)
is the standard way to load your experiment setup. OptimizeConfig
(which we’ll cover later) helps define how to search for the best set of configurations automatically.
By defining your experiments in configuration files, you make them easier to understand, modify, share, and reproduce – crucial aspects of any serious machine learning project.
Now that we know how to define all the settings for an experiment using RunConfig
, let’s move on to see how the project actually uses this configuration to start and manage the experiment execution process.
Next Up: Chapter 2: Experiment Execution Orchestration (main.py
)
Generated by AI Codebase Knowledge Builder