fastml Tutorials
What this site is about
This site provides a structured introduction to fastml, an R package for training, evaluating, and comparing machine learning models under architecturally constrained, leakage-aware resampling.
The emphasis is not on developing new modeling techniques or maximizing predictive performance. Instead, the focus is on methodologically sound performance evaluation under clearly defined assumptions.
Why this matters
In applied machine learning, reported performance estimates are frequently optimistic due to subtle forms of data leakage and unsafe evaluation workflows.
fastml is designed to reduce this risk by treating Guarded Resampling as a core design principle rather than as an optional recommendation.
How to read this site
The material is organized into four sections.
Concepts
These sections introduce the ideas that motivate fastml, including:
- what data leakage is,
- why many ML pipelines are unsafe by default,
- what guarded resampling entails,
- which classes of workflow configurations fastml deliberately restricts.
The concepts establish the assumptions required to interpret the tutorials correctly.
Tutorials
The tutorials apply the conceptual framework to concrete modeling tasks.
They assume familiarity with the Concepts section and focus on demonstrating evaluation workflows rather than reintroducing theoretical material.
Advanced
Advanced sections extend the framework to more complex settings, such as:
- handling missing data,
- survival analysis,
- model interpretation and diagnostics.
These sections build on the same evaluation principles under additional modeling constraints.
Comparisons
Comparative sections discuss fastml in relation to other frameworks and common workflows, highlighting differences in design philosophy and evaluation guarantees.
Where to start
Readers should begin with the Concepts section (C1–C4).
The first hands-on example is Tutorial 01: Basic Cl