| Title: | Supervised Learning with Mandatory Splits and Seeds |
|---|---|
| Description: | Implements the split-fit-evaluate-assess workflow from Hastie, Tibshirani, and Friedman (2009, ISBN:978-0-387-84857-0) "The Elements of Statistical Learning", Chapter 7. Provides three-way data splitting with automatic stratification, mandatory seeds for reproducibility, automatic data type handling, and 10 algorithms out of the box. Uses 'Rust' backend for cross-language deterministic splitting. Designed for tabular supervised learning with minimal ceremony. Polyglot parity with the 'Python' 'mlw' package on 'PyPI'. |
| Authors: | Simon Roth [aut, cre] |
| Maintainer: | Simon Roth <[email protected]> |
| License: | MIT + file LICENSE |
| Version: | 1.0.0 |
| Built: | 2026-06-02 09:01:17 UTC |
| Source: | https://github.com/epagogy/ml |
Provides the module-style interface ml$verb() as an alternative to the
standard ml_verb() function style. Both styles are equivalent and call
the same underlying implementation.
mlml
A locked environment with 22 verb entries.
Note: ml$fit(...) and ml_fit(...) produce identical results.
s <- ml$split(iris, "Species", seed = 42) model <- ml$fit(s$train, "Species", seed = 42) ml$evaluate(model, s$valid)s <- ml$split(iris, "Species", seed = 42) model <- ml$fit(s$train, "Species", seed = 42) ml$evaluate(model, s$valid)
Returns a data.frame showing which algorithms support classification and regression, and which require optional packages.
ml_algorithms(task = NULL)ml_algorithms(task = NULL)
task |
Optional filter: "classification" or "regression" |
A data.frame with columns: algorithm, classification, regression, optional_dep, installed
ml_algorithms() ml_algorithms(task = "classification")ml_algorithms() ml_algorithms(task = "classification")
The final exam — separate from ml_evaluate() to force a conscious choice.
Errors if called more than once on the same model. Use s$test (not
s$valid) for the test data.
ml_assess(model, test)ml_assess(model, test)
model |
An |
test |
Test data.frame (use |
An object of class ml_evidence (sealed — not substitutable for ml_metrics)
s <- ml_split(iris, "Species", seed = 42) model <- ml_fit(s$train, "Species", seed = 42) verdict <- ml_assess(model, test = s$test)s <- ml_split(iris, "Species", seed = 42) model <- ml_fit(s$train, "Species", seed = 42) verdict <- ml_assess(model, test = s$test)
Returns the top-ranked fitted model from screen() or compare(). NULL if no models were stored.
ml_best(lb)ml_best(lb)
lb |
An ml_leaderboard |
An ml_model or NULL
s <- ml_split(iris, "Species", seed = 42) lb <- ml_screen(s, "Species", seed = 42) best <- ml_best(lb) predict(best, s$valid)s <- ml_split(iris, "Species", seed = 42) lb <- ml_screen(s, "Species", seed = 42) best <- ml_best(lb) predict(best, s$valid)
Applies Platt scaling (logistic regression on raw probabilities) to produce better-calibrated class probability estimates. Use validation data for calibration – never training data.
ml_calibrate(model, data = NULL)ml_calibrate(model, data = NULL)
model |
An |
data |
A data.frame of calibration data (use validation set) |
Binary classification only.
An ml_calibrated_model that behaves like an ml_model but
returns calibrated probabilities
s <- ml_split(ml_dataset("cancer"), "target", seed = 42) model <- ml_fit(s$train, "target", algorithm = "xgboost", seed = 42) cal <- ml_calibrate(model, data = s$valid) ml_evaluate(cal, s$valid)s <- ml_split(ml_dataset("cancer"), "target", seed = 42) model <- ml_fit(s$train, "target", algorithm = "xgboost", seed = 42) cal <- ml_calibrate(model, data = s$valid) ml_evaluate(cal, s$valid)
Fits the same model twice with the same seed and asserts predictions are
identical. Returns a list with passed, algorithm, seed,
and message.
ml_check(data, target, algorithm = "random_forest", seed)ml_check(data, target, algorithm = "random_forest", seed)
data |
A data.frame with features and target |
target |
Target column name |
algorithm |
Algorithm to check (default "random_forest") |
seed |
Random seed |
A list with passed (logical), algorithm, seed,
message. Supports isTRUE(result$passed) for assertions.
result <- ml_check(iris, "Species", seed = 42) result$passedresult <- ml_check(iris, "Species", seed = 42) result$passed
Runs before fit() to catch common data quality issues that silently degrade model performance.
ml_check_data(data, target, severity = "warn")ml_check_data(data, target, severity = "warn")
data |
A data.frame |
target |
Target column name |
severity |
"warn" (default) or "error". If "error", raises on any issue. |
Checks performed:
NaN in target (silently dropped by split)
Inf in features
ID columns (100\
Zero-variance features (constant columns)
High-null columns (>50\
Severe class imbalance (<5\
Duplicate rows (>10\
Feature redundancy (|r| > 0.95)
A list with warnings, errors, has_issues,
passed. Supports isTRUE(result$passed) for assertions.
report <- ml_check_data(iris, "Species") report$passedreport <- ml_check_data(iris, "Species") report$passed
Evaluates multiple fitted models on the same dataset without re-fitting. All models must share the same target column and task.
ml_compare(models, data, sort_by = "auto")ml_compare(models, data, sort_by = "auto")
models |
A list of |
data |
A data.frame containing the target column |
sort_by |
"auto" or a metric name string |
An object of class ml_leaderboard (data.frame with formatted print)
s <- ml_split(iris, "Species", seed = 42) m1 <- ml_fit(s$train, "Species", algorithm = "logistic", seed = 42) m2 <- ml_fit(s$train, "Species", algorithm = "random_forest", seed = 42) ml_compare(list(m1, m2), s$valid)s <- ml_split(iris, "Species", seed = 42) m1 <- ml_fit(s$train, "Species", algorithm = "logistic", seed = 42) m2 <- ml_fit(s$train, "Species", algorithm = "random_forest", seed = 42) ml_compare(list(m1, m2), s$valid)
Set global configuration for the ml package. Currently supports guards
to control partition enforcement.
ml_config(guards = NULL)ml_config(guards = NULL)
guards |
Character: |
Invisibly returns the previous settings as a list.
ml_config(guards = "off") # disable guards ml_config(guards = "strict") # re-enable (default)ml_config(guards = "off") # disable guards ml_config(guards = "strict") # re-enable (default)
Takes an existing ml_split_result and creates k-fold rotations within its
dev partition (train + valid). The test partition stays sealed on the original
split for ml_assess().
ml_cv(s, target, folds = 5L, seed = NULL, stratify = TRUE)ml_cv(s, target, folds = 5L, seed = NULL, stratify = TRUE)
s |
An |
target |
Target column name (string) |
folds |
Number of folds (default 5) |
seed |
Random seed for fold assignment |
stratify |
Logical. Stratify folds by target for classification (default TRUE) |
Two primitives, strict separation of concerns: ml_split() creates the
three-way boundary, ml_cv() creates rotations within that boundary.
An ml_cv_result that ml_fit() accepts directly.
The original split's $test remains available via s$test for ml_assess().
s <- ml_split(iris, "Species", seed = 42) c <- ml_cv(s, "Species", folds = 5, seed = 42) model <- ml_fit(c, "Species", seed = 42) model$scores_s <- ml_split(iris, "Species", seed = 42) c <- ml_cv(s, "Species", folds = 5, seed = 42) model <- ml_fit(c, "Species", seed = 42) model$scores_
No group appears in both train and validation within any fold. Prevents leakage from repeated measurements (patients, stores, sensors).
ml_cv_group(s, target, groups, folds = 5L, seed = NULL)ml_cv_group(s, target, groups, folds = 5L, seed = NULL)
s |
An |
target |
Target column name (string) |
groups |
Column name identifying groups |
folds |
Number of folds (default 5) |
seed |
Random seed for group assignment |
An ml_cv_result with group-aware folds
df <- data.frame(pid = rep(1:20, each = 5), x = rnorm(100), y = sample(0:1, 100, TRUE)) s <- ml_split(df, "y", seed = 42) c <- ml_cv_group(s, "y", groups = "pid", folds = 5, seed = 42)df <- data.frame(pid = rep(1:20, each = 5), x = rnorm(100), y = sample(0:1, 100, TRUE)) s <- ml_split(df, "y", seed = 42) c <- ml_cv_group(s, "y", groups = "pid", folds = 5, seed = 42)
Expanding-window CV for time series. Data must already be sorted
chronologically (use ml_split_temporal() first).
ml_cv_temporal( s, target, folds = 5L, embargo = 0L, window = "expanding", window_size = NULL )ml_cv_temporal( s, target, folds = 5L, embargo = 0L, window = "expanding", window_size = NULL )
s |
An |
target |
Target column name (string) |
folds |
Number of folds (default 5) |
embargo |
Integer. Number of rows to skip between train end and valid start (gap to prevent temporal leakage from autocorrelation). Default 0. Must be >= 0. |
window |
|
window_size |
Integer. Required when |
An ml_cv_result with expanding-window folds
df <- data.frame(date = 1:100, x = rnorm(100), y = sample(0:1, 100, TRUE)) s <- ml_split_temporal(df, "y", time = "date") c <- ml_cv_temporal(s, "y", folds = 5) # With embargo to prevent autocorrelation leakage: c2 <- ml_cv_temporal(s, "y", folds = 5, embargo = 5L)df <- data.frame(date = 1:100, x = rnorm(100), y = sample(0:1, 100, TRUE)) s <- ml_split_temporal(df, "y", time = "date") c <- ml_cv_temporal(s, "y", folds = 5) # With embargo to prevent autocorrelation leakage: c2 <- ml_cv_temporal(s, "y", folds = 5, embargo = 5L)
Returns one of the built-in datasets. Useful for experimenting with the ml API before applying it to your own data.
ml_dataset(name)ml_dataset(name)
name |
Dataset name (string) |
Available datasets: "iris", "wine", "cancer", "diabetes", "houses", "churn", "fraud"
A data.frame
churn <- ml_dataset("churn") head(churn)churn <- ml_dataset("churn") head(churn)
Compares a reference dataset (typically training data) to new data using per-feature statistical tests or adversarial validation.
ml_drift( reference, new, method = "statistical", threshold = 0.05, exclude = NULL, target = NULL, seed = NULL, algorithm = "random_forest" )ml_drift( reference, new, method = "statistical", threshold = 0.05, exclude = NULL, target = NULL, seed = NULL, algorithm = "random_forest" )
reference |
A data.frame — reference dataset (typically training data) |
new |
A data.frame — new data to compare against the reference |
method |
Detection method: "statistical" (default) or "adversarial" |
threshold |
p-value threshold for statistical method (default 0.05) |
exclude |
Character vector of column names to skip (e.g., ID columns) |
target |
Target column name — automatically excluded from drift analysis |
seed |
Random seed (required for method = "adversarial") |
algorithm |
Algorithm for adversarial classifier: "random_forest" (default) or "xgboost" |
Statistical method (default): per-feature distribution tests with no labels required.
Numeric features: Kolmogorov-Smirnov two-sample test
Categorical features: Chi-squared test on value counts
Adversarial method: trains a binary classifier to distinguish reference from new data. AUC near 0.5 means similar distributions; AUC near 1.0 means very different distributions.
$train_scores: per-row probability of "looks like new data" for reference
rows. Use sort(result$train_scores, decreasing = TRUE)[1:n] to select
validation rows that mirror the new distribution.
$features: most discriminative features (temporal leakage candidates)
Pair with ml_shelf() for complete monitoring: drift() detects input
distribution shift (label-free), shelf() detects performance degradation
(requires labels).
An object of class ml_drift_result with:
$shifted: TRUE if drift detected
$features: named numeric — p-values (statistical) or importances (adversarial)
$features_shifted: character vector of drifted feature names
$severity: "none", "low", "medium", or "high"
$auc: adversarial mode only — classifier AUC
$train_scores: adversarial mode only — per-row reference probabilities
s <- ml_split(iris, "Species", seed = 42) # Simulate drift by perturbing test data new <- s$test new$Sepal.Length <- new$Sepal.Length + 2 result <- ml_drift(reference = s$train, new = new, target = "Species") result$shifted result$features_shifteds <- ml_split(iris, "Species", seed = 42) # Simulate drift by perturbing test data new <- s$test new$Sepal.Length <- new$Sepal.Length + 2 result <- ml_drift(reference = s$train, new = new, target = "Species") result$shifted result$features_shifted
Fits a text vectorizer on training texts and returns an embedder object that stores the vocabulary for consistent transform at prediction time.
ml_embed(texts, method = "tfidf", max_features = 100L)ml_embed(texts, method = "tfidf", max_features = 100L)
texts |
A character vector of texts to embed |
method |
Embedding method. Currently only "tfidf" is supported. |
max_features |
Maximum vocabulary size (number of TF-IDF features). Default 100. |
Currently supports TF-IDF ('tm' package). SBERT and neural methods are planned for future gates.
An object of class ml_embedder with:
$vectors: data.frame of TF-IDF features (n_texts x max_features)
$method: the method used
$vocab_size: number of features generated
$transform(new_texts): apply stored vocabulary to new texts
if (requireNamespace("tm", quietly = TRUE)) { texts <- c("good product", "bad service", "great value", "poor quality") emb <- ml_embed(texts, method = "tfidf", max_features = 20) emb$vocab_size nrow(emb$vectors) # Transform new texts using the fitted vocabulary new_texts <- c("excellent quality", "terrible service") new_vecs <- emb$transform(new_texts) }if (requireNamespace("tm", quietly = TRUE)) { texts <- c("good product", "bad service", "great value", "poor quality") emb <- ml_embed(texts, method = "tfidf", max_features = 20) emb$vocab_size nrow(emb$vectors) # Transform new texts using the fitted vocabulary new_texts <- c("excellent quality", "terrible service") new_vecs <- emb$transform(new_texts) }
Trains at increasing data sizes and reports train vs validation performance at each step. Answers: is the model still learning (more data helps), or saturated (more data unlikely to help)?
ml_enough(s, target, seed = NULL, algorithm = "auto", steps = 8L, cv = 3L)ml_enough(s, target, seed = NULL, algorithm = "auto", steps = 8L, cv = 3L)
s |
An |
target |
Target column name |
seed |
Random seed (optional in R; auto-generated if NULL) |
algorithm |
Algorithm to use (default |
steps |
Integer >= 2. Number of data-size steps to evaluate, evenly spaced from ~10%% to 100%% of training data. Default 8. |
cv |
Integer >= 2. Number of cross-validation folds for validation score at each step. Default 3. |
An ml_enough_result with fields:
$saturated — logical, TRUE if curve plateaus (< 1%% gain in last half)
$curve — data.frame: n_samples, train_score, val_score
$metric — metric name used
$n_current — total training rows in the full dataset
$recommendation — human-readable action
s <- ml_split(iris, "Species", seed = 42) result <- ml_enough(s, "Species", seed = 42) result$recommendations <- ml_split(iris, "Species", seed = 42) result <- ml_enough(s, "Species", seed = 42) result$recommendation
The practice exam — call as many times as needed. For the one-time final
grade on held-out test data, use ml_assess().
ml_evaluate(model, data)ml_evaluate(model, data)
model |
An |
data |
A data.frame containing the target column |
An object of class ml_metrics (named numeric vector with print method)
s <- ml_split(iris, "Species", seed = 42) model <- ml_fit(s$train, "Species", seed = 42) metrics <- ml_evaluate(model, s$valid) metrics[["accuracy"]]s <- ml_split(iris, "Species", seed = 42) model <- ml_fit(s$train, "Species", seed = 42) metrics <- ml_evaluate(model, s$valid) metrics[["accuracy"]]
Returns a data frame of feature importances, normalized to sum to 1.0, sorted descending. Uses tree-based impurity importance for 'xgboost' and 'random_forest', absolute coefficients for 'logistic', 'linear', and 'elastic_net'. Not supported for 'svm' or 'knn'.
ml_explain(model)ml_explain(model)
model |
An |
An object of class ml_explanation (a data.frame with columns
feature and importance; custom print shows a bar chart)
s <- ml_split(iris, "Species", seed = 42) model <- ml_fit(s$train, "Species", algorithm = "random_forest", seed = 42) ml_explain(model)s <- ml_split(iris, "Species", seed = 42) model <- ml_fit(s$train, "Species", algorithm = "random_forest", seed = 42) ml_explain(model)
Trains a model using cross-validation (if data is an ml_split_result
with folds) or holdout (if data is a data.frame). Automatically detects task type,
handles encoding, and records metadata for reproducibility.
ml_fit( data, target, algorithm = "auto", seed = NULL, task = "auto", balance = FALSE, engine = "auto", ... )ml_fit( data, target, algorithm = "auto", seed = NULL, task = "auto", balance = FALSE, engine = "auto", ... )
data |
A |
target |
Target column name (string) |
algorithm |
"auto" (default), "xgboost", "random_forest", "svm", "knn", "logistic", "linear", "naive_bayes", "elastic_net" |
seed |
Random seed. NULL (default) auto-generates and stores for reproducibility. |
task |
"auto", "classification", or "regression" |
balance |
Logical. If |
engine |
Backend engine: |
... |
Additional hyperparameters passed to the engine
(e.g., |
Formula interfaces are not supported. Pass the data frame and target column name as a string. Unordered factors use one-hot encoding for linear models and ordinal encoding for tree-based models. Ordered factors always use ordinal encoding.
An object of class ml_model
s <- ml_split(iris, "Species", seed = 42) model <- ml_fit(s$train, "Species", seed = 42) model$algorithms <- ml_split(iris, "Species", seed = 42) model <- ml_fit(s$train, "Species", seed = 42) model$algorithm
Analyzes feature-target relationships before modeling. Runs pure data introspection – no model fitting.
ml_leak(data, target)ml_leak(data, target)
data |
A data.frame or ml_split_result |
target |
Target column name |
Checks performed:
Feature-target correlation (Pearson |r|, numeric features)
High-cardinality ID columns
Target name in feature names
Duplicate rows between train and test (SplitResult only)
A list with clean (logical), n_warnings,
checks (list of check results), suspects (list of
suspect features). Class ml_leak_report.
s <- ml_split(iris, "Species", seed = 42) report <- ml_leak(s, "Species") report$cleans <- ml_split(iris, "Species", seed = 42) report <- ml_leak(s, "Species") report$clean
Load a model from disk
ml_load(path)ml_load(path)
path |
Path to a |
An ml_model or ml_tuning_result
s <- ml_split(iris, "Species", seed = 42) model <- ml_fit(s$train, "Species", seed = 42) path <- file.path(tempdir(), "iris_model.mlr") ml_save(model, path) loaded <- ml_load(path) loaded$algorithms <- ml_split(iris, "Species", seed = 42) model <- ml_fit(s$train, "Species", seed = 42) path <- file.path(tempdir(), "iris_model.mlr") ml_save(model, path) loaded <- ml_load(path) loaded$algorithm
Sweeps thresholds from min_threshold to 0.95 in two phases (coarse
0.05 steps, then fine 0.005 steps around the coarse best) and returns a
copy of the model with a tuned threshold. Subsequent ml_predict() calls
apply this threshold to positive-class probability instead of 0.5.
ml_optimize(model, data, metric = "f1", min_threshold = "auto")ml_optimize(model, data, metric = "f1", min_threshold = "auto")
model |
An |
data |
A data.frame containing the target column used as true labels. |
metric |
Character. Optimisation objective: |
min_threshold |
Lower bound of the sweep. |
An ml_optimize_result (also an ml_model). The threshold is
baked in — every ml_predict() call uses it automatically.
Inspect with result$threshold. The original model is unchanged.
s <- ml_split(iris[iris$Species != "virginica", ], "Species", seed = 42) model <- ml_fit(s$train, "Species", seed = 42) opt <- ml_optimize(model, data = s$valid, metric = "f1") opt$thresholds <- ml_split(iris[iris$Species != "virginica", ], "Species", seed = 42) model <- ml_fit(s$train, "Species", seed = 42) opt <- ml_optimize(model, data = s$valid, metric = "f1") opt$threshold
Produces diagnostic plots using base R graphics. No extra packages required.
ml_plot(model, data = NULL, kind = "importance", ...)ml_plot(model, data = NULL, kind = "importance", ...)
model |
An |
data |
A data.frame for computing predictions (required for all
except |
kind |
Plot type. One of |
... |
Passed to the underlying base R plot call |
Available kinds:
"importance" — feature importance bar chart
"roc" — ROC curve (classification)
"confusion" — confusion matrix heatmap (classification)
"residual" — residuals vs fitted (regression)
"calibration" — predicted vs actual probabilities (classification)
Invisibly returns NULL (called for its side effect)
s <- ml_split(iris, "Species", seed = 42) model <- ml_fit(s$train, "Species", algorithm = "random_forest", seed = 42) ml_plot(model, kind = "importance") ml_plot(model, data = s$valid, kind = "confusion")s <- ml_split(iris, "Species", seed = 42) model <- ml_fit(s$train, "Species", algorithm = "random_forest", seed = 42) ml_plot(model, kind = "importance") ml_plot(model, data = s$valid, kind = "confusion")
Alias for predict(model, newdata = ...). Matches Python ml.predict().
ml_predict(model, new_data)ml_predict(model, new_data)
model |
An |
new_data |
A data.frame with the same features used for training |
A vector of predicted class labels (classification) or numeric values (regression).
s <- ml_split(iris, "Species", seed = 42) model <- ml_fit(s$train, "Species", seed = 42) preds <- ml_predict(model, s$valid) head(preds)s <- ml_split(iris, "Species", seed = 42) model <- ml_fit(s$train, "Species", seed = 42) preds <- ml_predict(model, s$valid) head(preds)
Predict class probabilities
ml_predict_proba(model, new_data)ml_predict_proba(model, new_data)
model |
An |
new_data |
A data.frame with the same features used for training |
A data.frame with one column per class. Values are probabilities summing to 1.0 per row. Column names are the original class labels.
s <- ml_split(iris, "Species", seed = 42) model <- ml_fit(s$train, "Species", algorithm = "random_forest", seed = 42) probs <- ml_predict_proba(model, s$valid) head(probs)s <- ml_split(iris, "Species", seed = 42) model <- ml_fit(s$train, "Species", algorithm = "random_forest", seed = 42) probs <- ml_predict_proba(model, s$valid) head(probs)
Grammar primitive #2: DataFrame -> PreparedData.
ml_prepare(data, target, algorithm = "auto", task = "auto")ml_prepare(data, target, algorithm = "auto", task = "auto")
data |
A data.frame including the target column. |
target |
Name of the target column (string). |
algorithm |
Algorithm hint for encoding strategy: "auto", "random_forest", "logistic", etc. Tree-based algorithms use ordinal encoding; linear algorithms use one-hot encoding for low-cardinality categoricals. |
task |
"classification", "regression", or "auto" (detected from target). |
In the default workflow, ml_fit() calls preparation internally per fold.
Use ml_prepare() explicitly when you need manual control: inspect the
preprocessing state, apply the same encoding to external data, or chain
preparation with fitting.
An ml_prepared_data object with:
$data — transformed data.frame (all-numeric, ready for ml_fit)
$state — NormState list; use .transform(state, X) on new data
$target — target column name
$task — detected or provided task type
df <- data.frame(x1 = rnorm(50), x2 = rnorm(50), y = rnorm(50)) s <- ml_split(df, "y", seed = 42) p <- ml_prepare(s$train, "y") p$task # "classification" or "regression" p$data # encoded feature matrixdf <- data.frame(x1 = rnorm(50), x2 = rnorm(50), y = rnorm(50)) s <- ml_split(df, "y", seed = 42) p <- ml_prepare(s$train, "y") p$task # "classification" or "regression" p$data # encoded feature matrix
Computes per-column statistics and emits warnings for common data quality issues: missing values, constant columns, high cardinality, imbalanced targets, and near-collinear features.
ml_profile(data, target = NULL)ml_profile(data, target = NULL)
data |
A data.frame (also accepts tibble or data.table) |
target |
Optional target column name (enables task detection + distribution stats) |
An object of class ml_profile_result (list with formatted print)
ml_profile(iris, "Species")ml_profile(iris, "Species")
The fastest path from raw data to a trained, evaluated model. Screens logistic, random_forest, and xgboost, picks the best, fits on training data, and evaluates on validation.
ml_quick(data, target, seed)ml_quick(data, target, seed)
data |
A data.frame with features and target |
target |
Target column name |
seed |
Random seed |
A list with model (ml_model), metrics (ml_metrics),
and split (ml_split_result).
result <- ml_quick(iris, "Species", seed = 42) result$model result$metricsresult <- ml_quick(iris, "Species", seed = 42) result$model result$metrics
Produces a self-contained HTML report with model metadata, evaluation metrics, and feature importances. Open in any browser.
ml_report(model, data = NULL, path = "model_report.html")ml_report(model, data = NULL, path = "model_report.html")
model |
An |
data |
A data.frame for computing metrics (use validation data) |
path |
Output file path. Default: |
The path to the saved report (invisibly)
s <- ml_split(iris, "Species", seed = 42) model <- ml_fit(s$train, "Species", algorithm = "random_forest", seed = 42) tmp <- tempfile(fileext = ".html") ml_report(model, data = s$valid, path = tmp) unlink(tmp)s <- ml_split(iris, "Species", seed = 42) model <- ml_fit(s$train, "Species", algorithm = "random_forest", seed = 42) tmp <- tempfile(fileext = ".html") ml_report(model, data = s$valid, path = tmp) unlink(tmp)
Saves an ml_model or ml_tuning_result to a .mlr file using
saveRDS with a version wrapper.
ml_save(model, path)ml_save(model, path)
model |
An |
path |
File path (recommended extension: |
The normalized path, invisibly.
ml_load() uses readRDS() internally, which can execute
arbitrary R code during deserialization. Never load .mlr files
from untrusted sources.
s <- ml_split(iris, "Species", seed = 42) model <- ml_fit(s$train, "Species", seed = 42) path <- file.path(tempdir(), "iris_model.mlr") ml_save(model, path) loaded <- ml_load(path)s <- ml_split(iris, "Species", seed = 42) model <- ml_fit(s$train, "Species", seed = 42) path <- file.path(tempdir(), "iris_model.mlr") ml_save(model, path) loaded <- ml_load(path)
Fits every available algorithm on the training data and ranks by validation performance. Use this to identify promising candidates before tuning.
ml_screen( data, target, algorithms = NULL, seed = NULL, sort_by = "auto", time_budget = NULL, keep_models = TRUE, ... )ml_screen( data, target, algorithms = NULL, seed = NULL, sort_by = "auto", time_budget = NULL, keep_models = TRUE, ... )
data |
An |
target |
Target column name |
algorithms |
Character vector of algorithm names, or NULL for all available |
seed |
Random seed. NULL auto-generates. |
sort_by |
"auto" (roc_auc for binary clf, f1_macro for multiclass, rmse for regression), or a metric name string |
time_budget |
Maximum seconds for entire screen. Stops between algorithms (not mid-fit) when budget exceeded. NULL (default) = no limit. |
keep_models |
If FALSE, discard fitted models after scoring to save memory. ml_best() will return NULL. Default TRUE. |
... |
Additional arguments passed to |
Multiple comparison bias: Selecting the best from N algorithms on the
same validation set produces optimistic estimates. The winning algorithm
benefits from selection bias. Use ml_validate() on held-out test data
for trustworthy comparisons.
For imbalanced data, consider sort_by = "f1" — the default roc_auc
can hide failures on minority classes.
An object of class ml_leaderboard (data.frame with formatted print)
s <- ml_split(iris, "Species", seed = 42) lb <- ml_screen(s, "Species", seed = 42) lbs <- ml_split(iris, "Species", seed = 42) lb <- ml_screen(s, "Species", seed = 42) lb
Evaluates the model on new labeled data and compares performance to the model's original training metrics. Requires ground truth labels.
ml_shelf(model, new, target, tolerance = 0.05)ml_shelf(model, new, target, tolerance = 0.05)
model |
An |
new |
A data.frame — new labeled dataset including the target column |
target |
Name of the target column in |
tolerance |
Allowed degradation per metric (default 0.05 = 5pp). Any key metric degrading beyond tolerance marks the model as stale. |
Run this when outcome labels become available (e.g., daily/weekly batch
scoring, then wait for outcomes). Pair with ml_drift() for complete
monitoring:
ml_drift(): input distribution shift (label-free, run always)
ml_shelf(): performance degradation (needs labels, run periodically)
Requires model$scores_ from a cross-validated fit. If the model was
trained on a holdout split (no CV), scores_ will be NULL and shelf()
raises a model_error.
An object of class ml_shelf_result with:
$fresh: TRUE if model performance is within tolerance
$stale: inverse of fresh
$metrics_then: original training metrics (from model$scores_)
$metrics_now: current metrics on new data
$degradation: per-metric delta (negative = worse for higher-is-better)
$recommendation: human-readable guidance
cv <- ml_split(iris, "Species", seed = 42, folds = 3) model <- ml_fit(cv, "Species", algorithm = "logistic", seed = 42) # Simulate a new labeled batch new_batch <- iris[sample(nrow(iris), 30), ] result <- ml_shelf(model, new = new_batch, target = "Species") result$fresh result$degradationcv <- ml_split(iris, "Species", seed = 42, folds = 3) model <- ml_fit(cv, "Species", algorithm = "logistic", seed = 42) # Simulate a new labeled batch new_batch <- iris[sample(nrow(iris), 30), ] result <- ml_shelf(model, new = new_batch, target = "Species") result$fresh result$degradation
Three-way split is the default (60/20/20), following Hastie, Tibshirani, and Friedman (2009, ISBN:978-0-387-84857-0) Chapter 7. Automatically stratifies for classification.
ml_split( data, target = NULL, seed = NULL, ratio = c(0.6, 0.2, 0.2), folds = NULL, stratify = TRUE, task = "auto", time = NULL, groups = NULL )ml_split( data, target = NULL, seed = NULL, ratio = c(0.6, 0.2, 0.2), folds = NULL, stratify = TRUE, task = "auto", time = NULL, groups = NULL )
data |
A data.frame (also accepts tibble or data.table) |
target |
Target column name (enables stratification + task detection) |
seed |
Random seed. NULL (default) auto-generates and stores for reproducibility. Pass an integer for reproducible splits. |
ratio |
Numeric vector of length 3: c(train, valid, test). Must sum to 1.0. |
folds |
Integer for k-fold CV (e.g., |
stratify |
Logical. Auto-stratify for classification targets (default TRUE). |
task |
"auto", "classification", or "regression". Override task detection. |
time |
Column name for temporal/chronological split. Data is sorted by
this column, and the time column is dropped from output. Deterministic
(seed is ignored). Cannot combine with |
groups |
Column name for group-aware split. No group appears in both
train and validation/test. Cannot combine with |
An ml_split_result. Access $train, $valid, $test,
$dev (train + valid). When folds is set, also $folds (CV on dev).
s <- ml_split(iris, "Species", seed = 42) nrow(s$train) nrow(s$dev)s <- ml_split(iris, "Species", seed = 42) nrow(s$train) nrow(s$dev)
Domain specialization of ml_split() for clinical trials, repeated measures,
and any data where observations are nested within groups (patients, subjects,
hospitals). No group appears in more than one partition.
ml_split_group( data, target = NULL, groups, seed = NULL, ratio = c(0.6, 0.2, 0.2), folds = NULL, stratify = TRUE, task = "auto" )ml_split_group( data, target = NULL, groups, seed = NULL, ratio = c(0.6, 0.2, 0.2), folds = NULL, stratify = TRUE, task = "auto" )
data |
A data.frame |
target |
Target column name (optional, enables stratification) |
groups |
Column name identifying groups |
seed |
Random seed for reproducibility |
ratio |
Numeric vector c(train, valid, test). Must sum to 1.0. |
folds |
Integer for group CV. When set, ignores ratio. |
stratify |
Logical. Stratify by target within groups (default TRUE). |
task |
"auto", "classification", or "regression" |
Also covers Leave-Source-Out CV: when groups represent data sources (hospitals, devices), this produces deployment-realistic evaluation.
An ml_split_result. When folds is set, includes $folds and $test.
df <- data.frame(pid = rep(1:10, each = 5), x = rnorm(50), y = sample(0:1, 50, TRUE)) s <- ml_split_group(df, "y", groups = "pid", seed = 42) nrow(s$train)df <- data.frame(pid = rep(1:10, each = 5), x = rnorm(50), y = sample(0:1, 50, TRUE)) s <- ml_split_group(df, "y", groups = "pid", seed = 42) nrow(s$train)
Domain specialization of ml_split() for time series and forecasting.
Data is sorted by the time column and partitioned by position.
Deterministic: seed is ignored (chronological order is the only order).
ml_split_temporal( data, target = NULL, time, ratio = c(0.6, 0.2, 0.2), folds = NULL, task = "auto" )ml_split_temporal( data, target = NULL, time, ratio = c(0.6, 0.2, 0.2), folds = NULL, task = "auto" )
data |
A data.frame |
target |
Target column name (optional, enables task detection) |
time |
Column name containing timestamps or orderable values. Used for sorting, then dropped from output partitions. |
ratio |
Numeric vector c(train, valid, test). Must sum to 1.0. |
folds |
Integer for temporal CV (expanding window). When set, ignores ratio. |
task |
"auto", "classification", or "regression" |
An ml_split_result. When folds is set, includes $folds and $test.
df <- data.frame(date = 1:100, x = rnorm(100), y = sample(0:1, 100, TRUE)) s <- ml_split_temporal(df, "y", time = "date") nrow(s$train)df <- data.frame(date = 1:100, x = rnorm(100), y = sample(0:1, 100, TRUE)) s <- ml_split_temporal(df, "y", time = "date") nrow(s$train)
Trains a stacking ensemble with out-of-fold meta-features. Base models generate out-of-fold predictions, which are used to train a meta-learner.
ml_stack(data, target, models = NULL, meta = NULL, cv_folds = 5L, seed = NULL)ml_stack(data, target, models = NULL, meta = NULL, cv_folds = 5L, seed = NULL)
data |
A data.frame with features and target |
target |
Target column name |
models |
Character vector of base algorithm names, or NULL for defaults |
meta |
Meta-learner algorithm. Default: "logistic" (classification) or "linear" (regression) |
cv_folds |
Number of CV folds for generating out-of-fold predictions |
seed |
Random seed |
Note: This function uses global normalization (not per-fold), because the stacking CV is internal to the meta-learner training. This is the one exception to the per-fold normalization rule.
An ml_model with $is_stacked = TRUE
s <- ml_split(iris, "Species", seed = 42) stacked <- ml_stack(s$train, "Species", seed = 42) predict(stacked, s$valid)s <- ml_split(iris, "Species", seed = 42) stacked <- ml_stack(s$train, "Species", seed = 42) predict(stacked, s$valid)
Tune hyperparameters via random or grid search
ml_tune( data, target, model = NULL, algorithm = NULL, n_trials = 20L, cv_folds = 3L, method = "random", seed = NULL, params = NULL )ml_tune( data, target, model = NULL, algorithm = NULL, n_trials = 20L, cv_folds = 3L, method = "random", seed = NULL, params = NULL )
data |
A data.frame or |
target |
Target column name |
model |
An |
algorithm |
Algorithm name (if model is NULL) |
n_trials |
Number of random search trials (default 20) |
cv_folds |
Number of CV folds per trial (default 3) |
method |
"random" (default) or "grid" |
seed |
Random seed |
params |
Named list of parameter ranges (overrides defaults). For numeric ranges, provide a 2-element numeric vector c(min, max). For discrete, provide a character/integer vector. |
An object of class ml_tuning_result
s <- ml_split(iris, "Species", seed = 42) tuned <- ml_tune(s$train, "Species", algorithm = "xgboost", n_trials = 5, seed = 42) tuned$best_params_s <- ml_split(iris, "Species", seed = 42) tuned <- ml_tune(s$train, "Species", algorithm = "xgboost", n_trials = 5, seed = 42) tuned$best_params_
Three modes: (1) absolute rules, (2) regression prevention vs baseline, (3) combined. Returns a structured result with pass/fail and diagnostics.
ml_validate(model, test, rules = NULL, baseline = NULL, tolerance = 0)ml_validate(model, test, rules = NULL, baseline = NULL, tolerance = 0)
model |
An |
test |
Test data.frame (use |
rules |
Named list of threshold strings, e.g.
|
baseline |
An |
tolerance |
Numeric. Allowed absolute degradation (0.02 = 2pp slack). Default 0.0. |
Tolerance is absolute (not relative): a tolerance of 0.02 means 2 percentage points of allowed degradation, applied uniformly across all metrics.
An object of class ml_validate_result
s <- ml_split(iris, "Species", seed = 42) model <- ml_fit(s$train, "Species", seed = 42) gate <- ml_validate(model, test = s$test, rules = list(accuracy = ">0.80")) gate$passeds <- ml_split(iris, "Species", seed = 42) model <- ml_fit(s$train, "Species", seed = 42) gate <- ml_validate(model, test = s$test, rules = list(accuracy = ">0.80")) gate$passed
Checks provenance chain: split parameters -> training fingerprint -> assess ceremony status. Catches accidental self-deception (load-assess loops, test-set shopping) rather than adversarial tampering.
ml_verify(model)ml_verify(model)
model |
An |
A list with status ("verified"/"unverified"/"warning"),
checks, provenance, and assess_count.
s <- ml_split(iris, "Species", seed = 42) model <- ml_fit(s$train, "Species", seed = 42) report <- ml_verify(model) report$statuss <- ml_split(iris, "Species", seed = 42) model <- ml_fit(s$train, "Species", seed = 42) report <- ml_verify(model) report$status
Predict from a fitted model
Predict from an ml_model
## S3 method for class 'ml_model' predict(object, newdata, proba = FALSE, ...) ## S3 method for class 'ml_model' predict(object, newdata, proba = FALSE, ...)## S3 method for class 'ml_model' predict(object, newdata, proba = FALSE, ...) ## S3 method for class 'ml_model' predict(object, newdata, proba = FALSE, ...)
object |
An ml_model object |
newdata |
A data.frame |
proba |
Logical. If TRUE, returns class probabilities (classification only) |
... |
Ignored |
A vector of predicted class labels (classification) or numeric values
(regression). If proba = TRUE, returns a data.frame with one column per
class; values are probabilities summing to 1.0 per row.
Predicted labels (classification) or numeric values (regression).
If proba = TRUE, a data.frame of probabilities.
s <- ml_split(iris, "Species", seed = 42) model <- ml_fit(s$train, "Species", seed = 42) preds <- predict(model, newdata = s$valid) head(preds)s <- ml_split(iris, "Species", seed = 42) model <- ml_fit(s$train, "Species", seed = 42) preds <- predict(model, newdata = s$valid) head(preds)
Predict from best model in a tuning result
## S3 method for class 'ml_tuning_result' predict(object, newdata, ...)## S3 method for class 'ml_tuning_result' predict(object, newdata, ...)
object |
An ml_tuning_result |
newdata |
A data.frame |
... |
Passed to predict.ml_model |
Predictions
Print ml_cv_result
## S3 method for class 'ml_cv_result' print(x, ...)## S3 method for class 'ml_cv_result' print(x, ...)
x |
An ml_cv_result object |
... |
Ignored |
The object x, invisibly.
Print ml_drift_result
## S3 method for class 'ml_drift_result' print(x, ...)## S3 method for class 'ml_drift_result' print(x, ...)
x |
An ml_drift_result object |
... |
Ignored |
The object x, invisibly.
Print ml_embedder
## S3 method for class 'ml_embedder' print(x, ...)## S3 method for class 'ml_embedder' print(x, ...)
x |
An ml_embedder object |
... |
Ignored |
The object x, invisibly.
Print ml_evidence
## S3 method for class 'ml_evidence' print(x, ...)## S3 method for class 'ml_evidence' print(x, ...)
x |
An ml_evidence object |
... |
Ignored |
The object x, invisibly.
Print ml_explanation
## S3 method for class 'ml_explanation' print(x, ...)## S3 method for class 'ml_explanation' print(x, ...)
x |
An ml_explanation object |
... |
Ignored |
The object x, invisibly.
Print ml_leaderboard
## S3 method for class 'ml_leaderboard' print(x, ...)## S3 method for class 'ml_leaderboard' print(x, ...)
x |
An ml_leaderboard object |
... |
Ignored |
The object x, invisibly.
Print ml_metrics
## S3 method for class 'ml_metrics' print(x, ...)## S3 method for class 'ml_metrics' print(x, ...)
x |
An ml_metrics object |
... |
Ignored |
The object x, invisibly.
Print an ml_model
## S3 method for class 'ml_model' print(x, ...)## S3 method for class 'ml_model' print(x, ...)
x |
An ml_model object |
... |
Ignored |
The object x, invisibly.
Print ml_profile_result
## S3 method for class 'ml_profile_result' print(x, ...)## S3 method for class 'ml_profile_result' print(x, ...)
x |
An ml_profile_result object |
... |
Ignored |
The object x, invisibly.
Print ml_shelf_result
## S3 method for class 'ml_shelf_result' print(x, ...)## S3 method for class 'ml_shelf_result' print(x, ...)
x |
An ml_shelf_result object |
... |
Ignored |
The object x, invisibly.
Print an ml_split_result
## S3 method for class 'ml_split_result' print(x, ...)## S3 method for class 'ml_split_result' print(x, ...)
x |
An ml_split_result object |
... |
Ignored |
The object x, invisibly.
Print an ml_tuning_result
## S3 method for class 'ml_tuning_result' print(x, ...)## S3 method for class 'ml_tuning_result' print(x, ...)
x |
An ml_tuning_result object |
... |
Ignored |
The object x, invisibly.
Print ml_validate_result
## S3 method for class 'ml_validate_result' print(x, ...)## S3 method for class 'ml_validate_result' print(x, ...)
x |
An ml_validate_result object |
... |
Ignored |
The object x, invisibly.