KRLSThis vignette is the five-minute version: how to fit, inspect, and plot a kernel-regularized least squares (KRLS) regression. For the methodological background and pedagogy, see the explainer on the project site.
KRLS estimates a flexible regression surface without assuming linearity, additivity, or pre-specified interactions, and it returns pointwise marginal effects — one partial derivative per observation — with closed-form analytical standard errors. That’s the unique contribution: where OLS gives you one slope per predictor, KRLS gives you a distribution of slopes across the covariate space.
Build a dataset where the marginal effect of x1
varies across the covariate space. OLS will average it down to
a single slope and miss the heterogeneity entirely.
The formula interface is new in KRLS 1.2-0:
The matrix interface — krls(X = cbind(x1, x2), y = y) —
works exactly the same way and is preserved unchanged from earlier
releases.
library(generics)
#>
#> Attaching package: 'generics'
#> The following objects are masked from 'package:base':
#>
#> as.difftime, as.factor, as.ordered, intersect, is.element, setdiff,
#> setequal, union
tidy(fit)
#> term estimate std.error statistic p.value conf.low conf.high
#> 1 x1 0.5184107 0.02599911 19.939556 3.091001e-49 0.4671400 0.5696814
#> 2 x2 0.2611639 0.02853130 9.153593 6.901213e-17 0.2048997 0.3174282
#> q25 median q75 binary
#> 1 0.19724353 0.4375873 0.7898796 FALSE
#> 2 0.07743713 0.2845457 0.4579881 FALSE
glance(fit)
#> nobs n_predictors r.squared loo_mse lambda sigma eff_df approx
#> 1 200 2 0.8755594 657.6741 0.4049892 2.45606 14.25015 none
#> nystrom_m inference
#> 1 NA exacttidy() returns one row per predictor with the
average marginal effect (estimate), its
analytical standard error, a 95% confidence interval, and quartiles of
the pointwise derivatives.
That last block is the headline diagnostic. The AME of
x1 averages \(\sin'(x_1) =
\cos(x_1)\) across a symmetric interval and so is close to zero.
But the pointwise marginal effects range from roughly −1 to +1
— exactly the heterogeneity the simulation built in. The quartiles tell
that story; OLS would not.
glance() is a one-row summary of the fit:
nobs, n_predictors, r.squared,
leave-one-out MSE, the regularization parameter lambda, the
kernel bandwidth sigma, and the effective degrees of
freedom from the kernel ridge.
autoplot() shows the per-predictor distribution of
pointwise marginal effects with the AME overlaid in blue:
The base-graphics plot(fit) is the same idea, with
optional “profile” plots that hold one predictor fixed and vary
another.
augment() joins fitted values, residuals, and pointwise
marginal-effect columns back to the original data — useful when you want
to feed the results into another model or visualization:
aug <- augment(fit, data = df)
head(aug, 3)
#> y x1 x2 .fitted .resid .dy_d_x1 .dy_d_x2
#> 1 -0.68353586 -0.9379653 -0.92996717 -0.8343403 0.1508045 0.5742612 0.05434422
#> 2 0.01717233 -0.5115044 -1.12541886 -0.5805310 0.5977033 0.6435138 0.11918619
#> 3 0.79687657 0.2914135 0.06718735 0.4516485 0.3452281 0.9391226 0.35127167The columns .dy_d_x1 and .dy_d_x2 carry the
pointwise derivatives, one row per observation, ready for
dplyr / ggplot2 / data.table
workflows.
?krls, ?summary.krls,
?predict.krls, ?plot.krls.tidy(), glance(), augment(),
autoplot() are discoverable via library(broom)
/ library(ggplot2).