Title: | Sparse Wrapper Algorithm |
---|---|
Description: | An algorithm that trains a meta-learning procedure that combines screening and wrapper methods to find a set of extremely low-dimensional attribute combinations. This package works on top of the 'caret' package and proceeds in a forward-step manner. More specifically, it builds and tests learners starting from very few attributes until it includes a maximal number of attributes by increasing the number of attributes at each step. Hence, for each fixed number of attributes, the algorithm tests various (randomly selected) learners and picks those with the best performance in terms of training error. Throughout, the algorithm uses the information coming from the best learners at the previous step to build and test learners in the following step. In the end, it outputs a set of strong low-dimensional learners. |
Authors: | Samuel Orso [aut, cre], Gaetan Bakalli [aut], Cesare Miglioli [aut], Stephane Guerrier [ctb], Roberto Molinari [ctb] |
Maintainer: | Samuel Orso <[email protected]> |
License: | GPL (>= 2) |
Version: | 0.1.1 |
Built: | 2025-01-10 05:18:49 UTC |
Source: | https://github.com/smac-group/swag |
Gives predictions for different train
learners obtained by swag
.
## S3 method for class 'swag' predict( object, newdata = NULL, type = c("best", "cv_performance", "attribute"), cv_performance = NULL, attribute = NULL, ... )
## S3 method for class 'swag' predict( object, newdata = NULL, type = c("best", "cv_performance", "attribute"), cv_performance = NULL, attribute = NULL, ... )
object |
An object of class |
newdata |
an optional set of data to predict on. If |
type |
type of prediction required. The default is "best", it takes
the best model (with lowest CV errors). The option "cv_performance"
(which requires |
cv_performance |
a level of CV errors (between 0 and 1) combines with
|
attribute |
an attribute combines with |
... |
Not used for the moment. |
Currently the different train
learners are trained (again) to make the predictions.
Predictions .
Gaetan Bakalli, Samuel Orso and Cesare Miglioli
The function return a list that contains beta_models_df and the swag_summary object. The beta_models_df is a dataframe where the columns are all of the selected variables from the summary.swag objects, and where each row are the estimated coefficients from a selected models using the classical lm procedure.
return_glm_beta_selected_models(swag_summary)
return_glm_beta_selected_models(swag_summary)
swag_summary |
A |
Gaetan Bakalli, Samuel Orso, Cesare Miglioli and Lionel Voirol
The function return a list that contains beta_models_df and the swag_summary object. The beta_models_df is a dataframe where the columns are all of the selected variables from the summary.swag objects, and where each row are the estimated coefficients from a selected models using the classical lm procedure.
return_lm_beta_selected_models(swag_summary)
return_lm_beta_selected_models(swag_summary)
swag_summary |
A |
Gaetan Bakalli, Samuel Orso, Cesare Miglioli and Lionel Voirol
Method 'summary' that returns the number and proportion of appearance of each variables on a subset of selected model. The selection procedure of models proceed in two steps. First we select an explored dimension in which the 'mean', 'min' or 'median' is the lowest. We then compute the selected percentile of the CV error on this dimension. We then select all models in all explored dimensions that have a lower CV error than the CV value set by this two-steps procedure.
## S3 method for class 'swag' summary( object, min_dim_method = "median", min_dim_min_cv_error_quantile = 0.01, ... )
## S3 method for class 'swag' summary( object, min_dim_method = "median", min_dim_min_cv_error_quantile = 0.01, ... )
object |
A |
min_dim_method |
A |
min_dim_min_cv_error_quantile |
The quantile of CV error in the selected dimension to specify the minimum CV value for selected models. |
... |
additional arguments affecting the summary produced. |
Gaetan Bakalli, Samuel Orso, Cesare Miglioli and Lionel Voirol
swag
is used to trains a meta-learning procedure that combines
screening and wrapper methods to find a set of extremely low-dimensional attribute
combinations. swag
works on top of the caret package and proceeds in a
forward-step manner.
swag( x, y, control = swagControl(), auto_control = T, caret_args_dyn = NULL, metric = NULL, ... )
swag( x, y, control = swagControl(), auto_control = T, caret_args_dyn = NULL, metric = NULL, ... )
x |
A |
y |
A |
control |
see |
auto_control |
A |
caret_args_dyn |
If not null, a function that can modify arguments for
|
metric |
A |
... |
Arguments to be passed to |
Currently we expect the user to replace ...
with the arguments one would
use for train
. This requires to know how to use train
function. If ...
is left unspecified, default values of train
are used. But this might lead to unexpected results.
The function caret_args_dyn
is expected to take as a first
argument a list
with all arguments for train
and as a second argument the number of attributes (see examples in the vignette).
More specifically, swag
builds and tests learners starting
from very few attributes until it includes a maximal number of attributes by
increasing the number of attributes at each step. Hence, for each fixed number
of attributes, the algorithm tests various (randomly selected) learners and
picks those with the best performance in terms of training error. Throughout,
the algorithm uses the information coming from the best learners at the previous
step to build and test learners in the following step. In the end, it outputs
a set of strong low-dimensional learners. See Molinari et al. (2020) for
more details.
swag
returns an object of class "swag
". It is a list
with the following components:
x |
same as x input |
y |
same as y input |
control |
the control used (see swagControl ) |
CVs |
a list containing cross-validation errors from all trained models |
VarMat |
a list containing information about which models are trained |
cv_alpha |
a vector of size pmax containing the
cross-validation error at alpha (see swagControl ) |
IDs |
a list containing information about trained model
that performs better than corresponding cv_alpha error |
args_caret |
arguments used for train |
args_caret_dyn |
same as args_caret_dyn input |
Gaetan Bakalli, Samuel Orso and Cesare Miglioli
Molinari R, Bakalli G, Guerrier S, Miglioli C, Orso S, Scaillet O (2020). “SWAG: A Wrapper Method for Sparse Learning.” https://arxiv.org/pdf/2006.12837.pdf. Version 1: 23 June 2020, 2006.12837, https://arxiv.org/pdf/2006.12837.pdf.
The Spare Wrapper AlGorithm depends on some meta-parameters that are described below.
swagControl( pmax = 3, m = 100, alpha = 0.05, seed = 163L, verbose = FALSE, verbose_dim_1 = FALSE )
swagControl( pmax = 3, m = 100, alpha = 0.05, seed = 163L, verbose = FALSE, verbose_dim_1 = FALSE )
pmax |
A |
m |
A |
alpha |
A |
seed |
An |
verbose |
A |
verbose_dim_1 |
A |