lambert type transforms again.lambert of type "hh" as option when allow_lambert_h is set to
TRUE, per request from
issue 24dplyr::progress_estimated (#26)bestLogConstant, that uses the same machinery to pick the best
value of a constant to use when logging a variable, e.g. the one that makes
the distribution look the most normal, especially useful for non-positive or
zero-inflated data. Currently experimental.step_orderNorm() to work with parallel processing.step_best_normalize() to work with parallel processing.boxcox in response to issue 10; thank you to Krzysztof Dyba (kadyb) for the suggestions.yeojohnson, thanks to Emil Hvitfeldt (EmilHvitfeldt) for his work on this problem for the recipes package here.tidy method to work more generally, provide easy access to
chosen transformations (responding to issue 9)usethis in response to issue 7n_logit_fit argument, with default of 10000. This should substantially decrease memory use of orderNorm while only minimally affecting the out-of-domain approximations.step_bestNormalize to step_best_normalize, responding to 8LambertW transformation types
(thank you to Georg M. Goerg, the author of LambertW, for pointing this out).center_scale transform as default when standardize == TRUET and F to TRUE and FALSEscales and ggplot2 to visualize all transformations.butcher and axe functionality in order to improve scalability of step_* functionstidy functionality with bestNormalize and step_best_normalizebestNormalizestandardize option from no_transform so x.t always matches input vector.step_bestNormalize and step_orderNorm functions for implementation within recipes.warn = FALSE when calling bestNormalize. If a transformation doesn't work,
warnings will no longer be shown by default unless warn is set to TRUE.plot.bestNormalize which was improperly labeling transformationsexp_x having trouble with standardize option, so added option allow_exp_x to
bestNormalize to allow a workaround, and changed it so if any infinite values
are produced during the transformation, exp_x will not work (that way, bestNormalize
will not include this in its results).quiet is FALSE and length(x) > 2000loo for leave-one-out cross-validationbestNormalize function via allow_lambert_h argument.Added feature to estimate out-of-sample normality statistics in bestNormalize instead of in-sample ones via repeated cross-validation
out_of_sample = FALSE to maintain backward-compatibility with prior versions
and set allow_orderNorm = FALSE as well so that it isn't automatically selectedImproved extrapolation of the ORQ (orderNorm) method
Added plotting feature for transformation objects
Cleared up some documentation