Releases: easystats/datawizard
datawizard 1.3.0
BREAKING CHANGES
-
Argument
values_fillindata_to_wide()is now defunct, because it did not
work as intended (#645). -
data_to_wide()no longer removes empty columns that were created after
widening data frames, to behave similarly totidyr::pivot_wider()(#645).
CHANGES
-
data_tabulate()now saves the table of proportions for crosstables as
attribute, accessible via the newas.prop.table()method (#656). -
Due to changes in the package
insight,data_tabulate()no longer prints
decimals when all values in a column are integers (#641). -
Argument
values_fromindata_to_wide()now supports select-helpers like
theselectargument in other{datawizard}functions (#645). -
Added a
display()method fordata_codebook()(#646). -
display()methods now support the{tinytable}package. Useformat = "tt"
to export tables astinytableobjects (#646). -
Improved performance for several functions that process grouped data frames
when the input is a groupedtibble(#651).
BUG FIXES
-
Fixed an issue when
demean()ing nested structures with more than 2 grouping
variables (#635). -
Fixed an issue when
demean()ing crossed structures with more than 2 grouping
variables (#638). -
Fixed issue in
data_to_wide()with multiple variables assigned in
values_fromwhen IDs were not balanced (equally spread across observations)
(#644). -
Fixed issue in
data_replicate()when data frame had only one column to
replicate (#654).
datawizard 1.2.0
BREAKING CHANGES
- The following deprecated arguments have been removed (#603):
drop_naindata_match()safe,pattern, andverboseindata_rename()
CHANGES
-
data_read()anddata_write()now support the.parquetfile format, via
the nanoparquet package (#625). -
data_tabulate()gets adisplay()method (#627). -
data_tabulate()gets anas.table()method to coerce the frequency or
contingency table into a (list of)table()object(s). This can be useful for
further statistical analysis, e.g. in combination withchisq.test()(#629). -
The
print()method fordata_tabulate()now appears in the documentation,
making thebig_markargument visible (#627).
BUG FIXES
-
Fixed an issue when printing cross tables using
data_tabulate(by = ...),
which was caused by the recent changes ininsight::export_table(). -
Fixed another issue when printing cross tables using
data_tabulate(by = ...),
when more than one variable was selected forselect(#630). -
Fixed typo in the documentation of
data_match().
datawizard 1.1.0
BREAKING CHANGES
-
data_read()now also returns Bayesian models from packages brms and
rstanarm as original model objects, and no longer coerces them into data
frames (#606). -
The output format of
describe_distribution()on grouped data has changed.
Before, it printed one table per group combination. Now, it prints a single
table with group columns at the start (#610). -
The output format of
describe_distribution()when confidence intervals are
requested has changed. Now, for each centrality measure a confidence interval
is calculated (#617). -
data_modify()now always uses values of a vector for a modified or newly
created variable, and no longer tries to detect whether a character value
possibly contains an expression. To allow expression provided as string (or
character vectors), use the helper-functionas_expr(). Only literal
expressions or strings wrapped inas_expr()will be evaluated as
expressions, everything else will be treated as vector with values for new
variables (#605).
CHANGES
-
display()is now re-exported from package insight. -
data_read()anddata_write()now rely on base-R functions for files of
type.rds,.rdaor.rdata. Thus, package rio is no longer required
to be installed for these file types (#607). -
data_codebook()gives an informative warning when no column names matched
the selection pattern (#601). -
data_to_long()now errors when columns selected to reshape do not exist in
the data, to avoid nonsensical results that could be missed (#602). -
New argument
byindescribe_distribution()(#604). -
describe_distribution()now gives informative errors when column names
in the input data frame conflict with column from the output table (#612). -
The methods for
parameters_distributionobjects are now defined in
datawizard(they were previously inparameters) (#613).
BUG FIXES
-
Fixed bug in
data_to_wide(), where new column names innames_fromwere
ignored when that column only contained one unique value. -
Fixed bug in
describe_distribution()when some group combinations
didn't appear in the data (#609). -
Fixed bug in
describe_distribution()when more than one value for the
centralityargument were specified (#617). -
Fixed bug in
describe_distribution()where settingverbose = FALSE
didn't hide some warnings (#617). -
Fixed warning in
data_summary()when a variable had the same name as
another object in the global environment (#585).
datawizard 1.0.2
BUG FIXES
- Fixed failing R CMD check on ATLAS, noLD, and OpenBLAS due to small numerical
differences (#592).
datawizard 1.0.1
BUG FIXES
-
Fixed issue in
data_arrange()for data frames that only had one column.
Formerly, the data frame was coerced into a vector, now the data frame class
is preserved. -
Fixed issue in R-devel (4.5.0) due to a change in how
grep()handles logical
arguments with missing values (#588).
datawizard 1.0.0
BREAKING CHANGES AND DEPRECATIONS
-
datawizard now requires R >= 4.0 (#515).
-
Argument
drop_naindata_match()is deprecated now. Please use
remove_nainstead (#556). -
In
data_rename()(#567):- argument
patternis deprecated. Useselectinstead. - argument
safeis deprecated. The function now errors whenselect
contains unknown column names. - when
replacementisNULL, an error is now thrown (previously, column
indices were used as new names). - if
select(previouslypattern) is a named vector, then all elements
must be named, e.g.c(length = "Sepal.Length", "Sepal.Width")errors.
- argument
-
Order of arguments
byandprobability_weightsinrescale_weights()has
changed, because formethod = "kish", thebyargument is optional (#575). -
The name of the rescaled weights variables in
rescale_weights()have been
renamed.pweights_aandpweights_bare now namedrescaled_weights_a
andrescaled_weights_b(#575). -
print()methods fordata_tabulate()with multiple sub-tables (i.e. when
length ofbywas > 1) were revised. Now, an integrated table instead of
multiple tables is returned. Furthermore,print_html()did not work, which
was also fixed now (#577). -
demean()(anddegroup()) gets anappendargument that defaults toTRUE,
to append the centered variables to the original data frame, instead of
returning the de- and group-meaned variables only. Useappend = FALSEto
for the previous default behaviour (i.e. only returning the newly created
variables) (#579).
CHANGES
-
rescale_weights()gets amethodargument, to choose method to rescale
weights. Options are"carle"(the default) and"kish"(#575). -
The
selectargument, which is available in different functions to select
variables, can now also be a character vector with quoted variable names,
including a colon to indicate a range of several variables (e.g."cyl:gear")
(#551). -
New function
row_sums(), to calculate row sums (optionally with minimum
amount of valid values), as complement torow_means()(#552). -
New function
row_count(), to count specific values row-wise (#553). -
data_read()no longer shows warning about forthcoming breaking changes
in upstream packages when reading.RDatafiles (#557). -
data_modify()now recognizesn(), for example to create an index for data
groups with1:n()(#535). -
The
replacementargument indata_rename()now supports glue-styled
tokens (#563). -
data_summary()also accepts the results ofbayestestR::ci()as summary
function (#483). -
ranktransform()has a new argumentzerosto determine how zeros should be
handled whensign = TRUE(#573).
BUG FIXES
datawizard 0.13.0
BREAKING CHANGES
-
data_rename()now errors when thereplacementargument containsNAvalues
or empty strings (#539). -
Removed deprecated functions
get_columns(),data_find(),format_text()(#546). -
Removed deprecated arguments
groupandna.rmin multiple functions. Usebyandremove_nainstead (#546). -
The default value for the argument
dummy_factorsinto_numeric()has
changed fromTRUEtoFALSE(#544).
CHANGES
-
The
patternargument indata_rename()can also be a named vector. In this
case, names are used as values for thereplacementargument (i.e.pattern
can be a character vector using<new name> = "<old name>"). -
categorize()gains a newbreaksargument, to decide whether breaks are
inclusive or exclusive (#548). -
The
labelsargument incategorize()gets two new options,"range"and
"observed", to use the range of categorized values as labels (i.e. factor
levels) (#548). -
Minor additions to
reshape_ci()to work with forthcoming changes in the
{bayestestR}package.
datawizard 0.12.3
CHANGES
-
demean()(anddegroup()) now also work for nested designs, if argument
nested = TRUEandbyspecifies more than one variable (#533). -
Vignettes are no longer provided in the package, they are now only available
on the website. There is only one "Overview" vignette available in the package,
it contains links to the other vignettes on the website. This is because there
are CRAN errors occurring when building vignettes on macOS and we couldn't
determine the cause after multiple patch releases (#534).
datawizard 0.12.2
- Remove
htmltoolsfromSuggestsin an attempt of fixing an error in CRAN
checks due to failures to build a vignette (#528).
datawizard 0.12.0
BREAKING CHANGES
-
The argument
include_naindata_tabulate()anddata_summary()has been
renamed intoremove_na. Consequently, to mimic former behaviour,FALSEand
TRUEneed to be switched (i.e.remove_na = TRUEis equivalent to the former
include_na = FALSE). -
Class names for objects returned by
data_tabulate()have been changed to
datawizard_tableanddatawizard_crosstable(resp. the plural forms,
*_tables), to provide a clearer and more consistent naming scheme.
CHANGES
-
data_select()can directly rename selected variables when a named vector
is provided inselect, e.g.data_select(mtcars, c(new1 = "mpg", new2 = "cyl")). -
data_tabulate()gains anas.data.frame()method, to return the frequency
table as a data frame. The structure of the returned object is a nested data
frame, where the first column contains name of the variable for which
frequencies were calculated, and the second column contains the frequency table. -
demean()(anddegroup()) now also work for cross-classified designs, or
more generally, for data with multiple grouping or cluster variables (i.e.
bycan now specify more than one variable).