Title: | Handy Tools for TJU/TJUH Employees |
---|---|
Description: | Functions for admin needs of employees of Thomas Jefferson University and Thomas Jefferson University Hospital, Philadelphia, PA. |
Authors: | Tingting Zhan [aut, cre, cph]
|
Maintainer: | Tingting Zhan <[email protected]> |
License: | GPL-2 |
Version: | 0.1.3 |
Built: | 2025-02-17 02:43:44 UTC |
Source: | https://github.com/cran/ThomasJeffersonUniv |
Add conditional and/or marginal probabilities to a two-way contingency table.
addProbs(A, margin = seq_len(nd), fmt = "%d (%.1f%%)")
addProbs(A, margin = seq_len(nd), fmt = "%d (%.1f%%)")
A |
matrix of typeof integer, two-dimensional contingency table. See addmargins |
margin |
integer scalar or vector, see addmargins |
fmt |
character scalar,
C-style string format with a |
Function addProbs provides the joint, marginal (using margin = 1:2
)
and conditional (using margin = 1L
or margin = 2L
)
probabilities of a two-dimensional contingency table.
Function addProbs returns an 'addProbs'
object, which inherits from table and noquote.
margin.table (which is to be renamed as marginSums) is much slower than colSums.
The use of argument margin
is
the same as addmargins,
and different from proportions!
addProbs(table(warpbreaks$tension)) storage.mode(VADeaths) = 'integer' addProbs(VADeaths) addProbs(VADeaths, margin = 1L) rowSums(proportions(VADeaths, margin = 1L)) addmargins(VADeaths, margin = 1L)
addProbs(table(warpbreaks$tension)) storage.mode(VADeaths) = 'integer' addProbs(VADeaths) addProbs(VADeaths, margin = 1L) rowSums(proportions(VADeaths, margin = 1L)) addmargins(VADeaths, margin = 1L)
Number of anniversaries between two dates.
anniversary(to, from)
anniversary(to, from)
to |
an R object convertible to POSIXlt, end date/time |
from |
an R object convertible to POSIXlt, start date/time |
Year difference between from
and to
dates are calculated
In either situation below, subtract one (1) year from the year difference obtained in Step 1.
Month of from
is later than month of to
;
Months of from
and to
are the same, but day of from
is later than day of to
.
In either of such situations, the anniversary of the current year has not been reached.
If any element from Step 2 is negative, stop.
Function anniversary returns an integer scalar or vector.
To create difftime object
with additional time units 'months'
and 'years'
.
asDifftime( tim, units = names(timeUnits()), negative_do = stop(sQuote(deparse1(substitute(tim))), " has negative value!"), ... )
asDifftime( tim, units = names(timeUnits()), negative_do = stop(sQuote(deparse1(substitute(tim))), " has negative value!"), ... )
tim |
numeric or difftime object, similar usage as in function as.difftime |
units |
character scalar,
similar usage as in function as.difftime,
but with additional options |
negative_do |
exception handling
if input |
... |
additional parameters, currently not in use |
Function asDifftime improves function as.difftime in terms that
If input tim
is a difftime object,
function units_difftime<- is called and the unit of tim
is updated.
In function as.difftime, tim
is returned directly, i.e., parameter units
is ignored
Time units 'months'
and 'years'
are supported,
in addition to 'secs'
, 'mins'
, 'hours'
, 'days'
, 'weeks'
supported in function as.difftime.
Moreover, partial matching (via function match.arg) is allowed,
while function as.difftime requires exact matching.
End user may choose to stop if tim
has negative values.
Function as.difftime does not check for negative tim
.
Function asDifftime returns a difftime object.
Potential name clash with function as_difftime
R markdown format of a citation and/or bibentry object.
bibentry2rmd(x = "R")
bibentry2rmd(x = "R")
x |
character scalar,
|
Function bibentry2rmd beautifies the output from
function utils:::format.bibentry
(with option style = 'text'
)
in the following ways.
Line break '\n'
is replaced by a white space;
Fancy quotes ,
,
and
are removed;
doi entries are shown as URLs with labels (in R markdown grammar).
Function bibentry2rmd returns a character scalar or vector.
bibentry2rmd('survival') if (FALSE) { # disabled for ?devtools::check ap = rownames(installed.packages()) lapply(ap, FUN = bibentry2rmd) }
bibentry2rmd('survival') if (FALSE) { # disabled for ?devtools::check ap = rownames(installed.packages()) lapply(ap, FUN = bibentry2rmd) }
Number and percentage of positive counts in a logical vector.
checkCount(x)
checkCount(x)
x |
Function checkCount returns a character scalar.
checkCount(as.logical(infert$case))
checkCount(as.logical(infert$case))
To inspect duplicated records in a data.frame.
checkDuplicated( data, f, dontshow = character(length = 0L), file = tempfile(pattern = "checkDuplicated_", fileext = ".xlsx"), ... )
checkDuplicated( data, f, dontshow = character(length = 0L), file = tempfile(pattern = "checkDuplicated_", fileext = ".xlsx"), ... )
data |
|
f |
formula,
criteria of duplication, e.g.,
use |
dontshow |
(optional) character scalar or vector,
variable names to be omitted in output diagnosis |
file |
character scalar, path of diagnosis file, print out of substantial duplicates |
... |
additional parameters, currently not in use |
Function checkDuplicated returns a data.frame.
(d1 = data.frame(A = c(1, 1), B = c(NA_character_, 'text'))) (d2 = data.frame(A = c(1, 2), B = c(NA_character_, 'text')))
(d1 = data.frame(A = c(1, 1), B = c(NA_character_, 'text'))) (d2 = data.frame(A = c(1, 2), B = c(NA_character_, 'text')))
..
date_difftime_(date_, difftime_, tz = "UTC", tol = sqrt(.Machine$double.eps))
date_difftime_(date_, difftime_, tz = "UTC", tol = sqrt(.Machine$double.eps))
date_ |
an R object containing Date information |
difftime_ |
a difftime object |
tz |
character scalar, time zone, see as.POSIXlt.Date and ISOdatetime |
tol |
numeric scalar, tolerance in finding second.
Default |
Function date_difftime_ returns a POSIXct object.
For now, I do not know how to force function readxl::read_excel
to read a column
as POSIXt.
By default, such column will be read as difftime.
See lubridate:::date.default
for the handling of year and month!
(x = as.Date(c('2022-09-10', '2023-01-01', NA, '2022-12-31'))) y = as.difftime(c(47580.3, NA, 48060, 30660), units = 'secs') units(y) = 'hours' y date_difftime_(x, y)
(x = as.Date(c('2022-09-10', '2023-01-01', NA, '2022-12-31'))) y = as.difftime(c(47580.3, NA, 48060, 30660), units = 'secs') units(y) = 'hours' y date_difftime_(x, y)
Concatenate date and time information from two objects.
date_time_(date_, time_)
date_time_(date_, time_)
date_ |
an R object containing Date information |
time_ |
an R object containing time (POSIXt) information |
Function date_time_ is useful as clinicians may put date and time in different columns.
Function date_time_ returns a POSIXct object.
(today = Sys.Date()) (y = ISOdatetime(year = c(1899, 2010), month = c(12, 3), day = c(31, 22), hour = c(15, 3), min = 2, sec = 1, tz = 'UTC')) date_time_(today, y)
(today = Sys.Date()) (y = ISOdatetime(year = c(1899, 2010), month = c(12, 3), day = c(31, 22), hour = c(15, 3), min = 2, sec = 1, tz = 'UTC')) date_time_(today, y)
Convert between decimal, hexavigesimal in C-style, and hexavigesimal in Excel-style.
Excel2int(x) Excel2C(x)
Excel2int(x) Excel2C(x)
x |
character scalar or vector,
which consists of (except missingness)
only letters |
Convert between decimal, hexavigesimal in C-style, and hexavigesimal in Excel-style.
Decimal | 0 | 1 | 25 | 26 | 27 | 51 | 52 | 676 | 702 | 703 |
Hexavigesimal; C | 0 |
1 |
P |
10 |
11 |
1P |
20 |
100 |
110 |
111 |
Hexavigesimal; Excel | 0 |
A |
Y |
Z |
AA |
AY |
AZ |
YZ |
ZZ |
AAA |
Function Excel2C converts from hexavigesimal in Excel-style to hexavigesimal in C-style.
Function Excel2int converts from hexavigesimal in Excel-style to decimal, using function Excel2C and strtoi.
Function Excel2int returns an integer vector.
Function Excel2C returns a character vector.
http://mathworld.wolfram.com/Hexavigesimal.html
int1 = c(NA_integer_, 1L, 25L, 26L, 27L, 51L, 52L, 676L, 702L, 703L) Excel1 = c(NA_character_, 'A', 'Y', 'Z', 'AA', 'AY', 'AZ', 'YZ', 'ZZ', 'AAA') C1 = c(NA_character_, '1', 'P', '10', '11', '1P', '20', '100', '110', '111') stopifnot(identical(int1, Excel2int(Excel1)), identical(int1, strtoi(C1, base = 26L))) int2 = c(NA_integer_, 1L, 4L, 19L, 37L, 104L, 678L) Excel2 = c(NA_character_, 'a', 'D', 's', 'aK', 'cZ', 'Zb') stopifnot(identical(int2, Excel2int(Excel2))) Excel2C(Excel2) head(swiss[Excel2int('A')])
int1 = c(NA_integer_, 1L, 25L, 26L, 27L, 51L, 52L, 676L, 702L, 703L) Excel1 = c(NA_character_, 'A', 'Y', 'Z', 'AA', 'AY', 'AZ', 'YZ', 'ZZ', 'AAA') C1 = c(NA_character_, '1', 'P', '10', '11', '1P', '20', '100', '110', '111') stopifnot(identical(int1, Excel2int(Excel1)), identical(int1, strtoi(C1, base = 26L))) int2 = c(NA_integer_, 1L, 4L, 19L, 37L, 104L, 678L) Excel2 = c(NA_character_, 'a', 'D', 's', 'aK', 'cZ', 'Zb') stopifnot(identical(int2, Excel2int(Excel2))) Excel2C(Excel2) head(swiss[Excel2int('A')])
To match the rows of one data.frame to the rows of another data.frame.
matchDF( x, table = unique.data.frame(x), by = names(x), by.x = character(), by.table = character(), view.table = character(), trace = FALSE, ... )
matchDF( x, table = unique.data.frame(x), by = names(x), by.x = character(), by.table = character(), view.table = character(), trace = FALSE, ... )
x |
data.frame, the rows of which to be matched. |
table |
data.frame, the rows of which to be matched against. |
by |
|
by.x , by.table
|
|
view.table |
(optional) character scalar or vector,
variable names of |
trace |
logical scalar, to provide detailed diagnosis information, default |
... |
additional parameters, currently not in use |
Function matchDF returns a integer vector
Unfortunately, R does not provide case-insensitive match. Only case-insensitive grep methods are available.
DF = swiss[sample(nrow(swiss), size = 55, replace = TRUE), ] matchDF(DF)
DF = swiss[sample(nrow(swiss), size = 55, replace = TRUE), ] matchDF(DF)
..
mergeDF( x, table, by = character(), by.x = character(), by.table = character(), ... )
mergeDF( x, table, by = character(), by.x = character(), by.table = character(), ... )
x |
data.frame, on which new columns will be added.
All rows of |
table |
data.frame, columns of which will be added to |
by |
|
by.x , by.table
|
|
... |
additional parameters of matchDF |
Function mergeDF returns a data.frame.
We avoid merge.data.frame as much as possible,
because it's slow and
even sort = FALSE
may not completely retain the original order of input x
.
# examples inspired by ?merge.data.frame (authors = data.frame( surname = c('Tukey', 'Venables', 'Tierney', 'Ripley', 'McNeil'), nationality = c('US', 'Australia', 'US', 'UK', 'Australia'), deceased = c('yes', rep('no', 4)))) (books = data.frame( name = c('Tukey', 'Venables', 'Tierney', 'Ripley', 'Ripley', 'McNeil', 'R Core', 'Diggle'), title = c( 'Exploratory Data Analysis', 'Modern Applied Statistics', 'LISP-STAT', 'Spatial Statistics', 'Stochastic Simulation', 'Interactive Data Analysis', 'An Introduction to R', 'Analysis of Longitudinal Data'), other.author = c( NA, 'Ripley', NA, NA, NA, NA, 'Venables & Smith', 'Heagerty & Liang & Scott Zeger'))) (m = mergeDF(books, authors, by.x = 'name', by.table = 'surname')) attr(m, 'nomatch')
# examples inspired by ?merge.data.frame (authors = data.frame( surname = c('Tukey', 'Venables', 'Tierney', 'Ripley', 'McNeil'), nationality = c('US', 'Australia', 'US', 'UK', 'Australia'), deceased = c('yes', rep('no', 4)))) (books = data.frame( name = c('Tukey', 'Venables', 'Tierney', 'Ripley', 'Ripley', 'McNeil', 'R Core', 'Diggle'), title = c( 'Exploratory Data Analysis', 'Modern Applied Statistics', 'LISP-STAT', 'Spatial Statistics', 'Stochastic Simulation', 'Interactive Data Analysis', 'An Introduction to R', 'Analysis of Longitudinal Data'), other.author = c( NA, 'Ripley', NA, NA, NA, NA, 'Venables & Smith', 'Heagerty & Liang & Scott Zeger'))) (m = mergeDF(books, authors, by.x = 'name', by.table = 'surname')) attr(m, 'nomatch')
..
phone10(x, sep = "")
phone10(x, sep = "")
x |
|
sep |
character scalar |
Function phone10 converts all US and Canada (+1) phone numbers to 10-digit.
Function phone10 returns a character vector of nchar-10.
x = c( '+1(800)275-2273', # Apple '1-888-280-4331', # Amazon '000-000-0000' ) phone10(x) phone10(x, sep = '-')
x = c( '+1(800)275-2273', # Apple '1-888-280-4331', # Amazon '000-000-0000' ) phone10(x) phone10(x, sep = '-')
..
rbinds(x, make.row.names = FALSE, ..., .id = "idx")
rbinds(x, make.row.names = FALSE, ..., .id = "idx")
x |
a list of named data.frame |
make.row.names , ...
|
additional parameters of rbind.data.frame |
.id |
character value to specify the name of ID column, nomenclature follows rbindlist |
Yet to look into ggplot2:::rbind_dfs
closely.
Mine is slightly slower than the fastest alternatives, but I have more checks which are useful.
Function rbinds returns a data.frame.
https://stackoverflow.com/questions/2851327/combine-a-list-of-data-frames-into-one-data-frame
x = list(A = swiss[1:3, 1:2], B = swiss[5:9, 1:2]) # list of 'data.frame' rbinds(x) rbinds(x, make.row.names = TRUE)
x = list(A = swiss[1:3, 1:2], B = swiss[5:9, 1:2]) # list of 'data.frame' rbinds(x) rbinds(x, make.row.names = TRUE)
Indices of Stratified Sampling
sample.by.int(f, ...)
sample.by.int(f, ...)
f |
|
... |
potential parameters of sample.int |
End user should use interaction to combine multiple factors.
Function sample.by.int returns an integer vector.
dplyr::slice_sample
id1 = sample.by.int(state.region, size = 2L) state.region[id1] id2 = sample.by.int(f = with(npk, interaction(N, P)), size = 2L) npk[id2, c('N', 'P')] # each combination selected 2x
id1 = sample.by.int(state.region, size = 2L) state.region[id1] id2 = sample.by.int(f = with(npk, interaction(N, P)), size = 2L) npk[id2, c('N', 'P')] # each combination selected 2x
..
sign2( e1, e2, name1 = substitute(e1), name2 = substitute(e2), na.detail = TRUE, ... )
sign2( e1, e2, name1 = substitute(e1), name2 = substitute(e2), na.detail = TRUE, ... )
e1 , e2
|
two R objects, must be both numeric vectors, or ordered factors with the same levels |
name1 , name2
|
|
na.detail |
logical scalar,
whether to provide the missingness details of |
... |
additional parameters, currently not in use |
Function sign2 extends sign in the following ways
Function sign2 returns character vector when na.detail = TRUE
, or
ordered factor when na.detail = FALSE
.
lv = letters[c(1,3,2)] x0 = letters[1:3] x = ordered(sample(x0, size = 100, replace = TRUE), levels = lv) y = ordered(sample(x0, size = 50, replace = TRUE), levels = lv) x < y # base R ok pmax(x, y) # base R okay pmin(x, y) # base R okay x[c(1,3)] = NA y[c(3,5)] = NA table(sign(unclass(y) - unclass(x))) table(sign2(x, y)) table(sign2(x, y, na.detail = FALSE), useNA = 'always')
lv = letters[c(1,3,2)] x0 = letters[1:3] x = ordered(sample(x0, size = 100, replace = TRUE), levels = lv) y = ordered(sample(x0, size = 50, replace = TRUE), levels = lv) x < y # base R ok pmax(x, y) # base R okay pmin(x, y) # base R okay x[c(1,3)] = NA y[c(3,5)] = NA table(sign(unclass(y) - unclass(x))) table(sign2(x, y)) table(sign2(x, y, na.detail = FALSE), useNA = 'always')
source all *.R
and *.r
files under a directory.
sourcePath(path, ...)
sourcePath(path, ...)
path |
character scalar, parent directory of |
... |
additional parameters of source |
Function sourcePath does not have a returned value
split.data.frame into individual rows.
splitDF(x)
splitDF(x)
x |
Function splitDF returns a list of nrow-1 data.frames.
We use split.data.frame with argument f
being attr(x, which = 'row.names', exact = TRUE)
instead of
seq_len(.row_names_info(x, type = 2L))
,
not only because the former is faster, but also .rowNamesDF<- enforces
that row.names.data.frame must be unique.
splitDF(head(mtcars)) # data.frame with rownames splitDF(head(warpbreaks)) # data.frame without rownames splitDF(data.frame()) # exception
splitDF(head(mtcars)) # data.frame with rownames splitDF(head(warpbreaks)) # data.frame without rownames splitDF(data.frame()) # exception
..
subset_(x, subset, select, select_pattern, avoid, avoid_pattern)
subset_(x, subset, select, select_pattern, avoid, avoid_pattern)
x |
|
subset |
logical expression, see function subset.data.frame |
select |
character vector, columns to be selected, see function subset.data.frame |
select_pattern |
regular expression regex for multiple columns to be selected |
avoid |
|
avoid_pattern |
regular expression regex, for multiple columns to be avoided |
Function subset_ is different from subset.data.frame, such that
if both select
and select_pattern
are missing, only variables mentioned in subset
are selected;
be able to select all variables, except those in avoid
and avoid_pattern
;
always returns data.frame, i.e., forces drop = FALSE
Function subset_ returns a data.frame, with additional attributes
attr(,'vline')
integer scalar,
position of a vertical line (see ?flextable::vline
)
attr(,'jhighlight)'
character vector,
names of columns to be flextable::highlight
ed.
subset_(trees, Girth > 9 & Height < 70) subset_(swiss, Fertility > 80, avoid = 'Catholic') subset_(warpbreaks, wool == 'K')
subset_(trees, Girth > 9 & Height < 70) subset_(swiss, Fertility > 80, avoid = 'Catholic') subset_(warpbreaks, wool == 'K')
Create right-censored Surv object using start, stop and censoring dates.
Surv_3Date(start, stop, censor, units = "years", ...)
Surv_3Date(start, stop, censor, units = "years", ...)
start , stop , censor
|
|
units |
(optional) character scalar, time units |
... |
potential parameters, currently not in use |
Function Surv_3Date returns a Surv object.
library(survival) d1 = within(survival::udca, expr = { edp_yr = Surv_3Date(entry.dt, death.dt, last.dt, units = 'years') edp_mon = Surv_3Date(entry.dt, death.dt, last.dt, units = 'months') }) head(d1) noout = within(survival::udca, expr = { edp_bug = Surv_3Date(entry.dt, death.dt, as.Date('1991-01-01'), units = 'months') }) subset(survival::udca, subset = entry.dt > as.Date('1991-01-01')) # check error as suggested
library(survival) d1 = within(survival::udca, expr = { edp_yr = Surv_3Date(entry.dt, death.dt, last.dt, units = 'years') edp_mon = Surv_3Date(entry.dt, death.dt, last.dt, units = 'months') }) head(d1) noout = within(survival::udca, expr = { edp_bug = Surv_3Date(entry.dt, death.dt, as.Date('1991-01-01'), units = 'months') }) subset(survival::udca, subset = entry.dt > as.Date('1991-01-01')) # check error as suggested
Print out grant and effort from Cayuse.
aggregateAwards(path = "~/Downloads", fiscal.year = year(Sys.Date())) viewProposal(path = "~/Downloads", fiscal.year = year(Sys.Date())) viewAward(path = "~/Downloads") award2LaTeX(path = "~/Downloads")
aggregateAwards(path = "~/Downloads", fiscal.year = year(Sys.Date())) viewProposal(path = "~/Downloads", fiscal.year = year(Sys.Date())) viewAward(path = "~/Downloads") award2LaTeX(path = "~/Downloads")
path |
character scalar, directory of downloaded award |
fiscal.year |
integer scalar |
go to https://jefferson.cayuse424.com/sp/index.cfm
My Proposals -> Submitted Proposals.
Lower-right corner of screen, 'Export to CSV'.
Downloaded file has name pattern '^proposals_.*\\.csv'
My Awards -> Awards (not 'Active Projects').
Lower-right corner of screen, 'View All', then 'Export to CSV'.
Downloaded file has name pattern '^Awards_.*\\.csv'
My Awards -> Awards. Click into each project, under 'People' tab to find my 'Sponsored Effort'
Function aggregateAwards aggregates grant over different period
(e.g. from Axx-xx-001, Axx-xx-002, Axx-xx-003 to Axx-xx).
Then we need to manually added in our 'Sponsored Effort' in the returned .csv
file.
..
if (FALSE) { aggregateAwards() viewAward() viewProposal() award2LaTeX() }
if (FALSE) { aggregateAwards() viewAward() viewProposal() award2LaTeX() }
..
TJU_Fiscal_Year(x)
TJU_Fiscal_Year(x)
x |
integer scalar |
Function TJU_Fiscal_Year returns a length-two Date vector, indicating the start (July 1 of the previous calendar year) and end date (June 30) of a fiscal year.
TJU_Fiscal_Year(2022L)
TJU_Fiscal_Year(2022L)
..
TJU_SchoolTerm(x)
TJU_SchoolTerm(x)
x |
Date object |
TJU_SchoolTerm returns a character vector
TJU_SchoolTerm(as.Date(c('2021-03-14', '2022-01-01', '2022-05-01')))
TJU_SchoolTerm(as.Date(c('2021-03-14', '2022-01-01', '2022-05-01')))
To summarize the number of workdays, weekends, holidays and vacations in a given time-span (e.g., a month or a quarter of a year).
TJU_Workday(x, vacations)
TJU_Workday(x, vacations)
x |
character scalar or vector (e.g.,
|
vacations |
Function TJU_Workday summarizes the workdays, weekends, Jefferson paid holidays (New Year’s Day, Martin Luther King, Jr. Day, Memorial Day, Fourth of July, Labor Day, Thanksgiving and Christmas) and your vacation (e.g., sick, personal, etc.) days (if any), in a given time-span.
Per Jefferson policy (source needed), if a holiday is on Saturday, then the preceding Friday is considered to be a weekend day. If a holiday is on Sunday, then the following Monday is considered to be a weekend day.
Function TJU_Workday returns a factor.
table(TJU_Workday(c('2021-01', '2021-02'))) tryCatch(TJU_Workday(c('2019-10', '2019-12')), error = identity) table(c(TJU_Workday('2019-10'), TJU_Workday('2019-12'))) # work-around table(TJU_Workday('2022-12')) table(TJU_Workday('2022 Q1', vacations = seq.Date( from = as.Date('2022-03-14'), to = as.Date('2022-03-18'), by = 1))) table(TJU_Workday('2022 Q2', vacations = as.Date(c( '2022-05-22', '2022-05-30', '2022-06-01', '2022-07-04')))) table(TJU_Workday(2021L))
table(TJU_Workday(c('2021-01', '2021-02'))) tryCatch(TJU_Workday(c('2019-10', '2019-12')), error = identity) table(c(TJU_Workday('2019-10'), TJU_Workday('2019-12'))) # work-around table(TJU_Workday('2022-12')) table(TJU_Workday('2022 Q1', vacations = seq.Date( from = as.Date('2022-03-14'), to = as.Date('2022-03-18'), by = 1))) table(TJU_Workday('2022 Q2', vacations = as.Date(c( '2022-05-22', '2022-05-30', '2022-06-01', '2022-07-04')))) table(TJU_Workday(2021L))
To remove leading/trailing and duplicated (symbols that look like) white spaces.
More aggressive than function trimws.
trimws_(x)
trimws_(x)
x |
Function trimws_ is more aggressive than trimws, that it removes
duplicated white spaces
symbols that look like white space, such as \u00a0
(no-break space)
Function trimws_ returns an object of typeof character.
gsub keeps attributes
(x = c(A = ' a b ', b = 'a . s', ' a , b ; ', '\u00a0 ab ')) base::trimws(x) # raster::trim(x) # do not want to 'Suggests' trimws_(x) (xm = matrix(x, nrow = 2L)) trimws_(xm) #library(microbenchmark) #microbenchmark(trimws(x), trimws_(x))
(x = c(A = ' a b ', b = 'a . s', ' a , b ; ', '\u00a0 ab ')) base::trimws(x) # raster::trim(x) # do not want to 'Suggests' trimws_(x) (xm = matrix(x, nrow = 2L)) trimws_(xm) #library(microbenchmark) #microbenchmark(trimws(x), trimws_(x))