Package 'ThomasJeffersonUniv' reference manual

Title:	Handy Tools for TJU/TJUH Employees
Description:	Functions for admin needs of employees of Thomas Jefferson University and Thomas Jefferson University Hospital, Philadelphia, PA.
Authors:	Tingting Zhan [aut, cre]
Maintainer:	Tingting Zhan <[email protected]>
License:	GPL-2
Version:	0.1.3
Built:	2025-03-17 05:21:00 UTC
Source:	https://github.com/tingtingzhan/thomasjeffersonuniv

Number of Anniversaries Between Two Dates

Description

Number of anniversaries between two dates.

Usage

anniversary(to, from)
anniversary(to, from)

Arguments

`to`	an R object convertible to POSIXlt, end date/time
`from`	an R object convertible to POSIXlt, start date/time

Details

Year difference between from and to dates are calculated
In either situation below, subtract one (1) year from the year difference obtained in Step 1.
- Month of from is later than month of to;
- Months of from and to are the same, but day of from is later than day of to.
In either of such situations, the anniversary of the current year has not been reached.
If any element from Step 2 is negative, stop.

Value

Function anniversary returns an integer scalar or vector.

Assign Key to bibentry

Description

Assign an attributes named 'key' to bibentry object.

Usage

bibentry_key(x) <- value
bibentry_key(x) <- value

Arguments

`x`	bibentry (and inherited citation) object
`value`	character scalar or vector

Note

'key' is the attribute of bibentry, which is used in utils:::.bibentry_get_key and utils:::toBibtex.bibentry.

factor Return for case_match

Description

Usage

case_match_factor(
  .x,
  ...,
  .default = NULL,
  .ptype = NULL,
  envir = parent.frame()
)
case_match_factor(
  .x,
  ...,
  .default = NULL,
  .ptype = NULL,
  envir = parent.frame()
)

Arguments

`.x`, `...`, `.default`, `.ptype`	see case_match
`envir`	environment to evaluate function case_match. Default `parent.frame()`

Details

If argument .default is missing, function case_match_factor converts the return of case_match into a factor. The order of levels follows the order of formulas in dynamic dots ....

Value

Function case_match_factor returns a factor.

Examples

# from ?dplyr::case_match
x = c('a', 'b', 'a', 'd', 'b', NA, 'c', 'e')
dplyr::case_match(x, 'a' ~ 1, 'b' ~ 2, 'c' ~ 3, 'd' ~ 4)
case_match_factor(x, 'a' ~ 1, 'b' ~ 2, 'c' ~ 3, 'd' ~ 4)
case_match_factor(x, 'a' ~ 1, 'b' ~ 2, 'c' ~ 1, 'd' ~ 2)

case_match_factor(x, 'a' ~ 1, 'd' ~ 4, 'b' ~ 2, 'c' ~ 3) # order matters!

y = c(1, 2, 1, 3, 1, NA, 2, 4)
case_match_factor(y, NA ~ 0, .default = y)
# from ?dplyr::case_match
x = c('a', 'b', 'a', 'd', 'b', NA, 'c', 'e')
dplyr::case_match(x, 'a' ~ 1, 'b' ~ 2, 'c' ~ 3, 'd' ~ 4)
case_match_factor(x, 'a' ~ 1, 'b' ~ 2, 'c' ~ 3, 'd' ~ 4)
case_match_factor(x, 'a' ~ 1, 'b' ~ 2, 'c' ~ 1, 'd' ~ 2)

case_match_factor(x, 'a' ~ 1, 'd' ~ 4, 'b' ~ 2, 'c' ~ 3) # order matters!

y = c(1, 2, 1, 3, 1, NA, 2, 4)
case_match_factor(y, NA ~ 0, .default = y)

Positive Counts in a logical vector

Description

Number and percentage of positive counts in a logical vector.

Usage

checkCount(x)
checkCount(x)

Arguments

`x`	logical vector

Value

Function checkCount returns a character scalar.

Examples

checkCount(as.logical(infert$case))
checkCount(as.logical(infert$case))

Inspect Duplicated Rows in a data.frame

Description

To inspect duplicated rows in a data.frame.

Usage

checkDuplicated(
  data,
  f,
  dontshow = character(length = 0L),
  file = tempfile(pattern = "checkdup_", fileext = ".xlsx"),
  rule,
  ...
)
checkDuplicated(
  data,
  f,
  dontshow = character(length = 0L),
  file = tempfile(pattern = "checkdup_", fileext = ".xlsx"),
  rule,
  ...
)

Arguments

`data`	data.frame
`f`	formula, criteria of duplication, e.g., use `~ mrn` to identify duplicated `mrn`, or use `~ mrn + visitdt` to identify duplicated `mrn:visitdt`
`dontshow`	(optional) character scalar or vector, variable names to be omitted in output diagnosis `file`
`file`	character scalar, path of diagnosis file, print out of substantial duplicates
`rule`	language, rule of dealing with duplicates
`...`	additional parameters, currently not in use

Value

Function checkDuplicated returns a data.frame.

Examples

x = swiss[c(1, 1:5), ]
x$Agriculture[2] = x$Agriculture[1] + 1
x
checkDuplicated(x, ~ Fertility)
x = swiss[c(1, 1:5), ]
x$Agriculture[2] = x$Agriculture[1] + 1
x
checkDuplicated(x, ~ Fertility)

Divide Numeric Into Intervals

Description

Divide numeric into intervals; an alternative to function cut.default.

Usage

cut_(
  x,
  breaks = quantile(x, probs = break_probs, na.rm = TRUE),
  break_probs,
  right = TRUE,
  include.lowest = TRUE,
  data.name = substitute(x),
  ...
)

cut_levels(breaks, right = TRUE, include.lowest = TRUE, data.name = "x")
cut_(
  x,
  breaks = quantile(x, probs = break_probs, na.rm = TRUE),
  break_probs,
  right = TRUE,
  include.lowest = TRUE,
  data.name = substitute(x),
  ...
)

cut_levels(breaks, right = TRUE, include.lowest = TRUE, data.name = "x")

Arguments

`x`	integer, numeric, difftime, Date, POSIXct vector or matrix.
`breaks`	vector of the same class as `x`. `-Inf` and `Inf` will be automatically added
`break_probs`	double vector from 0 to 1, probabilities to specify the quantiles to be used as `breaks`
`right`	logical scalar, default `TRUE`, see functions cut.default and .bincode.
`include.lowest`	logical scalar, default `TRUE`, see functions cut.default and .bincode.
`data.name`	character scalar, name of data. R language is also accepted. Default is the argument call of `x`
`...`	additional parameters, currently not in use

Details

Function cut_ is different from function cut.default, that

More classes of x are accepted, see Arguments
-Inf and Inf are added to breaks, so that the values outside of breaks will be correctly categorized, instead of returning an NA_integer per .bincode
More user-friendly factor levels, see helper function cut_levels

Examples

x = c(tmp <- c(10, 31, 45, 50, 52, NA, 55, 55, 57, 58.5, 60, 92), rev.default(tmp))
(xm = array(x, dim = c(6L, 4L)))
brk = c(20, 60, 80)
cut_(x, breaks = brk)
cut_(x, break_probs = c(.3, .6))
cut_(xm, breaks = brk)

cut_(rnorm(100), break_probs = c(.3, .6))

(x2 = zoo::as.Date.ts(airmiles))
length(x2)
cut_(x2, breaks = as.Date(c('1942-01-01', '1950-01-01')))

x2a = x2
attr(x2a, 'dim') = c(4L, 6L)
cut_(x2a, breaks = as.Date(c('1942-01-01', '1950-01-01')))

x3 = 0:10
cut_(x3, breaks = c(0, 3, 6), right = FALSE)
cut_(x3, breaks = c(0, 3, 6), right = TRUE)

if (FALSE) {
# ?base::.bincode much faster than ?base::findInterval
x = 2:18
v = c(5, 10, 15) # create two bins [5,10) and [10,15)
findInterval(x, v)
.bincode(x, v)
library(microbenchmark)
microbenchmark(findInterval(x, v), .bincode(x, v))
}
 
## Examples on Helper Function cut_levels()
foo = function(...) cbind(
 levels(cut.default(numeric(0), ...)),
 cut_levels(...)
)
foo(breaks = 1:4, right = TRUE, include.lowest = TRUE)
foo(breaks = 1:4, right = FALSE, include.lowest = TRUE)
foo(breaks = 1:4, right = TRUE, include.lowest = FALSE)
foo(breaks = 1:4, right = FALSE, include.lowest = FALSE)
set.seed(2259); foo(breaks = c(-Inf, sort(rnorm(1:3)), Inf))
x = c(tmp <- c(10, 31, 45, 50, 52, NA, 55, 55, 57, 58.5, 60, 92), rev.default(tmp))
(xm = array(x, dim = c(6L, 4L)))
brk = c(20, 60, 80)
cut_(x, breaks = brk)
cut_(x, break_probs = c(.3, .6))
cut_(xm, breaks = brk)

cut_(rnorm(100), break_probs = c(.3, .6))

(x2 = zoo::as.Date.ts(airmiles))
length(x2)
cut_(x2, breaks = as.Date(c('1942-01-01', '1950-01-01')))

x2a = x2
attr(x2a, 'dim') = c(4L, 6L)
cut_(x2a, breaks = as.Date(c('1942-01-01', '1950-01-01')))

x3 = 0:10
cut_(x3, breaks = c(0, 3, 6), right = FALSE)
cut_(x3, breaks = c(0, 3, 6), right = TRUE)

if (FALSE) {
# ?base::.bincode much faster than ?base::findInterval
x = 2:18
v = c(5, 10, 15) # create two bins [5,10) and [10,15)
findInterval(x, v)
.bincode(x, v)
library(microbenchmark)
microbenchmark(findInterval(x, v), .bincode(x, v))
}
 
## Examples on Helper Function cut_levels()
foo = function(...) cbind(
 levels(cut.default(numeric(0), ...)),
 cut_levels(...)
)
foo(breaks = 1:4, right = TRUE, include.lowest = TRUE)
foo(breaks = 1:4, right = FALSE, include.lowest = TRUE)
foo(breaks = 1:4, right = TRUE, include.lowest = FALSE)
foo(breaks = 1:4, right = FALSE, include.lowest = FALSE)
set.seed(2259); foo(breaks = c(-Inf, sort(rnorm(1:3)), Inf))

Concatenate a Date and a difftime Object

Description

Usage

date_difftime_(date_, difftime_, tz = "UTC", tol = sqrt(.Machine$double.eps))
date_difftime_(date_, difftime_, tz = "UTC", tol = sqrt(.Machine$double.eps))

Arguments

`date_`	an R object containing Date information
`difftime_`	a difftime object
`tz`	character scalar, time zone, see as.POSIXlt.Date and ISOdatetime
`tol`	numeric scalar, tolerance in finding second. Default `sqrt(.Machine$double.eps)` as in all.equal.numeric

Value

Function date_difftime_ returns a POSIXct object.

Note

For now, I do not know how to force function readxl::read_excel to read a column as POSIXt. By default, such column will be read as difftime.

See lubridate:::date.default for the handling of year and month!

Examples

(x = as.Date(c('2022-09-10', '2023-01-01', NA, '2022-12-31')))
y = as.difftime(c(47580.3, NA, 48060, 30660), units = 'secs')
units(y) = 'hours'
y
date_difftime_(x, y)
(x = as.Date(c('2022-09-10', '2023-01-01', NA, '2022-12-31')))
y = as.difftime(c(47580.3, NA, 48060, 30660), units = 'secs')
units(y) = 'hours'
y
date_difftime_(x, y)

Concatenate Date and Time

Description

Concatenate date and time information from two objects.

Usage

date_time_(date_, time_)
date_time_(date_, time_)

Arguments

`date_`	an R object containing Date information
`time_`	an R object containing time (POSIXt) information

Details

Function date_time_ is useful as clinicians may put date and time in different columns.

Value

Function date_time_ returns a POSIXct object.

Examples

(today = Sys.Date())
(y = ISOdatetime(year = c(1899, 2010), month = c(12, 3), day = c(31, 22), 
  hour = c(15, 3), min = 2, sec = 1, tz = 'UTC'))
date_time_(today, y)
(today = Sys.Date())
(y = ISOdatetime(year = c(1899, 2010), month = c(12, 3), day = c(31, 22), 
  hour = c(15, 3), min = 2, sec = 1, tz = 'UTC'))
date_time_(today, y)

Clopper-Pearson Exact Binomial Confidence Interval

Description

Clopper-Pearson exact binomial confidence interval.

Usage

exact_confint(
  x,
  n,
  level = 0.95,
  alternative = c("two.sided", "less", "greater"),
  ...
)
exact_confint(
  x,
  n,
  level = 0.95,
  alternative = c("two.sided", "less", "greater"),
  ...
)

Arguments

`x`	positive integer scalar or vector, counts
`n`	positive integer scalar or vector, sample sizes $n$
`level`	numeric scalar, confidence level, default .95
`alternative`	character scalar, `'two.sided'` (default), `'less'` or `'greater'`
`...`	potential parameters

Value

Function exact_confint returns an S3 'exact_confint' object, inspired by element ⁠$conf.int⁠ of an 'htest' object, i.e., the returned value of functions t.test, prop.test, etc.

An 'exact_confint' object is a double matrix with additional attributes,

attr(.,'conf.level'): double scalar, default .95, to mimic the element ⁠$conf.int⁠ of an 'htest' object
attr(.,'alternative'): character scalar
attr(.,'x'): integer scalar or vector
attr(.,'n'): integer scalar or vector

Note

Function Hmisc::binconf uses qf.

Functions binom.test and binom::binom.confint uses qbeta (equivalent but much cleaner!)

Only function binom.test provides one-sided confidence interval.

References

doi:10.1093/biomet/26.4.404

Examples

exact_confint(0:10, 10L)
exact_confint(0:10, 10L, alternative = 'less')
exact_confint(0:10, 10L, alternative = 'greater')
exact_confint(0:10, 10L)
exact_confint(0:10, 10L, alternative = 'less')
exact_confint(0:10, 10L, alternative = 'greater')

max and min of factor

Description

Usage

## S3 method for class 'factor'
max(..., na.rm = FALSE)

## S3 method for class 'factor'
min(..., na.rm = FALSE)
## S3 method for class 'factor'
max(..., na.rm = FALSE)

## S3 method for class 'factor'
min(..., na.rm = FALSE)

Arguments

`...`	one factor object
`na.rm`	logical scalar

Value

Functions max.factor and min.factor both return a factor.

Examples

(x = structure(c(NA_integer_, sample.int(3L, size = 20L, replace = TRUE)),
 levels = letters[1:3], class = 'factor'))
max(x, na.rm = FALSE)
max(x, na.rm = TRUE)
min(x, na.rm = TRUE)

(x0 = structure(rep(NA_integer_, times = 20L),
 levels = letters[1:3], class = 'factor'))
max(x0)
min(x0)
(x = structure(c(NA_integer_, sample.int(3L, size = 20L, replace = TRUE)),
 levels = letters[1:3], class = 'factor'))
max(x, na.rm = FALSE)
max(x, na.rm = TRUE)
min(x, na.rm = TRUE)

(x0 = structure(rep(NA_integer_, times = 20L),
 levels = letters[1:3], class = 'factor'))
max(x0)
min(x0)

Syntactic Sugar to Set (ordered) factor levels

Description

Syntactic sugar to set (ordered) factor levels.

Usage

factor(x, plus = 0L, ordered = FALSE) <- value

ordered(x, plus = 0L) <- value
factor(x, plus = 0L, ordered = FALSE) <- value

ordered(x, plus = 0L) <- value

Arguments

`x`	integer vector, dummy coding indices
`plus`	integer scalar, value to be added to `x`, default `0L`. If the dummy coding starts from 0 (instead of from 1), then use `plus = 1L`
`ordered`	logical scalar, should the returned factor be ordered (`TRUE`) or unordered (`FALSE`, default)
`value`	character vector, levels of output factor

Value

Syntactic sugar factor<- returns a factor.

Syntactic sugar ordered<- returns an ordered factor.

Examples

(x1 = x2 = sample.int(n = 5L, size = 20, replace = TRUE))
factor(x1) = letters[1:5]; x1
ordered(x2) = LETTERS[1:5]; x2

set.seed(141); (x10 = x20 = sample.int(n = 5L, size = 20, replace = TRUE) - 1L)
factor(x10, plus = 1L) = letters[1:5]; x10
ordered(x20, plus = 1L) = LETTERS[1:5]; x20

# some exceptions
x = 1:4
factor(x) = c('a', 'b', 'c', 'c')
x # duplicated levels dropped

x = 1:4
factor(x) = c('a', 'b', NA_character_, 'c')
x # missing level converted to missing entry

x = 1:4
ordered(x) = c('a', 'b', NA_character_, 'c')
x # correctly ordered

(x = array(sample.int(4, size = 20, replace = TRUE), dim = c(4,5)))
factor(x) = letters[1:4]
x # respects other attributes
(x1 = x2 = sample.int(n = 5L, size = 20, replace = TRUE))
factor(x1) = letters[1:5]; x1
ordered(x2) = LETTERS[1:5]; x2

set.seed(141); (x10 = x20 = sample.int(n = 5L, size = 20, replace = TRUE) - 1L)
factor(x10, plus = 1L) = letters[1:5]; x10
ordered(x20, plus = 1L) = LETTERS[1:5]; x20

# some exceptions
x = 1:4
factor(x) = c('a', 'b', 'c', 'c')
x # duplicated levels dropped

x = 1:4
factor(x) = c('a', 'b', NA_character_, 'c')
x # missing level converted to missing entry

x = 1:4
ordered(x) = c('a', 'b', NA_character_, 'c')
x # correctly ordered

(x = array(sample.int(4, size = 20, replace = TRUE), dim = c(4,5)))
factor(x) = letters[1:4]
x # respects other attributes

File Modification Time by Pattern

Description

Usage

file_mtime(
  path,
  pattern = "\\.xlsx$|\\.xls$|\\.csv$",
  file = list.files(path = path, pattern = pattern, ..., full.names = TRUE),
  ...
)
file_mtime(
  path,
  pattern = "\\.xlsx$|\\.xls$|\\.csv$",
  file = list.files(path = path, pattern = pattern, ..., full.names = TRUE),
  ...
)

Arguments

`path`	character scalar, directory on hard drive, see list.files
`pattern`	character scalar, regular expression regex
`file`	(optional) character scalar or vector
`...`	..

Examples

## Not run:  # devtools::check error
file_mtime('./R', pattern = '\\.R$')
file_mtime('./src', pattern = '\\.cpp$')

## End(Not run)

## Not run:  # devtools::check error
file_mtime('./R', pattern = '\\.R$')
file_mtime('./src', pattern = '\\.cpp$')

## End(Not run)

Force to logical vector

Description

Usage

force_bool(x, else_return = x)
force_bool(x, else_return = x)

Arguments

`x`	a factor, or a vector
`else_return`	an R object to return if cannot force into logical. Default `x`.

Details

Function force_bool tries to turn an object into logical.

Examples

force_bool(c('0', '1', '0', NA))
(tmp = factor(rep(0:1, times = 10L)))
force_bool(tmp)

force_bool(c('0', '1', '0', NA))
(tmp = factor(rep(0:1, times = 10L)))
force_bool(tmp)

Format file.size

Description

Format file.size.

Usage

format_file_size(..., units = "auto")
format_file_size(..., units = "auto")

Arguments

`...`	character scalar or vector, file paths
`units`	character scalar, see parameter `units` of function format.object_size

Details

Function format_file_size formats file.size in the same manner as function format.object_size does to object.size.

Value

Function format_file_size returns a character vector.

Note

Return of function file.size is simply numeric, thus we will not be able to define an S3 method dispatch for the S3 generic format, yet.

Examples

# format_file_size('./R/allequal.R', './R/approxdens.R')
# format_file_size('./R/allequal.R', './R/approxdens.R')

Format Clopper-Pearson Exact Binomial Confidence Interval

Description

To format a Clopper-Pearson exact binomial confidence interval.

Usage

## S3 method for class 'exact_confint'
format(x, data.name, ...)
## S3 method for class 'exact_confint'
format(x, data.name, ...)

Arguments

`x`	an exact_confint
`data.name`	(optional) character scalar
`...`	additional parameters, currently not in use

Value

Function format.exact_confint returns a character vector when argument data.name is missing; otherwise, a noquote character matrix is returned.

grepl, but recognize `NA`

Description

grepl, but recognize NA

Usage

grepl_(x, ...)
grepl_(x, ...)

Arguments

`x`	character vector
`...`	additional parameters of function grepl

Examples

grepl_(pattern = '^a', x = c('a', 'b', NA_character_))
grepl_(pattern = '^a', x = c('a', 'b', NA_character_))

gsub_yes

Description

gsub_yes

Usage

gsub_yes(pattern, replacement = "YES", x, ...)
gsub_yes(pattern, replacement = "YES", x, ...)

Arguments

`pattern`	character scalar
`replacement`	character scalar
`x`	character vector
`...`	additional parameters of function grepl

Value

Function

Examples

gsub_yes('^a', x = c('a', 'b', NA_character_))
gsub_yes('^a', x = c('a', 'b', NA_character_))

Excel-Style Hexavigesimal (A to Z)

Description

Convert between decimal, C-style hexavigesimal (0 to 9, A to P), and Excel-style hexavigesimal (A to Z).

Usage

Excel2int(x)

Excel2C(x)

int2Excel(x)
Excel2int(x)

Excel2C(x)

int2Excel(x)

Arguments

`x`	integer scalar or vector for function int2Excel. character scalar or vector for functions Excel2C and Excel2int, which consists of (except missingness) only letters `A` to `Z` and `a` to `z`.

Details

Convert between decimal, C-style hexavigesimal, and Excel-style hexavigesimal.

Decimal	0	1	25	26	27	51	52	676	702	703
Hexavigesimal; C	`0`	`1`	`P`	`10`	`11`	`⁠1P⁠`	`20`	`100`	`110`	`111`
Hexavigesimal; Excel	`0`	`A`	`Y`	`Z`	`AA`	`AY`	`AZ`	`YZ`	`ZZ`	`AAA`

Function Excel2C converts Excel-style hexavigesimal (A to Z) to C-style hexavigesimal (0 to 9, A to P).

Function Excel2int converts Excel-style hexavigesimal (A to Z) to decimal, using function Excel2C and strtoi.

Function int2Excel converts decimal to Excel-style hexavigesimal (A to Z). This function works very differently from R's solution to hexadecimal and decimal conversions. Function as.hexmode returns an object of typeof integer. Then function format.hexmode, i.e., the workhorse of function print.hexmode, relies on ⁠%x⁠ (hexadecimal) format option of function sprintf.

Value

Function Excel2int returns an integer vector.

Function Excel2C returns a character vector.

Function int2Excel returns a integer vector.

References

http://mathworld.wolfram.com/Hexavigesimal.html

Examples

# table in documentation
int1 = c(NA_integer_, 1L, 25L, 26L, 27L, 51L, 52L, 676L, 702L, 703L)
Excel1 = c(NA_character_, 'A', 'Y', 'Z', 'AA', 'AY', 'AZ', 'YZ', 'ZZ', 'AAA')
C1 = c(NA_character_, '1', 'P', '10', '11', '1P', '20', '100', '110', '111')
stopifnot(
 identical(int1, Excel2int(Excel1)), 
 identical(int1, strtoi(C1, base = 26L)),
 identical(int2Excel(int1), Excel1)
)

# another example
int2 = c(NA_integer_, 1L, 4L, 19L, 37L, 104L, 678L)
Excel2 = c(NA_character_, 'a', 'D', 's', 'aK', 'cZ', 'Zb')
Excel2C(Excel2)
stopifnot(
 identical(int2, Excel2int(Excel2)),
 identical(int2Excel(int2), toupper(Excel2))
)
# table in documentation
int1 = c(NA_integer_, 1L, 25L, 26L, 27L, 51L, 52L, 676L, 702L, 703L)
Excel1 = c(NA_character_, 'A', 'Y', 'Z', 'AA', 'AY', 'AZ', 'YZ', 'ZZ', 'AAA')
C1 = c(NA_character_, '1', 'P', '10', '11', '1P', '20', '100', '110', '111')
stopifnot(
 identical(int1, Excel2int(Excel1)), 
 identical(int1, strtoi(C1, base = 26L)),
 identical(int2Excel(int1), Excel1)
)

# another example
int2 = c(NA_integer_, 1L, 4L, 19L, 37L, 104L, 678L)
Excel2 = c(NA_character_, 'a', 'D', 's', 'aK', 'cZ', 'Zb')
Excel2C(Excel2)
stopifnot(
 identical(int2, Excel2int(Excel2)),
 identical(int2Excel(int2), toupper(Excel2))
)

Hour-Minute-Second

Description

Function read_excel might read a date-less time stamp (e.g., ⁠3:00:00 PM⁠ as ⁠1899-12-31 15:00:00 UTC⁠).

Usage

hms(x)

## S3 method for class 'POSIXlt'
hms(x)

## S3 method for class 'POSIXct'
hms(x)
hms(x)

## S3 method for class 'POSIXlt'
hms(x)

## S3 method for class 'POSIXct'
hms(x)

Arguments

`x`	a POSIXlt or POSIXct object

Value

Function hms() returns a difftime object.

Data Inspection

Description

Usage

inspect_(x, row_dup_rm = TRUE, col_na_rm = TRUE, ptn_Date, ...)
inspect_(x, row_dup_rm = TRUE, col_na_rm = TRUE, ptn_Date, ...)

Arguments

`x`	data.frame
`row_dup_rm`	logical scalar, whether to remove duplicated rows. Default `TRUE`.
`col_na_rm`	logical scalar, whether to remove all-missing columns. Default `TRUE`.
`ptn_Date`	regex, regular expression pattern of names of the columns to be converted to Dates.
`...`	additional parameters, currently not in use

Details

convert all eligible integer, numeric, factor and character columns to logical.

Value

Function inspect_ returns (invisibly) a data.frame.

Note

Be aware of potential name clash, e.g., lavaan::inspect.

Index to Factor, selected using Regular Expression

Description

Usage

key_rx(pattern, envir = parent.frame(), ...)
key_rx(pattern, envir = parent.frame(), ...)

Arguments

`pattern`, `envir`	see function select_rx
`...`	additional parameters of function id2key

Details

Function key_rx is the inverse of function splitKey.

Examples

npk |> within.data.frame(expr = {
 tmp = key_rx(pattern = 'N|P|K')
})
npk |> within.data.frame(expr = {
 tmp = key_rx(pattern = 'N|P|K')
})

Apply a Function to levels of a factor

Description

Apply a function to the levels of a factor.

Usage

levels_apply(x, FUN, ...)
levels_apply(x, FUN, ...)

Arguments

`x`	factor object
`FUN`	function
`...`	potential arguments of function `FUN`

Value

Function levels_apply returns

a factor, if FUN returns a character vector;
a vector, if otherwise.

Examples

(x1 = factor(rep(c('abE', 'fsSG'), times = c(2L, 3L))))
tolower(x1)
levels_apply(x1, FUN = tolower)

#library(microbenchmark)
#x1b = factor(rep(c('abE', 'fsSG'), times = c(1e3L, 1e4L)))
#microbenchmark(tolower(x1b), levels_apply(x1b, FUN = tolower))
(x1 = factor(rep(c('abE', 'fsSG'), times = c(2L, 3L))))
tolower(x1)
levels_apply(x1, FUN = tolower)

#library(microbenchmark)
#x1b = factor(rep(c('abE', 'fsSG'), times = c(1e3L, 1e4L)))
#microbenchmark(tolower(x1b), levels_apply(x1b, FUN = tolower))

The $n$ th Lower Edge(s) of a Matrix

Description

The $n$ th lower edge(s) of a matrix

Usage

lower_n(x, n = seq_len(.dim[1L] - 1L))
lower_n(x, n = seq_len(.dim[1L] - 1L))

Arguments

`x`	matrix
`n`	integer scalar or vector, order(s) of the lower edge(s)

Details

Function lower_n extends lower.tri, so that the logical indices of $n$ th lower-edge(s) elements are returned.

Value

Function lower_n returns a logical matrix.

Examples

# ?euro.cross
dim(euro.cross) # square matrix
lower_n(euro.cross) # I love dimnames 
lower.tri(euro.cross) # no dimnames
lower_n(euro.cross, 1:2)

dim(VADeaths) # non square matrix
lower_n(VADeaths, n = 1:2)

(x = array(1, dim = c(1, 1)))
lower_n(x) # exception handling
# ?euro.cross
dim(euro.cross) # square matrix
lower_n(euro.cross) # I love dimnames 
lower.tri(euro.cross) # no dimnames
lower_n(euro.cross, 1:2)

dim(VADeaths) # non square matrix
lower_n(VADeaths, n = 1:2)

(x = array(1, dim = c(1, 1)))
lower_n(x) # exception handling

Moving Average

Description

Usage

ma(x, order = 5L)
ma(x, order = 5L)

Arguments

`x`	numeric vector
`order`	integer scalar

Value

Function ma returns a time-series ts object from workhorse function filter.

Note

Function ma is a simplified version of function ma in package forecast.

Function ma is much faster than function rollmean in package zoo.

Function ma imports function filter from package stats, not function filter from package dplyr.

References

https://stackoverflow.com/questions/743812/calculating-moving-average

Examples

unclass(ma(1:20, order = 3L))
unclass(ma(1:2, order = 3L))
unclass(ma(1:20, order = 3L))
unclass(ma(1:2, order = 3L))

Match Rows of A data.frame to Another

Description

To match the rows of one data.frame to the rows of another data.frame.

Usage

matchDF(
  x,
  table = unique.data.frame(x),
  by = names(x),
  by.x = character(),
  by.table = character(),
  view.table = character(),
  trace_duplicate = FALSE,
  trace_nomatch = FALSE,
  inspect_fuzzy = FALSE,
  ...
)
matchDF(
  x,
  table = unique.data.frame(x),
  by = names(x),
  by.x = character(),
  by.table = character(),
  view.table = character(),
  trace_duplicate = FALSE,
  trace_nomatch = FALSE,
  inspect_fuzzy = FALSE,
  ...
)

Arguments

`x`	data.frame, the rows of which to be matched.
`table`	data.frame, the rows of which to be matched against.
`by`	character scalar or vector
`by.x`, `by.table`	character scalar or vector
`view.table`	(optional) character scalar or vector, variable names of `table` to be printed in fuzzy suggestion (if applicable)
`trace_duplicate`	logical scalar
`trace_nomatch`	logical scalar, to provide detailed diagnosis information, default `FALSE`
`inspect_fuzzy`	logical scalar
`...`	additional parameters, currently not in use

Value

Function matchDF returns a integer vector

Note

Unfortunately, R does not provide case-insensitive match. Only case-insensitive grep methods are available.

Examples

DF = swiss[sample(nrow(swiss), size = 55, replace = TRUE), ]
matchDF(DF)
DF = swiss[sample(nrow(swiss), size = 55, replace = TRUE), ]
matchDF(DF)

Cast a Molten data.frame with Multiple `value.var`

Description

Cast a molten data.frame with multiple value.var.

Usage

mdcast(
  data,
  formula,
  ...,
  value.var = setdiff(names(data), y = all.vars(formula))
)
mdcast(
  data,
  formula,
  ...,
  value.var = setdiff(names(data), y = all.vars(formula))
)

Arguments

`data`	a molten data.frame, returned object of melt.data.frame
`formula`	casting formula, see dcast
`...`	additional parameters of functions acast and dcast, which eventually get passed into function `reshape2:::cast`.
`value.var`	character vector, names of columns which store the values

Details

Function mdcast is an extension of dcast in the following aspects,

mdcast handles multiple value.var. For the $i$ -th value variable in value.var, the acast columns are named in the fashion of ⁠value[i].variable[j]⁠, $j=1,\cdots,J$ . This is follows naturally from the way data.frame handles multiple matrix input in ....
For the $i$ -th value variable in value.var, if one-and-only-one of the acast columns contains non-missing elements, this column is named as value[i], instead of ⁠value[i].variable[j]⁠. Specifically, we remove the all-NA columns from the acast columns of the $i$ -th value variable. If one-and-only-one column remains, we convert this single-column matrix into a vector. We pass this vector into data.frame with the other acast columns of the other value variables. This is a super useful feature in practice, e.g., some measurement only pertains to one of the multiple visits, therefore we do not need to specify to which visit it corresponds.

Value

Function mdcast returns a data.frame.

Note

Function mdcast uses unexported function reshape2:::cast illegally. I have asked Hadley, but he has no plan to export reshape2:::cast.

Examples

library(reshape2)
head(aqm <- melt(airquality, id = c('Month', 'Day'), na.rm = TRUE, 
  value.name = 'v1', variable.name = 'variable'))
aqm$v2 = rnorm(dim(aqm)[1L])
head(aqm)
head(dcast(aqm, Month + Day ~ variable, value.var = 'v1'))
head(aqm_d <- mdcast(aqm, Month + Day ~ variable, value.var = c('v1', 'v2')))
sapply(aqm_d, FUN = class)

(x <- data.frame(
  subj = rep(1:2, each = 3),
  event = rep(paste0('evt', 1:3), times = 2), 
  date = as.Date(c(14001:14003, 18001:18003)),
  y1 = rnorm(6), 
  y2 = c(rnorm(1), NA, NA, rnorm(1), NA, NA)))
mdcast(x, subj ~ event, value.var = c('y1', 'y2')) # very useful !!!
mdcast(x, subj ~ event) # very useful !!!

library(reshape2)
head(aqm <- melt(airquality, id = c('Month', 'Day'), na.rm = TRUE, 
  value.name = 'v1', variable.name = 'variable'))
aqm$v2 = rnorm(dim(aqm)[1L])
head(aqm)
head(dcast(aqm, Month + Day ~ variable, value.var = 'v1'))
head(aqm_d <- mdcast(aqm, Month + Day ~ variable, value.var = c('v1', 'v2')))
sapply(aqm_d, FUN = class)

(x <- data.frame(
  subj = rep(1:2, each = 3),
  event = rep(paste0('evt', 1:3), times = 2), 
  date = as.Date(c(14001:14003, 18001:18003)),
  y1 = rnorm(6), 
  y2 = c(rnorm(1), NA, NA, rnorm(1), NA, NA)))
mdcast(x, subj ~ event, value.var = c('y1', 'y2')) # very useful !!!
mdcast(x, subj ~ event) # very useful !!!

An Alternative Merge Operation

Description

Usage

mergeDF(
  x,
  table,
  by = character(),
  by.x = character(),
  by.table = character(),
  ...
)
mergeDF(
  x,
  table,
  by = character(),
  by.x = character(),
  by.table = character(),
  ...
)

Arguments

`x`	data.frame, on which new columns will be added. All rows of `x` will be retained in the returned object, in their original order.
`table`	data.frame, columns of which will be added to `x`. Not all rows of `table` will be included in the returned object
`by`	character scalar or vector
`by.x`, `by.table`	character scalar or vector
`...`	additional parameters of matchDF

Value

Function mergeDF returns a data.frame.

Note

We avoid merge.data.frame as much as possible, because it's slow and even sort = FALSE may not completely retain the original order of input x.

Examples

# examples inspired by ?merge.data.frame 
(authors = data.frame(
 surname = c('Tukey', 'Venables', 'Tierney', 'Ripley', 'McNeil'),
 nationality = c('US', 'Australia', 'US', 'UK', 'Australia'),
 deceased = c('yes', rep('no', 4))))
(books = data.frame(
 name = c('Tukey', 'Venables', 'Tierney', 'Ripley', 
  'Ripley', 'McNeil', 'R Core', 'Diggle'),
 title = c(
  'Exploratory Data Analysis',
  'Modern Applied Statistics',
  'LISP-STAT', 'Spatial Statistics', 'Stochastic Simulation',
  'Interactive Data Analysis', 'An Introduction to R',
  'Analysis of Longitudinal Data'),
 other.author = c(
  NA, 'Ripley', NA, NA, NA, NA, 'Venables & Smith',
  'Heagerty & Liang & Scott Zeger')))
(m = mergeDF(books, authors, by.x = 'name', by.table = 'surname'))
attr(m, 'nomatch')
# examples inspired by ?merge.data.frame 
(authors = data.frame(
 surname = c('Tukey', 'Venables', 'Tierney', 'Ripley', 'McNeil'),
 nationality = c('US', 'Australia', 'US', 'UK', 'Australia'),
 deceased = c('yes', rep('no', 4))))
(books = data.frame(
 name = c('Tukey', 'Venables', 'Tierney', 'Ripley', 
  'Ripley', 'McNeil', 'R Core', 'Diggle'),
 title = c(
  'Exploratory Data Analysis',
  'Modern Applied Statistics',
  'LISP-STAT', 'Spatial Statistics', 'Stochastic Simulation',
  'Interactive Data Analysis', 'An Introduction to R',
  'Analysis of Longitudinal Data'),
 other.author = c(
  NA, 'Ripley', NA, NA, NA, NA, 'Venables & Smith',
  'Heagerty & Liang & Scott Zeger')))
(m = mergeDF(books, authors, by.x = 'name', by.table = 'surname'))
attr(m, 'nomatch')

melt by Multiple Groups of Measurement Variables

Description

melt by multiple groups of measurement variables.

Usage

mmelt(data, id.vars, measure_rx, variable.name = "variable")
mmelt(data, id.vars, measure_rx, variable.name = "variable")

Arguments

`data`	a data.frame
`id.vars`	character vector, see function melt.data.frame. Default is all variables not matched by `measure_rx`.
`measure_rx`	named character vector, regexs to determine `measure.vars` of function melt.data.frame, while `names(measure_rx)` serve as `value.name` of function melt.data.frame
`variable.name`	character scalar, see function melt.data.frame

Details

Function mmelt melts by multiple groups of measurement variables.

Value

Function mmelt returns a data.frame

Examples

(iris0 = iris[c(1:3, 51:53, 101:103),])
rx = c(len = '\\.Length$', wd = '\\.Width$')
mmelt(iris0, measure_rx = rx, variable.name = 'Part')

head(iris1 <- iris0[2:5])
tryCatch(mmelt(iris1, measure_rx = rx, variable.name = 'Part'), error = identity)
iris1$Sepal.Length = NA # does not have to be NA_real_
mmelt(iris1, measure_rx = rx, variable.name = 'Part')
(iris0 = iris[c(1:3, 51:53, 101:103),])
rx = c(len = '\\.Length$', wd = '\\.Width$')
mmelt(iris0, measure_rx = rx, variable.name = 'Part')

head(iris1 <- iris0[2:5])
tryCatch(mmelt(iris1, measure_rx = rx, variable.name = 'Part'), error = identity)
iris1$Sepal.Length = NA # does not have to be NA_real_
mmelt(iris1, measure_rx = rx, variable.name = 'Part')

Elements which are not numeric

Description

Usage

not_numeric(x)
not_numeric(x)

Arguments

`x`	an R object

Details

Function not_numeric finds the elements cannot be handled by as.numeric (workhorse as.double).

Value

Function not_numeric returns a logical vector.

Examples

not_numeric(c('1.9', '1.1.3', Inf, NA))
not_numeric(c('1.9', '1.1.3', Inf, NA))

One-to-One

Description

Existence of one-to-one correspondence

Usage

one2one(x, y)
one2one(x, y)

Arguments

x, y

Operation on Objects Specified by regex

Description

Usage

select_rx(pattern, envir = parent.frame(), ...)

apply_rx(pattern, envir = parent.frame(), ..., FUN, MoreArgs = NULL)
select_rx(pattern, envir = parent.frame(), ...)

apply_rx(pattern, envir = parent.frame(), ..., FUN, MoreArgs = NULL)

Arguments

`pattern`	character scalar, regular expression
`envir`	an environment, a list, a data.frame. May also be a matrix for function select_rx.
`...`	additional parameters of grep, most importantly `invert`
`FUN`	function
`MoreArgs`	(optional) list, additional parameters of `FUN`

Value

Function select_rx selects

columns from a data.frame by pattern matching against its names, and returns a data.frame.
elements from a list by pattern matching against its names, and returns a list.
columns from a matrix by pattern matching against its colnames, and returns a matrix.
objects in an environment by pattern matching against the names of variables within, and returns a list instead of an environment.

Function apply_rx returns the updated envir, whether it is environment, list or data.frame.

Examples

head(select_rx('\\.Width$', envir = iris))
head(select_rx('\\.Length$', envir = iris, invert = TRUE))
select_rx('^Rural', VADeaths)
with(head(iris), expr = select_rx(pattern = '\\.Width$'))
apply_rx('^acc', head(attenu), FUN = stats:::format_perc, MoreArgs = list(digits = 2))
within(head(attenu), expr = {
 apply_rx('^acc', FUN = stats:::format_perc, MoreArgs = list(digits = 2))
}) # same
head(select_rx('\\.Width$', envir = iris))
head(select_rx('\\.Length$', envir = iris, invert = TRUE))
select_rx('^Rural', VADeaths)
with(head(iris), expr = select_rx(pattern = '\\.Width$'))
apply_rx('^acc', head(attenu), FUN = stats:::format_perc, MoreArgs = list(digits = 2))
within(head(attenu), expr = {
 apply_rx('^acc', FUN = stats:::format_perc, MoreArgs = list(digits = 2))
}) # same

Use packageName as citation Key

Description

Use packageName as citation key

Usage

packageKey(x, overwrite = FALSE)
packageKey(x, overwrite = FALSE)

Arguments

`x`	citation object
`overwrite`	logical scalar, whether to overwrite default key(s). Default `FALSE`.

Details

Function packageKey adds packageName as citation key.

Value

Function packageKey returns a citation object.

Note

As of June 2024: if the last call to bibentry in function citation adds the argument key = package, or provide a parameter to allow end-user to enable such choice, then we don't need the function packageKey.

Examples

if (FALSE) {
 ap = installed.packages()
 table(aa[, 'Priority'])
 which(ap[, 'Priority'] == 'recommended')
}

toBibtex(ct <- citation('survival')) # has default key(s)
toBibtex(packageKey(ct))
toBibtex(packageKey(ct)[1L])
toBibtex(packageKey(ct, overwrite = TRUE)[2L])

toBibtex(ct <- citation()) # no default key(s)
toBibtex(packageKey(ct))
if (FALSE) {
 ap = installed.packages()
 table(aa[, 'Priority'])
 which(ap[, 'Priority'] == 'recommended')
}

toBibtex(ct <- citation('survival')) # has default key(s)
toBibtex(packageKey(ct))
toBibtex(packageKey(ct)[1L])
toBibtex(packageKey(ct, overwrite = TRUE)[2L])

toBibtex(ct <- citation()) # no default key(s)
toBibtex(packageKey(ct))

10-digit US phone number

Description

Usage

phone10(x, sep = "")
phone10(x, sep = "")

Arguments

`x`	character vector
`sep`	character scalar

Details

Function phone10 converts all US and Canada (+1) phone numbers to 10-digit.

Value

Function phone10 returns a character vector of nchar-10.

Examples

x = c(
 '+1(800)275-2273', # Apple
 '1-888-280-4331', # Amazon
 '000-000-0000'
)
phone10(x)
phone10(x, sep = '-')

x = c(
 '+1(800)275-2273', # Apple
 '1-888-280-4331', # Amazon
 '000-000-0000'
)
phone10(x)
phone10(x, sep = '-')

Convert No-Date POSIXct to difftime

Description

Usage

POSIXct2difftime(
  x,
  else_return = stop("Expecting all '1899-12-31 hr:min:sec UTC'")
)
POSIXct2difftime(
  x,
  else_return = stop("Expecting all '1899-12-31 hr:min:sec UTC'")
)

Arguments

`x`	POSIXct
`else_return`	exception handling

Details

readxl will read 'hour:min:sec' as '1899-12-31 hr:min:sec UTC'

Proportions of Exceptions

Description

Usage

prop_missing(x, data.name = deparse1(substitute(x)), ...)

prop_nonmissing(x, data.name = deparse1(substitute(x)), ...)
prop_missing(x, data.name = deparse1(substitute(x)), ...)

prop_nonmissing(x, data.name = deparse1(substitute(x)), ...)

Arguments

`x`	..
`data.name`	..
`...`	all potential parameters of as.data.frame

Details

Examples

prop_missing(swiss)
prop_missing(airquality)
prop_missing(airquality$Ozone)

prop_missing(swiss)
prop_missing(airquality)
prop_missing(airquality$Ozone)

Row-Bind a list of data.frame

Description

Usage

rbinds(x, make.row.names = FALSE, ..., .id = "idx")
rbinds(x, make.row.names = FALSE, ..., .id = "idx")

Arguments

`x`	a list of named data.frame
`make.row.names`, `...`	additional parameters of rbind.data.frame
`.id`	character value to specify the name of ID column, nomenclature follows rbindlist

Details

Yet to look into ggplot2:::rbind_dfs closely.

Mine is slightly slower than the fastest alternatives, but I have more checks which are useful.

Value

Function rbinds returns a data.frame.

References

https://stackoverflow.com/questions/2851327/combine-a-list-of-data-frames-into-one-data-frame

Examples

x = list(A = swiss[1:3, 1:2], B = swiss[5:9, 1:2]) # list of 'data.frame'
rbinds(x)
rbinds(x, make.row.names = TRUE)

x = list(A = swiss[1:3, 1:2], B = swiss[5:9, 1:2]) # list of 'data.frame'
rbinds(x)
rbinds(x, make.row.names = TRUE)

Relaxed Identical

Description

Slightly relax the identical criteria, to identify near-identical objects, e.g., model estimates with mathematically identical model specifications.

Usage

relaxed_identical(x, y)
relaxed_identical(x, y)

Arguments

x, y

any R objects

Details

Function relaxed_identical relaxes function identical in the following ways.

For object with typeof being double and/or integer

Test near equality with attributes ignored, i.e., function all.equal.numeric with option check.attributes = FALSE.

For formulas

Set environment of x and y to NULL, then compare them using function identical. Note that

formula is.recursive, thus must be placed before the is.recursive branching;
formula is not closure. Therefore, using function identical with option ignore.environment = TRUE does not work!

For functions

Ignore environment of x and y, i.e., using function identical with option ignore.environment = TRUE. Note that

function is.recursive, whether it is closure or is.primitive. Therefore it must be placed before the is.recursive branching;

For all other is.recursive objects

Function relaxed_identical is called recursively, for each $ element of x and y.

For S4 objects

Function relaxed_identical is called recursively, for each @ slot (which is technically the attributes) of x and y, including the ⁠@.Data⁠ slot. Note that

S4 objects are not is.recursive.

Otherwise

Function identical is called, as the exception handling.

Value

Function relaxed_identical returns a logical scalar.

Examples

# mathematically identical model specification
m1 = lm(breaks ~ -1 + wool + wool:tension, data = warpbreaks)
m2 = lm(breaks ~ -1 + tension + tension:wool, data = warpbreaks)
foo = function(m) list(pred = predict(m), resid = residuals(m))
identical(foo(m1), foo(m2)) # FALSE
relaxed_identical(foo(m1), foo(m2)) # TRUE
# mathematically identical model specification
m1 = lm(breaks ~ -1 + wool + wool:tension, data = warpbreaks)
m2 = lm(breaks ~ -1 + tension + tension:wool, data = warpbreaks)
foo = function(m) list(pred = predict(m), resid = residuals(m))
identical(foo(m1), foo(m2)) # FALSE
relaxed_identical(foo(m1), foo(m2)) # TRUE

Row and Column Operations Recognizing Full Row/Column Missingness

Description

To perform row and column operations using na.rm = TRUE, and only to return a missing value if a full row/column of the input is missing.

Usage

rowSums_(x)

rowMeans_(x)

rowAnys_(x)
rowSums_(x)

rowMeans_(x)

rowAnys_(x)

Arguments

`x`	array of two or more dimensions, or a numeric data.frame

Details

Function rowSums_ performs ..

Function rowMeans_ performs ..

Value

Functions rowSums_, rowMeans_ return a numeric vector

Note

Potential name clash with rowMeans2

Examples

(x = matrix(c(1, 2, NA, 3, NA, NA), byrow = TRUE, ncol = 2L))
rowSums(x)
rowSums(x, na.rm = TRUE)
rowSums_(x)

(x = matrix(c(1, 2, NA, 3, NA, NA), byrow = TRUE, ncol = 2L))
rowSums(x)
rowSums(x, na.rm = TRUE)
rowSums_(x)

Indices of Stratified Sampling

Description

Indices of Stratified Sampling

Usage

sample.by.int(f, ...)
sample.by.int(f, ...)

Arguments

`f`	factor
`...`	potential parameters of sample.int

Details

End user should use interaction to combine multiple factors.

Value

Function sample.by.int returns an integer vector.

Examples

id1 = sample.by.int(state.region, size = 2L)
state.region[id1]

id2 = sample.by.int(f = with(npk, interaction(N, P)), size = 2L)
npk[id2, c('N', 'P')] # each combination selected 2x

id1 = sample.by.int(state.region, size = 2L)
state.region[id1]

id2 = sample.by.int(f = with(npk, interaction(N, P)), size = 2L)
npk[id2, c('N', 'P')] # each combination selected 2x

Exclusive-OR Elements

Description

Exclusive-OR elements in two vectors.

Usage

set_xor(
  e1,
  e2,
  name1 = deparse1(substitute(e1)),
  name2 = deparse1(substitute(e2))
)
set_xor(
  e1,
  e2,
  name1 = deparse1(substitute(e1)),
  name2 = deparse1(substitute(e2))
)

Arguments

`e1`, `e2`	two R objects of the same typeof
`name1`, `name2`	(optional) character scalars, human-friendly names of `e1` and `e2`. Default is the function call of `e1` and `e2`.

Details

Function set_xor returns the exclusive-OR elements in each of the sets, which is slow and only intended for end-user.

Value

Function set_xor returns either a list or a vector.

Examples

set_xor(1:5, 3:7)
set_xor(1:5, 1:3)
set_xor(1:5, 3:7)
set_xor(1:5, 1:3)

Sign of Difference of Two Objects

Description

Usage

sign2(
  e1,
  e2,
  name1 = substitute(e1),
  name2 = substitute(e2),
  na.detail = TRUE,
  ...
)
sign2(
  e1,
  e2,
  name1 = substitute(e1),
  name2 = substitute(e2),
  na.detail = TRUE,
  ...
)

Arguments

`e1`, `e2`	two R objects, must be both numeric vectors, or ordered factors with the same levels
`name1`, `name2`	two language objects, or character scalars
`na.detail`	logical scalar, whether to provide the missingness details of `e1` and `e2`. Default `TRUE`.
`...`	additional parameters, currently not in use

Details

Function sign2 extends sign in the following ways

two ordered factors can be compared;
(detailed) information on missingness are provided.

Value

Function sign2 returns character vector when na.detail = TRUE, or ordered factor when na.detail = FALSE.

Examples

lv = letters[c(1,3,2)]
x0 = letters[1:3]
x = ordered(sample(x0, size = 100, replace = TRUE), levels = lv)
y = ordered(sample(x0, size = 50, replace = TRUE), levels = lv)
x < y # base R ok
pmax(x, y) # base R okay
pmin(x, y) # base R okay
x[c(1,3)] = NA
y[c(3,5)] = NA
table(sign(unclass(y) - unclass(x)))
table(sign2(x, y))
table(sign2(x, y, na.detail = FALSE), useNA = 'always')

lv = letters[c(1,3,2)]
x0 = letters[1:3]
x = ordered(sample(x0, size = 100, replace = TRUE), levels = lv)
y = ordered(sample(x0, size = 50, replace = TRUE), levels = lv)
x < y # base R ok
pmax(x, y) # base R okay
pmin(x, y) # base R okay
x[c(1,3)] = NA
y[c(3,5)] = NA
table(sign(unclass(y) - unclass(x)))
table(sign2(x, y))
table(sign2(x, y, na.detail = FALSE), useNA = 'always')

Source All R Files under a Directory

Description

source all ⁠*.R⁠ and ⁠*.r⁠ files under a directory.

Usage

sourcePath(path, ...)
sourcePath(path, ...)

Arguments

`path`	character scalar, parent directory of `.R` files
`...`	additional parameters of source

Value

Function sourcePath does not have a returned value

Split character vector

Description

Split character vector, into keywords or by the order of appearance.

Usage

splitKey(x, key = xkey, data.name = substitute(x), envir = parent.frame(), ...)

splitOrd(
  x,
  nm = stop("must specify new names"),
  data.name = substitute(x),
  envir = parent.frame(),
  ...
)
splitKey(x, key = xkey, data.name = substitute(x), envir = parent.frame(), ...)

splitOrd(
  x,
  nm = stop("must specify new names"),
  data.name = substitute(x),
  envir = parent.frame(),
  ...
)

Arguments

`x`	character vector, each element being a set of keywords separated by a symbol (e.g., `','`)
`key`	(optional for function splitKey) character vector, user-specified keywords. Default to all keywords appearing in input `x`
`data.name`	(optional) character scalar or name, name of input `x`
`envir`	`NULL`, `FALSE` or an environment.
`...`	potential parameters of strsplit, most importantly `split`
`nm`	(for function splitOrd) character vector

Details

Function splitKey finds out whether each keyword appears in each element of input x.

NA_character_ or '' entries in input x are regarded as negative (i.e., none of the keywords exists), instead of as missingness (i.e., we do not know if any of the keywords exists). This practice is most intuitive to clinicians.

Value

Function splitKey returns a logical matrix if envir = NULL. Otherwise the logical vectors are assigned to envir (i.e., when used inside within.data.frame).

Function splitOrd returns a character matrix if envir = NULL. Otherwise the character vectors are assigned to envir (i.e., when used inside within.data.frame).

Examples

letters[1:4] |> splitKey(split = ';;', envir = NULL) # exception
(x1 = c('a,b,', 'c,a,b,,a', NA_character_, ''))
x1 |> splitKey(split = ',', envir = NULL)
data.frame(x = x1) |> within.data.frame(expr = splitKey(x, split = ','))
data.frame(x = x1) |> within.data.frame(expr = splitKey(x, split = ',', data.name = 'cancer'))
## Not run: 
X = rep(x1, times = 10L)
microbenchmark::microbenchmark( # speed O(n)
 splitKey(x1, split = ',', envir = NULL), 
 splitKey(X, split = ',', envir = NULL))
## End(Not run)
(x2 = c('T2;N0;M0;B1', '; ;M1; ', NA_character_, ''))
nm = c('T', 'N', 'M', 'B')
splitOrd(x2, split = ';', nm = nm, envir = NULL)
data.frame(x = x2) |> within.data.frame(expr = splitOrd(x, split = ';', nm = nm, data.name = 'st'))
letters[1:4] |> splitKey(split = ';;', envir = NULL) # exception
(x1 = c('a,b,', 'c,a,b,,a', NA_character_, ''))
x1 |> splitKey(split = ',', envir = NULL)
data.frame(x = x1) |> within.data.frame(expr = splitKey(x, split = ','))
data.frame(x = x1) |> within.data.frame(expr = splitKey(x, split = ',', data.name = 'cancer'))
## Not run: 
X = rep(x1, times = 10L)
microbenchmark::microbenchmark( # speed O(n)
 splitKey(x1, split = ',', envir = NULL), 
 splitKey(X, split = ',', envir = NULL))
## End(Not run)
(x2 = c('T2;N0;M0;B1', '; ;M1; ', NA_character_, ''))
nm = c('T', 'N', 'M', 'B')
splitOrd(x2, split = ';', nm = nm, envir = NULL)
data.frame(x = x2) |> within.data.frame(expr = splitOrd(x, split = ';', nm = nm, data.name = 'st'))

Inspect a Subset of data.frame

Description

Usage

subset_(x, subset, select, select_pattern, avoid, avoid_pattern)
subset_(x, subset, select, select_pattern, avoid, avoid_pattern)

Arguments

`x`	a data.frame
`subset`	logical expression, see function subset.data.frame
`select`	character vector, columns to be selected, see function subset.data.frame
`select_pattern`	regular expression regex for multiple columns to be selected
`avoid`	character vector, columns to be avoided
`avoid_pattern`	regular expression regex, for multiple columns to be avoided

Details

Function subset_ is different from subset.data.frame, such that

if both select and select_pattern are missing, only variables mentioned in subset are selected;
be able to select all variables, except those in avoid and avoid_pattern;
always returns data.frame, i.e., forces drop = FALSE

Value

Function subset_ returns a data.frame, with additional attributes

attr(,'vline'): integer scalar, position of a vertical line (see ?flextable::vline)
⁠attr(,'jhighlight)'⁠: character vector, names of columns to be flextable::highlighted.

Examples

subset_(trees, Girth > 9 & Height < 70)
subset_(swiss, Fertility > 80, avoid = 'Catholic')
subset_(warpbreaks, wool == 'K')

subset_(trees, Girth > 9 & Height < 70)
subset_(swiss, Fertility > 80, avoid = 'Catholic')
subset_(warpbreaks, wool == 'K')

Award & Effort from Cayuse

Description

Print out grant and effort from Cayuse.

Usage

aggregateAwards(path = "~/Downloads", fiscal.year = year(Sys.Date()))

viewProposal(path = "~/Downloads", fiscal.year = year(Sys.Date()))
aggregateAwards(path = "~/Downloads", fiscal.year = year(Sys.Date()))

viewProposal(path = "~/Downloads", fiscal.year = year(Sys.Date()))

Arguments

`path`	character scalar, directory of downloaded award `.csv` file. Default is the download directory `'~/Downloads'`
`fiscal.year`	integer scalar

Details

go to ⁠https://jefferson.cayuse424.com/sp/index.cfm⁠ in Chrome (Safari has bugs)
My Proposals -> Submitted Proposals. Lower-right corner of screen, 'Export to CSV'. Downloaded file has name pattern '^proposals_.*\\.csv'
My Awards -> Awards (not 'Active Projects'). Lower-right corner of screen, 'Export to CSV'. Downloaded file has name pattern '^Awards_.*\\.csv'
My Awards -> Awards. Click into each project, under 'People' tab to find my 'Sponsored Effort'

Function aggregateAwards aggregates grant over different period (e.g. from Axx-xx-001, Axx-xx-002, Axx-xx-003 to Axx-xx). Then we need to manually added in our 'Sponsored Effort' in the returned .csv file.

Value

Examples

if (FALSE) {
aggregateAwards()
viewProposal()
}

if (FALSE) {
aggregateAwards()
viewProposal()
}

TJU Fiscal Year

Description

Usage

TJU_Fiscal_Year(x)
TJU_Fiscal_Year(x)

Arguments

`x`	integer scalar

Value

Function TJU_Fiscal_Year returns a length-two Date vector, indicating the start (July 1 of the previous calendar year) and end date (June 30) of a fiscal year.

Examples

TJU_Fiscal_Year(2022L)

TJU_Fiscal_Year(2022L)

TJU School Term

Description

Usage

TJU_SchoolTerm(x)
TJU_SchoolTerm(x)

Arguments

`x`	a Date object

Value

TJU_SchoolTerm returns a character vector

Examples

TJU_SchoolTerm(as.Date(c('2021-03-14', '2022-01-01', '2022-05-01')))

TJU_SchoolTerm(as.Date(c('2021-03-14', '2022-01-01', '2022-05-01')))

Thomas Jefferson University Workdays

Description

To summarize the number of workdays, weekends, holidays and vacations in a given time-span (e.g., a month or a quarter of a year).

Usage

TJU_Workday(x, vacations)
TJU_Workday(x, vacations)

Arguments

`x`	character scalar or vector (e.g., `'2021-01'` for January 2021, `'2021 Q1'` for 2021 Q1 (January to March)), or integer scalar or vector (e.g., `2021L` for year 2021); The time-span to be summarized. Objects of classes yearqtr and yearmon are also accepted.
`vacations`	Date vector, vacation days

Details

Function TJU_Workday summarizes the workdays, weekends, Jefferson paid holidays (New Year’s Day, Martin Luther King, Jr. Day, Memorial Day, Fourth of July, Labor Day, Thanksgiving and Christmas) and your vacation (e.g., sick, personal, etc.) days (if any), in a given time-span.

Per Jefferson policy (source needed), if a holiday is on Saturday, then the preceding Friday is considered to be a weekend day. If a holiday is on Sunday, then the following Monday is considered to be a weekend day.

Value

Function TJU_Workday returns a factor.

Examples

table(TJU_Workday(c('2021-01', '2021-02')))

tryCatch(TJU_Workday(c('2019-10', '2019-12')), error = identity)
table(c(TJU_Workday('2019-10'), TJU_Workday('2019-12'))) # work-around

table(TJU_Workday('2022-12'))

table(TJU_Workday('2022 Q1', vacations = seq.Date(
 from = as.Date('2022-03-14'), to = as.Date('2022-03-18'), by = 1)))
 
table(TJU_Workday('2022 Q2', vacations = as.Date(c(
 '2022-05-22', '2022-05-30', '2022-06-01', '2022-07-04'))))
 
table(TJU_Workday(2021L))

table(TJU_Workday(c('2021-01', '2021-02')))

tryCatch(TJU_Workday(c('2019-10', '2019-12')), error = identity)
table(c(TJU_Workday('2019-10'), TJU_Workday('2019-12'))) # work-around

table(TJU_Workday('2022-12'))

table(TJU_Workday('2022 Q1', vacations = seq.Date(
 from = as.Date('2022-03-14'), to = as.Date('2022-03-18'), by = 1)))
 
table(TJU_Workday('2022 Q2', vacations = as.Date(c(
 '2022-05-22', '2022-05-30', '2022-06-01', '2022-07-04'))))
 
table(TJU_Workday(2021L))

Remove Leading/Trailing and Duplicated (Symbols that Look Like) White Spaces

Description

To remove leading/trailing and duplicated (symbols that look like) white spaces.

More aggressive than function trimws.

Usage

trimws_(x)
trimws_(x)

Arguments

`x`	an object with typeof being character

Details

Function trimws_ is more aggressive than trimws, that it removes

non-UTF-8 characters
duplicated white spaces
symbols that look like white space, such as ⁠\u00a0⁠ (no-break space)

Value

Function trimws_ returns an object of typeof character.

Note

gsub keeps attributes

Examples

(x = c(A = ' a  b  ', b = 'a .  s', ' a  ,  b ; ', '\u00a0  ab '))
base::trimws(x)
# raster::trim(x) # do not want to 'Suggests'
trimws_(x)

(xm = matrix(x, nrow = 2L))
trimws_(xm)

cat(x0 <- ' ab  \xa0cd ')
tryCatch(base::trimws(x0), error = identity)
# tryCatch(raster::trim(x0), error = identity)
trimws_(x0)

#library(microbenchmark)
#microbenchmark(trimws(x), trimws_(x))

(x = c(A = ' a  b  ', b = 'a .  s', ' a  ,  b ; ', '\u00a0  ab '))
base::trimws(x)
# raster::trim(x) # do not want to 'Suggests'
trimws_(x)

(xm = matrix(x, nrow = 2L))
trimws_(xm)

cat(x0 <- ' ab  \xa0cd ')
tryCatch(base::trimws(x0), error = identity)
# tryCatch(raster::trim(x0), error = identity)
trimws_(x0)

#library(microbenchmark)
#microbenchmark(trimws(x), trimws_(x))

5-digit US Zip Code

Description

Usage

zip5(x)
zip5(x)

Arguments

`x`	character vector

Details

Function zip5 converts all US zip codes to 5-digit.

Value

Function zip5 returns a character vector of nchar-5.

Examples

zip5(c('14901', '41452-1423'))
zip5(c('14901', '41452-1423'))

Package 'ThomasJeffersonUniv'

Help Index

Number of Anniversaries Between Two Dates

Description

Usage

Arguments

Details

Value

Assign Key to bibentry

Description

Usage

Arguments

Note

factor Return for case_match

Description

Usage

Arguments

Details

Value

Examples

Positive Counts in a logical vector

Description

Usage

Arguments

Value

Examples

Inspect Duplicated Rows in a data.frame

Description

Usage

Arguments

Value

Examples

Divide Numeric Into Intervals

Description

Usage

Arguments

Details

Examples

Concatenate a Date and a difftime Object

Description

Usage

Arguments

Value

Note

Examples

Concatenate Date and Time

Description

Usage

Arguments

Details

Value

Examples

Clopper-Pearson Exact Binomial Confidence Interval

Description

Usage

Arguments

Value

Note

References

Examples

max and min of factor

Description

Usage

Arguments

Value

Examples

Syntactic Sugar to Set (ordered) factor levels

Description

Usage

Arguments

Value

Examples

File Modification Time by Pattern

Description

Usage

Arguments

See Also

Examples

Force to logical vector

Description

grepl, but recognize `NA`

The $n$ th Lower Edge(s) of a Matrix