phint_squash() and datetime_squash() merge overlapping or adjacent
intervals into a single <phinterval> element containing a minimal set of
non-overlapping, non-adjacent time spans.
phint_squash()takes a<phinterval>or<Interval>vector.datetime_squash()takes separatestartandenddatetime vectors.
phint_squash_by() and datetime_squash_by() merge intervals within groups
defined by the by argument. The result is a <phinterval> vector containing
one element per unique value of by.
Usage
phint_squash(phint, na_rm = TRUE, empty_to = c("hole", "na"))
datetime_squash(start, end, na_rm = TRUE, empty_to = c("hole", "na"))
phint_squash_by(
phint,
by,
na_rm = TRUE,
empty_to = c("hole", "na"),
order_by = TRUE
)
datetime_squash_by(
start,
end,
by,
na_rm = TRUE,
empty_to = c("hole", "na"),
order_by = TRUE
)Arguments
- phint
[phinterval / Interval]A
<phinterval>or<Interval>vector.- na_rm
[TRUE / FALSE]Should
NAelements be removed before squashing? IfFALSEand anyNAelements are present, the result isNA. Defaults toTRUE.- empty_to
["hole" / "na"]How to handle empty inputs (length-0 vectors):
"hole"(default): Return a hole."na": Return anNAphinterval.
- start
[POSIXct / POSIXlt / Date]A vector of start times. Must be recyclable with
end. Only used indatetime_squash()anddatetime_squash_by().- end
[POSIXct / POSIXlt / Date]A vector of end times. Must be recyclable with
start. Only used indatetime_squash()anddatetime_squash_by().- by
[vector / data.frame]A grouping vector or data frame. Intervals are grouped by
byand merged separately within each group, returning one<phinterval>element per unique value ofby.For
datetime_squash_by(),bymust be recyclable with the recycled length ofstartandend.bymay be any vector in the vctrs sense. Seevctrs::obj_is_vector()for details.- order_by
[TRUE / FALSE]Should the output be ordered by the values in
by? IfTRUE(the default), the output is sorted by the unique values ofby. IfFALSE, the output order matches the first appearance of each group inby. Only used inphint_squash_by()anddatetime_squash_by().
Value
phint_squash() and datetime_squash() return a length-1 <phinterval>
vector.
phint_squash_by() and datetime_squash_by() return a tibble::tibble()
with columns by and phint, with one row per unique value of by.
Details
These functions are particularly useful in aggregation workflows with
dplyr::summarize() to combine intervals within groups.
The phint_squash_by() and datetime_squash_by() variants are designed to
replicate a call to dplyr::group_by() followed by dplyr::summarize(), but
are typically faster. In particular, the following produce identical results:
See also
phint_flatten() and datetime_flatten() to merge a <phinterval>
vector into a vector of scalar spans rather than a single element.
Examples
jan_1_to_5 <- interval(as.Date("2000-01-01"), as.Date("2000-01-05"))
jan_3_to_9 <- interval(as.Date("2000-01-03"), as.Date("2000-01-09"))
jan_11_to_12 <- interval(as.Date("2000-01-11"), as.Date("2000-01-12"))
# phint_squash: merge intervals from a phinterval/Interval vector
phint_squash(c(jan_1_to_5, jan_3_to_9, jan_11_to_12))
#> <phinterval<UTC>[1]>
#> [1] {2000-01-01--2000-01-09, 2000-01-11--2000-01-12}
# datetime_squash: merge intervals from start/end vectors
datetime_squash(
start = as.Date(c("2000-01-01", "2000-01-03", "2000-01-11")),
end = as.Date(c("2000-01-05", "2000-01-09", "2000-01-12"))
)
#> <phinterval<UTC>[1]>
#> [1] {2000-01-01--2000-01-09, 2000-01-11--2000-01-12}
# NA values are removed by default
phint_squash(c(jan_1_to_5, jan_3_to_9, jan_11_to_12, NA))
#> <phinterval<UTC>[1]>
#> [1] {2000-01-01--2000-01-09, 2000-01-11--2000-01-12}
# Set na_rm = FALSE to propagate NA values
phint_squash(c(jan_1_to_5, jan_3_to_9, jan_11_to_12, NA), na_rm = FALSE)
#> <phinterval<UTC>[1]>
#> [1] <NA>
# empty_to determines the result of empty inputs
phint_squash(phinterval(), empty_to = "hole")
#> <phinterval<UTC>[1]>
#> [1] <hole>
phint_squash(phinterval(), empty_to = "na")
#> <phinterval<UTC>[1]>
#> [1] <NA>
# phint_squash_by: squash within groups, returning a tibble
phint_squash_by(
c(jan_1_to_5, jan_3_to_9, jan_11_to_12),
by = c("A", "A", "B")
)
#> # A tibble: 2 × 2
#> by phint
#> <chr> <phint<UTC>>
#> 1 A {2000-01-01--2000-01-09}
#> 2 B {2000-01-11--2000-01-12}
# datetime_squash_by: squash from start/end vectors within groups
datetime_squash_by(
start = as.Date(c("2000-01-01", "2000-01-03", "2000-01-11")),
end = as.Date(c("2000-01-05", "2000-01-09", "2000-01-12")),
by = c("A", "A", "B")
)
#> # A tibble: 2 × 2
#> by phint
#> <chr> <phint<UTC>>
#> 1 A {2000-01-01--2000-01-09}
#> 2 B {2000-01-11--2000-01-12}
# Control output order with order_by
phint_squash_by(
c(jan_1_to_5, jan_3_to_9, jan_11_to_12),
by = c(2, 2, 1),
order_by = TRUE
)
#> # A tibble: 2 × 2
#> by phint
#> <dbl> <phint<UTC>>
#> 1 1 {2000-01-11--2000-01-12}
#> 2 2 {2000-01-01--2000-01-09}
