*Author: Ildi Czeller, @czeildi on Twitter and Github*

`## Warning: package 'data.table' was built under R version 3.5.2`

It caused me a great deal of headache to work around `as.difftime`

to get what I wanted but also I learned much more than I expected along the way so I wanted to share the journey with you.

## Initial experience

I had a data frame with two columns containing timestamps and I wanted to calculate the difference between them.

```
## send_time open_time
## 1: 2018-04-01 00:01:00 2018-04-01 00:07:00
## 2: 2018-04-02 00:01:00 2018-04-02 02:01:00
```

In the first row, there are 6 minutes between send and open time and in the second row there are 2 hours. The essence of the unexpected behavior I experienced:

`dt[, open_time - send_time][2]`

`## Time difference of 120 mins`

`dt[2, open_time - send_time]`

`## Time difference of 2 hours`

Basically the result for one row depends on other rows in the data frame.

### Finding a workaround

My first idea was to explicitly provide the `units`

argument. However, it did not help.

`dt[, as.difftime(open_time - send_time, units = "hours")][2]`

`## Time difference of 120 mins`

I found that if I create a difftime object with `difftime`

and not with subtraction it works as I expect.

`dt[, difftime(open_time, send_time, units = "hours")][2]`

`## Time difference of 2 hours`

## Digging deeper

### How does timestamp subtraction actually work?

```
base_time <- lubridate::ymd_hms("2018-04-01 00:00:00")
later_times <- base_time + c(
lubridate::period(1, "month"),
lubridate::period(1, "day"),
lubridate::period(1, "minute"),
lubridate::period(1, "second")
)
```

`later_times[1:1] - base_time`

`## Time difference of 30 days`

`later_times[1:2] - base_time`

```
## Time differences in days
## [1] 30 1
```

`later_times[1:3] - base_time`

```
## Time differences in mins
## [1] 43200 1440 1
```

`later_times[1:4] - base_time`

```
## Time differences in secs
## [1] 2592000 86400 60 1
```

Based on this I am quite confident to say that firstly the smallest unit will be used for all values and secondly subtraction won’t result in units greater than a day.

### Why does `units`

not help?

It still puzzled me why the explicit `units`

argument did not help. Time to look at the source code at last!

`as.difftime`

```
function (tim, format = "%X", units = "auto")
{
if (inherits(tim, "difftime"))
return(tim)
if (is.character(tim)) {
difftime(strptime(tim, format = format), strptime("0:0:0",
format = "%X"), units = units)
}
else {
# ...
}
}
```

Only by skimming the code we can see that it behaves differently depending on the type of its argument. What did we pass?

`class(dt[2, open_time - send_time])`

`## [1] "difftime"`

In the first two rows we get our answer: if the argument to `as.difftime`

is already a `difftime`

object, then the `units`

argument will have no effect.

`units="auto"`

My next question was where is the code to determine the used unit? As `as.difftime`

makes no transformation it must happen when we subtract two vectors.

Now it is useful to know that `-`

is an S3 generic which means it selects what S3 method to call based on the type (S3 class) of its argument(s). In base R subtraction is defined specifically for timestamp objects (of class `POSIXt`

). Note that you can call S3 methods directly with `[generic_name].[class_name]`

.

``-.POSIXt``

```
function (e1, e2)
{
# ...
if (!inherits(e1, "POSIXt"))
stop("can only subtract from \"POSIXt\" objects")
if (nargs() == 1)
stop("unary '-' is not defined for \"POSIXt\" objects")
if (inherits(e2, "POSIXt"))
return(difftime(e1, e2))
# if ...
# ...
}
```

We subtract two values of class `POSIXt`

so the relevant part for us is `return(difftime(e1, e2))`

. Let’s look at `difftime`

now.

`difftime`

```
function (time1, time2, tz, units = c("auto", "secs", "mins",
"hours", "days", "weeks"))
{
# ...
z <- unclass(time1) - unclass(time2)
attr(z, "tzone") <- NULL
units <- match.arg(units)
if (units == "auto")
units <- if (all(is.na(z)))
"secs"
else {
zz <- min(abs(z), na.rm = TRUE)
if (!is.finite(zz) || zz < 60)
"secs"
else if (zz < 3600)
"mins"
else if (zz < 86400)
"hours"
else "days"
}
# ...
}
```

Without understanding every detail we can see that if `units`

are not specified as in our case it will depend on the smallest absolute value. What makes these computations work is that `unclass()`

on a `POSIXct`

type object gives the number of seconds by default since `1970-01-01`

.

Finally I took the time to read the longish help file of `difftime`

which contained most of this information. Still I would go this way again as I believe I learned more.

### Conclusions

Look at the source code earlier and always be explicit with conversions if possible.