Working with ISO8601 Dates and Times.
iso8601
R-package
The package has special functions for transforming ISO8601 (ISO 8601-1:2019) strings into dates, date-times and times. These functions transform the strings into the corresponding R objects: ‘Date’, ‘POSIXct’ and ‘Time’ (which is a subclass of ‘POSIXct’, see below):
library(iso8601)
iso8601todate("2019-08-17")
## [1] "2019-08-17"
iso8601todatetime("2019-08-17T16:15:14Z")
## [1] "2019-08-17 16:15:14 GMT"
iso8601totime("T16:15:14")
## [1] "T16:15:14"
Dates
For converting to ‘Date’ the package should accept all valid formats described by ISO8601 as shown below:
iso8601todate("2019-08-17")
## [1] "2019-08-17"
iso8601todate("2019-08")
## [1] "2019-08-01"
iso8601todate("2019")
## [1] "2019-01-01"
iso8601todate("20190817")
## [1] "2019-08-17"
iso8601todate("2019-W33-6")
## [1] "2019-08-17"
iso8601todate("2019-W33")
## [1] "2019-08-12"
iso8601todate("2019W336")
## [1] "2019-08-17"
iso8601todate("2019W33")
## [1] "2019-08-12"
iso8601todate("2019-229")
## [1] "2019-08-17"
iso8601todate("2019229")
## [1] "2019-08-17"
iso8601todate("−0009-123")
## [1] "-9-05-03"
iso8601todate("-0009")
## [1] "-9-01-01"
iso8601todate("+002019-229", ndigitsyear = 6)
## [1] "2019-08-17"
As can be seen from the examples above, for incomplete dates, the missing parts are substituted by 1 as the ‘Date’ object cannot handle incomplete dates. It is also possible to mix different formats in one character vector.
Date-times
Date-time strings consist of a date and a time separated by the character ‘T’. For the date part all complete date strings mentioned above are allowed. The time part can be specified both in expanded format (with colon) and compact format and as complete or incomplete times. In the latter case the missing parts are substituted by 0:
iso8601todatetime("2019-08-17T16:15:14Z")
## [1] "2019-08-17 16:15:14 GMT"
iso8601todatetime("2019-08-17T161514Z")
## [1] "2019-08-17 16:15:14 GMT"
iso8601todatetime("2019-08-17T16:15Z")
## [1] "2019-08-17 16:15:00 GMT"
iso8601todatetime("2019-08-17T1615")
## [1] "2019-08-17 16:15:00 CEST"
iso8601todatetime("2019-08-17T16Z")
## [1] "2019-08-17 16:00:00 GMT"
iso8601todatetime("+002019-08-17T16:15:14Z", ndigitsyear = 6)
## [1] "2019-08-17 16:15:14 GMT"
Fractional times are also allowed:
iso8601todatetime("2019-08-17T16:15:14,00Z")
## [1] "2019-08-17 16:15:14 GMT"
iso8601todatetime("2019-08-17T16:15:14.00Z")
## [1] "2019-08-17 16:15:14 GMT"
iso8601todatetime("2019-08-17T161514.00Z")
## [1] "2019-08-17 16:15:14 GMT"
iso8601todatetime("2019-08-17T161514,00Z")
## [1] "2019-08-17 16:15:14 GMT"
iso8601todatetime("2019-08-17T16:15.24Z")
## [1] "2019-08-17 16:15:14 GMT"
iso8601todatetime("2019-08-17T16:15,24Z")
## [1] "2019-08-17 16:15:14 GMT"
iso8601todatetime("2019-08-17T1615.24Z")
## [1] "2019-08-17 16:15:14 GMT"
iso8601todatetime("2019-08-17T1615,24Z")
## [1] "2019-08-17 16:15:14 GMT"
iso8601todatetime("2019-08-17T16.2539Z")
## [1] "2019-08-17 16:15:14 GMT"
iso8601todatetime("2019-08-17T16,2539Z")
## [1] "2019-08-17 16:15:14 GMT"
When the date and time are in extended format the ‘T’ can be omitted
iso8601todatetime("2019-08-17 16:15:14Z")
## [1] "2019-08-17 16:15:14 GMT"
iso8601todatetime("2019-08-17 16:15:14,00Z")
## [1] "2019-08-17 16:15:14 GMT"
iso8601todatetime("2019-08-17 16:15:14.00Z")
## [1] "2019-08-17 16:15:14 GMT"
iso8601todatetime("2019-08-17 16:15Z")
## [1] "2019-08-17 16:15:00 GMT"
iso8601todatetime("2019-08-17 16:15Z")
## [1] "2019-08-17 16:15:00 GMT"
iso8601todatetime("2019-08-17 16:15.24Z")
## [1] "2019-08-17 16:15:14 GMT"
iso8601todatetime("2019-08-17 16:15,24Z")
## [1] "2019-08-17 16:15:14 GMT"
Time-zones can be indicated by ‘Z’ (as in the examples above) which indicates UTC or Zulu time; or by an offset in hours or hours and minutes. When there is no time-zone indicator it is assumed that the times are in local time. Which time zone that is should be communicated otherwise; the package assumes it is the local time of the system on which R is running. A positive offset indicates time zones east of the prime meridian whose times are ahead of UTC and negative offsets indicate time zones west of the prime meridian.
iso8601todatetime("2019-08-17T16:15:14Z")
## [1] "2019-08-17 16:15:14 GMT"
iso8601todatetime("2019-08-17T16:15:14+01:00")
## [1] "2019-08-17 15:15:14 GMT"
iso8601todatetime("2019-08-17T16:15:14±00:00")
## [1] "2019-08-17 16:15:14 GMT"
iso8601todatetime("2019-08-17T16:15:14-01")
## [1] "2019-08-17 17:15:14 GMT"
iso8601todatetime("2019-08-17T16:15:14−00:00")
## [1] "2019-08-17 16:15:14 GMT"
iso8601todatetime("2019-08-17T16:15:14")
## [1] "2019-08-17 16:15:14 CEST"
As shown above, when all date-times have either an offset or are in UTC, the times are converted and shown in UTC (for which R uses the string ‘GMT’). Otherwise, the date-times are shown in the local time zone.
iso8601todatetime
returns a ‘POSIXct’ object that has an additional ‘timezone’ attribute that contains the original time zones:
t <- iso8601todatetime(c(
"2019-08-17T16:15:14+01:00",
"2019-08-17T16:15:14+00",
"2019-08-17T16:15:14Z",
"2019-08-17T16:15:14",
"2019-08-17T16:15:14-05:30"
))
print(t)
## [1] "2019-08-17 17:15:14 CEST" "2019-08-17 18:15:14 CEST"
## [3] "2019-08-17 18:15:14 CEST" "2019-08-17 16:15:14 CEST"
## [5] "2019-08-17 23:45:14 CEST"
attr(t, "timezone")
## [1] "+01:00" "GMT" "GMT" "" "-05:30"
Times
The iso8601totime
converts times (without date). It accepts the following formats:
iso8601totime("T16:15:14")
## [1] "T16:15:14"
iso8601totime("T16:15:14,00")
## [1] "T16:15:14"
iso8601totime("T16:15:14.00")
## [1] "T16:15:14"
iso8601totime("T161514")
## [1] "T16:15:14"
iso8601totime("T161514.00")
## [1] "T16:15:14"
iso8601totime("T161514,00")
## [1] "T16:15:14"
iso8601totime("T16:15:14,00")
## [1] "T16:15:14"
iso8601totime("T16:15:14.00")
## [1] "T16:15:14"
iso8601totime("T161514.00")
## [1] "T16:15:14"
iso8601totime("T161514,00")
## [1] "T16:15:14"
iso8601totime("T16:15.24")
## [1] "T16:15:14"
iso8601totime("T16:15,24")
## [1] "T16:15:14"
iso8601totime("T1615.24")
## [1] "T16:15:14"
iso8601totime("T1615,24")
## [1] "T16:15:14"
iso8601totime("T16.2539")
## [1] "T16:15:14"
iso8601totime("T16,2539")
## [1] "T16:15:14"
When calling iso8601totime
we know that we are dealing with times, therefore, the ‘T’ can be omitted
iso8601totime("16:15:14")
## [1] "T16:15:14"
iso8601totime("16:15:14,00")
## [1] "T16:15:14"
iso8601totime("16:15:14.00")
## [1] "T16:15:14"
iso8601totime("16:15:14,00")
## [1] "T16:15:14"
iso8601totime("16:15:14.00")
## [1] "T16:15:14"
iso8601totime("16:15.24")
## [1] "T16:15:14"
iso8601totime("16:15,24")
## [1] "T16:15:14"
iso8601totime("161514")
## [1] "T16:15:14"
iso8601totime("161514,00")
## [1] "T16:15:14"
iso8601totime("161514.00")
## [1] "T16:15:14"
iso8601totime("1615")
## [1] "T16:15:00"
iso8601totime("1615")
## [1] "T16:15:00"
iso8601totime("1615.24")
## [1] "T16:15:14"
iso8601totime("1615,24")
## [1] "T16:15:14"
Time zones are ignored as these are meaningless without date.
The object returned is of class c("Time", "POSIXct", "POSIXt")
. It is therefore a subclass of ‘POSIXct’. As this object encodes date-times, the times are encoded as times on 1970-01-01. The ‘Time’ class handles proper display of the object. Otherwise, it can be handled are a regular ‘POSIXct’ object.
t <- iso8601totime("T16:15:14Z")
print(t)
## [1] "T16:15:14"
class(t)
## [1] "Time" "POSIXct" "POSIXt"
class(t) <- class(t)[-1]
print(t)
## [1] "1970-01-01 16:15:14 GMT"
Generic conversion
The function iso8601todataframe
will parse ISO8601 strings and split these into the separate parts. Only the parts present in any of the strings are returned.
iso8601todataframe(c(
"2019-08-17",
"2019-W33-6",
"2019-08-17T16:15:14+00",
"2019229T161514",
"T16:15"
))
## type year month day week weekday yearday hour minutes seconds
## 1 Date 2019 8 17 NA NA NA NA NA NA
## 2 Date 2019 NA NA 33 6 NA NA NA NA
## 3 Datetime 2019 8 17 NA NA NA 16 15 14
## 4 Datetime 2019 NA NA NA NA 229 16 15 14
## 5 Time NA NA NA NA NA NA 16 15 NA
## tzoffsethours tzoffsetminutes
## 1 NA NA
## 2 NA NA
## 3 0 0
## 4 NA NA
## 5 NA NA
The ‘type’ column contains the type of ISO8601 string. For parts not present in the string is returned.
It is also possible to transform the dates to one format: either year-month-day or year-day:
iso8601todataframe(c(
"2019-08-17",
"2019-W33-6",
"2019-08-17T16:15:14+01",
"2019229T161514",
"T16:15"
), transformdate = "toyearmonthday")
## type year month day hour minutes seconds tzoffsethours tzoffsetminutes
## 1 Date 2019 8 17 NA NA NA NA NA
## 2 Date 2019 8 17 NA NA NA NA NA
## 3 Datetime 2019 8 17 16 15 14 1 0
## 4 Datetime 2019 8 17 16 15 14 NA NA
## 5 Time NA NA NA 16 15 NA NA NA
iso8601todataframe(c(
"2019-08-17",
"2019-W33-6",
"2019-08-17T16:15:14+01",
"2019229T161514",
"T16:15"
), transformdate = "toyearday")
## type year yearday hour minutes seconds tzoffsethours tzoffsetminutes
## 1 Date 2019 229 NA NA NA NA NA
## 2 Date 2019 229 NA NA NA NA NA
## 3 Datetime 2019 229 16 15 14 1 0
## 4 Datetime 2019 229 16 15 14 NA NA
## 5 Time NA NA 16 15 NA NA NA
Helper functions
iso8601type
returns a character vector whose elements indicate the type of ISO8601 string:
iso8601type(c(
"2019-08-17",
"2019-W33-6",
"2019-08-17T16:15:14+01",
"2019229T161514",
"T16:15"
))
## [1] "YMD" "YWD" "YMDTHMS±Z" "YDTHMS" "THM"
iso8601standardise
transforms the dates into one standard extended format:
iso8601standardise(c(
"2019-08-17",
"2019-W33-6",
"2019-08-17 16:15:14+01",
"2019229T161514",
"T16:15"
))
## [1] "2019-08-17" "2019-08-17" "2019-08-17T15:15:14Z"
## [4] "2019-08-17T16:15:14" "T16:15:00"
The fillmissing
arguments fills in missing parts (1 for dates and 0 for times), toymd
transforms all dates to year-month-day and tozulu
applies any time zone offsets and transforms the times to UTC (times local time zones are not affected):
iso8601standardise(c(
"2019-08-17",
"2019-W33-6",
"2019-08-17 16:15:14+01",
"2019229T161514",
"T16:15"
), fillmissing = TRUE, toymd = TRUE, tozulu = TRUE)
Other R-packages
Other options for parsing date, time (none of the packages support time strings) and date-times strings are:
parsedate
has the functionparse_iso_8601
that supports ISO8601 dates and date-times. The performance of this function is significantly less than those fromiso8601
.anytime
has the functionsanytime
andanydate
to convert date-time and time strings to time and date classes. It supports many of the year-month-day formats but not year-weer-day or ordinal dates. Also (this can be an advantage of disadvantage) it accepts many more formats.lubridate
supports many or most of the year-month-day formats.as.Date
,as.POXIXct
andstrptime
from base R will also support most of the ISO8601 formats by specifying a appropriate format string except for time-zones. This, however, requires that all strings have the same format.