Skip to content

Commit

Permalink
Accept all RFC3339-compliant timestamps (#1093)
Browse files Browse the repository at this point in the history
**Issue:**
`aws_date_time` did not accept "2024-02-23 23:06:27+00:00", despite that being a valid RFC 3339 timestamp.

**Background:**
- [ISO 8601](https://en.wikipedia.org/wiki/ISO_8601): is a standard for date and time, but has a lot of options in it.
    - `aws_date_time` has enums for ISO_8601, but does not claim to support every single option
    - One option it did support though, was Basic vs Extended format.
        - Basic: no separators between "YYYYMMDD" and "HHMMSS"
        - Extended: has separators between "YYYY-MM-DD" and "HH-MM-SS"
- [RFC 3339](https://datatracker.ietf.org/doc/html/rfc3339): details a subset of ISO 8601 for use in internet timestamps
    - [Smithy 2.0](https://smithy.io/2.0/spec/protocol-traits.html#timestamp-formats) says that timestamps should use RFC 3339
    -  `aws_date_time` does not fully support RFC 3339. The unsupported features in the string above were:
        1) Space instead of "T" between date and time
        2) "+00:00" offset instead of "Z" after time
- I personally found the [parsing code for ISO_8601](https://github.com/awslabs/aws-c-common/blob/15a25349d59852e2655c0920835644f2eb948d77/source/date_time.c#L326) hard to follow, and the [ISO_8601_BASIC code](https://github.com/awslabs/aws-c-common/blob/15a25349d59852e2655c0920835644f2eb948d77/source/date_time.c#L221) was a big copy/paste of that.

**Description of changes:**
- The code is rewritten to be simpler. It is no longer a state machine.
- Combine parsing code for `ISO_8601` and `ISO_8601_BASIC` into 1 function.
    - The parse is lenient now, accepting Basic or Extended timestamps, regardless of whether the `ISO_8601` or `ISO_8601_BASIC` enum is passed in.
        - Our `ISO_8601` code already allowed separators to be omitted from time, which shows evidence that lenience is good.
        - This matches the leniency of Python's [dateutil.parser.isoparse(str)](https://dateutil.readthedocs.io/en/stable/parser.html) which allows "2024-02-23T23:06:27Z" or "20240223T230627Z" or "2024-02-23T230627Z" or "20240223T23:06:27Z"
  - Allow space instead of "T"
  - Allows lowercase "t" and "z"
  - Support offsets like "+12:34", instead of just "Z"
  - Allow ",123" for fractional seconds, not just ".123"
      - The Basic parse code had a bug where did not expect any characters between seconds and fractional-seconds. This is wrong. Python's dateutil rejects this.
  • Loading branch information
graebm authored Mar 5, 2024
1 parent 15a2534 commit fcadc0d
Show file tree
Hide file tree
Showing 4 changed files with 243 additions and 304 deletions.
7 changes: 5 additions & 2 deletions include/aws/common/date_time.h
Original file line number Diff line number Diff line change
Expand Up @@ -80,11 +80,14 @@ AWS_COMMON_API void aws_date_time_init_epoch_secs(struct aws_date_time *dt, doub
* Initializes dt to be the time represented by date_str in format 'fmt'. Returns AWS_OP_SUCCESS if the
* string was successfully parsed, returns AWS_OP_ERR if parsing failed.
*
* The parser is lenient regarding AWS_DATE_FORMAT_ISO_8601 vs AWS_DATE_FORMAT_ISO_8601_BASIC.
* Regardless of which you pass in, both "2002-10-02T08:05:09Z" and "20021002T080509Z" would be accepted.
*
* Notes for AWS_DATE_FORMAT_RFC822:
* If no time zone information is provided, it is assumed to be local time (please don't do this).
*
* If the time zone is something other than something indicating Universal Time (e.g. Z, UT, UTC, or GMT) or an offset
* from UTC (e.g. +0100, -0700), parsing will fail.
* Only time zones indicating Universal Time (e.g. Z, UT, UTC, or GMT),
* or offsets from UTC (e.g. +0100, -0700), are accepted.
*
* Really, it's just better if you always use Universal Time.
*/
Expand Down
Loading

0 comments on commit fcadc0d

Please sign in to comment.