-
Notifications
You must be signed in to change notification settings - Fork 14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bugfix/466 standardization support for second fractions #679
Merged
benedeki
merged 10 commits into
develop
from
bugfix/466-standardization-support-for-second-fractions
Jul 27, 2019
Merged
Changes from all commits
Commits
Show all changes
10 commits
Select commit
Hold shift + click to select a range
b7c5baa
#466: Standardization support for second fractions
benedeki a280691
#466: Standardization support for second fractions
benedeki 7785145
#466: Standardization support for second fractions
benedeki d995710
#466: Standardization support for second fractions
benedeki 4b6da16
#466: Standardization support for second fractions
benedeki 27c7d00
#466: Standardization support for second fractions
benedeki 078b26d
#466: Standardization support for second fractions
benedeki 91131f0
#466: Standardization support for second fractions - addressing PR co…
benedeki 24b4110
#466: Standardization support for second fractions - addressing PR co…
benedeki 3e7a5eb
#466: Standardization support for second fractions - addressing PR co…
benedeki File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -223,44 +223,52 @@ To enable processing of time entries from other systems **Standardization** offe | |
string and even numeric values to timestamp or date types. It's done using Spark's ability to convert strings to | ||
timestamp/date with some enhancements. The pattern placeholders and usage is described in Java's | ||
[`SimpleDateFormat` class description](https://docs.oracle.com/javase/8/docs/api/java/text/SimpleDateFormat.html) with | ||
the addition of recognizing two keywords `epoch` and `milliepoch` (case insensitive) to denote the number of | ||
seconds/milliseconds since epoch (1970/01/01 00:00:00.000 UTC). | ||
the addition of recognizing some keywords (like `epoch` and `milliepoch` (case insensitive)) to denote the number of | ||
seconds/milliseconds since epoch (1970/01/01 00:00:00.000 UTC) and some additional placeholders. | ||
It should be noted explicitly that `epoch` and `milliepoch` are considered a pattern including time zone. | ||
|
||
Summary: | ||
|
||
| placeholder | Description | Example | | ||
| --- | --- | --- | | ||
| G | Era designator | AD | | ||
| y | Year | 1996; 96 | | ||
| Y | Week year | 2009; 09 | | ||
| M | Month in year (context sensitive) | July; Jul; 07 | | ||
| L | Month in year (standalone form) | July; Jul; 07 | | ||
| w | Week in year | 27 | | ||
| W | Week in month | 2 | | ||
| D | Day in year | 189 | | ||
| d | Day in month | 10 | | ||
| F | Day of week in month | 2 | | ||
| E | Day name in week | Tuesday; Tue | | ||
| u | Day number of week (1 = Monday, ..., 7 = Sunday) | 1 | | ||
| a | Am/pm marker | PM | | ||
| H | Hour in day (0-23) | 0 | | ||
| k | Hour in day (1-24) | 24 | | ||
| K | Hour in am/pm (0-11) | 0 | | ||
| h | Hour in am/pm (1-12) | 12 | | ||
| m | Minute in hour | 30 | | ||
| s | Second in minute | 55 | | ||
| S | Millisecond | 978 | | ||
| z | General time zone | Pacific Standard Time; PST; GMT-08:00 | | ||
| Z | RFC 822 time zone | -0800 | | ||
| X | ISO 8601 time zone | -08; -0800; -08:00 | | ||
| _epoch_ | Seconds since 1970/01/01 00:00:00 | 1557136493| | ||
| _milliepoch_ | Milliseconds since 1970/01/01 00:00:00.0000| 15571364938124 | | ||
| placeholder | Description | Example | Note | | ||
| --- | --- | --- | --- | | ||
| G | Era designator | AD | | | ||
| y | Year | 1996; 96 | | | ||
| Y | Week year | 2009; 09 | | | ||
| M | Month in year (context sensitive) | July; Jul; 07 | | | ||
| L | Month in year (standalone form) | July; Jul; 07 | | | ||
| w | Week in year | 27 | | | ||
| W | Week in month | 2 | | | ||
| D | Day in year | 189 | | | ||
| d | Day in month | 10 | | | ||
| F | Day of week in month | 2 | | | ||
| E | Day name in week | Tuesday; Tue | | | ||
| u | Day number of week (1 = Monday, ..., 7 = Sunday) | 1 | | | ||
| a | Am/pm marker | PM | | | ||
| H | Hour in day (0-23) | 0 | | | ||
| k | Hour in day (1-24) | 24 | | | ||
| K | Hour in am/pm (0-11) | 0 | | | ||
| h | Hour in am/pm (1-12) | 12 | | | ||
| m | Minute in hour | 30 | | | ||
| s | Second in minute | 55 | | | ||
| S | Millisecond | 978 | | | ||
| z | General time zone | Pacific Standard Time; PST; GMT-08:00 | | | ||
| Z | RFC 822 time zone | -0800 | | | ||
| X | ISO 8601 time zone | -08; -0800; -08:00 | | | ||
| _epoch_ | Seconds since 1970/01/01 00:00:00 | 1557136493, 1557136493.136| | | ||
| _epochmilli_ | Milliseconds since 1970/01/01 00:00:00.0000| 1557136493128, 1557136493128.001 | | | ||
| _epochmicro_ | Microseconds since 1970/01/01 00:00:00.0000| 1557136493128789, 1557136493128789.999 | | | ||
| _epochnano_ | Nanoseconds since 1970/01/01 00:00:00.0000| 1557136493128789101 | Seen the remark bellow regarding the loss of precision in _nanoseconds_ | | ||
| i | Microsecond | 111, 321001 | | | ||
| n | Nanosecond | 999, 542113879 | Seen the remark bellow regarding the loss of precision in _nanoseconds_ | | ||
|
||
|
||
**NB!** Spark uses US Locale and because on-the-fly conversion would be complicated, at the moment we stick to this | ||
hardcoded locale as well. E.g. `am/pm` for `a` placeholder, English names of days and months etc. | ||
|
||
**NB!** The keywords are case **insensitive**. Therefore, there is no difference between `epoch` and `EpoCH`. | ||
|
||
**NB!** While _nanoseconds_ designation is supported on input, it's not supported in storage or further usage. So any | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Please put some |
||
value behind microseconds precision will be truncated. | ||
|
||
##### Time Zone support | ||
As it has been mentioned, it's highly recommended to use timestamps with the time zone. But it's not unlikely that the | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like that we have such a good description of pattern characters. I had to look up for this every time I needed to write a pattern. Now README is the place to look. 👍