Skip to content

Should ECMA-402 spec text for time zone canonicalization refer to CLDR or to IANA as authoritative? #825

@justingrant

Description

@justingrant

CLDR in unicode-org/cldr#3105 will soon provide the ability to fetch modern canonical IDs for currently-problematic time zones like Asia/Calcutta and Europe/Kiev. ICU will also be adding an API to expose the IANA canonical ID. This will enable V8 and JSC to finally expose modern IANA canonical names like SpiderMonkey does, but without the separate IANA-based overrides that SpiderMonkey has had to maintain.

Before this change, it didn't make sense to have normative spec text in 402 to define which IDs should be primary (the new 262 term for canonical time zone ID) vs. non-primary. But with this CLDR/ICU change, it's finally practical to specify normative rules for determining which IDs are primary (and, if not, which primary ID they resolve to).

I can draft a PR with this normative text for discussion, but first there's one main question to answer: should we specify the rules solely as using CLDR data, solely as using IANA data, or should we specify rules for which IANA IDs are canonical that just happens to match what CLDR is doing?

As an illustration, here's two possible directions that we could take this spec text. Don't worry about the particular text used (it's very rough and will change) but the general approach of depending on CLDR vs. depending on IANA is where I'm most looking for feedback.

Which one is better? @sffc @gibson042 @anba @Constellation @FrankYFTang

Option A - Defer to CLDR

The Unicode Common Locale Data Repository (CLDR) defines which available named time zone identifiers are primary or non-primary, as well as which non-primary time zone identifiers resolve to which primary time zone identifiers. The following exceptions are applied:

  • For historical reasons, "UTC" is a primary time zone identifier while "Etc/UTC", "Etc/GMT", and "GMT" (and the identifiers that resolve to them) are non-primary time zone identifiers that resolve to "UTC".
  • CLDR identifiers that are not present in the IANA Time Zone Database ("Etc/Unknown", "Canada/East-Saskatchewan", and "US/Pacific-New") are not supported.
  • "Factory" is not supported.

The following spec text would replace the steps of https://tc39.es/proposal-temporal/#sup-availablenamedtimezoneidentifiers:

  1. Let result be a new empty List.
  2. For each <type>element that contains an alias attribute in timezone.xml in the Unicode Common Locale Data Repository (CLDR), do
    1. Let aliases be a new List, populated by splitting the space-delimited alias attribute.
    2. If an iana attribute is present, let primary be the String value of that attribute; otherwise, let primary be the first element in aliases.
    3. If primary is one of "Etc/UTC", "Etc/GMT", or "GMT", set primary to "UTC".
    4. If primary is not "Etc/Unknown" nor "Factory" , then
      1. Let record be the Time Zone Identifier Record { [[Identifier]]: primary, [[PrimaryIdentifier]]: primary }.
      2. Append record to result.
      3. For each element identifier in aliases, do
        1. If identifier is not primary and is not one of "Canada/East-Saskatchewan" nor "US/Pacific-New", then
          1. Set record to the Time Zone Identifier Record { [[Identifier]]: primary, [[PrimaryIdentifier]]: primary }.
          2. Append record to result.
  3. Sort result lexicographically by UTF-16 code unit of each element's [[Identifier]] field, in ascending order
  4. Assert: result contains a Time Zone Identifier Record r such that r.[[Identifier]] is "UTC" and r.[[PrimaryIdentifier]] is "UTC".
  5. Return result.

Option B - Define using IANA only

Each Zone in the IANA Time Zone Database must be a primary time zone identifier and each Link name in the IANA Time Zone Database must be a non-primary time zone identifier that resolves to its corresponding Zone name, with the following exceptions:

  • For historical reasons, "UTC" must be a primary time zone identifier. "Etc/UTC", "Etc/GMT", and "GMT", as well as all Link names that resolve to any of them, must be non-primary time identifiers that resolve to "UTC".
  • Any Link name in the TZ column of zone.tab of the IANA Time Zone Database must be a primary time zone identifier.
  • Any Link name that represents a geographical area entirely contained within the territory of a single ISO 3166-2 country code must resolve to a primary identifier that also represents a geographical area entirely contained within the territory of the same ISO 3166-2 country code. For example, "Atlantic/Jan_Mayen" must resolve to "Arctic/Longyearbyen".
  • The following legacy POSIX identifiers must resolve to their Continent/City equivalents, as shown in the table below:
Legacy POSIX Zone Name Primary Time Zone Identifier
EST Etc/GMT+5
MST Etc/GMT+7
HST Etc/GMT+10
EST5EDT America/New_York
CST6CDT America/Chicago
MST7MDT America/Denver
PST8PDT America/Los_Angeles
WET Europe/Lisbon
CET Europe/Berlin
MET Europe/Vienna
EET Europe/Athens

Option C - Define IANA rules, but explain how to use timezone.xml

We could also merge both (A) and (B): define the IANA *and* explain how to use timezone.xml data to satisfy those rules. I'm not going to draft text for this yet because I'm unsure if anyone would want it, but I wanted to include this option here for discussion.

Metadata

Metadata

Assignees

Labels

SmallSmaller change solvable in a Pull Requestc: datetimeComponent: dates, times, timezoness: in progressStatus: the issue has an active proposal

Type

No type

Projects

Status

Previously Discussed

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions