Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Official MIME Type? #465

Closed
hairyhenderson opened this issue Jun 4, 2017 · 22 comments
Closed

Official MIME Type? #465

hairyhenderson opened this issue Jun 4, 2017 · 22 comments

Comments

@hairyhenderson
Copy link

There doesn't seem to be a registered MIME Type for TOML - are there any plans to register one?

From some quick google searching I see one use of text/x-toml, but I'd suggest application/toml as a more appropriate type.

If an application hasn't been started yet, here is the place to start: https://www.iana.org/form/media-types.

Thanks!

@Hrxn
Copy link

Hrxn commented Jun 5, 2017

Yes, application/toml

+1, this would be along the lines of JSON.

@ChristianSi
Copy link
Contributor

-1, if there is a MIME type, it should be text/toml. TOML, as opposed to JSON, is very much for files that are meant to be read and written by humans, such as config files.

@hairyhenderson
Copy link
Author

I don't feel too strongly about this, but based on https://tools.ietf.org/html/rfc2046#section-3 I think application/toml is still more appropriate.

Re: text (from the above - section 3.1):

          ...
          Other subtypes are to be used for enriched text in
          forms where application software may enhance the
          appearance of the text, but such software must not be
          required in order to get the general idea of the
          content.  Possible subtypes of "text" thus include any
          word processor format that can be read without
          resorting to software that understands the format.

Based on that, I think subtypes of text is intended to be used more for unstructured (but possibly formatted) text, whereas application seems to be more appropriate for structured data, like TOML.

I'd argue that, while TOML emphasizes human-readability, it's still primarily a data format, intended to be read by applications. As a contrast, a markdown file (text/markdown) conveys its full meaning without being processed further by an application, whereas a TOML files are generally meaningless outside of the context of the application they are configuring.

@lilydjwg
Copy link

But we have text/{css,vcard,csv,html}. application/* sounds like it's not intended for people to read and write even it's text (like JSON or mbox). application/javascript is strange since many source types are text/* or text/x-*.

@hairyhenderson
Copy link
Author

application/* sounds like it's not intended for people to read and write even it's text (like JSON or mbox).

@lilydjwg I agree that's how it sounds... But neither JSON and mbox are primarily intended to be human-read. That they're human-readable or not is incidental IMO.

As for TOML, I still maintain that it's meaningless outside of the context of an application that parses it. A file written in TOML must be processed by some application in order for it to gain meaning. In contrast (and, in theory but not always in practice), you could read a file written in text/html without missing any of the intended meaning.

I'd also contend that we shouldn't use the existing registered MIME types, especially older ones, as particularly good examples of RFC 2046 conformance 😉

@hairyhenderson
Copy link
Author

Thanks @mojombo!

@ChristianSi
Copy link
Contributor

The README now says: "the appropriate MIME type is application/toml."

However, this seems very bad advice, since https://www.iana.org/assignments/media-types/media-types.xhtml knows no such MIME type. Unless the MIME type has officially been registered (assuming that anyone even plans to do that), application/x-toml should be used (or something else with x- in it).

@ChristianSi
Copy link
Contributor

I propose to re-open this and fix the README.

@hairyhenderson
Copy link
Author

@ChristianSi the x- is discouraged as per https://tools.ietf.org/html/rfc6838#section-3.4

IMO it's totally reasonable to recommend application/toml before it's registered. Also IMO, it's up to @mojombo to initiate the registration process (it is, after all Tom's Own ... 😉).

Again, the link to the registration form is https://www.iana.org/form/media-types...

@ChristianSi
Copy link
Contributor

@hairyhenderson I stand corrected. However, I also note that the same RFC section says: "with the simplified registration procedures described above for vendor and personal trees, it should rarely, if ever, be necessary to use unregistered types." Hence I take it that the RFC does NOT recommend using x-free MIME types without registration.

Also, assuming some Tim one day designs Tim's Original Multimedia Layout (TOML) and registers the appropriate MIME type before @mojombo comes around to do it. Then we would be in trouble....

Hence, forget about the x-, but swiftly registering the MIME type still seems a good idea.

@pradyunsg
Copy link
Member

+1 to registering the MIME type asap.

@hairyhenderson
Copy link
Author

the RFC does NOT recommend using x-free MIME types without registration.

Of course it doesn't - type registration is the subject of that RFC, after all 😉

but swiftly registering the MIME type still seems a good idea.

I agree! But AFAIK, there's only one person who can reasonably do that 🙂

Also, assuming some Tim one day designs Tim's Original Multimedia Layout (TOML) and registers the appropriate MIME type before @mojombo comes around to do it. Then we would be in trouble....

I think this is unlikely. TOML's a well-enough-known format by now... And encouraging common usage of application/toml is IMO a good way to prevent this from happening - prior art, and all that.

@patcon
Copy link

patcon commented Oct 28, 2018

In case anyone wants to help drive the registration process: #574 ❤️

@DeadWisdom
Copy link

DeadWisdom commented May 16, 2020

@hairyhenderson -- You've got this wrong here. The point of the text/* media type vs application/* is toward displaying the file to the user. An application/* object should make no sense to a user, and so a system should not even attempt to present it except within the context of its application.

For instance, given an attachment in an email, and the user clicks on it, the client should not present it, unless it knows how to handle that type. A text/* object however, should be displayed as text even if the client doesn't know what to do with it.

The point for text/* isn't that it should represent long-textual data, but rather that it is likely to be fully readable by a human as opposed to application/*, which is not. The former is the very point of TOML.

As for JSON, "application/json" is correct, because although the user can parse it, the purpose of JSON was always to be a limited data format, not necessarily an easily readable one. Hence the long disputed decision that it cannot have comments. Nor does it dictate any human readable whitespace / newlines.

Sorry to re-open a long dead, closed issue, and I don't mean to bikeshead, but whereas one might see this as trivial, I see a correct mime type as very important.

@hairyhenderson
Copy link
Author

The point of the text/* media type vs application/* is toward displaying the file to the user

I think that over-simplifies things, but even with this simplification, application/toml still makes more sense than text/toml.

Again, from the MIME RFC:

Other subtypes are to be used for enriched text in
forms where application software may enhance the
appearance of the text, but such software must not be
required in order to get the general idea of the
content. Possible subtypes of "text" thus include any
word processor format that can be read without
resorting to software that understands the format.

The common usage of TOML is not primarily as an "enriched text" format. I wouldn't write a blog post in TOML, for example.

From this repo's README:

TOML aims to be a minimal configuration file format that's easy to read due to obvious semantics.

TOML is specifically defined as a config format, and even though it may be easy to read, it's still impossible for a human to gain the intended value of a TOML file on its own. To actually gain the value of a TOML file, I'd need to pair it with some software to process it.

Unfortunately RFC 2046 is somewhat unclear, likely due to the rarity back in 1996 of human-readable non-binary configuration formats. Besides, the use of MIME types has long since escaped the realm of "Internet Mail", in ways that I'm sure the authors did not at all expect!

Ultimately, until someone goes through the process of registering with IANA, there won't be any official type, and any argument around which one to use will simply be bikeshedding 😉

I see a correct mime type as very important.

If so, please help out with #574!

@patcon
Copy link

patcon commented May 16, 2020

Or both :) Ceci n'est pas une bikeshed

From inspecting the mimetype list:

https://www.iana.org/assignments/media-types/text/xml
https://www.iana.org/assignments/media-types/application/xml

(*/rtf is too, and likely some others)

@hairyhenderson
Copy link
Author

@patcon That's an interesting point, though XML is a bit "special" in that it can be used both to represent both formatted text and configuration/data intended to be processed by applications.

Comparing TOML to XML is a bit of an apples-to-oranges comparison. XML is a generalized Markup Language, and a file in XML format is completely devoid of semantics outside of the context of a schema. Mapping XML to a data structure requires a schema to do so unambiguously, whereas TOML can be mapped unambiguously without a schema.

I think it's probably more appropriate to compare TOML with JSON or YAML, of which only one has a registered media type: application/json. And YAML is explicitly a superset of JSON, which could be construed as a reason to treat application/yaml as the correct type for that format. Though both text/yaml and application/yaml appear in the wild (as well as many other variations).

@eksortso
Copy link
Contributor

I'm late to this shindig, but here's my take. I've read convincing arguments for both application/toml and text/toml here, but I'm siding with application/toml because TOML is essentially a data format, and there's a lot of precedent to use application/* for data, even when humans can read it easily.

I scanned https://www.iana.org/assignments/media-types/media-types.xhtml and looked for prominent keywords.

  • Almost all occurrences, and there are lots, of xml appear in the application section. Only two occurrences show up in text.
  • ini doesn't show up anywhere, which isn't surprising.
  • yaml doesn't show up at all. That is surprising.
  • json mainly shows up in application. No occurrences in text.
  • Here's the killer argument for me. The word config appears almost entirely in application. One instance in audio. None in text. None at all.

The registered types certainly don't force usage, and they don't reflect what's used in the wild necessarily, but they do give clear intentions as to what is intended by the content type.

Maybe the dual approach suggested by @patcon would be best. But based on current usage and on previous types, application/toml ought to come first. Via #574 of course.

@hairyhenderson
Copy link
Author

That's a good summary @eksortso, thanks for digging into it.

  • yaml doesn't show up at all. That is surprising.

Yeah - I don't know why... FWIW I just filed yaml/yaml-spec#49 to suggest it 😉

@DeadWisdom
Copy link

I'm gonna shut up cause I'm clearly bikeshedding, then. Sorry about that.

But, I will leave this, conversation on XML's mimetype:
https://mailarchive.ietf.org/arch/msg/xml-mime/jGvJ-bYob0oqV8W9SYjYF5vCy5o/

I am not sure how to help on #574 -- But I will try!

@IS4Code
Copy link

IS4Code commented Aug 6, 2023

One argument against using text/ for the current version of TOML here is that text files commonly undergo line ending and encoding conversions when transmitted, and the default encoding is ASCII. Hence, TOML being restricted to UTF-8, the actual MIME type would be text/toml;charset=utf-8. Additionally, when transmitted to Mac using CR-style line endings, the file would not be parseable, since it does not treat a sole CR as a newline (could be fixed by defining it in terms of lines). Multiline strings would also change their value depending on the target line ending sequence.

It should be noted however that there are configuration formats in text/, contra @eksortso's observation, and relatively recent ones too ‒ text/turtle for example, being a general RDF graph storage format (see its encoding consideration for inspiration), could be used for configuration too, so TOML would still be in good company there (alongside text/n3, text/csv, text/dns, text/tab-separated-values, text/shaclc, text/vcard and so on).

I support using both text/ (for readability) and application/ (for safety of processing). I could even imagine myself using TOML for structured information presented only to humans (as opposed to human-unfriendly JSON), and there are far less human-readable formats in text/ anyway.

@arp242
Copy link
Contributor

arp242 commented Aug 6, 2023

when transmitted to Mac using CR-style line endings, the file would not be parseable, since it does not treat a sole CR as a newline

Only the old "Classic MacOS" used CR as line endings, as did some other old systems like the Commodore 64, but none of that is really relevant any more. Current OS-X/macOS uses a LF (\n) like other Unix and Unix-y systems.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

10 participants