From a354b247f546894c2e9784f1ffa8ecad262e29f4 Mon Sep 17 00:00:00 2001 From: Peter Desmet Date: Mon, 28 Oct 2024 11:01:16 +0100 Subject: [PATCH 1/3] Rename data-package-csvw.md to csvw-data-package.md (#1054) --- .../docs/guides/{data-package-csvw.md => csvw-data-package.md} | 0 1 file changed, 0 insertions(+), 0 deletions(-) rename content/docs/guides/{data-package-csvw.md => csvw-data-package.md} (100%) diff --git a/content/docs/guides/data-package-csvw.md b/content/docs/guides/csvw-data-package.md similarity index 100% rename from content/docs/guides/data-package-csvw.md rename to content/docs/guides/csvw-data-package.md From 936269650fd79015a63d3856ad0d403a65d395f6 Mon Sep 17 00:00:00 2001 From: roll Date: Mon, 28 Oct 2024 10:02:56 +0000 Subject: [PATCH 2/3] Add a `dialect.type` property --- content/docs/standard/table-dialect.mdx | 119 +++++++++++++++++------- 1 file changed, 86 insertions(+), 33 deletions(-) diff --git a/content/docs/standard/table-dialect.mdx b/content/docs/standard/table-dialect.mdx index fb291d8d..50ea2b19 100644 --- a/content/docs/standard/table-dialect.mdx +++ b/content/docs/standard/table-dialect.mdx @@ -47,44 +47,31 @@ Table Dialect supersedes [CSV Dialect](https://specs.frictionlessdata.io/csv-dia ## Descriptor -Table Dialect descriptor `MUST` be a descriptor as per [Descriptor](/standard/glossary/#descriptor) definition. A list of standard properties that can be included into a descriptor is defined in the [Properties](#properties) section. +Table Dialect descriptor `MUST` be a descriptor as per [Descriptor](/standard/glossary/#descriptor) definition. The descriptor `MAY` include a `type` property to indicate which of optional dialect properties `SHOULD` be considered when reading the target data format. Some [properties](#properties) are generic and can be used for multiple formats, while others are specific to one format. A list of standard dialect types are defined in the [Table Dialect Types](#table-dialect-types) section. The properties that can be used with these types are defined in the [Properties](#properties) section. + +When the `type` property is not set, it `SHOULD` be assumed that the descriptor is a `delimited` dialect type. In this case, the [`$schema`](#schema) property `SHOULD` default to `https://datapackage.org/profiles/1.0/tabledialect.json`, to maintain backward compatibility with [CSV Dialect](https://specs.frictionlessdata.io/csv-dialect/). An example of a Table Dialect descriptor: ```json { + "type": "delimited", "header": false, "delimiter": ";", "quoteChar": "'" } ``` -## Tabular Data Formats - -Table Dialect can be used for different data formats, such as delimited text files, semi-structured formats and spreadsheets. Some [properties](#properties) are generic and can be used for multiple formats, while others are specific to one format. - -A property `MUST` be ignored if it is no applicable for an arbitrary data format. For example, SQL databases do not have a concept of a header row. +## Table Dialect Types -For the sake of simplicity, most of examples are written in the CSV data format. For example, this data file without providing any Table Dialect properties: +Table Dialect can be used for different data formats, such as delimited text files, semi-structured formats and spreadsheets. The list of supported dialect types with associated data formats and related properties is as follows. -```csv -id,name -1,apple -2,organe -``` +### `delimited` -`SHOULD` output this data: +Delimited formats are textual formats such as CSV and TSV. Their charactistics can be expressed the following properties: -```javascript -{id: 1, name: "apple"} -{id: 2, name: "orange"} -``` - -### Delimited - -Delimited formats is a group of textual formats such as CSV and TSV. Their charactistics can be expressed the following properties: - -- [$schema](#dollar-schema): `https://datapackage.org/profiles/1.0/tabledialect.json` by default +- [type](#type): `delimited` +- [$schema](#dollar-schema): `https://datapackage.org/profiles/2.0/tabledialect.json` by default - [header](#header): `true` by default - [headerRows](#headerRows): `1` by default - [headerJoin](#headerJoin): ` ` by default @@ -102,6 +89,7 @@ An example of a well-defined Table Dialect descriptor for a CSV format: ```json { + "type": "delimited", "header": false, "commentChar": "#" "delimiter": ";", @@ -112,21 +100,23 @@ An example of a well-defined Table Dialect descriptor for a CSV format: } ``` -### Structured +### `structured` -Structured formats is a group of structured or semi-structured formats such as JSON and YAML. Their charactistics can be expressed the following properties: +Structured formats are structured or semi-structured formats such as JSON and YAML. Their charactistics can be expressed the following properties: -- [$schema](#dollar-schema): `https://datapackage.org/profiles/1.0/tabledialect.json` by default +- [type](#type): `structured` +- [$schema](#dollar-schema): `https://datapackage.org/profiles/2.0/tabledialect.json` by default - [header](#header): `true` by default - [property](#property): undefined by default - [itemType](#itemType): undefined by default - [itemKeys](#itemKeys): undefined by default -### Spreadsheet +### `spreadsheet` -Spreadsheet formats is a group of sheet-based formats such as Excel or ODS. Their charactistics can be expressed the following properties: +Spreadsheet formats are sheet-based formats such as Excel or ODS. Their charactistics can be expressed the following properties: -- [$schema](#dollar-schema): `https://datapackage.org/profiles/1.0/tabledialect.json` by default +- [type](#type): `spreadsheet` +- [$schema](#dollar-schema): `https://datapackage.org/profiles/2.0/tabledialect.json` by default - [header](#header): `true` by default - [headerRows](#headerRows): `1` by default - [headerJoin](#headerJoin): ` ` by default @@ -135,23 +125,34 @@ Spreadsheet formats is a group of sheet-based formats such as Excel or ODS. Thei - [sheetNumber](#sheetNumber): `1` by default - [sheetName](#sheetName): undefined by default -### Database +### `database` Database formats is a group of formats accessing data from databases like SQLite. Their charactistics can be expressed the following properties: -- [$schema](#dollar-schema): `https://datapackage.org/profiles/1.0/tabledialect.json` by default -- [table](#table): undefined by default +- [type](#type): `database` +- [$schema](#dollar-schema): `https://datapackage.org/profiles/2.0/tabledialect.json` by default +- [table](#table): (required) ## Properties +### `type` + +**Dialect Types:** All + +A Table Dialect descriptor MAY contain a property `type` that `MUST` be a string with the following possible values and the `delimited` value by default: `delimited`, `structured`, `spreadsheet`, or `database`. + ### `$schema` {#dollar-schema} +**Dialect Types:** All + A root level Table Dialect descriptor `MAY` have a `$schema` property that `MUST` be a profile as per [Profile](/standard/glossary/#profile) definition that `MUST` include all the metadata constraints required by this specification. The default value is `https://datapackage.org/profiles/1.0/tabledialect.json` and the recommended value is `https://datapackage.org/profiles/2.0/tabledialect.json`. ### `header` +**Dialect Types:** [delimited](#delimited), [structured](#structured), [spreadsheet](#spreadsheet) + A Table Dialect descriptor `MAY` have the `header` property that `MUST` be boolean with default value `true`. This property indicates whether the file includes a header row. If `true` the first row in the file `MUST` be interpreted as a header row, not data. For example, this data file: @@ -165,6 +166,7 @@ With this dialect definition: ```json { + "type": "delimited", "header": false } ``` @@ -180,6 +182,8 @@ Where `field1` and `field2` names are implementation-specific and used here only ### `headerRows` {#headerRows} +**Dialect Types:** [delimited](#delimited), [spreadsheet](#spreadsheet) + A Table Dialect descriptor `MAY` have the `headerRows` property that `MUST` be an array of positive integers starting from 1 with default value `[1]`. This property specifies the row numbers for the header. It is `RECOMMENDED` to be used for multiline-header files. For example, this data file: @@ -195,6 +199,7 @@ With this dialect definition: ```json { + "type": "delimited", "headerRows": [1, 2] } ``` @@ -208,6 +213,8 @@ With this dialect definition: ### `headerJoin` {#headerJoin} +**Dialect Types:** [delimited](#delimited), [spreadsheet](#spreadsheet) + A Table Dialect descriptor `MAY` have the `headerJoin` property that `MUST` be a string with default value `" "`. This property specifies how multiline-header files have to join the resulting header rows. For example, this data file: @@ -223,6 +230,7 @@ With this dialect definition: ```json { + "type": "delimited", "headerRows": [1, 2], "headerJoin": "-" } @@ -237,6 +245,8 @@ With this dialect definition: ### `commentRows` {#commentRows} +**Dialect Types:** [delimited](#delimited), [spreadsheet](#spreadsheet) + A Table Dialect descriptor `MAY` have the `commentRows` property that `MUST` be an array of positive integers starting from 1; undefined by default. This property specifies what rows have to be omitted from the data. For example, this data file: @@ -252,6 +262,7 @@ With this dialect definition: ```json { + "type": "delimited", "commentRows": [2] } ``` @@ -265,6 +276,8 @@ With this dialect definition: ### `commentChar` {#commentChar} +**Dialect Types:** [delimited](#delimited), [spreadsheet](#spreadsheet) + A Table Dialect descriptor `MAY` have the `commentChar` property that `MUST` be a string of one or more characters; undefined by default. This property specifies what rows have to be omitted from the data based on the row's first characters. For example, this data file: @@ -280,6 +293,7 @@ With this dialect definition: ```json { + "type": "delimited", "commentChar": "#" } ``` @@ -293,6 +307,8 @@ With this dialect definition: ### `delimiter` +**Dialect Types:** [delimited](#delimited) + A Table Dialect descriptor `MAY` have the `delimiter` property that `MUST` be a string; with default value `,` (comma). This property specifies the character sequence which separates fields in the data file. For example, this data file: @@ -307,6 +323,7 @@ With this dialect definition: ```json { + "type": "delimited", "delimiter": "|" } ``` @@ -320,6 +337,8 @@ With this dialect definition: ### `lineTerminator` {#lineTerminator} +**Dialect Types:** [delimited](#delimited) + A Table Dialect descriptor `MAY` have the `lineTerminator` property that `MUST` be a string; with default value `\r\n`. This property specifies the character sequence which terminates rows. For example, this data file: @@ -332,6 +351,7 @@ With this dialect definition: ```json { + "type": "delimited", "lineTerminator": ";" } ``` @@ -345,6 +365,8 @@ With this dialect definition: ### `quoteChar` {#quoteChar} +**Dialect Types:** [delimited](#delimited) + A Table Dialect descriptor `MAY` have the `quoteChar` property that `MUST` be a string of one character length with default value `"` (double quote). This property specifies a character to use for quoting in case the `delimiter` needs to be used inside a data cell. For example, this data file: @@ -359,6 +381,7 @@ With this dialect definition: ```json { + "type": "delimited", "quoteChar": "'" } ``` @@ -372,6 +395,8 @@ With this dialect definition: ### `doubleQuote` {#doubleQuote} +**Dialect Types:** [delimited](#delimited) + A Table Dialect descriptor `MAY` have the `doubleQuote` property that `MUST` be boolean with default value `true`. This property controls the handling of `quoteChar` inside data cells. If true, two consecutive quotes are interpreted as one. For example, this data file: @@ -386,6 +411,7 @@ With this dialect definition: ```json { + "type": "delimited", "doubleQuote": true } ``` @@ -399,6 +425,8 @@ With this dialect definition: ### `escapeChar` {#escapeChar} +**Dialect Types:** [delimited](#delimited) + A Table Dialect descriptor `MAY` have the `escapeChar` property that `MUST` be a string of one character length; undefined by default. This property specifies a one-character string to use for escaping, for example, `\`, mutually exclusive with `quoteChar`. For example, this data file: @@ -413,6 +441,7 @@ With this dialect definition: ```json { + "type": "delimited", "escapeChar": "|" } ``` @@ -426,6 +455,8 @@ With this dialect definition: ### `nullSequence` {#nullSequence} +**Dialect Types:** [delimited](#delimited) + A Table Dialect descriptor `MAY` have the `nullSequence` property that `MUST` be a string; undefined by default. This property specifies specifies the null sequence, for example, `\N`. For example, this data file: @@ -440,6 +471,7 @@ With this dialect definition: ```json { + "type": "delimited", "nullSequence": "NA" } ``` @@ -453,6 +485,8 @@ With this dialect definition: ### `skipInitialSpace` {#skipInitialSpace} +**Dialect Types:** [delimited](#delimited) + A Table Dialect descriptor `MAY` have the `skipInitialSpace` property that `MUST` be boolean with default value `false`. This property specifies how to interpret whitespace which immediately follows a delimiter; if `false`, it means that whitespace immediately after a delimiter is treated as part of the following field. For example, this data file: @@ -467,6 +501,7 @@ With this dialect definition: ```json { + "type": "delimited", "skipInitialSpace": true } ``` @@ -480,6 +515,8 @@ With this dialect definition: ### `property` +**Dialect Types:** [structured](#structured) + A Table Dialect descriptor `MAY` have the `property` property that `MUST` be a string; undefined by default. This property specifies where a data array is located in the data structure. For example, this data file: @@ -497,6 +534,7 @@ With this dialect definition: ```json { + "type": "structured", "property": "rows" } ``` @@ -510,6 +548,8 @@ With this dialect definition: ### `itemType` {#itemType} +**Dialect Types:** [structured](#structured) + A Table Dialect descriptor `MAY` have the `itemType` property that `MUST` be a string with value `array` or `object`; undefined by default. This property specifies whether the data `property` contains an array of arrays or an array of objects. For example, this data file: @@ -526,6 +566,7 @@ With this dialect definition: ```json { + "type": "structured", "itemType": "array" } ``` @@ -539,6 +580,8 @@ With this dialect definition: ### `itemKeys` {#itemKeys} +**Dialect Types:** [structured](#structured) + A Table Dialect descriptor `MAY` have the `itemKeys` property that `MUST` be array of strings; undefined by default. This property specifies the way of extracting rows from data arrays with `itemType` is `object`. For example, this data file: @@ -554,6 +597,7 @@ With this dialect definition: ```json { + "type": "structured", "itemKeys": ["id", "name"] } ``` @@ -567,6 +611,8 @@ With this dialect definition: ### `sheetNumber` {#sheetNumber} +**Dialect Types:** [spreadsheet](#spreadsheet) + A Table Dialect descriptor `MAY` have the `sheetNumber` property that `MUST` be an integer with default value `1`. This property specifies a sheet number of a table in the spreadsheet file. For example, this data file: @@ -580,6 +626,7 @@ With this dialect definition: ```json { + "type": "spreadsheet", "sheetNumber": 2 } ``` @@ -588,6 +635,8 @@ With this dialect definition: ### `sheetName` {#sheetName} +**Dialect Types:** [spreadsheet](#spreadsheet) + A Table Dialect descriptor `MAY` have the `sheetName` property that `MUST` be a string; undefined by default. This property specifies a sheet name of a table in the spreadsheet file. For example, this data file: @@ -601,6 +650,7 @@ With this dialect definition: ```json { + "type": "spreadsheet", "sheetName": "Sheet 2" } ``` @@ -609,7 +659,9 @@ With this dialect definition: ### `table` -A Table Dialect descriptor `MAY` have the `table` property that `MUST` be a string; undefined by default. This property specifies a name of the table in the database. +**Dialect Types:** [database](#database) + +A Table Dialect of type `database` `MUST` have a `table` property of type string. This property specifies a name of the table in the database. For example, the database with the tables below: @@ -622,6 +674,7 @@ With this dialect definition: ```json { + "type": "database", "table": "table2" } ``` From 7dc87c050cac8a48f3176f16cda08f1237a39fd5 Mon Sep 17 00:00:00 2001 From: roll Date: Mon, 28 Oct 2024 10:09:09 +0000 Subject: [PATCH 3/3] Fixed linting --- .prettierignore | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/.prettierignore b/.prettierignore index 00aecf6d..45131ed9 100644 --- a/.prettierignore +++ b/.prettierignore @@ -1,2 +1,2 @@ -content/docs/guides/data-package-csvw.md -public/profiles \ No newline at end of file +content/docs/guides/csvw-data-package.md +public/profiles