Skip to content

Latest commit

 

History

History
979 lines (940 loc) · 24.2 KB

technical.md

File metadata and controls

979 lines (940 loc) · 24.2 KB
layout title
standard
Data Format

Version: 0.1, Sept. 27, 2012

This document details a data format for the publication of transactional expenditure data. Data is published in a set of CSV files. See below for a description of the purpose and status of each file.

This specification is strongly inspired by the General Transit Feed Specification which is widely used for the dissemination of public transit information.

Feed Files

This specification defines the following files along with their associated content:

Name Required Purpose
transactions.txt Required Core information about transactions in the dataset (fact table).
suppliers.txt Required Supplier (beneficiary) information to identify the recipient of funds.
entities.txt Required Data regarding departments or agencies involved in the transactions.
projects.txt Optional Project-centric information.
programmes.txt Optional Descriptions of government programmes authorizing transactions.
accounts.txt Optional Account specification. Represents a chart of accounts for the publisher.
functions.txt Optional Taxonomy codesheet for functional classifications of expenditure
economic.txt Optional Taxonomy codesheet for economic classifications of expenditure
sectors.txt Optional Taxonomy codesheet for sectors classifications of expenditure
To be discussed / specified
metadata.json Required DublinCore metadata in a simple representation.
indicators.json Opional Indicator measurements related to a particular transaction or programme.

File Requirements

The file requirements for GTFS files apply to this specification:

  • All files in a must be saved as comma-delimited text.
  • The first line of each file must contain field names. Each subsection of the Field Definitions section corresponds to one of the files in a transit feed and lists the field names you may use in that file.
  • All field names are case-sensitive.
  • Field values may not contain tabs, carriage returns or new lines.
  • Field values that contain quotation marks or commas must be enclosed within quotation marks. In addition, each quotation mark in the field value must be preceded with a quotation mark. This is consistent with the manner in which Microsoft Excel outputs comma-delimited (CSV) files. For more information on the CSV file format, see http://tools.ietf.org/html/rfc4180.
  • The following example demonstrates how a field value would appear in a comma-delimited file:
    • Original field value: Contains "quotes", commas and text
    • Field value in CSV file: "Contains ""quotes"", commas and text"
  • Field values must not contain HTML tags, comments or escape sequences.
  • Remove any extra spaces between fields or field names. Many parsers consider the spaces to be part of the value, which may cause errors.
  • Each line must end with a CRLF or LF linebreak character.
  • Files should be encoded in UTF-8 to support all Unicode characters. Files that include the Unicode byte-order mark (BOM) character are acceptable. Please see the Unicode FAQ for more information on the BOM character and UTF-8.
  • Zip the files in your feed.

Data Types

Data in the files can have a certain type specified. The following types are recognized:

Name Example Description
string "Some text" A text string, UTF-8.
date 2012-09-27 A date string in ISO 8601 format. This can optionally be given as a date/time combination, expanding to the full format given in the ISO specification.
money 102.44 A numeric financial value, without any specification of currency. Money should always be stated in single units, not thousands or millions. Implementors are advised to use appropriate internal data structures to represent financial values (e.g. decimal).
The number is given with a dot as separator for decimal places and no thousands separators.
identifier 34 Identifier are integer numbers used to refer between data files. They should be used in the sense of auto-incrementing primary keys.
url http://spendingdata.org/resource#section A fully qualified URL for a resource available on the web (i.e. via HTTP or HTTPs).

Glossary

  • Transaction: most granular level of expenditure which indentifies a single beneficiary receiving a specified amount of funds for some service, under a government programme or as an entitlement.

File specifications

Name Type Required Description
id identifier Required An identifier for this transaction which is unique within the dataset and all similar datasets released by the same publisher.
code string Optional Internal identifer or transaction reference used by the data publisher (can be identical to the id).
status string Optional Specify the current status of this transaction. No enumeration is specified yet, but concise, english-language descriptions are recommended. Example: IATI Activity Status.
description string Optional A textual description of the purpose and context of this transaction, if available.
financial_type string Optional Specify the type of financial instrument underlying this transaction. No enumeration is specified yet, but concise, english-language descriptions are recommended. Example: IATI Finance Type.
invoice_number string Optional Invoice number stated through the vendor or supplier.
rev_exp string Required Specification of whether a transaction implies government spending or revenue. Valid values:
  • REVENUE
  • EXPENDITURE
budget_line_item string Optional Line item from the budget authorizing this expenditure.
amount_budgeted money Optional Monetary amount initially budgeted for this transaction, including any taxes.
amount_allocated money Optional Monetary amount allocated for expenditure for this transaction, including any taxes.
amount_executed money Required Monetary amount actually disbursed under this transaction, including any taxes.
date_budgeted date Optional Date when the amount_budgeted had been decided.
date_allocated date Optional Date when the allocation was made.
date_completed date Required Date when the transaction was completed.
date_reported date Optional Date when the transaction was reported to the publishing body.
project_id identifier Optional Identifier linking to the project-centric metadata in projects.txt.
entity_id identifier Required Information about the government entity responsible for this transaction, as described in entities.txt.
purchaser_id identifier Optional Information about the government entity acting as a purchaser for this, if different from the institution controlling the project. Described in entities.txt.
supplier_id identifier Required Information about the supplier, as described in entities.txt.
programme_id identifier Optional Information about the underlying government programme, as described in programmes.txt.
account_id identifier Optional Information about the account in which this transaction was registered, as described in programmes.txt.
account_id identifier Optional Information about the account in which this transaction was registered, as described in accounts.txt.
economic_id identifier Optional Information about the economic classification of this transaction, as described in economic.txt.
function_id identifier Optional Information about the functional classification of this transaction, as described in functions.txt.
sector_id identifier Optional Information about the sector of this transaction, as described in sectors.txt.

Suppliers are usually assumed to be privately incorporated companies, sole traders or not-for-profit institutions. In some cases, transactions may detail intra-governmental exchanges of funds, in which case a synthetic supplier definition pointing at the actual government entity should be used.

Name Type Required Description
id identifier Required An identifier for this supplier which is unique within the dataset and all similar datasets released by the same publisher.
entity_id identifier Optional If the transaction describes intra-governmental transfers, the receiving department will be identified using this identifier, while the remainder of the supplier record will be disregarded.
name string Required Full legal name of the supplier.
acronym string Optional Acronym commonly used by the supplier.
legal_form string Optional Legal form of the supplier, i.e. the company type.
code string Optional Company code used within the purchasing government entity, e.g. vendor indentification.
tax_identification string Optional (Value-added) tax identification number assigned to the supplier.
opencorporates_uri url Optional Identifier used on the OpenCorporates identity resolution service.
duns_number string Optional Dun & Bradstreet D-U-N-S number used for the supplier in the US.
street string Optional Street part of the suppliers trading address.
post_code string Optional Post code part of the suppliers trading address.
city string Optional City of the suppliers trading address.
country_name string Optional Country name of the suppliers trading address.
country_code string Optional 2-letter alphanumeric ISO 3166-1 country identifier of the suppliers trading address.
Name Type Required Description
id identifier Required An identifier for this governemnt entity which is unique within the dataset and all similar datasets released by the same publisher.
parent_id identifier Optional The name of the superior government entity, e.g. the responsible department, ministry or other organ. This hierarchy can have any depth and should be precise.
name string Required Full name of the government entity.
acronym string Optional Acronym commonly used by the government entity.
code string Optional Department/entity code used within the government for financial purposes.
class string Optional Type of the government entity, e.g. department, unit, arms-length body. No taxonomy is specified at the moment, but should be made available in the future.
ministerial_level boolean Optional Flag to indicate this governemnt entity is at ministerial level.
street string Optional Street part of the government entity's trading address.
post_code string Optional Post code part of the government entity's trading address.
city string Optional City of the government entity's trading address.
country_name string Optional Country name of the government entity's trading address.
country_code string Optional 2-letter alphanumeric ISO 3166-1 country identifier of the government entity's trading address.
Name Type Required Description
id identifier Required An identifier for this project which is unique within the dataset and all similar datasets released by the same publisher.
name string Required Name of the project.
code string Optional Internal code used to identify this project or activity.
description string Optional A textual description of the purpose and context of this project.
start_date date Optional Date when the project began or is scheduled to begin.
end_date date Optional Date when the project ended or is scheduled to be complete.
Name Type Required Description
id identifier Required An identifier for this programme which is unique within the dataset and all similar datasets released by the same publisher.
parent_id identifier Optional An identifier for a wider programme, into which the specified programme is hierarchically embedded.
name string Required Name of the programme.
code string Optional Internal code used to identify this programme.
budget_line_item string Optional Line item from the budget establishing this programme.
description string Optional A textual description of the purpose and context of this programme.
Name Type Required Description
id identifier Required An identifier for this account which is unique within the dataset and all similar datasets released by the same publisher.
parent_id identifier Optional An identifier for a wider account, into which the specified account is hierarchically embedded.
name string Required Name of the account.
code string Optional Internal code used to identify this account.
level number Optional Hierarchy level of this account.
description string Optional A textual description of the purpose and context of this account.
Name Type Required Description
id identifier Required An identifier for this function which is unique within the dataset and all similar datasets released by the same publisher.
parent_id identifier Optional An identifier for a wider function, into which the specified function is hierarchically embedded.
name string Required Name of the function.
code string Optional Internal code used to identify this function.
level number Optional Hierarchy level of this function.
description string Optional A textual description of this function.
Name Type Required Description
id identifier Required An identifier for this economic type which is unique within the dataset and all similar datasets released by the same publisher.
parent_id identifier Optional An identifier for a wider economic type, into which the specified economic type is hierarchically embedded.
name string Required Name of the economic type.
code string Optional Internal code used to identify this economic type.
level number Optional Hierarchy level of this economic type.
description string Optional A textual description of this economic type.
Name Type Required Description
id identifier Required An identifier for this sector which is unique within the dataset and all similar datasets released by the same publisher.
parent_id identifier Optional An identifier for a wider sector, into which the specified sector is hierarchically embedded.
name string Required Name of the sector.
code string Optional Internal code used to identify this sector.
level number Optional Hierarchy level of this sector.
description string Optional A textual description of this sector.