Skip to content

Commit

Permalink
Merge remote-tracking branch 'origin/master' into netex-validation/po…
Browse files Browse the repository at this point in the history
…lling-of-results
  • Loading branch information
ptitfred committed Dec 10, 2024
2 parents 69c1ce0 + 9ea927e commit 566d743
Show file tree
Hide file tree
Showing 17 changed files with 795 additions and 27 deletions.
13 changes: 9 additions & 4 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -73,7 +73,7 @@ To have an up to date database schema run `mix ecto.migrate`.
### Restoring the production database

The production database does not contains any sensitive data, you can retreive it for dev purpose.
* You can retrieve the [latest clever-cloud backup](https://console.clever-cloud.com/organisations/orga_f33ebcbc-4403-4e4c-82f5-12305e0ecb1b/addons/addon_beebaa5e-c3a4-4c57-b124-cf9d1473450a) (you need some permissions to access it, if you don't have them, you can ask someone on the team to give you the database)
* You can retrieve the [latest Clever Cloud backup](https://console.clever-cloud.com/organisations/orga_f33ebcbc-4403-4e4c-82f5-12305e0ecb1b/addons/addon_beebaa5e-c3a4-4c57-b124-cf9d1473450a) (you need some permissions to access it, if you don't have them, you can ask someone on the team to give you the database)
* On the clever-cloud website, under transport-site-postgresql, there is a Backups section with download links.
* restore the downloaded backup on you database: `./restore_db.sh <path_to_the_backup>`

Expand Down Expand Up @@ -130,7 +130,7 @@ If you need to login via `demo.data.gouv.fr`, follow these steps:

```elixir
config :oauth2, Datagouvfr.Authentication,
# go to CleverCloud staging site and pick `DATAGOUVFR_CLIENT_ID`
# go to Clever Cloud staging site and pick `DATAGOUVFR_CLIENT_ID`
client_id: "TODO-REPLACE",
# same but use `DATAGOUVFR_CLIENT_SECRET`
client_secret: "TODO-REPLACE"
Expand Down Expand Up @@ -280,9 +280,9 @@ The following domain names are currently in use by the deployed Elixir app:
* jobs: https://workers.prochainement.transport.data.gouv.fr
* proxy: https://proxy.prochainement.transport.gouv.fr

These names are [configured via a CNAME on CleverCloud](https://www.clever-cloud.com/doc/administrate/domain-names/#using-personal-domain-names).
These names are [configured via a CNAME on Clever Cloud](https://www.clever-cloud.com/doc/administrate/domain-names/#using-personal-domain-names).

The corresponding SSL certificates are auto-generated via Let's Encrypt and CleverCloud.
The corresponding SSL certificates are auto-generated via Let's Encrypt and Clever Cloud.

# Uptime monitoring (updown.io)

Expand All @@ -294,6 +294,11 @@ The following URLs are currently monitored via updown.io (with email & Mattermos
* https://metabase.transport.data.gouv.fr (https://updown.io/f9rd) every 5 minutes
* https://prochainement.transport.data.gouv.fr/health-check (https://updown.io/2pvz) every 5 minutes

# Useful changelogs

* https://developers.clever-cloud.com/changelog/ for Clever Cloud components (e.g. Postgres)
* [.tool-versions](.tool-versions) for Elixir & Erlang

# Blog

The project [blog](https://blog.transport.data.gouv.fr/) code and articles are hosted in the [blog](https://github.com/etalab/transport-site/tree/blog/blog) folder of the blog branch. A specific blog branch has been created with less restrictive merge rules, to allow publishing articles directly from the CMS without needing a github code review.
Expand Down
495 changes: 495 additions & 0 deletions apps/shared/meta/schema-irve-statique.json

Large diffs are not rendered by default.

98 changes: 98 additions & 0 deletions apps/transport/lib/irve/data_frame.ex
Original file line number Diff line number Diff line change
@@ -0,0 +1,98 @@
defmodule Transport.IRVE.DataFrame do
@moduledoc """
Tooling supporting the parsing of an IRVE static file into `Explorer.DataFrame`
"""

@doc """
Helper function to convert TableSchema types into DataFrame ones.
There is no attempt to make this generic at this point, it is focusing solely
on the static IRVE use.
iex> Transport.IRVE.DataFrame.remap_schema_type(:geopoint)
:string
iex> Transport.IRVE.DataFrame.remap_schema_type(:number)
{:u, 16}
iex> Transport.IRVE.DataFrame.remap_schema_type(:literally_anything)
:literally_anything
"""
def remap_schema_type(input_type) do
case input_type do
:geopoint -> :string
:number -> {:u, 16}
type -> type
end
end

@doc """
Parse an in-memory binary of CSV content into a typed `Explorer.DataFrame` for IRVE use.
Current behaviour is that the embedded static IRVE schema enforces the field type, for fields
that are known.
For instance, a `string` field in the input schema will be considered as a `string` in the `DataFrame`:
iex> Transport.IRVE.DataFrame.dataframe_from_csv_body!("id_pdc_itinerance\\nABC123")
#Explorer.DataFrame<
Polars[1 x 1]
id_pdc_itinerance string ["ABC123"]
>
Even if it contains something that would be considered a float (the schema type spec wins):
iex> Transport.IRVE.DataFrame.dataframe_from_csv_body!("id_pdc_itinerance\\n22.0")
#Explorer.DataFrame<
Polars[1 x 1]
id_pdc_itinerance string ["22.0"]
>
An `integer` field will be mapped to a `integer` (here, signed 64-bits):
iex> Transport.IRVE.DataFrame.dataframe_from_csv_body!("nbre_pdc\\n123")
#Explorer.DataFrame<
Polars[1 x 1]
nbre_pdc s64 [123]
>
A `boolean` field in the schema, similary, will correctly result into a `boolean` `DataFrame` field:
iex> Transport.IRVE.DataFrame.dataframe_from_csv_body!("reservation\\nfalse")
#Explorer.DataFrame<
Polars[1 x 1]
reservation boolean [false]
>
And dates are also handled correctly:
iex> Transport.IRVE.DataFrame.dataframe_from_csv_body!("date_mise_en_service\\n2024-10-02")
#Explorer.DataFrame<
Polars[1 x 1]
date_mise_en_service date [2024-10-02]
>
Other, unknown columns, are at this point kept, and types are inferred:
iex> Transport.IRVE.DataFrame.dataframe_from_csv_body!("foo,bar\\n123,14.0")
#Explorer.DataFrame<
Polars[1 x 2]
foo s64 [123]
bar f64 [14.0]
>
Congratulations for reading this far.
"""
def dataframe_from_csv_body!(body, schema \\ Transport.IRVE.StaticIRVESchema.schema_content()) do
dtypes =
schema
|> Map.fetch!("fields")
|> Enum.map(fn %{"name" => name, "type" => type} ->
{
String.to_atom(name),
String.to_atom(type)
|> Transport.IRVE.DataFrame.remap_schema_type()
}
end)

Explorer.DataFrame.load_csv!(body, dtypes: dtypes)
end
end
19 changes: 19 additions & 0 deletions apps/transport/lib/irve/static_irve_schema.ex
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
defmodule Transport.IRVE.StaticIRVESchema do
@moduledoc """
A module providing programmatic access to the static IRVE schema,
as stored in the source code.
"""

@doc """
Read & decode the content of the IRVE static schema.
NOTE: this is not cached at the moment.
"""
def schema_content do
__DIR__
|> Path.join("../../../shared/meta/schema-irve-statique.json")
|> Path.expand()
|> File.read!()
|> Jason.decode!()
end
end
31 changes: 21 additions & 10 deletions apps/transport/lib/jobs/new_datagouv_datasets_job.ex
Original file line number Diff line number Diff line change
Expand Up @@ -15,12 +15,10 @@ defmodule Transport.Jobs.NewDatagouvDatasetsJob do
tags:
MapSet.new([
"bus",
"deplacements",
"déplacement",
"déplacements",
"horaires",
"mobilite",
"mobilité",
"temps-reel",
"temps-réel",
"transport",
"transports"
Expand Down Expand Up @@ -53,13 +51,13 @@ defmodule Transport.Jobs.NewDatagouvDatasetsJob do
"etalab/schema-comptage-mobilites-measure",
"etalab/schema-comptage-mobilites-site"
],
tags: MapSet.new(["cyclable", "parking", "stationnement", "velo", "vélo"]),
tags: MapSet.new(["cyclable", "parking", "stationnement", "vélo"]),
formats: MapSet.new([])
},
%{
category: "Covoiturage et ZFE",
schemas: ["etalab/schema-lieux-covoiturage", "etalab/schema-zfe"],
tags: MapSet.new(["covoiturage", "zfe"]),
tags: MapSet.new(["covoiturage", "zfe", "zfe-m", "zone à faible émission"]),
formats: MapSet.new([])
},
%{
Expand All @@ -71,8 +69,7 @@ defmodule Transport.Jobs.NewDatagouvDatasetsJob do
"borne de recharge",
"irve",
"sdirve",
"électrique",
"electrique"
"électrique"
]),
formats: MapSet.new([])
}
Expand Down Expand Up @@ -223,12 +220,12 @@ defmodule Transport.Jobs.NewDatagouvDatasetsJob do
defp string_matches?(nil, _rule), do: false

defp string_matches?(str, %{formats: formats, tags: tags} = _rule) when is_binary(str) do
searches = MapSet.union(formats, tags) |> MapSet.to_list()
str |> String.downcase() |> String.contains?(searches)
searches = MapSet.union(formats, tags) |> MapSet.to_list() |> Enum.map(&normalize/1)
str |> normalize() |> String.contains?(searches)
end

defp tags_is_relevant?(%{"tags" => tags} = _dataset, rule) do
tags |> Enum.map(&string_matches?(String.downcase(&1), rule)) |> Enum.any?()
tags |> Enum.map(&string_matches?(&1, rule)) |> Enum.any?()
end

defp resource_is_relevant?(%{} = resource, rule) do
Expand All @@ -250,4 +247,18 @@ defmodule Transport.Jobs.NewDatagouvDatasetsJob do
end

defp resource_schema_is_relevant?(%{}, _rule), do: false

@doc """
Clean up a string, lowercase it and replace accented letters with ASCII letters.
iex> normalize("Paris")
"paris"
iex> normalize("vélo")
"velo"
iex> normalize("Châteauroux")
"chateauroux"
"""
def normalize(value) do
value |> String.normalize(:nfd) |> String.replace(~r/[^A-z]/u, "") |> String.downcase()
end
end
8 changes: 4 additions & 4 deletions apps/transport/lib/mix/tasks/logs.ex
Original file line number Diff line number Diff line change
@@ -1,8 +1,8 @@
defmodule Mix.Tasks.Clever.Logs do
@shortdoc "Fetches logs from CleverCloud"
@shortdoc "Fetches logs from Clever Cloud"

@moduledoc """
The CleverCloud logs command currently has strong limitations, including a maximum of
The Clever Cloud logs command currently has strong limitations, including a maximum of
1000 lines of logs per command invocation (https://github.com/CleverCloud/clever-tools/issues/429)
and a lack of auto-pagination.
Expand All @@ -14,10 +14,10 @@ defmodule Mix.Tasks.Clever.Logs do
mix clever.logs --since "2021-01-25T04:00:00Z" --before "2021-01-25T04:10:00Z" --alias "the-app"
```
The switches mimic the CleverCloud logs ones:
The switches mimic the Clever Cloud logs ones:
* `--since`: start time (ISO8601 Z). Defaults to "24 hours ago".
* `--before`: end time (ISO8601 Z). Defaults to "now".
* `--alias`: name of the CleverCloud app.
* `--alias`: name of the Clever Cloud app.
"""

require Logger
Expand Down
1 change: 1 addition & 0 deletions apps/transport/mix.exs
Original file line number Diff line number Diff line change
Expand Up @@ -113,6 +113,7 @@ defmodule Transport.Mixfile do
{:unzip, "~> 0.8"},
{:protobuf, "~> 0.11"},
{:nimble_csv, "~> 1.2.0"},
{:explorer, "~> 0.10.0"},
{:kino, "~> 0.6", only: :dev},
# db
{:ecto, "~> 3.12"},
Expand Down
57 changes: 57 additions & 0 deletions apps/transport/test/support/factory.ex
Original file line number Diff line number Diff line change
Expand Up @@ -449,4 +449,61 @@ defmodule DB.Factory do
"schema" => %{"name" => Keyword.get(opts, :schema_name), "version" => Keyword.get(opts, :schema_version)}
}
end

defmodule IRVE do
@moduledoc """
Factory part relevant to IRVE.
"""

@doc """
Generate a row following the IRVE static schema.
See:
- https://schema.data.gouv.fr/etalab/schema-irve-statique/
"""
def generate_row do
%{
"nom_amenageur" => "Métropole de Nulle Part",
"siren_amenageur" => "123456782",
"contact_amenageur" => "[email protected]",
"nom_operateur" => "Opérateur de Charge",
"contact_operateur" => "[email protected]",
"telephone_operateur" => "0199456782",
"nom_enseigne" => "Réseau de recharge",
"id_station_itinerance" => "FRPAN99P12345678",
"id_station_local" => "station_001",
"nom_station" => "Ma Station",
"implantation_station" => "Lieu de ma station",
"adresse_station" => "26 rue des écluses, 17430 Champdolent",
"code_insee_commune" => "17085",
"coordonneesXY" => "[-0.799141,45.91914]",
"nbre_pdc" => 1,
"id_pdc_itinerance" => "FRPAN99E12345678",
"id_pdc_local" => "pdc_001",
"puissance_nominale" => 22,
"prise_type_ef" => false,
"prise_type_2" => true,
"prise_type_combo_ccs" => false,
"prise_type_chademo" => false,
"prise_type_autre" => false,
"gratuit" => false,
"paiement_acte" => true,
"paiement_cb" => true,
"paiement_autre" => true,
"tarification" => "2,50€ / 30min puis 0,025€ / minute",
"condition_acces" => "Accès libre",
"reservation" => false,
"horaires" => "24/7",
"accessibilite_pmr" => "Accessible mais non réservé PMR",
"restriction_gabarit" => "Hauteur maximale 2.30m",
"station_deux_roues" => false,
"raccordement" => "Direct",
"num_pdl" => "12345678912345",
"date_mise_en_service" => "2024-10-02",
"observations" => "Station située au niveau -1 du parking",
"date_maj" => "2024-10-17",
"cable_t2_attache" => false
}
end
end
end
65 changes: 65 additions & 0 deletions apps/transport/test/transport/irve/irve_data_frame_test.exs
Original file line number Diff line number Diff line change
@@ -0,0 +1,65 @@
defmodule Transport.IRVE.DataFrameTest do
use ExUnit.Case, async: true
doctest Transport.IRVE.DataFrame

test "schema content" do
data =
Transport.IRVE.StaticIRVESchema.schema_content()
|> Map.fetch!("fields")
|> Enum.at(0)
|> Map.take(["name", "type"])

assert data == %{"name" => "nom_amenageur", "type" => "string"}
end

test "dataframe roundtrip (encode + decode)" do
body = [DB.Factory.IRVE.generate_row()] |> CSV.encode(headers: true) |> Enum.join()
df = Transport.IRVE.DataFrame.dataframe_from_csv_body!(body)
maps = Explorer.DataFrame.to_rows(df)

assert maps == [
%{
"nom_amenageur" => "Métropole de Nulle Part",
"siren_amenageur" => "123456782",
"contact_amenageur" => "[email protected]",
"nom_operateur" => "Opérateur de Charge",
"contact_operateur" => "[email protected]",
"telephone_operateur" => "0199456782",
"nom_enseigne" => "Réseau de recharge",
"id_station_itinerance" => "FRPAN99P12345678",
"id_station_local" => "station_001",
"nom_station" => "Ma Station",
"implantation_station" => "Lieu de ma station",
"adresse_station" => "26 rue des écluses, 17430 Champdolent",
"code_insee_commune" => "17085",
"coordonneesXY" => "[-0.799141,45.91914]",
"nbre_pdc" => 1,
"id_pdc_itinerance" => "FRPAN99E12345678",
"id_pdc_local" => "pdc_001",
"puissance_nominale" => 22,
"prise_type_ef" => false,
"prise_type_2" => true,
"prise_type_combo_ccs" => false,
"prise_type_chademo" => false,
"prise_type_autre" => false,
"gratuit" => false,
"paiement_acte" => true,
"paiement_cb" => true,
"paiement_autre" => true,
"tarification" => "2,50€ / 30min puis 0,025€ / minute",
"condition_acces" => "Accès libre",
"reservation" => false,
"horaires" => "24/7",
"accessibilite_pmr" => "Accessible mais non réservé PMR",
"restriction_gabarit" => "Hauteur maximale 2.30m",
"station_deux_roues" => false,
"raccordement" => "Direct",
"num_pdl" => "12345678912345",
"date_mise_en_service" => ~D[2024-10-02],
"observations" => "Station située au niveau -1 du parking",
"date_maj" => ~D[2024-10-17],
"cable_t2_attache" => false
}
]
end
end
Loading

0 comments on commit 566d743

Please sign in to comment.