Skip to content

Commit

Permalink
Document the output file format
Browse files Browse the repository at this point in the history
  • Loading branch information
tpwo committed Jul 10, 2024
1 parent 3443e65 commit d5f8a4b
Show file tree
Hide file tree
Showing 2 changed files with 9 additions and 1 deletion.
8 changes: 8 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -58,6 +58,14 @@ $ python -m event_scrapper_srt --output-path output.json
2024-07-10 21:44:22,058 - INFO - Saved 16 events to `/home/tpwo/ws/event-scrapper-srt/output.json`
```

### Output file structure

Output file is [Newline Delimited JSON](https://github.com/ndjson/ndjson-spec) format which means. Each line has the following structure:

```json
{"title": "Lindy Hop dla początkujacych | intensywne warsztaty", "description": "<p>Daj się zarazić swingowym bakcylem...<snipped>", "place_name": "Studio Swing Revolution Trójmiasto", "place_address": "Łąkowa 35/38, Gdańsk", "online_locations": ["https://swingrevolution.pl/warsztaty-lindy-hop-od-podstaw/"], "start_datetime": 1722074400, "end_datetime": 1722085200, "multidate": 1, "tags": ["swing"], "image_url": "https://swingrevolution.pl/wp-content/uploads/2022/04/351150267_646835474155254_2037209978322475013_n.jpg"}
```

## Development

### Run unit tests and static checks
Expand Down
2 changes: 1 addition & 1 deletion event_scrapper_srt/main.py
Original file line number Diff line number Diff line change
Expand Up @@ -26,7 +26,7 @@ def main(argv: list[str] | None = None) -> int:
parser.add_argument(
'--output-path',
default=f'output/{datetime.now().isoformat()}.json',
help='JSON with scrapped events is saved there (default: %(default)s)',
help='NDJSON with scrapped events is saved there (default: %(default)s)',
)
args = parser.parse_args(argv)

Expand Down

0 comments on commit d5f8a4b

Please sign in to comment.