You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The primary goal is to write a reusable parser to collect the data on public transport routs. It would be also nice to have an example of a resulting dataset for a particular date.
Tasks
This website http://marshrut.info/ presents data on routes and schedules of buses and trolleybuses in Yerevan. Write a reusable parser in your preferred language to grab the data from the website and pack them onto a nice machine-readable structure to store in a format, such as JSON or XML. These two would be preferable, because the data are likely to require a hierarchical structure, for instance:
[
...
{"vehicle": STRING,// specify if it is a bus or a trolleybus"number": STRING,// the route's "number"; in fact, it should be a string in case it's alphanumeric"interval": NUMBER,// how often the vehicle arrives"measure": STRING,// looks like it's all in minutes, but just in case specify for each entry"stops_forward": ARRAY,// make it an array of stop names (strings)"stops_backward": ARRAY,// make it an array of stop names (strings) in the reverse order if the route back is the same or store the specific back route stops in this array}...]
This is just an example of a possible structure. If you can think of something more convenient, you're most welcome to implement it.
The key idea of such a parser is to make it as reusable and maintainable as possible. Schedules change quite often, so it would be great to be able to run this script at least on a daily basis to collect the actual data.
It would be also nice of course to have an example output of these data as a dataset for a particular date.
The website is in Armenian only, but in fact its structure is rather clear and simple, so if you don't know the language, it shouldn't be a problem. If you still run into language troubles that you cannot solve even with the help of Google Translate, please don't hesitate to contact us.
Context
The data presented at http://marshrut.info/ have a huge potential. They could be used in very helpful web and mobile apps to build optimal routes and predict arrival times, especially if combined with some spacial data. Unfortunately, they are not published as an API, so the first step to make use of these data is to parse the HTML pages.
Requirements
A public GitHub repository should be created to store and publish the code and possibly the data under one of the free and open licenses, such as Creative Commons or MIT. Please make the code as reusable and maintainable as possible and provide it with some instructions and requirements.
Wishes
It would be best if you also comment your code, so that even beginners can understand what it does.
Goal
The primary goal is to write a reusable parser to collect the data on public transport routs. It would be also nice to have an example of a resulting dataset for a particular date.
Tasks
This website http://marshrut.info/ presents data on routes and schedules of buses and trolleybuses in Yerevan. Write a reusable parser in your preferred language to grab the data from the website and pack them onto a nice machine-readable structure to store in a format, such as JSON or XML. These two would be preferable, because the data are likely to require a hierarchical structure, for instance:
This is just an example of a possible structure. If you can think of something more convenient, you're most welcome to implement it.
The key idea of such a parser is to make it as reusable and maintainable as possible. Schedules change quite often, so it would be great to be able to run this script at least on a daily basis to collect the actual data.
It would be also nice of course to have an example output of these data as a dataset for a particular date.
The website is in Armenian only, but in fact its structure is rather clear and simple, so if you don't know the language, it shouldn't be a problem. If you still run into language troubles that you cannot solve even with the help of Google Translate, please don't hesitate to contact us.
Context
The data presented at http://marshrut.info/ have a huge potential. They could be used in very helpful web and mobile apps to build optimal routes and predict arrival times, especially if combined with some spacial data. Unfortunately, they are not published as an API, so the first step to make use of these data is to parse the HTML pages.
Requirements
A public GitHub repository should be created to store and publish the code and possibly the data under one of the free and open licenses, such as Creative Commons or MIT. Please make the code as reusable and maintainable as possible and provide it with some instructions and requirements.
Wishes
It would be best if you also comment your code, so that even beginners can understand what it does.
Resources
http://marshrut.info/
Prepared by
The Open Data Armenia team prepared this task.
The text was updated successfully, but these errors were encountered: