Skip to content
abyrd edited this page Apr 17, 2013 · 5 revisions

Large and Long Distance Graphs

OpenTripPlanner has been developed and tested mostly using data sets representing a single city or urban area. Many of the default settings are more suited for data sets of this size. It is certainly possible to combine several feeds and make a large regional graph, but search speed and memory use may suffer. Don't hestitate to try: we appreciate feedback on new configurations. This page will be used to collect configuration changes and other issues specific to large graphs.

To our knowledge, the largest system on which OTP is used in production is the Netherlands, where open data coverage (and real-time coverage) is almost complete: around 13 million stop times, 1969 routes, and 66000 stops. The stock OpenTripPlanner configuration is too slow on graphs of this size (mostly due to re-exploring the street network for every conceivable transfer). We have worked out a custom configuration that is quite responsive, though it sacrificies a bit of flexibility by pre-computing transfers (fixed maximum distance for transfers, no bike triangle on transfers). Try it out for some long-distance routing in the Netherlands at [http://opentripplanner.nl].

We are interested in scaling OTP up to provide international trip planning, and have even received research and development funding to experiment with a next-generation rework of the routing code with an eye on reducing resource consumption.

The Dutch Setup

OTP was largely designed, tested, and deployed in medium-sized metropolitan areas (approximately 2 million inhabitants). We did have some concern about scalability when moving to data sets for territories the size of the Netherlands or BeNeLux, with some responses taking tens of seconds or more. However, we have had reassuring results testing the set of strategies used together in the “SimplifiedPathService”. They attempt to solve “99th percentile” trip planning requests efficiently rather than being infinitely flexible as existing OTP deployments are. They assume that transfers will all be less than some maximum distance, and that while walk or bicycle speed may be user-defined, other turning or hill climbing preferences are not. The component strategies are:

  • Precalculated transfers. Rather than leaving the transit network for the street network at every stop, we create single edges for each possible transfer. This limits transfer distance, but that is not a problem for the vast majority of requests which only transfer over short distances.

  • Threaded Bidirectional Heuristic. OTP’s original Bidirectional heuristic was intended for multiobjective transit searches. It is an admissible (strictly under-estimating) A* heuristic that works backward from the destination, skipping some of the more complex boarding calculations so it is faster than the main (“forward”) search. Its value at a given vertex can be quite close to the true cost to reach the destination; in essence it captures the non-time-dependent constants of the transit network, such as transfer walking/biking times and minimum inter-stop times. It has the disadvantage of being prohibitively slow to calculate since it needs to explore the entire graph. The threaded version of the bidirectional heuristic has been modified to run at the same time as the main search, and does not need to completely explore the graph to return meaningful heuristic values. In cases where the main search is slow, the threaded heuristic search progresses farther and guides it.

  • Constrained on-street search. The heuristic’s initial on-street search is limited to an area around the endpoints determined by the maxWalk parameter. The search then continues within only the transit network and pre-calculated transfer edges, and the rest of the street network is considered inaccessible. This ensures that the main search does not waste time repeatedly exploring huge numbers of streets at intermediate transfer locations where almost everyone will just take an obvious short transfer. Excess street exploration especially becomes a problem at transfers to infrequent services (the poor temporal connectivity problem) and late at night.

  • A PathParser which defines acceptable trips in terms of a series of edge types: any number of on-street edges, followed by any number of transit legs connected by pre-computed transfer edges, followed by any number of on-street edges. All other strings of edge types in the language of paths are detected on the fly and rejected, which greatly limits the number of possibilities. This is somewhat redundant when used with a constrained on-street search, but prevents other aberrations like chaining transfer edges or alighting and re-boarding transit near the origin.

First, you will need to build your graph with the StreetlessStopLinker module:

<bean id="graphBuilderTask" class="org.opentripplanner.graph_builder.GraphBuilderTask">
   <property name="path" value="build" />
   <property name="graphBuilders">
      <list>
         <ref bean="gtfsBuilder" />
         <ref bean="osmBuilder" />
         <bean class="org.opentripplanner.graph_builder.impl.PruneFloatingIslands" />
         <bean class="org.opentripplanner.graph_builder.impl.TransitToStreetNetworkGraphBuilderImpl" />
         <bean class="org.opentripplanner.graph_builder.impl.StreetlessStopLinker">
            <property name="radius" value="400" />
         </bean>
      </list>
   </property>
</bean>

Then, it otp-api-webapp's application-context.xml, you'll want to specify the SimplifiedPathService instead of the RetryingPathServiceImpl or Raptor:

<bean id="pathService" class="org.opentripplanner.routing.impl.SimplifiedPathServiceImpl">
   <property name="timeout" value="6" />
</bean>

Remember this code is not heavily used yet, and it might take some tweaking or patches to get the results you want. Contributions welcome!

Multi-path Timeouts

Sometimes the first itinerary is easy to find, but the second or subsequent itineraries are more difficult. Perhaps they don't exist at all, and the router has to do quite a bit of work to discover that fact. This is especially true in very large graphs. It is preferable to return several itineraries to the user, but maybe all you really need is one. You can decide to tolerate smaller numbers of itineraries in exchange for faster response times by specifying timeout values in your application-context.

For example, if you are using the RetryingPathServiceImpl, you could use the following config xml:

<bean id="pathService" class="org.opentripplanner.routing.impl.RetryingPathServiceImpl">
    <property name="firstPathTimeout" value="10.0" />
    <property name="multiPathTimeout" value="1.0" />
</bean>

This will take up to 10 seconds to find the first itinerary, but once an itinerary has been found it will bail out of the search if more than 1 second has elapsed.

Here is a similar snippet for the RAPTOR path service:

<bean id="pathService" class="org.opentripplanner.routing.impl.raptor.Raptor">
    <property name="multiPathTimeout" value="1.0" />
</bean>

The MaxTransfers Query Parameter

The router constrains the number of transit vehicles boarded. By default, the number of transfers is limited to 2, so up to three vehicles are used in an itinerary. This may not be enough for a complicated intercity path, so if your graph contains origin-destination pairs that may require more than three vehicles to connect efficiently you will want to increase this value using the maxTransfers query parameter. Otherwise the router may spend a great deal of time examining every combination of routes looking for a non-existant 3-vehicle option. Note that there is a hard-coded upper limit on the number of transfers at the top of org.opentripplanner.routing.core.RoutingRequest (CLAMP_TRANSFERS=6).

Clone this wiki locally