Skip to content

2012.08.02 Weekly Check In

demory edited this page Aug 9, 2012 · 1 revision
  • 13:32 <demory> hi everyone
  • 13:32 <novalis_dt> Hi!
  • 13:32 <demory> mele grant_h, think we should wait for Frank?
  • 13:33 <mele> yeah, probably. looks like he's just been pushing some commits so i imagine he'll be right in
  • 13:34 <demory> ok
  • 13:35 <mele> Our big release is very soon so Frank has been working on getting all of the T-s crossed and I-s dotted
  • 13:36 -!- FrankP [1815504e@gateway/web/freenode/ip.24.21.80.78] has joined #opentripplanner
  • 13:37 <demory> alright, why don't we get started
  • 13:37 <demory> i think we have everyone -- kpw is off this week btw
  • 13:37 <demory> my update -- just updated deployer to 0.7.10 and am currently rebuilding the DC bikeshare graph -- hopefully will fix the elev-related issues kpw was having earlier. also working on a blog post about recent updates. otherwise focused on general deployer stuff and the leaflet-based ui work
  • 13:37 <novalis_dt> I've been working on optimizing and improving RAPTOR. There have been a few big wins so far: (1) reusing walking SPTs between rounds, (2) more aggressive target bounding (this reduces the number of alternate paths found but is probably worth it), some micro-optimizations in board/alight (which will even help non-raptor code), and, most significantly (architecturally), replacing the edge-based graph with a plain graph and usin
  • 13:37 <novalis_dt> g an edge-based Dijkstra (what pgRouting calls "shooting star"). Looking at the heatmap of the search space, I'm still seeing a lot of time spent in midtown, so I'm trying another variant of Tim Merrifield's heuristic to try to improve target bounding. If target bounding were strong enough, I could attempt to build some bidirectionality into the search to find some bounding path earlier.
  • 13:38 <novalis_dt> (I've also fixed some misc bugs in other areas)
  • 13:39 <FrankP> just added an example base layer to config.js that shows how to limit the extent of the map (in answer to a user question).
  • 13:39 <abyrd> novalis_dt, sounds interesting
  • 13:40 <novalis_dt> abyrd, for what it's worth, the edge-based graph uses about 30% more time than the plain graph.
  • 13:41 <abyrd> so it is both faster and conceptually simpler. sounds like a win.
  • 13:42 <novalis_dt> Yeah.
  • 13:42 <novalis_dt> The one bad thing is that it requires some changes to our state model
  • 13:43 <abyrd> I have been looking at input data for bicycle routing in Paris, including elevation data sets. Also looking at input data for the new york study and how to prep/filter it.
  • 13:43 <abyrd> Currently adapting the service ID refactor to the realtimeStoptimes branch
  • 13:43 <abyrd> novalis_dt, what must change in the state model?
  • 13:44 <novalis_dt> abyrd, we need to be able to clone/edit states outside a traversal
  • 13:44 <novalis_dt> Because when we are done traversing an edge, we need to generate new states for each of the following edges, with different costs depending on turn angles
  • 13:44 <abyrd> oh right, you mentioned that when I was in the office
  • 13:46 <novalis_dt> Yeah.
  • 13:46 <novalis_dt> So, one question that all of this Raptor stuff brings up is: what is a good path?
  • 13:47 <novalis_dt> Raptor, by default, finds the earliest arrival
  • 13:47 <novalis_dt> Well
  • 13:47 <novalis_dt> It finds the earliest arrival in under N boardings
  • 13:48 <novalis_dt> And along the way, it finds all pareto-optimal paths between (boardings, walk distance, arrival time)
  • 13:48 <novalis_dt> I have pruned this by assuming that the worst total time allowable is no more than 1.5x the first-found (lowest transfers) path
  • 13:49 <novalis_dt> Anyway, so I get a bunch of paths which go some distance out of the way to avoid walking
  • 13:49 <novalis_dt> Does this make any sense?
  • 13:49 <novalis_dt> What sorts of paths do people actually want to see?
  • 13:49 <demory> so, the core raptor algorithm doesn't take into account user prefs for transfers vs time, walking dist, etc?
  • 13:50 <novalis_dt> demory, it takes into account max walk distance
  • 13:50 <demory> it just generates every logical path, and then we choose from those?
  • 13:50 <demory> ok
  • 13:50 <mele> I guess we just need to define how far we're willing to go out of the way to avoid walking/how much extra time on the bus for xx distance walking
  • 13:51 <demory> but we can still have a user preference for transfers vs. travel time, and use that to inform which path(s) we return
  • 13:51 <novalis_dt> mele, by default, presently, walking time is considered 2x as bad as all other time
  • 13:51 <novalis_dt> demory, yes, we could do that, to prune the list of paths.
  • 13:51 <novalis_dt> The other question is when we cut off the searches. Search proceeds in order of number of boardings. So, first we find pure-walk trips, then one-boarding trips, and so on
  • 13:52 <novalis_dt> The number of transfers we allow is the major contributor to compute time
  • 13:52 <mele> ok great, so it's not sticking to that, or is it?
  • 13:52 <novalis_dt> Right now, it's finding all combinations of earliest arrival/short walk
  • 13:52 <novalis_dt> (within some epsilon)
  • 13:53 <demory> would it make sense to have user-specified weighted preferences (a la the bike triangle)?
  • 13:53 <demory> a transit triangle? :)
  • 13:53 <novalis_dt> It would only sort of make sense
  • 13:53 <novalis_dt> That is, we could use that to sort the paths found
  • 13:54 <novalis_dt> Also, the nontransit portion already uses the bike triangle, for what that's worth.
  • 13:54 <demory> yeah, that's true
  • 13:55 <demory> also, currently OTP will ignore the max walk if no results are found at first, right? would that be the same w/ raptor?
  • 13:56 <novalis_dt> We would have to rerun the search
  • 13:56 <novalis_dt> But I think it would tend to be fast
  • 13:56 <demory> ok
  • 13:57 <novalis_dt> The typical problem with short walk distances is that they don't get you out of your origin/destination space
  • 13:57 <novalis_dt> And that will be discovered pretty immediately by raptor.
  • 13:59 <demory> ok, so we could just double the walk radius, and if still no results, double again, etc..
  • 13:59 <novalis_dt> Yes, which is what we do now
  • 13:59 <demory> maybe cap that at 2 or 3 passes
  • 14:00 <demory> so, say we stick w/ the existing inputs -- shortest trip vs fewest transfers, and max walk/bike. (plus the bike prefs, if applicable)
  • 14:00 <novalis_dt> Well, "fewest transfers" is not a real preference
  • 14:00 <novalis_dt> We implement that with a vast transfer penalty
  • 14:01 <novalis_dt> I actually did see raptor find the insane min-transfer route that I love so much
  • 14:02 <novalis_dt> We definitely would use the transfer penalty if we decide to rank returned paths by weight. We just have to make sure not to cut off the search just because some paths have been found
  • 14:02 <novalis_dt> in the case where the found paths are all insane
  • 14:02 <novalis_dt> What defines "insane" is an interesting question, however.
  • 14:02 <FrankP> Dave, where did Raptor come from? Any links to a paper describing the algorithm?
  • 14:02 <novalis_dt> FrankP, yes, there's a paper. It's from Microsoft Research
  • 14:03 <novalis_dt> http://research.microsoft.com/apps/pubs/default.aspx?id=156567
  • 14:03 <novalis_dt> The thing that the paper does not really address is how "footpaths" (in its terminology) work.
  • 14:03 <FrankP> thanks
  • 14:04 <novalis_dt> But the reason that the algorithm works well for cases like wilsonville->sandy is that it goes in order of number of boardings rather than absolute weight. And since there is some path with relatively few transfers between those two places, it does find that path.
  • 14:06 <demory> so when you say "goes in order of # of boardings" what does that mean? does it mean it considers all 0-transfer trips first, then 1-transfer, etc.?
  • 14:06 <novalis_dt> Exactly.
  • 14:07 <demory> yeah, this actually sounds similar in some ways to the approach i took when I wrote my 1st trip planner in Atlanta (I did a presentation on that at the last workshop in Portland)
  • 14:08 <demory> so, if a user wants a "fewest transfer" trip, we can just stop after the first "round" that produces any results, right?
  • 14:08 <novalis_dt> We could
  • 14:08 <demory> and return the "best" one of those (by time)
  • 14:08 <novalis_dt> But nobody really wants that
  • 14:08 <demory> well, technically that is the fewest-transfer trip, right?
  • 14:08 <novalis_dt> The "fewest transfers" route from Grand Army Plaza to 28th & lex is:
  • 14:08 <abyrd> actually, I think people often do really want least transfers as long as the trip is not totally insane
  • 14:08 <novalis_dt> Wait until 1:00am. Then the 4 train will be running local.
  • 14:09 <demory> ok. isee
  • 14:09 <novalis_dt> Whereas the reasonable path is: take the 2/3 to the 4/5 to the 6. These are all across-the-platform transfers
  • 14:09 <abyrd> you just need to apply some sanity checks to the first result you get back. if it fails then try more transfers.
  • 14:09 <novalis_dt> And another option is: take the Q to the 6
  • 14:10 <abyrd> of course one criterion is how much worse the 0-transfer trip is than the 1+ transfer trips... which you'll never know unless you find them
  • 14:10 <novalis_dt> Exactly.
  • 14:11 <abyrd> but assuming the 0-transfer trip is not insane (which could be judged almost entirely on the wait time for boarding) the user will be happy with that result
  • 14:12 <novalis_dt> Yeah.
  • 14:12 <novalis_dt> And we don't have to prune out insane waits if all of our waits turn out to be insane.
  • 14:12 <novalis_dt> Like, if you plan a trip at 1:00am almost anywhere in Portland, you're going to be waiting 4 hours until the system starts up again
  • 14:14 <novalis_dt> So, yeah, that will definitely help
  • 14:15 <demory> so, likewise, if someone wants the "fastest trip" result, are there cases where the literal fastest trip is not actually what makes sense to return?
  • 14:15 <novalis_dt> demory, when it involves six transfers
  • 14:15 <novalis_dt> To go a tiny distance
  • 14:15 <abyrd> and I really do think it's as simple as examining wait time to decide whether to proceed with more transfers
  • 14:16 <abyrd> yeah, I think you definitely want some kind of combined weight to order all the pareto-optimal options
  • 14:16 <novalis_dt> What's a "reasonable" wait time?
  • 14:16 <novalis_dt> Just to put some numbers on it: a long trip (southampton to the bronx) at 11am. 3 boardings: 1 second. 5 boardings: 3.5 seconds. Of course I expect to get both of these numbers down.
  • 14:16 <abyrd> and I suspect that trip planner users will not want to set those parameters themselves
  • 14:16 <abyrd> weighting transfers, walking, etc. properly is part of the trip planner design
  • 14:17 <novalis_dt> Yep
  • 14:17 <novalis_dt> And our current weighting system is pretty good
  • 14:17 <abyrd> people are more personally involved with the safety and energy expenditure of biking, so enjoy setting those params, but not in transit
  • 14:17 <novalis_dt> And since the way I have implemented raptor produces a traditional path, we have those weightings
  • 14:18 <abyrd> cool, then the simplest thing to do is return pareto-optimal paths in order of generalized cost, just like MOA*
  • 14:19 <novalis_dt> OK.
  • 14:19 <abyrd> with the slight difference that in RAPTOR you don't know that you've found the lowest generalized cost until you iterate all the way up to max transfers
  • 14:20 <abyrd> whereas MOA* will always spit out paths in that order
  • 14:20 <novalis_dt> Right.
  • 14:20 <novalis_dt> But probably weight can be used to bound the search
  • 14:20 <novalis_dt> Just as we now do for walk distance and time
  • 14:20 <demory> btw, how is max_transfers determined? that heavily influences the overall running time right?
  • 14:20 <novalis_dt> I think I tried a variant of that and it didn't improve speed, but I will try again because I have made a bunch of changes since
  • 14:21 <novalis_dt> demory, yes, it's huge
  • 14:21 <abyrd> but usually you can get up to 3+ transfers very quickly right? you'd only have to bail early for really awful searches.
  • 14:21 <novalis_dt> demory, almost all of the compute time happens in later rounds
  • 14:21 <demory> yeah, that makes sense
  • 14:21 <novalis_dt> abyrd, well, "very quickly" is relative, but basically, yes
  • 14:22 <abyrd> later rounds are what, >3 or something crazy like the 8 in the paper?
  • 14:22 <novalis_dt> I don't recall exactly how they define rounds
  • 14:22 <novalis_dt> For us, max rounds = max transfers + 2
  • 14:22 <abyrd> well, I can judge from your numbers above (1 vs 3.5 sec)
  • 14:23 <novalis_dt> That's +1 for the zero-transfer case and +1 because n transfers = n+1 boardings
  • 14:23 <abyrd> right
  • 14:23 <novalis_dt> My numbers are in terms of transfers
  • 14:23 <novalis_dt> so, 3 transfers -> 4 boardings -> 5 rounds
  • 14:24 <abyrd> as for what constitutes insane wait time, that's hard to define. I think it's going to be determined through experience, but I'd start with something like 1 hour.
  • 14:25 <novalis_dt> 1 hour total?
  • 14:25 <abyrd> or how about the lowest route frequency in the graph, plus a few minutes?
  • 14:25 <novalis_dt> The lowest frequency will likely be > 12 hours
  • 14:25 <novalis_dt> because routes run differently at night
  • 14:25 <novalis_dt> That late-night 4 only runs for like 4 hours total.
  • 14:26 <abyrd> I was thinking of during the day, not over service day boundaries, but still there could be a service that only runs twice a day
  • 14:26 <abyrd> so yeah that idea is out
  • 14:26 <novalis_dt> Yeah, pretty often the first/last runs of the day are special
  • 14:27 <abyrd> I'd say anything that involves waiting an hour or more for the first boarding is a but crazy
  • 14:27 <abyrd> and should only be returned if all other options involve waiting that much
  • 14:28 <novalis_dt> Yeah, that's fair.
  • 14:28 <abyrd> or paths that involve waiting more than ah hour (or whatever arbitrary value we choose) for /any one/ boarding
  • 14:29 <novalis_dt> Yeah, that's fair.
  • 14:29 <abyrd> those should all be a hint to try another round
  • 14:29 <abyrd> up to your absolute max
  • 14:30 <demory> so maybe insane wait time is defined relatively, i.e. > 1 hr more than median wait time of all trips returned?
  • 14:30 <novalis_dt> all trips returned when?
  • 14:31 <novalis_dt> Like, ever?
  • 14:32 <demory> well, for a given search
  • 14:32 <novalis_dt> Here, we're contemplating when to stop the search early, tho.
  • 14:34 <demory> so, take the case of the 2am search in Portland, when every trip will have like a 4+ hr wait
  • 14:34 <abyrd> hey, following discussion on transit-developers, OTP Deployer and Analyst are both mentioned in the draft TRB paper just posted by Aaron Antrim and Sean Barbeau
  • 14:34 <novalis_dt> Awesome.
  • 14:34 <novalis_dt> demory, ah, for a given start time
  • 14:34 <demory> normally, seeing those waits would normally tell us to keep doing more rounds, right?
  • 14:35 <demory> but in this case we're never going to do better than that
  • 14:35 <novalis_dt> Normally in A*?
  • 14:35 <novalis_dt> Or in raptor?
  • 14:35 <demory> so, i mean in raptor
  • 14:35 <novalis_dt> Yeah, we could definitely use time of day information for that
  • 14:35 <novalis_dt> That's going to be hard to benchmark, but probably reasonable in practice.
  • 14:35 <abyrd> in the case where low-transfers results have a lot of waiting, we have to do more rounds to verify that they are not insane, in the given context
  • 14:40 <demory> so, just so i understand -- we're talking about having typical waits for, say, each hour of the day and using that as a baseline for evaluating trips returned for a given search at that time?
  • 14:41 <novalis_dt> I think that sounds like a good idea
  • 14:41 <novalis_dt> Just to tell you when to keep searching after you find N paths
  • 14:42 <demory> isn't there also geographic component -- i.e. in NYC at 2am, typical waits in Manhattan (where trains still run every 30 min) are going to differ considerably compared to a more suburban area (where the next bus isn't until 6am)
  • 14:42 <novalis_dt> Yes.
  • 14:42 <novalis_dt> But if we ignore that, we will still probably get good results
  • 14:43 <demory> ok.
  • 14:44 <demory> i need to step away for an errand pretty soon
  • 14:45 <demory> but sounds like you are making good progress on this
  • 14:45 <novalis_dt> OK, let's call this a day.
  • 14:45 <demory> ok, thanks
Clone this wiki locally