Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ambiguous spec expressions #14

Open
jpn-- opened this issue May 6, 2024 · 2 comments
Open

Ambiguous spec expressions #14

jpn-- opened this issue May 6, 2024 · 2 comments

Comments

@jpn--
Copy link
Member

jpn-- commented May 6, 2024

In the joint tour frequency and composition component, we have (for example):

util_constant_for_children_party_shopping_tour,Constant for Children Party/ Shopping Tour,@(df.purpose1==5)*(df.party1==2)+(df.purpose2==5)*(df.party2==2),coef_constant_for_children_party_shopping_tour

This expression is summarized as (bool * bool) + (bool * bool). The two parenthetical terms each neatly and correctly resolves to a binary value regardless of whether the operands are treated as literal boolean values or their (0,1) numerical equivalent. However, + operator is not so clean; if both operands are True, we could arrive at different results:

  1. Interpret as numeric, so1 + 1 = 2, or
  2. Interpret as boolean, so True + True = True.

The numexpr engine of pandas.eval will (with arguably good reason) punt on solving this, throwing a NotImplementedError. Pandas can fall back to numpy logic, which will solve the expression based on the logic (2) and get True. Sharrow converts the booleans to numbers, using logic (1).

It would be better to write expressions so they are less ambiguous, and (obviously) so they resolve the same with or without sharrow. Based on context clues from the rest of the spec, it appears the intention of these expressions is following logic (1). @dhensle can you (or whomever at RSG crafted this spec) confirm the preferred interpretation?

@i-am-sijia
Copy link
Contributor

I can answer this since I worked on the initial conversion of joint tour frequency composition model.

This is what's coded in the BayDAG CT-RAMP UEC for the joint tour frequency composition model:

image

In CT-RAMP, expressions are returned in numeric. A==B returns either numeric 1 or 0, which means this expression should return 0, 1, or 2.

So this expression should be rewritten to follow logic (1).

@dhensle do you agree? If we correct the expressions, it will cause abm3 results to change in the sharrow off mode, perhaps slightly, depending on how many 2+ joint tour households are there.

@dhensle
Copy link
Contributor

dhensle commented May 6, 2024

Yes, agreed. We should be replicating the original CT-RAMP logic/results.

cc: @aletzdy

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants