Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Testing CPS tax-unit benefit #135

Open
wants to merge 6 commits into
base: master
Choose a base branch
from
Open

Conversation

Amy-Xu
Copy link
Member

@Amy-Xu Amy-Xu commented Dec 6, 2017

This PR addresses the issues brought up in #115. Specifically, this PR compares summaries of new CPS tax-unit database with previous versions in terms of the benefit aggregates and distribution.

All three tests passed using a previous version of cps_benefit.csv on top of PR #133. Will add the distribution & tabulation files to this PR once I run the tests with latest version of benefit data.

@Amy-Xu
Copy link
Member Author

Amy-Xu commented Dec 7, 2017

It seems all three tests passed locally on my laptop.
screen shot 2017-12-07 at 3 52 49 pm

The tests in this script require both benefit and recipients information, while the latter is not included in the current zipped file in order to save space. My solution is to ask contributors/users to create a temporary version of benefit data locally in order to conduct the checks. I elaborated the procedure in detail in the docstring section of the script. I have finished most parts I thought of at this point. Would love to hear any thought/feedback/suggestion.

@martinholmer @hdoupe @andersonfrailey @MattHJensen

@Amy-Xu Amy-Xu changed the title [WIP] Testing CPS tax-unit benefit Testing CPS tax-unit benefit Dec 7, 2017
@andersonfrailey
Copy link
Collaborator

@Amy-Xu, sorry for leaving this unchecked for so long. When you get the chance, can you check that the tests in this PR are still accurate? I'll merge once that's been done.

@martinholmer
Copy link
Contributor

@Amy-Xu and @andersonfrailey, I'm not convinced that the benefit of these new tests (proposed nine months ago in taxdata pull request #135) is worth the maintenance cost of adding them to the taxdata test suite. My main reason for saying this is that these tests do not do a good job of alerting us to the presence of filing units with extremely large (and therefore, probably incorrect) imputed benefit amounts. Perhaps some tests should be added to the C-TAM repository, which seems to have no tests at all.

I'll be providing more information soon on extremely large imputed benefits in the CPS data, but for now consider this CPS record:

1 age_head 46
2 age_spouse 39
3 e00200p 865902
7 e00900s 233836
9 a_lineno 1
14 s006 300
16 h_seq 41675
17 ffpos 1
18 fips 11
21 nu18 2
23 n21 2
30 tanf_ben 136088
33 nu13 2
34 nu05 2
35 n24 2
37 f2441 2
38 EIC 2
39 XTOT 4
40 filer 1
41 FLPDYR 2014
42 MARS 2
49 e19200 30670
50 e18500 7695
53 RECID 76511
54 e18400 51322
55 e00900 233836
57 e00300 27
60 e19800 7383
61 e20100 1619
64 agi_bin 14
68 e00200 865902

I really don't understand how a husband and wife, who between them earn over one million dollars and live with their two young children, can be getting a TANF benefit of more than $136,000. People using the CPS benefit data are going to ask that question, especially after they implement a UBI-with-benefit-repeal reform and see that this filing unit experiences a large loss in income.

If the imputed TANF amount for this family is incorrect, then we need to add tests to C-TAM that detect these kinds of problems, and then fix the code (either in C-TAM or taxdata) that generated the incorrect imputation. On the other hand, if this imputed TANF benefit amount is correct, then the C-TAM repository needs to add documentation that explains why this extremely large TANF benefit for millionaires is accurate.

@MattHJensen @feenberg @MaxGhenis

@martinholmer
Copy link
Contributor

@Amy-Xu, I see you wrote the C-TAM code that imputes TANF benefits. Does the $136,088 tanf_ben for this filing unit, which has more than a million dollars in earned income, come straight off one of the three raw CPS files? Or is imputed using the C-TAM TANF imputation algorithm? If it's in the raw data, don't you think we need some kind of explanation in the C-TAM repository about why this rich family can be receiving such a large TANF benefit? If it's imputed by algorithm, then don't you think that C-TAM algorithm needs to be revised to screen out rich families? What are your thoughts on this matter?

@MattHJensen @feenberg @andersonfrailey @MaxGhenis

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants