Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Deduplicator doesn't support entries with price and inferred amount #98

Open
henrikssn opened this issue Jan 17, 2022 · 7 comments · May be fixed by #99
Open

Deduplicator doesn't support entries with price and inferred amount #98

henrikssn opened this issue Jan 17, 2022 · 7 comments · May be fixed by #99

Comments

@henrikssn
Copy link

henrikssn commented Jan 17, 2022

I have this in my ledger:

2016-09-25 * "Hooli Vest Event"
  Assets:US:Schwab:HOOL               1 HOOL {786.9 USD}
  Income:Salary:Hooli:GSU     CHF @ 0.968892 USD
  Assets:US:Hooli:Unvested:C123456  -1 HOOL.UNVEST
  Expenses:Hooli:Vested              1 HOOL.UNVEST

Exception:

  Traceback (most recent call last):
    File "/home/erik/.local/lib/python3.7/site-packages/beangulp/__init__.py", line 86, in _extract
      entries = extract.extract_from_file(importer, filename, existing_entries)
    File "/home/erik/.local/lib/python3.7/site-packages/beangulp/extract.py", line 44, in extract_from_file
      entries = importer.deduplicate(entries, existing_entries)
    File "/home/erik/.local/lib/python3.7/site-packages/beangulp/importer.py", line 167, in deduplicate
      return extract.mark_duplicate_entries(entries, existing, window, self.cmp)
    File "/home/erik/.local/lib/python3.7/site-packages/beangulp/extract.py", line 125, in mark_duplicate_entries
      if compare(entry, target):
    File "/home/erik/.local/lib/python3.7/site-packages/beangulp/importer.py", line 145, in cmp
      return compare(a, b)
    File "/home/erik/.local/lib/python3.7/site-packages/beangulp/similar.py", line 101, in __call__
      amounts1 = self.cache[id(entry1)] = amounts_map(entry1)
    File "/home/erik/.local/lib/python3.7/site-packages/beangulp/similar.py", line 151, in amounts_map
      amounts[key] += posting.units.number
  TypeError: unsupported operand type(s) for +=: 'decimal.Decimal' and 'type'
@henrikssn henrikssn linked a pull request Jan 17, 2022 that will close this issue
@dnicolodi
Copy link
Collaborator

Interesting, I didn't know that this is valid Beancount syntax. To clarify, the ledger snippet above is emitted by your importer or is it in the existing ledger?

@henrikssn
Copy link
Author

The snippet above is in my existing ledger (both v2 and v3 accepts it), and it is actually quite tricky to fill in the blank as there is a currency conversion involved.

@henrikssn henrikssn reopened this Jan 17, 2022
@dnicolodi
Copy link
Collaborator

dnicolodi commented Jan 17, 2022

Then I don't understand why the deduplication code sees the MISSING value: the booking code should take care of filling in the missing value. A quick test indeed confirms that using the snippet above as existing ledger in beangulp extract downloads/ -e existing.beans works just fine.

Do you invoke the deduplication code in a custom way that uses parse_file() instead than load_file() to parse the existing ledger? From the traceback you reporter, it would not seem so. I'm confused. Can you post a more exact reproducer?

@blais
Copy link
Member

blais commented Jan 17, 2022

What's the content of your accounts' inventories just before the transactions?
(use bean-doctor context)

@henrikssn
Copy link
Author

henrikssn commented Jan 18, 2022

Here is a minimal ledger which reproduces the issue:

2016-09-25 open Assets:US:Schwab:HOOL HOOL
2016-09-25 open Income:Salary:Hooli:GSU CHF
2016-09-25 open Assets:US:Hooli:Unvested:C123456 HOOL.UNVEST
2016-09-25 open Expenses:Hooli:Vested HOOL.UNVEST

2016-09-25 * "Hooli Grant C123456"
  Income:US:Hooli:Awards                -100 HOOL.UNVEST
  Assets:US:Hooli:Unvested:C123456      100 HOOL.UNVEST

reproduce_issue98.py

import dateutil.parser

from beancount.core.amount import Amount
from beancount.core.data import EMPTY_SET, new_metadata, Posting, Transaction
from beancount.core.number import Decimal, MISSING
from beancount.core.position import Cost

import beangulp
from beangulp import mimetypes
from beangulp.cache import cache
from beangulp.testing import main

class Importer(beangulp.Importer):
  """Reproduce beancount/beangulp/issues/98."""

  def __init__(self):
    pass

  def identify(self, filepath):
    if not "reproduce_issue98.csv" in filepath.lower():
        return False
    return True

  def filename(self, filepath):
    return 'reproduce_issue98.csv'

  def account(self, filepath):
    return None

  def date(self):
    return dateutil.parser.parse("2016-09-25").date()

  def extract(self, filepath, existing):
    date = dateutil.parser.parse("2016-09-25").date()
    return [
      Transaction(
      meta=new_metadata(filepath, 0),
      date=date,
      flag="*",
      payee="",
      narration=f"Hooli Vest Event",
      tags=EMPTY_SET,
      links=EMPTY_SET,
      postings=[
        Posting("Assets:US:Hooli:GOOG", Amount(Decimal("1"), "GOOG"), Cost(Decimal("786.9"), "USD", date, None), None, None, None),
        Posting("Expenses:Hooli:Vested", Amount(Decimal("1"), "GOOG.UNVEST"), None, None, None, None),
        Posting("Assets:US:Hooli:Unvested:C123456", -Amount(Decimal("1"), "GOOG.UNVEST"), None, None, None, None),
        Posting(f"Income:Salary:TY2016:Google:GSU", Amount(MISSING, "CHF"), None, Amount(Decimal("0.968892"), "USD"), None, None),
        ])
    ]


if __name__ == '__main__':
    importer = Importer()
    main(importer)

Commands to reproduce:

touch testdata/reproduce_issue98.csv
python3 importers/reproduce_issue98.py extract -e test.beancount testdata

bean-doctor output:

$ bean-doctor context test.beancount 6
** Transaction Id --------------------------------

Hash:38ae45045f6a38c872e50ee91eb452e7
Location: /config/beancount/test.beancount:6


** Balances before transaction --------------------------------

  Income:US:Hooli:Awards                                                             

  Assets:US:Hooli:Unvested:C123456                                                   


** Unbooked Transaction --------------------------------

2016-09-25 * "Hooli Grant C123456"
  Income:US:Hooli:Awards            -100 HOOL.UNVEST
  Assets:US:Hooli:Unvested:C123456   100 HOOL.UNVEST


** Transaction --------------------------------

2016-09-25 * "Hooli Grant C123456"
  Income:US:Hooli:Awards            -100 HOOL.UNVEST
  Assets:US:Hooli:Unvested:C123456   100 HOOL.UNVEST


** Residual and Tolerances --------------------------------



** Balances after transaction --------------------------------

* Income:US:Hooli:Awards                                             -100 HOOL.UNVEST

* Assets:US:Hooli:Unvested:C123456                                    100 HOOL.UNVEST

@henrikssn
Copy link
Author

Interesting, I didn't know that this is valid Beancount syntax. To clarify, the ledger snippet above is emitted by your importer or is it in the existing ledger?

To clarify, the snippet did exist in both the ledger and the import output, but it seems to be the latter which causes the issue (see above).

@dnicolodi
Copy link
Collaborator

Thank you for the reproducer. The problematic transaction containing the MISSING amount is indeed the one produced by the importer. Now things make sense. I'll thing about what's the best strategy to handle those in the deduplication code.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants