-
Notifications
You must be signed in to change notification settings - Fork 112
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Introduce the new GeneanetForGramps plugin #473
base: master
Are you sure you want to change the base?
Conversation
Nice tool! Note, lxml might be also a dependency... At a glance, it seems that birth and death dates are ignored (parsing marriage date is good)
https://github.com/bcornec/addons-source/blob/master/GeneanetForGramps/GeneanetForGramps.py#L647 I am not sure of the cause there, but if it is an issue on localized date handler (there is an attribute for setting the locale on date/event; ie. on reports), then I should not get date on marriage event and maybe at least get a date with textual format as fallback. Anyway, I suppose this can be quickly checked.
Maybe an hardcoded string (in french) for _("in") will limit issues with translations handling ? Otherwise, a description will be added on created events. That's a great idea. Maybe could go further! All imported records from geneanet might be also marked with one custom tag added during import? Just an idea! |
Which creates a problem for packagers. Please use the markup API of the Python standard library. |
My bad! |
I added lxml as a prerequisite indeed and also documented the fact that it's only working with the new SQlite DB format of 5.1, not the BSDDB. |
Will try to understand how it works and implement it. Cf: https://github.com/bcornec/GeneanetForGramps/issues/23 |
Because Gramps is distributed as an all-in-one bundle on macOS and Microsoft Windows. Every additional dependency adds to the effort to create those bundles and to the resulting download size. The python in those bundles doesn't include pip and even if it did would require a UAC to run on Windows and would violate macOS's code signing security so everything must be in the bundles. Add to that that Gramps already depends directly on libxml2's python API so lxml, while a little easier to use and a little more pythonic, is superfluous. |
…ons, which is also ok for me
Hummm, too bad people are still using proprietary OSes to run FLOSS software :-( Ok, will have a look at what it requires to make modifications, but I'd like to concentrate first on having returns from people already able to execute it (so Linux users), see how it's working for them before breaking stuff to make it work for other platforms. |
Also is requests part of your existing MacOS & Windows packages ? Because this time it would make it very difficult for me to work without it.
|
Copied from a forum (in french!):
A gramps user ( @patlx ), under Windows, cannot get it working, despite 'requests' installed. |
Yes. Gramps on windows seems not to interact with any python installed packages outside of gramps, telling it where to look at using windows path variable don't change anything even with python and its installed packages as first directory in the path. Gramps in a linux vm or dockerized may be an answer but if I couldn't add such a layer on my os it'll be better. And, I'm aware that's a complex issue: |
That's because Gramps on Windows and macOS has its own python interpreter, libpython, and sys.path. This is required in order to use the Gtk GUI stack as its python interface, |
Note, just made a minor local change on description addition/inclusion before more testing, and got an issue on a person without name : some data have been added on an unrelated person and the gender has been changed to unknown! Fortunately, the local change tracks this down easily. e.g., this url where the starting url was this one. It was my testing database, but this could be an 'overwrite/collapse' issue for users having people without name (surname, firstname) on the database. The mistake seems to occur by checking the spouse and marriage. people and events (records) with colored tags (and the black one) were already there before import (my data), have a look at new relations and dates... |
Is it not possible to parse html stuff via gramps' Html class? I did tests in the past. |
I suppose that the fallback for spouse surname should be improved? @@ -1228,14 +1228,16 @@ def from_geneanet(self,purl):
if verbosity >= 2:
print(_("Spouse name:"), sname[s])
except:
- sname.append("")
+ from uuid import uuid4
+ sname.append(str(uuid4()))
try:
sref.append(str(spouse.xpath('a/attribute::href')[0]))
if verbosity >= 2:
print(_("Spouse ref:"), ROOTURL+sref[s])
except:
- sref.append("")
+ from uuid import uuid4
+ sname.append(str(uuid4()))
self.spouseref.append(ROOTURL+sref[s])
try: maybe uuid3 will be more useful for this tool (e.g., hash on url)? |
The problem is somewhere else... |
Copied debug log section:
|
Maybe one section is missing on find_grampsf() for a common use?
|
Not certain to fully understand all expected stuff, but get a workaround (need to polish or refactor it) for the issue when data already exist (outside current geneanet page): @ -792,6 +792,9 @@ def find_grampsf(self):
and self.mother and mother and mother.gramps_id == self.mother.gid:
return(f)
#TODO: What about preexisting families not created in this run ?
+ else:
+ print(i)
+ return(f)
return(None)
def from_geneanet(self): It seems that a proper fix could exist somewhere else (before on the trace back), but this workaround (as a simple addition) should not break something already checked. The only one remaining issue is maybe that we need to merge the family and the marriage event (but this avoids the overwrite without control!). My specific case is maybe that I have some data missing on the url (geneanet): I just wanted to add some data from a distant cousin. I am not certain that the multiple marriages (families/spouses) or the empty name is the cause of the issue. This might be an issue for gramps' user as the tool will overwrite the existing relations and family. |
Well, from my first reading I don't think there is an equivalent in it to the XPath features I'm getting with lxml.
What puzzles me then is that it seems that at least another gramplet is using lxml: addons-source/lxml/superclasses.py Line 13 in 18ec9ad
So if that's the case, why would importing lxml on my side be a problem in fact ? |
I think that's probaly the origin of the issue you're seeing, as I hadn't such cases when I made my tests. |
You didn't dig deeply enough. That's an experimental gramplet @romjerome wrote a few years ago. It's not distributed: Note its absence from the listing. |
Well, it was an experimental set of gramplets, which aims to explore lxml features with gramps xml file. The superclass module was an other experimentation... |
Yes, my bad! |
There is maybe a more global issue! Geneanet/Geneweb model is family centric. Just ran the "check and repair" tool (Tools menu -> Repair Family Tree) So, either my workaround is incomplete or it generates a new issue... Trying to get a better return (fallback), I also played with name/id by forcing a custom/random ID related to url: This may not be a problem if all data will be imported from/for/by geneanet, but many gramps users will not be able to quickly look at this issue. By merging (overwrite) it becomes difficult to go back, so gramps developers will have to provide an help for getting safe data back... ps: geneanet has a private API (I do not remember well the endpoint...) |
@patlx Note, if you want to test it, there is maybe one additionnal issue under non-linux system around logging (and a log file creation), but this should not be a problem if you do not use the debug flag/mode on your gramps session. There is still some minor sections to check... |
@bcornec
Maybe it could be possible to add a support for WindowsOS (and MacOS), but I was not able to test it. So, just added a simple test and check around OS and plateform. There is still some limitations with 'Requests' module as 'geneanet.org' returns 201 status code (generating a new response...). Also, after checking some pages, the server will generate a 302 code! I use a workaround but should 'Requests' Otherwise, it works fine. |
My local changes should be visible here. |
New plugin for Gramps to easily import Geneanet subtrees of a person.
Code pushed to addons-source and soon to addons
Wiki pages proposed as well: