Skip to content

Commit

Permalink
rename core.modules.json_new -> core.modules.json and core.modules.xm…
Browse files Browse the repository at this point in the history
…l_clean -> core.modules.xml

I think initially it was like this because I was running it like  `bleanser/core/modules/xml.py prune ...`

This would implicitly add src/bleanser/core to PYTHONPATH, and if you have any modules conflicting with builtin modules, it would shadow them (like xml or json), and result in all sorts of trouble (e.g. segfaults).

Running as `python3 -m bleanser.core.modules.xml prune ...` is much cleaner anyway and results in less trouble in other aspects as well, so let's just do that and rename the core modules for the ease of discovery
  • Loading branch information
karlicoss committed Oct 14, 2023
1 parent 4388c15 commit a924b13
Show file tree
Hide file tree
Showing 24 changed files with 24 additions and 28 deletions.
8 changes: 4 additions & 4 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -71,7 +71,7 @@ if __name__ == "__main__":

This is **always** acting on the data loaded into memory/temporary files, it is not modifying the files itself. Once it determines an input file can be pruned, it will warn you by default, and you can specify `--move` or `--remove` with the CLI (see below) to remove it.

There are particular normalisers for different filetypes, e.g. [`json`](./src/bleanser/core/modules/json_new.py), [`xml`](./src/bleanser/core/modules/xml_clean.py), [`sqlite`](./src/bleanser/core/modules/sqlite.py) which might work if your data is especially basic, but typically this requires subclassing one of those and writing some custom code to 'cleanup' the data, so it can be properly compared/diffed.
There are particular normalisers for different filetypes, e.g. [`json`](./src/bleanser/core/modules.json.py), [`xml`](./src/bleanser/core/modules/xml_clean.py), [`sqlite`](./src/bleanser/core/modules/sqlite.py) which might work if your data is especially basic, but typically this requires subclassing one of those and writing some custom code to 'cleanup' the data, so it can be properly compared/diffed.

### do_cleanup

Expand Down Expand Up @@ -117,7 +117,7 @@ As it can be a bit difficult to follow, generally this is doing something like.
For example, the JSON normaliser calls a `cleanup` function before it starts processing the data. If you wanted to remove the `images` key like shown above, you could do so like:

```python
from bleanser.core.modules.json_new import JsonNormaliser, delkeys, Json
from bleanser.core.modules.json import JsonNormaliser, delkeys, Json


class Normaliser(JsonNormaliser):
Expand All @@ -136,8 +136,8 @@ if __name__ == '__main__':
For common formats, the helper classes handle all the tedious bits like loading/parsing data, managing the temporary files. The `Normaliser.main` calls the CLI, which looks like this:

```
$ python3 -m bleanser.core.modules.json_new prune --help
Usage: python -m bleanser.core.modules.json_new prune [OPTIONS] PATH
$ python3 -m bleanser.core.modules.json prune --help
Usage: python -m bleanser.core.modules.json prune [OPTIONS] PATH
Options:
--glob Treat the path as glob (in the glob.glob sense)
Expand Down
Empty file modified src/bleanser/core/modules/extract.py
100755 → 100644
Empty file.
File renamed without changes.
Empty file modified src/bleanser/core/modules/sqlite.py
100755 → 100644
Empty file.
4 changes: 0 additions & 4 deletions src/bleanser/core/modules/xml_clean.py → src/bleanser/core/modules/xml.py
100755 → 100644
Original file line number Diff line number Diff line change
@@ -1,8 +1,4 @@
#!/usr/bin/env python3
"""
Ugh, wtf?? If I name it simply 'xml', I get all sorts of weird behaviours... presumably because it conflicts with some system modules..
"""

from lxml import etree

from contextlib import contextmanager
Expand Down
2 changes: 1 addition & 1 deletion src/bleanser/modules/bumble_android.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
#!/usr/bin/env python3
from bleanser.core.modules.sqlite import SqliteNormaliser, Tool
from bleanser.core.modules.json_new import delkeys
from bleanser.core.modules.json import delkeys

import json

Expand Down
2 changes: 1 addition & 1 deletion src/bleanser/modules/foursquare.py
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
#!/usr/bin/env python3
from __future__ import annotations

from bleanser.core.modules.json_new import JsonNormaliser, Json, delkeys
from bleanser.core.modules.json import JsonNormaliser, Json, delkeys


TARGET = object()
Expand Down
2 changes: 1 addition & 1 deletion src/bleanser/modules/ghexport.py
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
#!/usr/bin/env python3
from bleanser.core.modules.json_new import JsonNormaliser, Json
from bleanser.core.modules.json import JsonNormaliser, Json


class Normaliser(JsonNormaliser):
Expand Down
2 changes: 1 addition & 1 deletion src/bleanser/modules/goodreads.py
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
#!/usr/bin/env python3
from bleanser.core.modules.xml_clean import Normaliser as XmlNormaliser
from bleanser.core.modules.xml import Normaliser as XmlNormaliser


class Normaliser(XmlNormaliser):
Expand Down
2 changes: 1 addition & 1 deletion src/bleanser/modules/instagram_android.py
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
#!/usr/bin/env python3
from bleanser.core.modules.json_new import delkeys, patch_atoms
from bleanser.core.modules.json import delkeys, patch_atoms
from bleanser.core.modules.sqlite import SqliteNormaliser, Tool

import json
Expand Down
4 changes: 2 additions & 2 deletions src/bleanser/modules/json_new.py
Original file line number Diff line number Diff line change
@@ -1,8 +1,8 @@
#!/usr/bin/env python3
from bleanser.core.modules.json_new import * # noqa: F403, F401
from bleanser.core.modules.json import * # noqa: F403, F401

import warnings
warnings.warn("Module 'bleanser.modules.json_new' is deprecated. Use 'bleanser.core.modules.json_new' instead.", DeprecationWarning)
warnings.warn("Module 'bleanser.modules.json_new' is deprecated. Use 'bleanser.core.modules.json' instead.", DeprecationWarning)


if __name__ == '__main__':
Expand Down
2 changes: 1 addition & 1 deletion src/bleanser/modules/lastfm.py
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
#!/usr/bin/env python3
from bleanser.core.modules.json_new import JsonNormaliser, Json
from bleanser.core.modules.json import JsonNormaliser, Json


class Normaliser(JsonNormaliser):
Expand Down
2 changes: 1 addition & 1 deletion src/bleanser/modules/monzo.py
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
#!/usr/bin/env python3
from bleanser.core.modules.json_new import JsonNormaliser, Json, delkeys
from bleanser.core.modules.json import JsonNormaliser, Json, delkeys


class Normaliser(JsonNormaliser):
Expand Down
2 changes: 1 addition & 1 deletion src/bleanser/modules/pinboard.py
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
#!/usr/bin/env python3
from bleanser.core.modules.json_new import JsonNormaliser
from bleanser.core.modules.json import JsonNormaliser


class Normaliser(JsonNormaliser):
Expand Down
2 changes: 1 addition & 1 deletion src/bleanser/modules/pocket.py
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
#!/usr/bin/env python3
from bleanser.core.modules.json_new import JsonNormaliser, Json, delkeys
from bleanser.core.modules.json import JsonNormaliser, Json, delkeys


class Normaliser(JsonNormaliser):
Expand Down
2 changes: 1 addition & 1 deletion src/bleanser/modules/reddit.py
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
#!/usr/bin/env python3
from itertools import chain

from bleanser.core.modules.json_new import JsonNormaliser, Json, delkeys
from bleanser.core.modules.json import JsonNormaliser, Json, delkeys


REDDIT_IGNORE_KEYS = {
Expand Down
2 changes: 1 addition & 1 deletion src/bleanser/modules/rescuetime.py
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
#!/usr/bin/env python3
from bleanser.core.modules.json_new import JsonNormaliser
from bleanser.core.modules.json import JsonNormaliser


class Normaliser(JsonNormaliser):
Expand Down
2 changes: 1 addition & 1 deletion src/bleanser/modules/skype_android.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
import json

from bleanser.core.modules.sqlite import SqliteNormaliser, Tool
from bleanser.core.modules.json_new import delkeys
from bleanser.core.modules.json import delkeys


class Normaliser(SqliteNormaliser):
Expand Down
2 changes: 1 addition & 1 deletion src/bleanser/modules/smscalls.py
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
#!/usr/bin/env python3
from bleanser.core.modules.xml_clean import Normaliser as XmlNormaliser
from bleanser.core.modules.xml import Normaliser as XmlNormaliser


class Normaliser(XmlNormaliser):
Expand Down
2 changes: 1 addition & 1 deletion src/bleanser/modules/spotify.py
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
#!/usr/bin/env python3
from bleanser.core.modules.json_new import JsonNormaliser, Json, delkeys
from bleanser.core.modules.json import JsonNormaliser, Json, delkeys


class Normaliser(JsonNormaliser):
Expand Down
2 changes: 1 addition & 1 deletion src/bleanser/modules/spotifyexport.py
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
#!/usr/bin/env python3
from bleanser.core.modules.json_new import JsonNormaliser, delkeys, Json
from bleanser.core.modules.json import JsonNormaliser, delkeys, Json


class Normaliser(JsonNormaliser):
Expand Down
2 changes: 1 addition & 1 deletion src/bleanser/modules/stackexchange.py
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
#!/usr/bin/env python3
from bleanser.core.modules.json_new import JsonNormaliser, Json, delkeys
from bleanser.core.modules.json import JsonNormaliser, Json, delkeys


class Normaliser(JsonNormaliser):
Expand Down
2 changes: 1 addition & 1 deletion src/bleanser/modules/xml_clean.py
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
from bleanser.core.modules.xml_clean import * # noqa: F401, F403
from bleanser.core.modules.xml import * # noqa: F401, F403

import warnings
warnings.warn("Module 'bleanser.modules.xml_clean' is deprecated. Use 'bleanser.core.modules.xml_clean' instead.", DeprecationWarning)
Expand Down
2 changes: 1 addition & 1 deletion src/bleanser/tests/test_hypothesis.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@

import pytest

from bleanser.core.modules.json_new import JsonNormaliser as Normaliser
from bleanser.core.modules.json import JsonNormaliser as Normaliser

from bleanser.tests.common import TESTDATA, actions, hack_attribute

Expand Down

0 comments on commit a924b13

Please sign in to comment.