Skip to content

Commit

Permalink
Merge branch 'develop' into main
Browse files Browse the repository at this point in the history
  • Loading branch information
vst committed Feb 17, 2021
2 parents 21d663b + a59935f commit da9e575
Show file tree
Hide file tree
Showing 13 changed files with 435 additions and 124 deletions.
66 changes: 57 additions & 9 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,23 +1,71 @@
# REMAP DEMAIL File Classifier
# reforg - Organize Files Based on Regular Expressions

`remap-demail-classifer` is a command line application written in
Python3. It classifies fetched DEMAIL attachment files using a regex
specification.
`reforg` is a command line application written in Python(3). It reorganizes
files under given directories based on a set of regex-powered rules.

There are no specific requirements for the application to run other
than `>= Python3.6`.

## Installation

```
curl -o - https://raw.githubusercontent.com/telostat/reforg/main/reforg/install.sh | sudo sh -x
```

## Usage

CLI arguments are as follows:

```
./remap-demail-classifer <SPEC-FILE> <PREFIX-REGEX> <IGNORE-FILE> <DEMAIL-DIR>
$ ./reforg --help
usage: reforg [-h] --spec SPEC-FILE --root ROOT-DIR [--dry-run] [--metadata]
[--force]
DIR [DIR ...]
Organize files based on regular expressions
positional arguments:
DIR Directory to list files of to process. Note that special
entries in the directory and sub-directories are not
listed and therefore not processed.
optional arguments:
-h, --help show this help message and exit
--spec SPEC-FILE Specification file
--root ROOT-DIR Root directory to copy processed files to
--dry-run Dry run (do not copy anything)
--metadata Preserve metadata while copying
--force Force copy if target exists (overwrite existing files)
reforg -- v0.0.1.dev0
```

Example:

```
./remap-demail-classifer spec.csv \
'^(?P<recdate>[0-9]{4}\-[0-9]{2}\-[0-9]{2})T[0-9]{2}:[0-9]{2}:[0-9]{2}Z_[A-Z0-9]{32}_[A-Z0-9]{32}_' \
tmp/ignore.dat \
/data/remap/tenants/deployment/demail/downloaded
./reforg --spec example/spec.json --root example/target/ --metadata --force --dry-run example/source/
```

## Specification Format

See [./example/spec.json](./example/spec.json) for an example.

Note that we are using JSON as specification file format. A much better file
format would be YAML (or maybe even TOML). However, we want to stick to the idea
of external *no-dependencies* for easier deployment. We may wish to change that
in the future.

For convenience, you may wish to write the specification in YAML (as in
[./example/spec.yaml](./example/spec.yaml)) and then convert to JSON:

```
cd example/
yq . < spec.yaml > spec.json
```

... or pipe converted JSON directly to the command (note the `--spec -`
argument):

```
yq . < example/spec.yaml | ./reforg --spec - --root example/target/ --metadata --force --dry-run example/source/
```
Empty file.
Empty file.
Empty file.
Empty file.
Empty file.
Empty file.
20 changes: 20 additions & 0 deletions example/spec.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
{
"prefix": "^(?P<recdate>[0-9]{4}\\-[0-9]{2}\\-[0-9]{2})_",
"rules": [
{
"regex": "A(?P<id>[0-9]+).txt$",
"filename": "{recdate}_{id}.txt",
"directory": "./txt/A"
},
{
"regex": "(?P<code>[B-Z])(?P<id>[0-9]+).dat$",
"filename": "{recdate}_{id}.dat",
"directory": "./dat/{code}"
}
],
"ignore": [
"^.*ignore.*$",
"A012.txt",
"Z013.dat"
]
}
12 changes: 12 additions & 0 deletions example/spec.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
prefix: ^(?P<recdate>[0-9]{4}\-[0-9]{2}\-[0-9]{2})_
rules:
- regex: A(?P<id>[0-9]+).txt$
filename: "{recdate}_{id}.txt"
directory: ./txt/A
- regex: (?P<code>[B-Z])(?P<id>[0-9]+).dat$
filename: "{recdate}_{id}.dat"
directory: ./dat/{code}
ignore:
- ^.*ignore.*$
- A012.txt
- Z013.dat
2 changes: 2 additions & 0 deletions example/target/.gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
*
!.gitignore
15 changes: 15 additions & 0 deletions install.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
#!/usr/bin/env sh

set -e

## Target installation directory:
REFORG_INSTALL_DIR="${REFORG_INSTALL_DIR:-"/usr/local/bin"}"

## Download the script:
curl -s -o - https://raw.githubusercontent.com/telostat/reforg/main/reforg > "${REFORG_INSTALL_DIR}/reforg"

## Change file permissions:
chmod +x "${REFORG_INSTALL_DIR}/reforg"

## Run:
reforg --help
Loading

0 comments on commit da9e575

Please sign in to comment.