Skip to content

Commit

Permalink
Add support for glob alike pattern matching
Browse files Browse the repository at this point in the history
Also revamped README.
  • Loading branch information
satazor committed Dec 4, 2024
1 parent faf57c5 commit 16d3191
Show file tree
Hide file tree
Showing 4 changed files with 634 additions and 183 deletions.
312 changes: 257 additions & 55 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,105 +1,307 @@
# anonymizer
Object redaction with whitelist and blacklist. Blacklist items have higher priority and will always supercede the whitelist.

## Arguments
1. `whitelist` _(Array)_: The whitelist array.
2. `blacklist` _(Array)_: The blacklist array.
3. `options` _(Object)_: An object with optional options.
Object redaction library that supports whitelisting, blacklisting and wildcard matching.

`options.replacement` _(Function)_: A function that allows customizing the replacement value (default implementation is `--REDACTED--`).
## Installation

`options.serializers` _(List[Object])_: A list with serializers to apply. Each serializers must contain two properties: `path` (path for the value to be serialized, must be a `string`) and `serializer` (function to be called on the path's value).
```bash
npm install @uphold/anonymizer
```

`options.trim` _(Boolean)_: A flag that enables trimming all redacted values, saving their keys to a `__redacted__` list (default value is `false`).
## Usage

### Example
### Basic example

```js
const { anonymizer } = require('@uphold/anonymizer');
const whitelist = ['foo.key', 'foo.depth.*', 'bar.*', 'toAnonymize.baz', 'toAnonymizeSuperString'];
const blacklist = ['foo.depth.innerBlacklist', 'toAnonymize.*'];
import { anonymizer } from '@uphold/anonymizer';

const whitelist = ['key1', 'key2.foo'];
const anonymize = anonymizer({ whitelist });

const data = {
key1: 'bar',
key2: {
foo: 'bar',
bar: 'baz',
baz: {
foo: 'bar',
bar: 'baz'
}
}
};

anonymize(data);

// {
// key1: 'bar',
// key2: {
// foo: 'bar',
// bar: '--REDACTED--',
// baz: {
// foo: '--REDACTED--'
// bar: '--REDACTED--'
// }
// }
// }
```

### Wildcard matching example

Using `*` allows you to match any character in a key, except for `.`.
This is similar to how `glob` allows you to use `*` to match any character, except for `/`.

```js
import { anonymizer } from '@uphold/anonymizer';

const whitelist = ['key2.*'];
const anonymize = anonymizer({ whitelist });

const data = {
key1: 'bar',
key2: {
foo: 'bar',
bar: 'baz',
baz: {
foo: 'bar',
bar: 'baz'
}
}
};

anonymize(data);

// {
// key1: '--REDACTED--',
// key2: {
// foo: 'bar',
// bar: 'baz',
// baz: {
// foo: '--REDACTED--',
// bar: '--REDACTED--'
// }
// }
// }
```

### Double wildcard matching example

Using `**` allows you to match any nested key.
This is similar to how `glob` allows you to use `**` to match any nested directory.

```js
import { anonymizer } from '@uphold/anonymizer';

const whitelist = ['key2.**', '**.baz'];
const blacklist = ['key2.bar']
const anonymize = anonymizer({ blacklist, whitelist });

const data = {
foo: { key: 'public', another: 'bar', depth: { bar: 10, innerBlacklist: 11 } },
bar: { foo: 1, bar: 2 },
toAnonymize: { baz: 11, bar: 12 },
toAnonymizeSuperString: 'foo'
key1: 'bar',
key2: {
foo: 'bar',
bar: 'baz',
baz: {
foo: 'bar',
bar: 'baz'
}
},
key3: {
foo: {
baz: 'biz'
}
}
};

anonymize(data);

// {
// foo: {
// key: 'public',
// another: '--REDACTED--',
// depth: { bar: 10, innerBlacklist: '--REDACTED--' }
// key1: '--REDACTED--',
// key2: {
// foo: 'bar',
// bar: '--REDACTED--',
// baz: {
// foo: 'bar',
// bar: 'baz'
// }
// },
// bar: { foo: 1, bar: 2 },
// toAnonymize: { baz: '--REDACTED--', bar: '--REDACTED--' },
// toAnonymizeSuperString: '--REDACTED--'
// key3: {
// foo: {
// baz: 'biz'
// }
// }
// }
```

#### Example using serializers
### Custom replacement example

By default, the replacement value is `--REDACTED--`. You can customize it by passing a `replacement` function in the options.

Here's an example that keeps strings partially redacted:

```js
const { anonymizer } = require('@uphold/anonymizer');
const whitelist = ['foo.key', 'foo.depth.*', 'bar.*', 'toAnonymize.baz'];
const blacklist = ['foo.depth.innerBlacklist'];
const serializers = [
{ path: 'foo.key', serializer: () => 'biz' },
{ path: 'toAnonymize', serializer: () => ({ baz: 'baz' }) }
]
const anonymize = anonymizer({ blacklist, whitelist }, { serializers });
import { anonymizer } from '@uphold/anonymizer';

const replacement = (key, value, path) => {
if (typeof value !== 'string') {
return '--REDACTED--';
}

// Keep the first half of the string and redact the rest.
const charsToKeep = Math.floor(value.length / 2);

return value.substring(0, charsToKeep) + '*'.repeat(Math.min(value.length - charsToKeep, 100));
};

const anonymize = anonymizer({}, { replacement });

const data = {
foo: { key: 'public', another: 'bar', depth: { bar: 10, innerBlacklist: 11 } },
bar: { foo: 1, bar: 2 },
toAnonymize: {}
key1: 'bar',
key2: {
foo: 'bar',
bar: 'baz',
baz: {
foo: 'bar',
bar: 'baz'
}
}
};

anonymize(data);

// {
// foo: {
// key: 'biz',
// another: '--REDACTED--',
// depth: { bar: 10, innerBlacklist: '--REDACTED--' }
// },
// bar: { foo: 1, bar: 2 },
// toAnonymize: { baz: 'baz' }
// key1: 'b**',
// key2: {
// foo: 'b**'
// bar: 'b**',
// baz: {
// foo: 'b**',
// bar: 'b**'
// },
// }
// }
```

### Default serializers
### Trim redacted values to keep output shorter

The introduction of serializers also added the possibility of using serializer functions exported by our module. The list of default serializers is presented below:
- error
In certain scenarios, you may want to trim redacted values to keep the output shorter. Such example is if you are redacting logs and sending them to a provider, which may charge you based on the amount of data sent and stored.

#### Example
This can be achieved by setting the `trim` option to `true`, like so:

```js
const { anonymizer, defaultSerializers } = require('@uphold/anonymizer');
const serializers = [
{ path: 'foo', serializer: defaultSerializers.error }
];
const whitelist = ['key1', 'key2.foo'];
const anonymize = anonymizer({ whitelist }, { trim: true });

const anonymize = anonymizer({}, { serializers });
const data = {
key1: 'bar',
key2: {
foo: 'bar',
bar: 'baz',
baz: {
foo: 'bar',
bar: 'baz'
}
}
};

const data = { foo: new Error('Foobar') };
anonymize(data);

// {
// __redacted__: [ 'key2.bar', 'key2.baz.foo', 'key2.baz.bar']
// key1: 'bar',
// key2: {
// foo: 'bar'
// }
// }
```

### Serializers example

Serializers allow you to apply custom transformations to specific values before being redacted.

Here's an example:

```js
const { anonymizer } = require('@uphold/anonymizer');
const whitelist = ['foo.key'];
const serializers = [
{ path: 'foo.key', serializer: () => 'biz' },
]
const anonymize = anonymizer({ whitelist }, { serializers });

const data = {
foo: { key: 'public' },
};

anonymize(data);

// {
// foo: {
// name: '--REDACTED--',
// message: '--REDACTED--',
// stack: '--REDACTED--'
// key: 'biz'
// }
// }
```

Take a look at the [built-in serializers](#serializers) for common use cases.

## API

### anonymizer({ whitelist, blacklist }, options)

Returns a function that redacts a given object based on the provided whitelist and blacklist.

#### whitelist

Type: `Array`
Default: `[]`

An array of whitelisted patterns to use when matching against object paths that should not be redacted.

#### blacklist

Type: `Array`
Default: `[]`

An array of blacklisted patterns to use when matching against object paths that should be redacted.

By default, every value is redacted. However, the blacklist can be used in conjunction with a whitelist. The values that match the blacklist will be redacted, even if they match the whitelist.

#### options

##### options.replacement

Type: `Function`
Default: `(key, value, path) => '--REDACTED--'`

A function that allows customizing the replacement value (default implementation is `--REDACTED--`).

It receives the following arguments: `key` _(String)_, `value` _(Any)_, and `path` _(String)_.

##### options.serializers

Type: `Array`
Default: `[]`

A list with serializers to apply. Each serializers must contain two properties: `path` (path for the value to be serialized, must be a `string`) and `serializer` (function to be called on the path's value).

##### options.trim

Type: `Boolean`
Default: `false`

A flag that enables trimming all redacted values, saving their keys to a `__redacted__` list. Please note that trimming is only applied when the replacement value is `--REDACTED--`.

### serializers

Built-in serializer functions you may use in the `serializers` option.

#### error

Serializes an `Error` object.

#### datadogSerializer

Serializes an `Error` object for the purpose of sending it to Datadog, adding a `kind` property based on the error class name.

## Release process

The release of a version is automated via the [release](https://github.com/uphold/anonymizer/.github/workflows/release.yml) GitHub workflow. Run it by clicking the "Run workflow" button.
Expand Down
2 changes: 1 addition & 1 deletion package.json
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
{
"name": "@uphold/anonymizer",
"version": "5.2.0",
"description": "Object redaction with whitelist as main feature.",
"description": "Object redaction library that supports whitelisting, blacklisting and wildcard matching",
"homepage": "https://github.com/uphold/anonymizer#readme",
"bugs": {
"url": "https://github.com/uphold/anonymizer/issues"
Expand Down
Loading

0 comments on commit 16d3191

Please sign in to comment.