Export your PostgreSQL database anonymized. Replace all sensitive data thanks to faker
. Output to a file that you can easily import with psql
.
Run this command by giving a connexion string and an output file name (no need to install first thanks to npx
):
npx pg-anonymizer postgres://user:secret@localhost:1234/mydb -o dump.sql
☝️ This command requires pg_dump
. It may already be installed as soon as PostgreSQL is installed.
Use --list
option with a comma separated list of column name:
npx pg-anonymizer postgres://localhost/mydb \
--list=email,firstName,lastName,phone
Specifying another list via --list
replace the default automatically anonymized values:
email,name,description,address,city,country,phone,comment,birthdate
You can also choose which faker function you want to use to replace data (default is faker.random.word
):
npx pg-anonymizer postgres://localhost/mydb \
--list=firstName:faker.name.firstName,lastName:faker.name.lastName
👉 You don't need to specify faker function since the command will try to find correct function via column name.
You can use plain text too for static replacements:
npx pg-anonymizer postgres://localhost/mydb \
--list=textcol:hello,jsoncol:{},intcol:12
You can even use your custom replacements function from your own javascript module. Here is a simple example to mask all the email.
npx pg-anonymizer postgres://localhost/mydb \
--extension ./myExtension.js \
--list=email:extension.maskEmail
// myExtension.js
module.exports = {
maskEmail: (email) => {
const [name, domain] = email.split('@');
const { length: len } = name;
const maskedName = name[0] + '...' + name[len - 1];
const maskedEmail = maskedName + '@' + domain;
return maskedEmail;
}
};
Use -m
to change pg_dump
output memory limit (e.g: 512
)
Use -l
to change the locale used by faker (default: en
)
The anonymized output file is plain SQL text, you can import it with psql
.
psql -d mylocaldb < output.sql
There are a bunch of competitors, still I failed to use them:
postgresql_anonymizer
may be hard to setup and may be cumbersome for simple usage. Still, I guess it's the best solution.pganonymize
fails when it does not usepublic
schema or columns have uppercase characterspganonymizer
also fails with simple cases. Errors are not explicit and silent.