Skip to content

Commit

Permalink
Fixed case sensitivity bug and updated packages
Browse files Browse the repository at this point in the history
  • Loading branch information
dandraka committed May 31, 2024
1 parent 4a05fd8 commit 9d1f788
Show file tree
Hide file tree
Showing 15 changed files with 69 additions and 57 deletions.
12 changes: 6 additions & 6 deletions MaskingTypes.md
Original file line number Diff line number Diff line change
Expand Up @@ -50,7 +50,7 @@ Example
* MaskType = None
* Asterisk: Ignored
* Expression: Ignored
* FieldName: Mandatory. The name of the field being sought.
* FieldName: Mandatory. The name of the field being sought. Note that field names are case-insensitive for CSV files & DB queries, but case-sensitive for JSON files.
* ListOfPossibleReplacements: Ignored
* QueryReplacement: Ignored
* RegExGroupToReplace: Ignored
Expand All @@ -77,7 +77,7 @@ With the usage of a regular expression, it is possible to change all or only par
* MaskType = Similar
* Asterisk: Ignored
* Expression: Ignored
* FieldName: Mandatory. The name of the field being sought.
* FieldName: Mandatory. The name of the field being sought. Note that field names are case-insensitive for CSV files & DB queries, but case-sensitive for JSON files.
* ListOfPossibleReplacements: Ignored
* QueryReplacement: Ignored
* RegExGroupToReplace: Optional. A number that specifies which regex group will be replaced.
Expand Down Expand Up @@ -126,7 +126,7 @@ With the usage of a regular expression, it is possible to change all or only par
* MaskType = Asterisk
* Asterisk: Optional. A character that will replace the original data. Defaults to an asterisk (*).
* Expression: Ignored
* FieldName: Mandatory. The name of the field being sought.
* FieldName: Mandatory. The name of the field being sought. Note that field names are case-insensitive for CSV files & DB queries, but case-sensitive for JSON files.
* ListOfPossibleReplacements: Ignored
* QueryReplacement: Ignored
* RegExGroupToReplace: Optional. A number that specifies which regex group will be replaced.
Expand Down Expand Up @@ -164,7 +164,7 @@ With the usage of a regular expression, it is possible to change all or only par
* MaskType = Expression
* Asterisk: Ignored
* Expression: Mandatory. An expression consisting of fixed text and fields/JsonPaths enclosed in double curly brackets.
* FieldName: Mandatory. The name of the field being sought.
* FieldName: Mandatory. The name of the field being sought. Note that field names are case-insensitive for CSV files & DB queries, but case-sensitive for JSON files.
* ListOfPossibleReplacements: Ignored
* QueryReplacement: Ignored
* RegExGroupToReplace: Optional. A number that specifies which regex group will be replaced.
Expand Down Expand Up @@ -256,7 +256,7 @@ In the case of Json, only one list with an empty selector (a.k.a. fallback) is a
* MaskType = List
* Asterisk: Ignored
* Expression: Ignored
* FieldName: Mandatory. The name of the field being sought.
* FieldName: Mandatory. The name of the field being sought. Note that field names are case-insensitive for CSV files & DB queries, but case-sensitive for JSON files.
* ListOfPossibleReplacements: A list of ```<Replacement>``` items. Each item must contain:
* a Selector attribute, which can be either empty (fallback) or contain a field name from the data, the equality sign (=) and a constant value. E.g. ```Selector="Country=Greece"```.
* and a List attribute, which is a comma-separated list of strings. E.g. ```List="Feta,Olives,Kasseri"```.
Expand Down Expand Up @@ -299,7 +299,7 @@ The field contents are substituted with a randomly picked item of one or more gi
* MaskType = Query
* Asterisk: Ignored
* Expression: Ignored
* FieldName: Mandatory. The name of the field being sought.
* FieldName: Mandatory. The name of the field being sought. Note that field names are case-insensitive for CSV files & DB queries, but case-sensitive for JSON files.
* ListOfPossibleReplacements: Ignored
* QueryReplacement: Mandatory. Must contain all of the following attributes:
* SelectorField: The name of the field _from the original data_ which will be used to match the reference records.
Expand Down
1 change: 1 addition & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -58,6 +58,7 @@ Please see the [generated docs](https://github.com/dandraka/Zoro/blob/master/doc

### Notes on usage

- Field names are case-insensitive for CSV files & DB queries, but case-sensitive for JSON files.
- If using a database to write data (DataDestination=Database), all names of parameters in SqlCommand (@field for SqlServer or $field elsewhere) must have a corresponding FieldMask, even if the MaskType is None. Also, currently connection types of ```System.Data.SqlClient``` and ```System.Data.OleDb``` are supported, but if anything else (e.g. MySql, Oracle) is needed please open an issue; adding more is trivial.
- If input is a JSON file (DataSource=JsonFile) and one or more FieldMasks are type List (FieldMask.MaskType=List), one 1 Replacement entry is allowed, which has to have an empty Selector (Selector="").
- If input is a JSON file (DataSource=JsonFile), FieldMasks that perform a database query (FieldMask.MaskType=Query) are not allowed. This is planned to be supported in a later version.
Expand Down
7 changes: 4 additions & 3 deletions Zoro.Processor/Dandraka.Zoro.nuspec
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
<package xmlns="http://schemas.microsoft.com/packaging/2013/05/nuspec.xsd">
<metadata>
<id>Dandraka.Zoro</id>
<version>2.3.1</version>
<version>2.3.2</version>
<authors>Jim (Dimitrios) Andrakakis</authors>
<requireLicenseAcceptance>true</requireLicenseAcceptance>
<icon>icon.png</icon>
Expand All @@ -17,13 +17,14 @@
Added full JSON support.
Added masking types reference documentation.
Added developer docs.
Addressed SQL Data Provider Security Feature Bypass Vulnerability CVE-2024-0056 on Microsoft.Data.SqlClient and System.Data.SqlClient
Addressed SQL Data Provider Security Feature Bypass Vulnerability CVE-2024-0056 on Microsoft.Data.SqlClient and System.Data.SqlClient.
Fixed case-sensitivity bug and updated to latest package versions.
</releaseNotes>
<copyright>Copyright (c) 2017 Jim (Dimitrios) Andrakakis</copyright>
<tags>data anonymization masking gdpr sql csv json</tags>
<dependencies>
<group targetFramework=".netstandard2.1">
<dependency id="GenericParsing" version="1.2.2" />
<dependency id="GenericParsing" version="1.5.0" />
<dependency id="System.Data.SqlClient" version="4.8.6" />
<dependency id="System.Data.OleDb" version="8.0.0" />
</group>
Expand Down
2 changes: 1 addition & 1 deletion Zoro.Processor/DataMasking.cs
Original file line number Diff line number Diff line change
Expand Up @@ -284,7 +284,7 @@ private string GetExpressionString(DataRow row, string expression, JProperty jso

foreach (Group rxGroup in r.Groups)
{
string fieldName = rxGroup.Value.Replace("{{", "").Replace("}}", "").ToLower();
string fieldName = rxGroup.Value.Replace("{{", "").Replace("}}", "");

switch (this.config.DataSource)
{
Expand Down
2 changes: 1 addition & 1 deletion Zoro.Processor/FieldMask.cs
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,7 @@ public FieldMask()
}

/// <summary>
/// The name of the field.
/// The name of the field. Note that field names are case-insensitive for CSV files and DB queries, but case-sensitive for JSON files.
/// </summary>
public string FieldName { get; set; }

Expand Down
4 changes: 2 additions & 2 deletions Zoro.Processor/FieldNotFoundException.cs
Original file line number Diff line number Diff line change
Expand Up @@ -18,15 +18,15 @@ public FieldNotFoundException()
/// Creates a FieldNotFoundException specifying the field name.
/// </summary>
public FieldNotFoundException(string field)
: base($"Field or JsonPath {field} was not found.")
: base($"Field or JsonPath {field} was not found. Note that field names are case-insensitive for CSV files & DB queries, but case-sensitive for JSON files.")
{
}

/// <summary>
/// Creates a FieldNotFoundException specifying the field name and an inner exception.
/// </summary>
public FieldNotFoundException(string field, Exception inner)
: base($"Field or JsonPath {field} was not found.", inner)
: base($"Field or JsonPath {field} was not found. Note that field names are case-insensitive for CSV files & DB queries, but case-sensitive for JSON files.", inner)
{
}
}
Expand Down
46 changes: 28 additions & 18 deletions Zoro.Processor/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -58,9 +58,10 @@ Please see the [generated docs](https://github.com/dandraka/Zoro/blob/master/doc

### Notes on usage

- If using a database to write data (DataDestination=Database), the names of parameters in SqlCommand ($field) must match the names of FieldMasks.
- Field names are case-insensitive for CSV files & DB queries, but case-sensitive for JSON files.
- If using a database to write data (DataDestination=Database), all names of parameters in SqlCommand (@field for SqlServer or $field elsewhere) must have a corresponding FieldMask, even if the MaskType is None. Also, currently connection types of ```System.Data.SqlClient``` and ```System.Data.OleDb``` are supported, but if anything else (e.g. MySql, Oracle) is needed please open an issue; adding more is trivial.
- If input is a JSON file (DataSource=JsonFile) and one or more FieldMasks are type List (FieldMask.MaskType=List), one 1 Replacement entry is allowed, which has to have an empty Selector (Selector="").
- If input is a JSON file (DataSource=JsonFile), FieldMasks that perform a database query (FieldMask.MaskType=Query) are not allowed.
- If input is a JSON file (DataSource=JsonFile), FieldMasks that perform a database query (FieldMask.MaskType=Query) are not allowed. This is planned to be supported in a later version.

## Examples:

Expand Down Expand Up @@ -114,45 +115,54 @@ ID;Name;BankAccount
<FieldMask>
<FieldName>ID</FieldName>
<MaskType>None</MaskType>
</FieldMask>
</FieldMask>
<FieldMask>
<FieldName>CustomerFullname</FieldName>
<MaskType>None</MaskType>
</FieldMask>
<FieldMask>
<FieldName>CountryISOCode</FieldName>
<FieldName>CustomerCountry</FieldName>
<MaskType>None</MaskType>
</FieldMask>
<FieldMask>
<FieldName>MainPhone</FieldName>
<MaskType>Similar</MaskType>
<RegExMatch>^(\+\d\d)?(.*)$</RegExMatch>
<RegExGroupToReplace>2</RegExGroupToReplace>
</FieldMask>
</FieldMask>
<FieldMask>
<FieldName>Street</FieldName>
<MaskType>List</MaskType>
<ListOfPossibleReplacements>
<Replacement Selector="Country=Netherlands" List="Bergselaan,Schieweg,Nootdorpstraat,Nolensstraat" />
<Replacement Selector="Country=Switzerland" List="Bahnhofstrasse,Clarahofweg,Sperrstrasse,Erlenstrasse" />
<Replacement Selector="Country=Liechtenstein" List="Lettstrasse,Bangarten,Beckagässli,Haldenweg" />
<Replacement Selector="Country=Germany" List="Bahnhofstraße,Freigaße,Hauptstraße" />
<Replacement Selector="Country=Belgium" List="Rue d'Argent,Rue d'Assaut,Rue de l'Ecuyer,Rue du Persil" />
<Replacement Selector="Country=Austria" List="Miesbachgasse,Kleine Pfarrgasse,Heinestraße" />
<Replacement Selector="Country=France" List="Rue Nationale,Boulevard Vauban,Rue des Stations,Boulevard de la Liberté" />
<Replacement Selector="CustomerCountry=NL" List="Bergselaan,Schieweg,Nootdorpstraat,Nolensstraat" />
<Replacement Selector="CustomerCountry=CH" List="Bahnhofstrasse,Clarahofweg,Sperrstrasse,Erlenstrasse" />
<Replacement Selector="CustomerCountry=LI" List="Lettstrasse,Bangarten,Beckagässli,Haldenweg" />
<Replacement Selector="CustomerCountry=DE" List="Bahnhofstraße,Freigaße,Hauptstraße" />
<Replacement Selector="CustomerCountry=BE" List="Rue d'Argent,Rue d'Assaut,Rue de l'Ecuyer,Rue du Persil" />
<Replacement Selector="CustomerCountry=FR" List="Rue Nationale,Boulevard Vauban,Rue des Stations,Boulevard de la Liberté" />
<!--- fallback when nothing matches; MUST be the last one --->
<Replacement Selector="" List="Bedford Gardens,Sheffield Terrace,Kensington Palace Gardens" />
</ListOfPossibleReplacements>
</FieldMask>
</FieldMask>
<FieldMask>
<FieldName>City</FieldName>
<FieldName>CustomerCity</FieldName>
<MaskType>Query</MaskType>
<QueryReplacement SelectorField="CountryISOCode" GroupField="countrycode" ValueField="cityname" Query="SELECT cityname, countrycode FROM cities" />
<QueryReplacement
SelectorField="CustomerCountry"
GroupField="cityCountryName"
ValueField="cityName"
Query="SELECT cityName, cityCountryName FROM cities" />
</FieldMask>
</FieldMasks>
<DataSource>Database</DataSource>
<DataDestination>Database</DataDestination>
<ConnectionString>Server=DBSRV1;Database=appdb;Trusted_Connection=yes;</ConnectionString>
<!-- Currently System.Data.SqlClient and System.Data.OleDb are supported, but if needed, adding more is trivial -->
<ConnectionType>System.Data.SqlClient</ConnectionType>
<SqlSelect>SELECT ID, MainPhone, Street, CountryISOCode FROM customers</SqlSelect>
<SqlCommand>INSERT INTO customers_anonymous (ID, MainPhone, Street, City, CountryISOCode) VALUES ($ID, $MainPhone, $Street, $City, $CountryISOCode)</SqlCommand>
</MaskConfig>
<SqlSelect>SELECT ID, CustomerFullname, CustomerCity, CustomerCountry FROM customers</SqlSelect>
<!-- Note that the parameter character is @ for Sql Server, $ elsewhere -->
<SqlCommand>INSERT INTO customers_anonymous (ID, CustomerFullname, CustomerCity, CustomerCountry) VALUES (@ID, @CustomerFullname, @CustomerCity, @CustomerCountry)</SqlCommand>
</MaskConfig>
```

**Sample config file for JSON source and destination using an Expression and a List**
Expand Down
Binary file modified Zoro.Processor/Zoro.Processor.csproj
Binary file not shown.
24 changes: 12 additions & 12 deletions Zoro.Tests/DataMasking_Tests.cs
Original file line number Diff line number Diff line change
Expand Up @@ -463,9 +463,9 @@ public void T14_Mask_JSON_Expression_Test()
config.FieldMasks.Clear();
config.FieldMasks.Add(new FieldMask()
{
FieldName = "name",
FieldName = "Name",
MaskType = MaskType.Expression,
Expression = "Customer-{{$.id}}"
Expression = "Customer-{{$.Id}}"
});
config.InputFile = config.InputFile.Replace("data2.json", "data5.json");
config.OutputFile = config.OutputFile.Replace("data2.json", "data5.json");
Expand All @@ -481,8 +481,8 @@ public void T14_Mask_JSON_Expression_Test()

var jsonObjOrig = JObject.Parse(File.ReadAllText(config.InputFile));
var jsonObjMasked = JObject.Parse(File.ReadAllText(config.OutputFile));
string origId = jsonObjOrig.Value<string>("id");
string maskedName = jsonObjMasked.Value<string>("name");
string origId = jsonObjOrig.Value<string>("Id");
string maskedName = jsonObjMasked.Value<string>("Name");
Assert.Equal($"Customer-{origId}", maskedName);
}

Expand All @@ -494,7 +494,7 @@ public void T15_Mask_JSON_Expression_WrongJsonPath_Test()
config.FieldMasks.Clear();
config.FieldMasks.Add(new FieldMask()
{
FieldName = "name",
FieldName = "Name",
MaskType = MaskType.Expression,
Expression = "Customer-{{$.SomeAttributeThatDoesntExist}}"
});
Expand Down Expand Up @@ -531,18 +531,18 @@ public void T16_Mask_JSON_Expression_Node_Test()
// and for name=Alicja Bakshi the id should be 2.
config.FieldMasks.Add(new FieldMask()
{
FieldName = "name",
FieldName = "Name",
MaskType = MaskType.Expression,
Expression = "Customer {{$.id}}"
Expression = "Customer {{$.Id}}"
});
// This field mask tests getting data
// from the Json root ($).
// So for all names the id should be 1.
config.FieldMasks.Add(new FieldMask()
{
FieldName = "spouse",
FieldName = "Spouse",
MaskType = MaskType.Expression,
Expression = "Spouse of Customer {{$.employees[0].id}}"
Expression = "Spouse of Customer {{$.Employees[0].Id}}"
});

// === Act ===
Expand All @@ -559,9 +559,9 @@ public void T16_Mask_JSON_Expression_Node_Test()

for (int i = 0; i < 2; i++)
{
string origId = jsonObjOrig.SelectToken($"employees[{i}].id").Value<string>();
string maskedCustomerName = jsonObjMasked.SelectToken($"employees[{i}].name").Value<string>();
string maskedSpouseName = jsonObjMasked.SelectToken($"employees[{i}].spouse").Value<string>();
string origId = jsonObjOrig.SelectToken($"Employees[{i}].Id").Value<string>();
string maskedCustomerName = jsonObjMasked.SelectToken($"Employees[{i}].Name").Value<string>();
string maskedSpouseName = jsonObjMasked.SelectToken($"Employees[{i}].Spouse").Value<string>();
Assert.Equal($"Customer {origId}", maskedCustomerName);
Assert.Equal($"Spouse of Customer 1", maskedSpouseName);
}
Expand Down
Binary file modified Zoro.Tests/Zoro.Tests.csproj
Binary file not shown.
10 changes: 5 additions & 5 deletions Zoro.Tests/data/data5.json
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
{
"id": "1",
"name": "Aleksander Singh",
"salary": 117300,
"married": true,
"spouse": "Ingrid Díaz"
"Id": "1",
"Name": "Aleksander Singh",
"Salary": 117300,
"Married": true,
"Spouse": "Ingrid Díaz"
}
14 changes: 7 additions & 7 deletions Zoro.Tests/data/data6.json
Original file line number Diff line number Diff line change
@@ -1,14 +1,14 @@
{
"employees": [
"Employees": [
{
"id": "1",
"name": "Aleksander Singh",
"spouse": "Ingrid Díaz"
"Id": "1",
"Name": "Aleksander Singh",
"Spouse": "Ingrid Díaz"
},
{
"id": "2",
"name": "Alicja Bakshi",
"spouse": "Ellinore Alvarez"
"Id": "2",
"Name": "Alicja Bakshi",
"Spouse": "Ellinore Alvarez"
}
]
}
Binary file modified Zoro/Zoro.csproj
Binary file not shown.
2 changes: 1 addition & 1 deletion docs/Dandraka.Zoro.Processor/FieldMask.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,7 @@ public class FieldMask
| [FieldMask](FieldMask/FieldMask.md)() | Creates an instance of FieldMask class. |
| [Asterisk](FieldMask/Asterisk.md) { get; set; } | In case of `MaskType.Asterisk`, the character to apply. The default is asterisk (*). |
| [Expression](FieldMask/Expression.md) { getset; } | In case of `MaskType.Expression`, the expression to use. The field contents are substituted with a combination of a constant string and values from other fields. Must be filled with a constant string and field names enclosed in double curly brackets. For example "Customer-{{CustomerID}}" (without the quotes). When the data source is Json, a JsonPath is expected in the place of field name. The JsonPath will be applied on the root of the Json. For example "Customer-{{$.CustomerID}}" (without the quotes). |
| [FieldName](FieldMask/FieldName.md) { getset; } | The name of the field. |
| [FieldName](FieldMask/FieldName.md) { getset; } | The name of the field. Note that field names are case-insensitive for CSV files and DB queries, but case-sensitive for JSON files. |
| [ListOfPossibleReplacements](FieldMask/ListOfPossibleReplacements.md) { getset; } | In case of `MaskType.List`, the comma-separated list of items to choose from. |
| [MaskType](FieldMask/MaskType.md) { getset; } | The type of masking to apply. The default is None. |
| [QueryReplacement](FieldMask/QueryReplacement.md) { getset; } | In case of `MaskType.Query`, the SQL query to get the list of replacements from. |
Expand Down
2 changes: 1 addition & 1 deletion docs/Dandraka.Zoro.Processor/FieldMask/FieldName.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# FieldMask.FieldName property

The name of the field.
The name of the field. Note that field names are case-insensitive for CSV files and DB queries, but case-sensitive for JSON files.

```csharp
public string FieldName { get; set; }
Expand Down

0 comments on commit 9d1f788

Please sign in to comment.