diff --git a/MaskingTypes.md b/MaskingTypes.md index 8530db4..cf7e7aa 100644 --- a/MaskingTypes.md +++ b/MaskingTypes.md @@ -50,7 +50,7 @@ Example * MaskType = None * Asterisk: Ignored * Expression: Ignored -* FieldName: Mandatory. The name of the field being sought. +* FieldName: Mandatory. The name of the field being sought. Note that field names are case-insensitive for CSV files & DB queries, but case-sensitive for JSON files. * ListOfPossibleReplacements: Ignored * QueryReplacement: Ignored * RegExGroupToReplace: Ignored @@ -77,7 +77,7 @@ With the usage of a regular expression, it is possible to change all or only par * MaskType = Similar * Asterisk: Ignored * Expression: Ignored -* FieldName: Mandatory. The name of the field being sought. +* FieldName: Mandatory. The name of the field being sought. Note that field names are case-insensitive for CSV files & DB queries, but case-sensitive for JSON files. * ListOfPossibleReplacements: Ignored * QueryReplacement: Ignored * RegExGroupToReplace: Optional. A number that specifies which regex group will be replaced. @@ -126,7 +126,7 @@ With the usage of a regular expression, it is possible to change all or only par * MaskType = Asterisk * Asterisk: Optional. A character that will replace the original data. Defaults to an asterisk (*). * Expression: Ignored -* FieldName: Mandatory. The name of the field being sought. +* FieldName: Mandatory. The name of the field being sought. Note that field names are case-insensitive for CSV files & DB queries, but case-sensitive for JSON files. * ListOfPossibleReplacements: Ignored * QueryReplacement: Ignored * RegExGroupToReplace: Optional. A number that specifies which regex group will be replaced. @@ -164,7 +164,7 @@ With the usage of a regular expression, it is possible to change all or only par * MaskType = Expression * Asterisk: Ignored * Expression: Mandatory. An expression consisting of fixed text and fields/JsonPaths enclosed in double curly brackets. -* FieldName: Mandatory. The name of the field being sought. +* FieldName: Mandatory. The name of the field being sought. Note that field names are case-insensitive for CSV files & DB queries, but case-sensitive for JSON files. * ListOfPossibleReplacements: Ignored * QueryReplacement: Ignored * RegExGroupToReplace: Optional. A number that specifies which regex group will be replaced. @@ -256,7 +256,7 @@ In the case of Json, only one list with an empty selector (a.k.a. fallback) is a * MaskType = List * Asterisk: Ignored * Expression: Ignored -* FieldName: Mandatory. The name of the field being sought. +* FieldName: Mandatory. The name of the field being sought. Note that field names are case-insensitive for CSV files & DB queries, but case-sensitive for JSON files. * ListOfPossibleReplacements: A list of `````` items. Each item must contain: * a Selector attribute, which can be either empty (fallback) or contain a field name from the data, the equality sign (=) and a constant value. E.g. ```Selector="Country=Greece"```. * and a List attribute, which is a comma-separated list of strings. E.g. ```List="Feta,Olives,Kasseri"```. @@ -299,7 +299,7 @@ The field contents are substituted with a randomly picked item of one or more gi * MaskType = Query * Asterisk: Ignored * Expression: Ignored -* FieldName: Mandatory. The name of the field being sought. +* FieldName: Mandatory. The name of the field being sought. Note that field names are case-insensitive for CSV files & DB queries, but case-sensitive for JSON files. * ListOfPossibleReplacements: Ignored * QueryReplacement: Mandatory. Must contain all of the following attributes: * SelectorField: The name of the field _from the original data_ which will be used to match the reference records. diff --git a/README.md b/README.md index 7857e48..dbf1e82 100644 --- a/README.md +++ b/README.md @@ -58,6 +58,7 @@ Please see the [generated docs](https://github.com/dandraka/Zoro/blob/master/doc ### Notes on usage +- Field names are case-insensitive for CSV files & DB queries, but case-sensitive for JSON files. - If using a database to write data (DataDestination=Database), all names of parameters in SqlCommand (@field for SqlServer or $field elsewhere) must have a corresponding FieldMask, even if the MaskType is None. Also, currently connection types of ```System.Data.SqlClient``` and ```System.Data.OleDb``` are supported, but if anything else (e.g. MySql, Oracle) is needed please open an issue; adding more is trivial. - If input is a JSON file (DataSource=JsonFile) and one or more FieldMasks are type List (FieldMask.MaskType=List), one 1 Replacement entry is allowed, which has to have an empty Selector (Selector=""). - If input is a JSON file (DataSource=JsonFile), FieldMasks that perform a database query (FieldMask.MaskType=Query) are not allowed. This is planned to be supported in a later version. diff --git a/Zoro.Processor/Dandraka.Zoro.nuspec b/Zoro.Processor/Dandraka.Zoro.nuspec index 48bdf85..91f5773 100644 --- a/Zoro.Processor/Dandraka.Zoro.nuspec +++ b/Zoro.Processor/Dandraka.Zoro.nuspec @@ -2,7 +2,7 @@ Dandraka.Zoro - 2.3.1 + 2.3.2 Jim (Dimitrios) Andrakakis true icon.png @@ -17,13 +17,14 @@ Added full JSON support. Added masking types reference documentation. Added developer docs. - Addressed SQL Data Provider Security Feature Bypass Vulnerability CVE-2024-0056 on Microsoft.Data.SqlClient and System.Data.SqlClient + Addressed SQL Data Provider Security Feature Bypass Vulnerability CVE-2024-0056 on Microsoft.Data.SqlClient and System.Data.SqlClient. + Fixed case-sensitivity bug and updated to latest package versions. Copyright (c) 2017 Jim (Dimitrios) Andrakakis data anonymization masking gdpr sql csv json - + diff --git a/Zoro.Processor/DataMasking.cs b/Zoro.Processor/DataMasking.cs index f16114c..077bf18 100644 --- a/Zoro.Processor/DataMasking.cs +++ b/Zoro.Processor/DataMasking.cs @@ -284,7 +284,7 @@ private string GetExpressionString(DataRow row, string expression, JProperty jso foreach (Group rxGroup in r.Groups) { - string fieldName = rxGroup.Value.Replace("{{", "").Replace("}}", "").ToLower(); + string fieldName = rxGroup.Value.Replace("{{", "").Replace("}}", ""); switch (this.config.DataSource) { diff --git a/Zoro.Processor/FieldMask.cs b/Zoro.Processor/FieldMask.cs index c972ae9..f3b0fc6 100644 --- a/Zoro.Processor/FieldMask.cs +++ b/Zoro.Processor/FieldMask.cs @@ -20,7 +20,7 @@ public FieldMask() } /// - /// The name of the field. + /// The name of the field. Note that field names are case-insensitive for CSV files and DB queries, but case-sensitive for JSON files. /// public string FieldName { get; set; } diff --git a/Zoro.Processor/FieldNotFoundException.cs b/Zoro.Processor/FieldNotFoundException.cs index 7a5966c..11bfdfb 100644 --- a/Zoro.Processor/FieldNotFoundException.cs +++ b/Zoro.Processor/FieldNotFoundException.cs @@ -18,7 +18,7 @@ public FieldNotFoundException() /// Creates a FieldNotFoundException specifying the field name. /// public FieldNotFoundException(string field) - : base($"Field or JsonPath {field} was not found.") + : base($"Field or JsonPath {field} was not found. Note that field names are case-insensitive for CSV files & DB queries, but case-sensitive for JSON files.") { } @@ -26,7 +26,7 @@ public FieldNotFoundException(string field) /// Creates a FieldNotFoundException specifying the field name and an inner exception. /// public FieldNotFoundException(string field, Exception inner) - : base($"Field or JsonPath {field} was not found.", inner) + : base($"Field or JsonPath {field} was not found. Note that field names are case-insensitive for CSV files & DB queries, but case-sensitive for JSON files.", inner) { } } diff --git a/Zoro.Processor/README.md b/Zoro.Processor/README.md index 967e3a8..dbf1e82 100644 --- a/Zoro.Processor/README.md +++ b/Zoro.Processor/README.md @@ -58,9 +58,10 @@ Please see the [generated docs](https://github.com/dandraka/Zoro/blob/master/doc ### Notes on usage -- If using a database to write data (DataDestination=Database), the names of parameters in SqlCommand ($field) must match the names of FieldMasks. +- Field names are case-insensitive for CSV files & DB queries, but case-sensitive for JSON files. +- If using a database to write data (DataDestination=Database), all names of parameters in SqlCommand (@field for SqlServer or $field elsewhere) must have a corresponding FieldMask, even if the MaskType is None. Also, currently connection types of ```System.Data.SqlClient``` and ```System.Data.OleDb``` are supported, but if anything else (e.g. MySql, Oracle) is needed please open an issue; adding more is trivial. - If input is a JSON file (DataSource=JsonFile) and one or more FieldMasks are type List (FieldMask.MaskType=List), one 1 Replacement entry is allowed, which has to have an empty Selector (Selector=""). -- If input is a JSON file (DataSource=JsonFile), FieldMasks that perform a database query (FieldMask.MaskType=Query) are not allowed. +- If input is a JSON file (DataSource=JsonFile), FieldMasks that perform a database query (FieldMask.MaskType=Query) are not allowed. This is planned to be supported in a later version. ## Examples: @@ -114,9 +115,13 @@ ID;Name;BankAccount ID None - + + + CustomerFullname + None + - CountryISOCode + CustomerCountry None @@ -124,35 +129,40 @@ ID;Name;BankAccount Similar ^(\+\d\d)?(.*)$ 2 - + Street List - - - - - - - + + + + + + - + - City + CustomerCity Query - + Database Database Server=DBSRV1;Database=appdb;Trusted_Connection=yes; + System.Data.SqlClient - SELECT ID, MainPhone, Street, CountryISOCode FROM customers - INSERT INTO customers_anonymous (ID, MainPhone, Street, City, CountryISOCode) VALUES ($ID, $MainPhone, $Street, $City, $CountryISOCode) - + SELECT ID, CustomerFullname, CustomerCity, CustomerCountry FROM customers + + INSERT INTO customers_anonymous (ID, CustomerFullname, CustomerCity, CustomerCountry) VALUES (@ID, @CustomerFullname, @CustomerCity, @CustomerCountry) + ``` **Sample config file for JSON source and destination using an Expression and a List** diff --git a/Zoro.Processor/Zoro.Processor.csproj b/Zoro.Processor/Zoro.Processor.csproj index 368abf8..898dd3d 100644 Binary files a/Zoro.Processor/Zoro.Processor.csproj and b/Zoro.Processor/Zoro.Processor.csproj differ diff --git a/Zoro.Tests/DataMasking_Tests.cs b/Zoro.Tests/DataMasking_Tests.cs index 46dbdba..d91e5ea 100644 --- a/Zoro.Tests/DataMasking_Tests.cs +++ b/Zoro.Tests/DataMasking_Tests.cs @@ -463,9 +463,9 @@ public void T14_Mask_JSON_Expression_Test() config.FieldMasks.Clear(); config.FieldMasks.Add(new FieldMask() { - FieldName = "name", + FieldName = "Name", MaskType = MaskType.Expression, - Expression = "Customer-{{$.id}}" + Expression = "Customer-{{$.Id}}" }); config.InputFile = config.InputFile.Replace("data2.json", "data5.json"); config.OutputFile = config.OutputFile.Replace("data2.json", "data5.json"); @@ -481,8 +481,8 @@ public void T14_Mask_JSON_Expression_Test() var jsonObjOrig = JObject.Parse(File.ReadAllText(config.InputFile)); var jsonObjMasked = JObject.Parse(File.ReadAllText(config.OutputFile)); - string origId = jsonObjOrig.Value("id"); - string maskedName = jsonObjMasked.Value("name"); + string origId = jsonObjOrig.Value("Id"); + string maskedName = jsonObjMasked.Value("Name"); Assert.Equal($"Customer-{origId}", maskedName); } @@ -494,7 +494,7 @@ public void T15_Mask_JSON_Expression_WrongJsonPath_Test() config.FieldMasks.Clear(); config.FieldMasks.Add(new FieldMask() { - FieldName = "name", + FieldName = "Name", MaskType = MaskType.Expression, Expression = "Customer-{{$.SomeAttributeThatDoesntExist}}" }); @@ -531,18 +531,18 @@ public void T16_Mask_JSON_Expression_Node_Test() // and for name=Alicja Bakshi the id should be 2. config.FieldMasks.Add(new FieldMask() { - FieldName = "name", + FieldName = "Name", MaskType = MaskType.Expression, - Expression = "Customer {{$.id}}" + Expression = "Customer {{$.Id}}" }); // This field mask tests getting data // from the Json root ($). // So for all names the id should be 1. config.FieldMasks.Add(new FieldMask() { - FieldName = "spouse", + FieldName = "Spouse", MaskType = MaskType.Expression, - Expression = "Spouse of Customer {{$.employees[0].id}}" + Expression = "Spouse of Customer {{$.Employees[0].Id}}" }); // === Act === @@ -559,9 +559,9 @@ public void T16_Mask_JSON_Expression_Node_Test() for (int i = 0; i < 2; i++) { - string origId = jsonObjOrig.SelectToken($"employees[{i}].id").Value(); - string maskedCustomerName = jsonObjMasked.SelectToken($"employees[{i}].name").Value(); - string maskedSpouseName = jsonObjMasked.SelectToken($"employees[{i}].spouse").Value(); + string origId = jsonObjOrig.SelectToken($"Employees[{i}].Id").Value(); + string maskedCustomerName = jsonObjMasked.SelectToken($"Employees[{i}].Name").Value(); + string maskedSpouseName = jsonObjMasked.SelectToken($"Employees[{i}].Spouse").Value(); Assert.Equal($"Customer {origId}", maskedCustomerName); Assert.Equal($"Spouse of Customer 1", maskedSpouseName); } diff --git a/Zoro.Tests/Zoro.Tests.csproj b/Zoro.Tests/Zoro.Tests.csproj index f1095ae..41d7a96 100644 Binary files a/Zoro.Tests/Zoro.Tests.csproj and b/Zoro.Tests/Zoro.Tests.csproj differ diff --git a/Zoro.Tests/data/data5.json b/Zoro.Tests/data/data5.json index b2041ee..cfb7bfd 100644 --- a/Zoro.Tests/data/data5.json +++ b/Zoro.Tests/data/data5.json @@ -1,7 +1,7 @@ { - "id": "1", - "name": "Aleksander Singh", - "salary": 117300, - "married": true, - "spouse": "Ingrid Díaz" + "Id": "1", + "Name": "Aleksander Singh", + "Salary": 117300, + "Married": true, + "Spouse": "Ingrid Díaz" } \ No newline at end of file diff --git a/Zoro.Tests/data/data6.json b/Zoro.Tests/data/data6.json index 7035b38..b25788a 100644 --- a/Zoro.Tests/data/data6.json +++ b/Zoro.Tests/data/data6.json @@ -1,14 +1,14 @@ { - "employees": [ + "Employees": [ { - "id": "1", - "name": "Aleksander Singh", - "spouse": "Ingrid Díaz" + "Id": "1", + "Name": "Aleksander Singh", + "Spouse": "Ingrid Díaz" }, { - "id": "2", - "name": "Alicja Bakshi", - "spouse": "Ellinore Alvarez" + "Id": "2", + "Name": "Alicja Bakshi", + "Spouse": "Ellinore Alvarez" } ] } \ No newline at end of file diff --git a/Zoro/Zoro.csproj b/Zoro/Zoro.csproj index 62471cc..de1a2b1 100644 Binary files a/Zoro/Zoro.csproj and b/Zoro/Zoro.csproj differ diff --git a/docs/Dandraka.Zoro.Processor/FieldMask.md b/docs/Dandraka.Zoro.Processor/FieldMask.md index 8617016..3eca0fe 100644 --- a/docs/Dandraka.Zoro.Processor/FieldMask.md +++ b/docs/Dandraka.Zoro.Processor/FieldMask.md @@ -13,7 +13,7 @@ public class FieldMask | [FieldMask](FieldMask/FieldMask.md)() | Creates an instance of FieldMask class. | | [Asterisk](FieldMask/Asterisk.md) { get; set; } | In case of `MaskType.Asterisk`, the character to apply. The default is asterisk (*). | | [Expression](FieldMask/Expression.md) { get; set; } | In case of `MaskType.Expression`, the expression to use. The field contents are substituted with a combination of a constant string and values from other fields. Must be filled with a constant string and field names enclosed in double curly brackets. For example "Customer-{{CustomerID}}" (without the quotes). When the data source is Json, a JsonPath is expected in the place of field name. The JsonPath will be applied on the root of the Json. For example "Customer-{{$.CustomerID}}" (without the quotes). | -| [FieldName](FieldMask/FieldName.md) { get; set; } | The name of the field. | +| [FieldName](FieldMask/FieldName.md) { get; set; } | The name of the field. Note that field names are case-insensitive for CSV files and DB queries, but case-sensitive for JSON files. | | [ListOfPossibleReplacements](FieldMask/ListOfPossibleReplacements.md) { get; set; } | In case of `MaskType.List`, the comma-separated list of items to choose from. | | [MaskType](FieldMask/MaskType.md) { get; set; } | The type of masking to apply. The default is None. | | [QueryReplacement](FieldMask/QueryReplacement.md) { get; set; } | In case of `MaskType.Query`, the SQL query to get the list of replacements from. | diff --git a/docs/Dandraka.Zoro.Processor/FieldMask/FieldName.md b/docs/Dandraka.Zoro.Processor/FieldMask/FieldName.md index 91fe547..b66e2cc 100644 --- a/docs/Dandraka.Zoro.Processor/FieldMask/FieldName.md +++ b/docs/Dandraka.Zoro.Processor/FieldMask/FieldName.md @@ -1,6 +1,6 @@ # FieldMask.FieldName property -The name of the field. +The name of the field. Note that field names are case-insensitive for CSV files and DB queries, but case-sensitive for JSON files. ```csharp public string FieldName { get; set; }