diff --git a/docs/docs/evaluating-performance.md b/docs/docs/evaluating-performance.md index fc92a83..a282c87 100644 --- a/docs/docs/evaluating-performance.md +++ b/docs/docs/evaluating-performance.md @@ -23,7 +23,7 @@ To evaluate Philter's performance you need: #### Configuring Philter -Before we can begin our evaluation we need to create a policy. A [policy](policies/filter_policies.md) is a file that defines the types of sensitive information that will be redacted and how it will be redacted. The policies are stored on the Philter instance under `/opt/Philter/policies`. You can edit the policies directly there using a text editor or you can use Philter's [API](policies-api.md) to upload a policy. In this case we recommend just using a text editor on the Philter instance to create a policy. +Before we can begin our evaluation we need to create a policy. A [policy](policies/filter_policies.md) is a file that defines the types of sensitive information that will be redacted and how it will be redacted. The policies are stored on the Philter instance under `/opt/philter/policies`. You can edit the policies directly there using a text editor or you can use Philter's [API](policies-api.md) to upload a policy. In this case we recommend just using a text editor on the Philter instance to create a policy. When using a text editor to create and edit a policy, be sure to save the policy often. Frequent saving can make editing a policy easier. @@ -33,9 +33,9 @@ We also recommend considering to place your policy directory under source contro Make a copy of the default policy, and we will modify the copy for our needs. -`cp /opt/Philter/policies/default.json /opt/Philter/policies/evaluation.json` +`cp /opt/philter/policies/default.json /opt/philter/policies/evaluation.json` -Now open `/opt/Philter/policies/evaluation.json` in a text editor. (The content of `evaluation.json` will be similar to what's shown below but may have minor differences between different versions of Philter.) +Now open `/opt/philter/policies/evaluation.json` in a text editor. (The content of `evaluation.json` will be similar to what's shown below but may have minor differences between different versions of Philter.) ``` { diff --git a/docs/docs/other_features/alerts.md b/docs/docs/other_features/alerts.md index b9e0cde..4a6b583 100644 --- a/docs/docs/other_features/alerts.md +++ b/docs/docs/other_features/alerts.md @@ -1,12 +1,12 @@ # Alerts -Phileas can optionally generate alerts when a particular type of sensitive information is identified. +Philter can optionally generate alerts when a particular type of sensitive information is identified. ### Alert Conditions In a policy, each type of sensitive information can have zero or more filter strategies. Each filter strategy can optionally have a condition associated with it. When a condition is present, the filter strategy will only be satisfied when the condition is satisfied. For example, a condition may be created to only filter phone numbers that start with the digits `123` or only filter names that start with `John`. Filter strategy conditions give you granular control over the filtering process. -When a filter strategy condition is satisfied, Phileas can optionally generate an alert. This feature allows you to be notified when a particular type of sensitive information is identified. +When a filter strategy condition is satisfied, Philter can optionally generate an alert. This feature allows you to be notified when a particular type of sensitive information is identified. ### Enabling Alerts @@ -47,4 +47,4 @@ An alert contains the following information: ### Retrieving and Deleting Alerts -The alerts that Phileas has generated are available through Phileas' [alerts API](alerts-api.md). This API allows for retrieving and deleting alerts. Using this API you can build sophisticated notification systems around Phileas' capabilities. +The alerts that Philter has generated are available through Philter' [alerts API](alerts-api.md). This API allows for retrieving and deleting alerts. Using this API you can build sophisticated notification systems around Philter' capabilities. diff --git a/docs/docs/policies/filter_policies.md b/docs/docs/policies/filter_policies.md index ac8ad00..18f369e 100644 --- a/docs/docs/policies/filter_policies.md +++ b/docs/docs/policies/filter_policies.md @@ -1,6 +1,6 @@ # Filter Policies -The types of sensitive information identified by Philter and how that information is de-identified are controlled through policies. A policy is a file stored under Philter’s `policies` directory, which by default is located at `/opt/Philter/policies/`. You can have an unlimited number of policies. +The types of sensitive information identified by Philter and how that information is de-identified are controlled through policies. A policy is a file stored under Philter’s `policies` directory, which by default is located at `/opt/philter/policies/`. You can have an unlimited number of policies. Each policy has a `name` that is used by Philter to apply the appropriate de-identification methods. The `name` is passed to Philter’s [API](../api_and_sdks/api/filtering_api.md) along with the text to be filtered when submitting text to Philter. This provides flexibility and allows you to de-identify different types of documents in differing manners with a single instance of Philter. For example, you may have a policy for bankruptcy documents and a separate policy for financial documents. @@ -51,7 +51,7 @@ The name of the policy is `email-and-phone-numbers`. Policies can be named anyth ### Applying a Policy to Text -To use this policy we will save it as `/opt/Philter/profiles/email-and-phone-numbers.json`. We must restart Philter for the new profile to be available for use. To apply the policy we will pass the policy's name to Philter when making a filter request, as shown in the example request below. +To use this policy we will save it as `/opt/philter/profiles/email-and-phone-numbers.json`. We must restart Philter for the new profile to be available for use. To apply the policy we will pass the policy's name to Philter when making a filter request, as shown in the example request below. ``` curl -k -X POST "https://localhost:8080/api/filter?c=context&p=email-and-phone-numbers" \ diff --git a/docs/docs/policies/filters.md b/docs/docs/policies/filters.md index 18c52ac..3268328 100644 --- a/docs/docs/policies/filters.md +++ b/docs/docs/policies/filters.md @@ -1,12 +1,12 @@ # Filters -A "filter" corresponds to a type of sensitive information. Phileas has filters for sensitive information such as names, addresses, ages, and lots of others. +A "filter" corresponds to a type of sensitive information. Philter has filters for sensitive information such as names, addresses, ages, and lots of others. -These are predefined filters that are ready to be used as well as custom filters that let you define your own Phileas to identify sensitive information outside of what the predefined filters can identify. An example of a custom filter is a filter to identify your patient account numbers, where the structure of an account number is specific to your organization. +These are predefined filters that are ready to be used as well as custom filters that let you define your own Philter to identify sensitive information outside of what the predefined filters can identify. An example of a custom filter is a filter to identify your patient account numbers, where the structure of an account number is specific to your organization. Each filter is capable of identifying and redacting a specific type of sensitive information. For example, there is a filter for phone numbers, a filter for US social security numbers, and a filter for person's names. You can enable any combination of these filters based on the types of sensitive information you need to redact. -This section of the documentation describes the filters available in Phileas. The configuration options for each filter can vary due to the type of the sensitive information. For instance, only the zip code filter has a configuration to truncate the zip code. +This section of the documentation describes the filters available in Philter. The configuration options for each filter can vary due to the type of the sensitive information. For instance, only the zip code filter has a configuration to truncate the zip code. A selection of filters and their configurations is called a [policy](filter_policies.md). A policy describes how to de-identify a document. @@ -14,7 +14,7 @@ A selection of filters and their configurations is called a [policy](filter_poli ### Person's Names -Phileas uses several methods to identify person's names. +Philter uses several methods to identify person's names. | Type | Description | |-------------------------------------------------------------------------|----------------------------------------------------------------------| @@ -55,9 +55,9 @@ Phileas uses several methods to identify person's names. ## Custom Filter Types of Sensitive Information -In addition to the predefined types of sensitive information listed in the table above, you can also define your own types of sensitive information. Through custom identifiers and dictionaries, Phileas can identify many other types of information that may be sensitive in your use-case. For example, if you have patient identifiers that follow a pattern of `AA-00000` you can define a custom identifier for this sensitive information. +In addition to the predefined types of sensitive information listed in the table above, you can also define your own types of sensitive information. Through custom identifiers and dictionaries, Philter can identify many other types of information that may be sensitive in your use-case. For example, if you have patient identifiers that follow a pattern of `AA-00000` you can define a custom identifier for this sensitive information. -Phileas can be configured to look identify sensitive information based on custom dictionaries. When a term in the dictionary is found in the text, Phileas will treat the term as sensitive information and apply the given filter strategy. +Philter can be configured to look identify sensitive information based on custom dictionaries. When a term in the dictionary is found in the text, Philter will treat the term as sensitive information and apply the given filter strategy. Custom dictionaries support fuzziness to accommodate for misspellings. The replacement strategy for a custom dictionary has a `sensitivityLevel` that controls the amount of allowed fuzziness. diff --git a/docs/docs/policies/ignoring_specific_information.md b/docs/docs/policies/ignoring_specific_information.md index 24c8e0b..d45392f 100644 --- a/docs/docs/policies/ignoring_specific_information.md +++ b/docs/docs/policies/ignoring_specific_information.md @@ -1,8 +1,8 @@ # Ignoring Specific Information -Phileas can optionally ignore a list of terms and prevent those terms from being redacted. For example, if the name `John Smith` is being redacted and you do not want it to be redacted, you can add `John Smith` to an ignore list. Each time Phileas identifies sensitive information it will check the ignore lists to see if the sensitive information is to be ignored. +Philter can optionally ignore a list of terms and prevent those terms from being redacted. For example, if the name `John Smith` is being redacted and you do not want it to be redacted, you can add `John Smith` to an ignore list. Each time Philter identifies sensitive information it will check the ignore lists to see if the sensitive information is to be ignored. -> Phileas can ignore terms and patterns per-policy, meaning each policy can have its own unique list of terms or patterns to ignore. +> Philter can ignore terms and patterns per-policy, meaning each policy can have its own unique list of terms or patterns to ignore. ## Ignore Lists @@ -84,7 +84,7 @@ In the policy shown below, an ignore list is set at the level of a filter. The t ## Ignoring Patterns -Phileas can ignore information based on a regular expression pattern. An example use of this feature is to ignore terms that are present in your text but are dynamic, such as logged timestamps. When using the date filter these timestamps may be identified as being sensitive but you do not want them redacted. With an ignore pattern we can ignore the logged timestamps. +Philter can ignore information based on a regular expression pattern. An example use of this feature is to ignore terms that are present in your text but are dynamic, such as logged timestamps. When using the date filter these timestamps may be identified as being sensitive but you do not want them redacted. With an ignore pattern we can ignore the logged timestamps. ## Ignore Patterns diff --git a/docs/docs/policies/sample_policies.md b/docs/docs/policies/sample_policies.md index 0ef4d1a..cc6c90b 100644 --- a/docs/docs/policies/sample_policies.md +++ b/docs/docs/policies/sample_policies.md @@ -2,9 +2,9 @@ This page lists some sample policies. You can use these policies either as-is or as starting points for customizing them to meet your specific de-identification needs. - + -> These policies are examples and not an exhaustive list of all the sensitive information Phileas can identify. Items from each of these policies can be combined to make policies to meet your use-cases. +> These policies are examples and not an exhaustive list of all the sensitive information Philter can identify. Items from each of these policies can be combined to make policies to meet your use-cases. ### Email Addresses and Phone Numbers diff --git a/docs/docs/settings.md b/docs/docs/settings.md index cbbbab4..73af1e3 100644 --- a/docs/docs/settings.md +++ b/docs/docs/settings.md @@ -1,18 +1,18 @@ # Settings -Phileas has settings to control how it operates. The settings and how to configure each are described below. +Philter has settings to control how it operates. The settings and how to configure each are described below. -> The configuration for the types of sensitive information that Phileas identifies are defined in [filter policies](policies/filter_policies.md) outside of Phileas' configuration properties described on this page. +> The configuration for the types of sensitive information that Philter identifies are defined in [filter policies](policies/filter_policies.md) outside of Philter' configuration properties described on this page. -## Configuring Phileas +## Configuring Philter -### The Phileas Settings File +### The Philter Settings File -Phileas looks for its settings in an `application.properties` file. +Philter looks for its settings in an `application.properties` file. ### Using Environment Variables -Properties set via environment variables take precedence over properties set in Phileas' settings file. +Properties set via environment variables take precedence over properties set in Philter' settings file. All following properties can also be set as environment variables by prepending `PHILTER_` to the property name and changing periods to underscores. For example, the property `filter.profiles.directory` can be set using the environment variable `PHILTER_FILTER_PROFILES_DIRECTORY` by: @@ -20,7 +20,7 @@ All following properties can also be set as environment variables by prepending export PHILTER_FILTER_PROFILES_DIRECTORY=/profiles/ ``` -Using environment variables to configure Phileas instead of using Phileas' settings file can allow for easier configuration management when deploying Phileas. +Using environment variables to configure Philter instead of using Philter' settings file can allow for easier configuration management when deploying Philter. ## Policies @@ -30,7 +30,7 @@ Using environment variables to configure Phileas instead of using Phileas' setti ## Span Disambiguation -These values configure Phileas' span disambiguation feature to determine the most appropriate type of sensitive information when duplicate spans are identified. In a deployment of multiple Phileas instances, you must enable the cache service for span disambiguation to work as expected. +These values configure Philter' span disambiguation feature to determine the most appropriate type of sensitive information when duplicate spans are identified. In a deployment of multiple Philter instances, you must enable the cache service for span disambiguation to work as expected. | | Description | Allowed Values | Default Value | | ----------------------------- | --------------------------------------------- | --------------- | ------------- | @@ -38,9 +38,9 @@ These values configure Phileas' span disambiguation feature to determine the mos ## Cache Service -The cache service is required to use [consistent anonymization](other_features/consistent_anonymization.md) and policies stored in Amazon S3. Phileas supports Redis as the backend cache. When Redis is not used, an in-memory cache is used instead. The in-memory cache is not recommended because all contents will be stored in memory on the local Phileas instance. +The cache service is required to use [consistent anonymization](other_features/consistent_anonymization.md) and policies stored in Amazon S3. Philter supports Redis as the backend cache. When Redis is not used, an in-memory cache is used instead. The in-memory cache is not recommended because all contents will be stored in memory on the local Philter instance. -The cache will contain sensitive information. It is important that you take the necessary precautions to secure the cache itself and all communication between Phileas and the cache. +The cache will contain sensitive information. It is important that you take the necessary precautions to secure the cache itself and all communication between Philter and the cache. | Setting | Description | Allowed Values | Default Value | | ------------------------ | ----------------------------------------------------------------- | ------------------------- | ------------- | diff --git a/docs/mkdocs.yml b/docs/mkdocs.yml index d7fc669..bba7437 100644 --- a/docs/mkdocs.yml +++ b/docs/mkdocs.yml @@ -52,6 +52,39 @@ nav: - 'Sample Policies': 'policies/sample_policies.md' - 'Filter Policies': 'policies/filter_policies.md' - 'Filters': 'policies/filters.md' + - 'All Filters': + - 'Common Filters': + - 'Ages': 'policies/filters/common_filters/ages.md' + - 'Bank Routing Numbers': 'policies/filters/common_filters/bank-routing-numbers.md' + - 'Bitcoin Addresses': 'policies/filters/common_filters/bitcoin-addresses.md' + - 'Credit Card Numbers': 'policies/filters/common_filters/creditcards.md' + - 'Dates': 'policies/filters/common_filters/dates.md' + - 'Drivers License Numbers': 'policies/filters/common_filters/drivers-license-numbers.md' + - 'Email Addresses': 'policies/filters/common_filters/email-addresses.md' + - 'IBAN Codes': 'policies/filters/common_filters/iban-codes.md' + - 'IP Addresses': 'policies/filters/common_filters/ip-addresses.md' + - 'MAC Addresses': 'policies/filters/common_filters/mac-addresses.md' + - 'Passport Numbers': 'policies/filters/common_filters/passport-numbers.md' + - 'Phone Numbers': 'policies/filters/common_filters/phone-numbers.md' + - 'Phone Number Extensions': 'policies/filters/common_filters/phone-number-extensions.md' + - 'Sections': 'policies/filters/common_filters/sections.md' + - 'SSNs/TINs': 'policies/filters/common_filters/ssns-and-tins.md' + - 'Tracking Numbers': 'policies/filters/common_filters/tracking-numbers.md' + - 'URLs': 'policies/filters/common_filters/urls.md' + - 'VINs': 'policies/filters/common_filters/vins.md' + - 'Zip Codes': 'policies/filters/common_filters/zip-codes.md' + - 'Location Filters': + - 'Cities': 'policies/filters/locations/cities.md' + - 'Counties': 'policies/filters/locations/counties.md' + - 'Hospitals': 'policies/filters/locations/hospitals.md' + - 'Hospital Abbreviations': 'policies/filters/locations/hospital-abbreviations.md' + - 'States': 'policies/filters/locations/states.md' + - 'State Abbreviations': 'policies/filters/locations/state-abbreviations.md' + - 'Persons Names Filters': + - 'First Names': 'policies/filters/persons_names/first-names.md' + - 'Persons Names (NER)': 'policies/filters/persons_names/persons-names-ner.md' + - 'Physician Names (NER)': 'policies/filters/persons_names/physician-names-ner.md' + - 'Surnames': 'policies/filters/persons_names/surnames.md' - 'Filter Strategies': 'policies/filter_strategies.md' - 'Document Analysis': - 'Document Analysis': 'policies/document_analysis.md'