-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
1 parent
201ce14
commit 96ff2d1
Showing
93 changed files
with
1,541 additions
and
107 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,4 +1,4 @@ | ||
# Phileas Documentation | ||
# Philter Documentation | ||
|
||
The documentation files here are markdown files used by MkDocs. | ||
|
||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,13 @@ | ||
# API | ||
|
||
Philter's API is divided into three parts, the [Filtering API](api/filtering_api.md) the [Policies API](api/policies_api.md), and the [Alerts API](api/alerts_api.md). | ||
|
||
* [Filtering API](api/filtering_api.md) - The filtering API is used to redact text. With the filtering API, you can send text or PDF documents to Philter and receive back the redacted text or PDF document. | ||
* [Policies API](api/policies_api.md) - The policies API allows you to create, modify, and delete [policies](../policies/filter_policies.md). Policies can also be created manually with access to Philter, but the API provides a programmatic way to manage policies. | ||
* [Alerts API](api/alerts_api.md) - The alerts API allows you to get and delete [alerts](../other_features/alerts.md) that were generated during redaction. | ||
|
||
The Philter [SDKs](sdks.md) provide convenient methods for using Philter's API methods for various programming languages. | ||
|
||
## Securing Philter's API | ||
|
||
Philter's API supports one-way and two-way SSL/TLS authentication. See the [settings](../settings.md) for more information. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,30 @@ | ||
# Alerts API | ||
|
||
The Alerts API provides endpoints for retrieving and deleting alerts. Alerts can optionally be generated when a filter strategy's condition is met. See [Alerts](alerts.md) for more information on Philter alerts. | ||
|
||
> The `curl` example commands shown on this page are written assuming Philter has been enabled for SSL and it is using a self-signed certificate. If launched from a cloud marketplace, SSL will be enabled automatically with a self-signed SSL certificate. See the [SSL/TLS ](settings.md#ssl-tls) settings for more information. | ||
{style="note"} | ||
|
||
## Get Alerts | ||
|
||
| Method | Endpoint | Description | | ||
| ------ | -------- | ----------- | | ||
| `GET` | `/api/alerts` | Get alerts. | | ||
|
||
Example request: | ||
|
||
``` | ||
curl -k https://localhost:8080/api/alerts | ||
``` | ||
|
||
## Delete an Alert | ||
|
||
| Method | Endpoint | Description | | ||
|----------|-------------------------|--------------------------------------------------------------------| | ||
| `DELETE` | `/api/alerts/{alertId}` | Delete an alert, where `alertId` is the ID of the alert to delete. | | ||
|
||
Example request to delete an alert with id `12345`: | ||
|
||
``` | ||
curl -k -X DELETE https://localhost:8080/api/alerts/12345 | ||
``` |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,123 @@ | ||
# Filtering API | ||
|
||
Philter’s filtering API provides access to Philter’s ability to filter sensitive information from text and to retrieve the health status of Philter. | ||
|
||
> The `curl` example commands shown on this page are written assuming Philter has been enabled for SSL and it is using a self-signed certificate. If launched from a cloud marketplace, SSL will be enabled automatically with a self-signed SSL certificate. See the [SSL/TLS ](settings.md#ssl-tls) settings for more information. | ||
{style="note"} | ||
|
||
Each filter request can optionally have a `context`. When not provided, the context defaults to `none`. Contexts provide a means for logically grouping your documents during filtering. For example, documents pertaining to one health care provider may be submitted under the context `hospital1`, and documents pertaining to another health care provider may be submitted under the context `hospital2`. | ||
|
||
The context for each filter request impacts how sensitive information is replaced when found in the text. [Consistent anonymization](anonymization.md) can be enabled at either the context or document level. When enabled at the context level, all instances of a given piece of sensitive information will be replaced consistently by the same value. This allows for maintaining meaning across all documents in the context. | ||
|
||
Each filter request submitted to Philter is automatically assigned a document identifier. The document identifier is an alphanumeric value unique to that request. No two documents should be assigned the same document identifier. The document identifier is returned in the `x-document-id` header with each `filter` or `explain` API response. | ||
|
||
## Filter | ||
|
||
The `filter` endpoint receives plain text or a PDF document and returns the redacted text or redacted PDF document. | ||
|
||
The types of sensitive information found and how each type is redacted is determined by the chosen policy. | ||
|
||
| Method | Endpoint | Description | | ||
|--------|---------------|------------------------| | ||
| `POST` | `/api/filter` | Filter the given text. | | ||
|
||
### Query Parameters | ||
|
||
* `d` - A document ID that uniquely identifies the text being submitted. Leave empty and Philter will generate a document ID derived from a hash of the submitted text. | ||
* `p` - The name of the policy to use for filtering. Defaults to `default` if not provided. | ||
* `c` - The filtering context. Defaults to `none` if not provided. | ||
|
||
### Headers | ||
|
||
* `Content-Type` - The value should be set to `text/plain` or `application/pdf`. | ||
|
||
Example request to filter plain text: | ||
|
||
``` | ||
curl -k -X POST "https://localhost:8080/api/filter" -d @file.txt -H Content-Type "text/plain" | ||
``` | ||
|
||
Example request to filter a PDF document: | ||
|
||
``` | ||
curl -k -X POST "https://localhost:8080/api/filter?" -d @file.pdf -H Content-Type "application/pdf" -O redacted.zip | ||
``` | ||
|
||
## Explain | ||
|
||
The `explain` endpoint behaves much like the `filter` endpoint in that receives plain text and returns the redacted plain text. However, the `explain` endpoint provides a detailed explanation describing how the text was redacted. Also, the `explain` endpoint does not support PDF documents. | ||
|
||
The types of sensitive information found and how each type is redacted is determined by the chosen policy. | ||
|
||
| Method | Endpoint | Description | | ||
|--------|----------------|-----------------------------------------------------------| | ||
| `POST` | `/api/explain` | Filter the given text and provide a detailed explanation. | | ||
|
||
### Query Parameters | ||
|
||
* `d` - A document ID that uniquely identifies the text being submitted. Leave empty and Philter will generate a document ID derived from a hash of the submitted text. | ||
* `p` - The name of the policy to use for filtering. Defaults to `default` if not provided. | ||
* `c` - The filtering context. Defaults to `none` if not provided. | ||
|
||
### Headers | ||
|
||
* `Content-Type` - The value should be set to `text/plain`. | ||
|
||
Example explain request: | ||
|
||
``` | ||
curl -k -X POST "https://localhost:8080/api/explain" -d @file.txt -H Content-Type "text/plain" | ||
``` | ||
|
||
Example explain response: | ||
|
||
``` | ||
{ | ||
"filteredText": "{{{REDACTED-entity}}} was a patient and his ssn was {{{REDACTED-ssn}}}.", | ||
"context": "none", | ||
"documentId": "7a906866-4fc9-44d6-9bc3-22728b93a602", | ||
"explanation": { | ||
"appliedSpans": [ | ||
{ | ||
"id": "c78fb69c-84d6-4189-b376-63791793cbd2", | ||
"characterStart": 0, | ||
"characterEnd": 17, | ||
"filterType": "NER_ENTITY", | ||
"context": "C1", | ||
"documentId": "7a906866-4fc9-44d6-9bc3-22728b93a602", | ||
"confidence": 0.9189682900905609, | ||
"text": "George Washington", | ||
"replacement": "{{{REDACTED-entity}}}", | ||
"ignored": false | ||
}, | ||
{ | ||
"id": "f4556f62-2f80-4edc-96f0-aa1d44802157", | ||
"characterStart": 48, | ||
"characterEnd": 59, | ||
"filterType": "SSN", | ||
"context": "C1", | ||
"documentId": "7a906866-4fc9-44d6-9bc3-22728b93a602", | ||
"confidence": 1, | ||
"text": "123-45-6789", | ||
"replacement": "{{{REDACTED-ssn}}}", | ||
"ignored": false | ||
} | ||
], | ||
"ignoredSpans": [] | ||
} | ||
} | ||
``` | ||
|
||
## Status | ||
|
||
The `status` endpoint is useful in determining the current state of Philter. The `status` endpoint can be used by monitoring software to assess Philter's availability or by your cloud provider for purposes of determining Philter's health when deployed behind a load balancer. | ||
|
||
| Method | Endpoint | Description | | ||
|--------|---------------|-----------------------------| | ||
| `GET` | `/api/status` | Gets the status of Philter. | | ||
|
||
Example request: | ||
|
||
``` | ||
curl -k -X POST "https://localhost:8080/api/status" | ||
``` |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,77 @@ | ||
# Policies API | ||
|
||
The Policies API provides endpoints for retrieving, uploading, and deleting [policies](policies_README.md). | ||
|
||
> The `curl` example commands shown on this page are written assuming Philter has been enabled for SSL and it is using a self-signed certificate. If launched from a cloud marketplace, SSL will be enabled automatically with a self-signed SSL certificate. See the [SSL/TLS ](settings.md#ssl-tls) settings for more information. | ||
{style="note"} | ||
|
||
## Get Policy Names | ||
|
||
| Method | Endpoint | Description | | ||
| ------ |-----------------|--------------------------------| | ||
| `GET` | `/api/policies` | Get the names of all policies. | | ||
|
||
|
||
Example request: | ||
|
||
``` | ||
curl -k https://localhost:8080/api/policies | ||
``` | ||
|
||
## Get a Policy | ||
|
||
| Method | Endpoint | Description | | ||
| ------ |------------------------------|-----------------------------------------------------------------------------------| | ||
| `GET` | `/api/policies/{policyName}` | Get the content of a policy, where {policyName} is the name of the policy to get. | | ||
|
||
Example request: | ||
|
||
``` | ||
curl -k https://localhost:8080/api/policies/my-policy | ||
``` | ||
|
||
Example response: | ||
|
||
``` | ||
{ | ||
"name": "just-phone-numbers", | ||
"ignored": [ | ||
], | ||
"identifiers": { | ||
"dictionaries": [ | ||
], | ||
"phoneNumber": { | ||
"phoneNumberFilterStrategies": [ | ||
{ | ||
"strategy": "REDACT", | ||
"redactionFormat": "{{{REDACTED-%t}}}" | ||
} | ||
] | ||
} | ||
} | ||
} | ||
``` | ||
|
||
## Upload a Policy | ||
|
||
| Method | Endpoint | Description | | ||
| ------ |------------------------------|-----------------------------------------------------------------------------------| | ||
| `PUT` | `/api/policies/{policyName}` | Upload a policy, where {policyName} is the name of the policy to get. If a policy with this name already exists it will be overwritten.| | ||
|
||
Example request: | ||
|
||
``` | ||
curl -X PUT -H "Content-Type: application/json" -k https://localhost:8080/api/profiles/my-profile -d @policy.json | ||
``` | ||
|
||
## Delete a Policy | ||
|
||
| Method | Endpoint | Description | | ||
|----------|------------------------------|---------------------------------------------------------------------------------------------------------------------------------------------| | ||
| `DELETE` | `/api/policies/{policyName}` | Delete a policy, where {policyName} is the name of the policy to delete. | | ||
|
||
Example request: | ||
|
||
``` | ||
curl -X DELETE -k https://localhost:8080/api/policies/exprofile | ||
``` |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,7 @@ | ||
# Client SDKs | ||
|
||
Philter SDKs are available for use in your projects. The SDKs are licensed under the Apache License, version 2.0]. Refer to the GitHub projects below for your language of choice for usage examples. | ||
|
||
* [Java](https://github.com/philterd/philter-sdk-java) | ||
* [.NET](https://github.com/philterd/philter-sdk-net) | ||
* [Go](https://github.com/philterd/philter-sdk-golang) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,35 @@ | ||
# De-identification Methods | ||
|
||
There are several ways data can be de-identified, and which you use depends on the types of data you want to de-identify and your use-case for de-identifying the data. The terminology around the different methods is often used interchangeably, but there are differences between each method. | ||
|
||
> In this User's Guide, we may use the terms `filter` and `redact` interchangeably. | ||
In Philter, de-identification methods vary for each type of sensitive information. For example, all types can be replaced or redacted, but only dates can be shifted and only zip codes can be truncated. How a de-identification method is applied by Philter is called a filter strategy. Each type of sensitive information can have one or more filter strategies, and the combination of the filter strategies you select is called a policy. A policy determines how a document will be de-identified. | ||
|
||
The following is a list of de-identification methods that describes how each method works and its applicability to Philter. Deidentifying a document is likely to require a combination of the following methods. For instance, you may want to redact names, encrypt credit card numbers, and shift appointment dates. | ||
|
||
## Summary of Deidentification Methods | ||
|
||
<table><thead><tr><th width="268">De-identification Method</th><th>Description</th></tr></thead><tbody><tr><td>Replacement</td><td>Replaces sensitive information with a defined value. For example, you might want to replace a credit card number with the literal value "CREDIT_CARD_NUMBER".</td></tr><tr><td>Redaction and Masking</td><td>Removes sensitive information. Philter gives you a choice of how to remove the sensitive information, whether it is by replacing it with ***** (masking) or by some other set of characters.</td></tr><tr><td>Encryption</td><td>Encrypts sensitive information.</td></tr><tr><td>Date Shifting</td><td>Shifts dates either forward or backward by some interval.</td></tr><tr><td>Bucketing</td><td>Categorizes data into buckets based on the data. Examples of bucketing is Philter can bucket dates into years, and zip codes by population.</td></tr></tbody></table> | ||
|
||
> A difference between [Philter](https://philterd.ai/philter/) and other services is that Philter does not send your data to a third party for de-identification. Philter runs in your cloud and your data stays in your cloud. | ||
## Deidentification Methods | ||
|
||
### Redaction and Masking | ||
|
||
Redaction and masking are two methods of de-identification that are often used interchangeably. The term redaction refers to removing a sensitive value from a document. When we hear the term redaction we often think of an image of a document with black bars across pieces of the text. | ||
|
||
Masking is similar to redaction but allows for configuring how the sensitive value is removed. The most common example is using asterisks (i.e. \*\*\*\*\*\*) in place of a sensitive value. | ||
|
||
### Replacement | ||
|
||
Replacement is a method of de-identification that simply replaces a sensitive value with another value. Replacement is useful when the sensitive value is not needed once the document has been de-identified. Philter can replace a sensitive value with a preset value or with a random value. | ||
|
||
In Philter's filter strategies, replacement is achieved by using the strategy to `REDACT`, `STATIC_REPLACE` , or `RANDOM_REPLACE` . | ||
|
||
### Bucketing | ||
|
||
### Date Shifting | ||
|
||
### Encryption |
This file was deleted.
Oops, something went wrong.
This file was deleted.
Oops, something went wrong.
This file was deleted.
Oops, something went wrong.
This file was deleted.
Oops, something went wrong.
This file was deleted.
Oops, something went wrong.
This file was deleted.
Oops, something went wrong.
Oops, something went wrong.