Presidio Image Redactor - improve scalability and design #1049
Replies: 2 comments 5 replies
-
Hello! I'm trying to get a better understanding of how the image processing flow happens from the beginning up to the class RecognizerBase(ABC):
"""A class representing an abstract visual objects recognizer.
"""
@abstractmethod
def recognize(self, image: object) -> List[RecognizerResult]:
"""Recognize visual objects
:param image: PIL Image/numpy array to be processed
:return: List of the recognized objects
"""
...
class RecognizerResult:
"""Represent the results of analysing the image by recognizer.
"""
entity_type: str
reconizer_name: str
text: Optional[str] = None
bbox: Bbox
polygon: Optional[Polygon] = None In the # For this the result obtained by TesseractOCR...
RecognizerResult:
entity_type: "text"
reconizer_name: TesseractOCR
text: "My name is Jhon Doe, My phone number is 212-555-5555"
bbox: Bbox
# we get two PII entities by presidio-analyzer...
[
type: PERSON, start: 11, end: 19, score: 0.85,
type: PHONE_NUMBER, start: 40, end: 52, score: 0.75
]
# Now we need to somehow map them back to the boxes/polygons And depending on the type of recognizer, the mapping logic may vary (for example, it will be different for the TesseractOCR and QR codes). One of the options is to create a separate class for recognizers who deal with text data class TextRecognizer(RecognizerBase, ABC):
@abstractmethod
def map_pii_to_boxes(
result: List[RecognizerResult], pii: List[PresidioResults]
) -> List[RecognizerResult]:
... Or something like that. Do you guys see the image processing flow roughly in the same direction? |
Beta Was this translation helpful? Give feedback.
-
Context / Problem Statement
The Presidio Image Redactor package was developed as beta as part of the effort on Presidio V2 in January 2021. It features a simple OCR pipeline which extracts text, parses it, and sends it to the Presidio Analyzer package. Once PII is identified, bounding boxes are matched with
RecognizerResult
objects and those bounding boxes are redacted.Two significant contributions to the package, one focusing on DICOM and the other on QR scanning, in addition to limited accuracy and extensibility due to the existing design, express the need to improve the package's performance and generalizability to new use cases, starting from DICOM and going to non-textual PII objects such as faces, or semi-textual objects like QR codes or license plate numbers.
This ADR proposes an architecture change to
presidio-image-redactor
to make it more similar to the structure ofpresidio-analyzer
andpresidio-anonymizer
that are used in the text version of Presidio.This is the proposed high-level flow:
And in more detail:
Like the structure of the
AnalyzerEngine
object, the newImageAnalyzerEngine
would contain a list of recognizers, one for each type of detection logic.The recognizer would contain the logic for:
The output of the
ImageAnalyzerEngine
would then be inputted into theImageAnonymizerEngine
, that would expose oprators such asredact
orvalidate
. Users would be able to create additional types of operators.Consequences
Links
Beta Was this translation helpful? Give feedback.
All reactions