Please use the origin
repository for internal development, syncronize only the main and develop branches between origin
and github
repos.
-
origin: Internal remote repository
origin [email protected]:metascan/mdcl-menlo-middleware.git
-
github: Public remote repository
github [email protected]:OPSWAT/metadefender-menlo-integration.git
# clone repo from bitbucket
git clone [email protected]:metascan/mdcl-menlo-middleware.git
# change directory to project root
cd mdcl-menlo-middleware
# add public remote repo
git remote add github [email protected]:OPSWAT/metadefender-menlo-integration.git
# init git-flow
git flow init
# [gitflow "branch"]
# master = main
# develop = develop
# [gitflow "prefix"]
# feature = feature/
# bugfix = bugfix/
# release = release/
# hotfix = hotfix/
# support = support/
# versiontag =
# init python virtual env
python3 -m venv .venv
# activate virtual env
source .venv/bin/activate
# install requirements
pip install -r requirements.txt
Use git-flow workflow for development
Make sure you have python3.5
or above installed.
This Middleware leverages python's async mechanism introduced in python3.5
Before you run it, install all dependencies in a virtual env:
python3 -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt
To run the app:
python3 -m metadefender_menlo
Middleware should be configured using the config file:
service: metadefender # MetaDefender Core service is the only current integration allowed
api:
type: core # mandatory, should be either `core` or `cloud`
params:
apikey: null # optional for MetaDefender Core (in case is configured to request it), but `mandatory for MetaDefender Cloud`
url:
core: https://localhost:8008 # MetaDefender Core URL - can be HTTP or HTTPS (certificate validation is disabled)
cloud: https://api.metadefender.com/v4 # MetaDefender Cloud URL - this doesn't need to be updated, will be consumed when api type is `cloud`
server:
port: 3000
host: "0.0.0.0"
api_version: /api/v1
https:
load_local: false # enable https by using locally stored certs
crt: /tmp/certs/server.crt # location of the public key
key: /tmp/certs/server.key # location of the private key
logging:
enabled: true # enable or disable logging.
level: INFO # select from (NOTSET, DEBUG, INFO, WARNING, ERROR, CRITICAL) (see https://docs.python.org/3/library/logging.html#levels)
logfile: /var/log/md-menlo/app.log # absolute path to the logfile. If path doesn't exist will be created (make sure the user has the right permissions)
interval: 24 # the interval (in hours) for log rotation
backup_count: 30 # for how many intervals should the logs be kept (e.g. 30 logs for 24h each -> 30 days logs)
Menlo requires all communication to be done over https, so either deploy an reverse proxy (nginx) in front of it to handle SSL or use the configuration https
in the config.yml
.
You can add self-signed X509 certs, but have to be signed by a (custom) Certificate Authority. The rootCA public key will be required to be uploaded to Menlo Security Web Policy configuration.
The location can be altered also by modifiying metadefender_menlo/__main__.py
at lines 22-23.
To change the environemnt, update .env
file.
# Use `local` for local development.
# Use `dev` for testing the deployment on https://menlo-dev.metadefender.com
# Use `prod` for deploying a new version to prod env: https://menlo.metadefender.com
ENVIRONMENT=<env>
Follow the next steps after the env is set.
# 1. Go to the kubernetes folder
cd kubernetes
# 2. Build a new image version.
# set different <version> for each deployment,
# you can append a letter to the current version number while testing. e.g. 1.1.0-c
./deploy.aws.sh build_image 1.1.0-c
# 3.a Login to AWS ECR. Once in a while if your session expires
./deploy.aws.sh ecr_login
# 3. Push image to remote
./deploy.aws.sh push_image 1.1.0-c
# 4. Deploy the new image to the dev cluster
./deploy.aws.sh apply_deployment
# Check the new deployment on
https://menlo-dev.metadefender.com
Some Bitbucket pipelines have been configured in order for everybody to easily build and deploy the Menlo middleware to any environment from the same place (Bitbucket pipelines that trigger cloud agents), thus eliminating potential deviations in the build/deploy script or others (OS, software etc.). Plus, this way all builds/deploys can be tracked.
There are 3 pipelines:
- branches: develop - automatically triggered to build/deploy to dev when there is a merge to develop
- branches: main - automatically triggered to build/deploy to prod when there is a merge to main
- custom: manual - manually triggered pipeline to deploy to dev from any branch other than develop/main
All of the environment variables used in the scripts can be viewed and configured in Bitbucket->Repository Settings->Deployments OR Repository variables
For local deployments, use the gitignored .env
file which will be sourced in the script.
First, you’re required to build a container. There’s a Dockerfile
in the repo, that you can use to build the container and push it to your registry. Once you have it, you will have to modify deployment.yaml and specify your own container image.
Also, check the deploy.sh
script. It was built specifically for Google Cloud to leverage GKE (Google Kubernetes Engine). But it can be easily adapted to run in any Kubernetes supported environment.
For AWS deployment please check deploy.aws.sh
.
The GCP specific part is building the cluster and (if needed) the static IP.
Modify the deploy script to use your own cluster and certificates as needed. If you don’t require self-signed certs, you can remove the entire portion with openssl and jump directly to adding your certificates as Kubernetes secrets.
Check deployment.yaml
and place the real apikey
(if you plan to use MetaDefender Cloud), or just remove that environment variable entirely if it is not going to be used. You can also set the apikey in the config file, if you prefer to have it hardcoded in the container image, instead of passing it as an environment variable.
containers:
- name: "metadefender-menlo-integration"
image: "<<middleware-container>>" # Build your own container using the Dockerfile provided in the repository
env:
- name: "apikey"
value: "<<MetaDefender-Cloud-apikey>>" # Set your private apikey if you will integrate with MetaDefender Cloud
This document describes the integration of Menlo Security with MetaDefender Core and Cloud, using the current middleware
Menlo Security has the ability to call an external API (called Sanitization API), which allows to offload the file analysis to MetaDefender and retrieve the sanitized file once is available.
The Sanitization API specification includes:
Submit metadata
- not implemented in this Middleware to maintain a stateless systemCheck Existing Report
- SHA256 lookupFile Submission
Analysis Report
- Menlo will keep polling this endpoint until the status is completedRetrieve Sanitized file
The middleware is using python3
and requires minimum python3.5 to run.
A list of all libraries are included in requirements.txt
The application can run on the same VM as MetaDefender Core or a separate system. Can also run in a Docker container, a Dockerfile is made available.
First, clone this repository:
git clone https://github.com/georgeprichici/metadefender-menlo-integration.git
Second, install all dependencies:
pip install -r requirements.txt
Middleware should be configured using the config file.
Make sure you place the certs in the right folder (certs
) and under the right name, since Menlo requires all communication to be done over https.
service: metadefender
api:
type: cloud
params:
apikey: null
url:
core: https://1.2.3.4:8008
cloud: https://api.metadefender.com/v4
server:
port: 3000
host: "0.0.0.0"
api_version: /api/v1
https:
load_local: false
crt: /tmp/certs/server.crt
key: /tmp/certs/server.key
logging:
enabled: true
level: INFO
logfile: /var/log/md-menlo/app.log
interval: 24
backup_count: 30
See the Middleware Documentation for details.
Environment variables (will overwrite some of the config.yml values)
MENLO_MD_ENV --> env # environment: local, dev, qa, prod, etc.
MENLO_MD_AWS_REGION --> region # aws region
MENLO_MD_BITBUCKET_COMMIT_HASH --> commitHash # git commit hash for versioning
MENLO_MD_MDCLOUD_RULE --> scanRule # MD Cloud Scan Workflow
MENLO_MD_APIKEY --> config.yml:api.params.apikey # MD Cloud or MD API Key
MENLO_MD_URL --> config.yml:api.url[api.type] # MD Cloud or MD Url
MENLO_MD_SENTRY_DSN --> sentryDsn # Sentry endpoint if enabled
MENLO_MD_KAFKA_ENABLED --> config.logging.kafka.enabled # enable writing logs to kafka
MENLO_MD_KAFKA_CLIENT_ID --> config.logging.kafka.client_id # kafka client id
MENLO_MD_KAFKA_SERVER --> config.logging.kafka.server # list of kafka brokers
MENLO_MD_KAFKA_TOPIC --> config.logging.kafka.topic # kafka topic
MENLO_MD_KAFKA_SSL --> config.logging.kafka.ssl # use ssl or not
MENLO_MD_SNS_ENABLED --> config.logging.sns.enabled # enable sns notifications for errors
MENLO_MD_SNS_ARN --> config.logging.sns.arn # sns topic arn
MENLO_MD_SNS_REGION --> config.logging.sns.region # sns topic region
MENLO_MD_FALLBACK_TO_ORIGINAL
Overwrites: config.yml:fallbackToOriginal
Description: Send 204 to Menlo when the sanitized version is not available
Values: true | false
- Login to Menlo Admin console (
https://admin.menlosecurity.com
) - Go to Web Policy > Content Inspection
- Edit
Menlo File REST API
- Add the following details:
- Plugin Name:
MetaDefender Core
- Plugin Description:
MetaDefender Core API Integration
- Base URL: The url of the middleware (e.g.
https://1.2.3.4:3000
)⚠️ HTTPS is required!
- Certificate: the certificate (X.509) used for the Middleware
- This will be used for certificate validation
- Type of transfers:
- Enable Downloads if MetaDefender is used to scan and sanitize downloaded files
- Enable Uploads if MetaDefender will be used to also redact uploaded files
- Authorization Header:
- For MetaDefender Core: you can put any dummy data, since this header will be ignored on the Middleware side
- For MetaDefender Cloud:
- you can choose to input here the apikey instead in the config file
- if so, you'll need to enable the functionality
- uncomment the prepare function in api.handlers.base_handler.py
- The main benefit is the ability to switch keys in Menlo admin console, maybe use different keys for different departments/groups.
- Hash Check: leave it unchecked, to force a file analysis and sanitization on every file download
- Metadata Check: leave it unchecked since is not supported in this integration
- Allow File Replacement: Enable this functionality if you plan to use MetaDefender to sanitize the downloaded files or redact uploaded files
- Plugin Name:
- Navigate to
https://safe.menlosecurity.com
- Search for a PDF
- Try to download the PDF
- You should see the File Download request in the Admin Console (Logs > Web Logs)
- Click on the table entry and you'll see the analyis details on the right side.
SERVICE > TYPE > DETAILS
SERVICE
- MenloPlugin
- MetaDefenderCloud
TYPE:
- Request
- Response
- Internal
DETAILS:
- <JSON Object>
- <Order>
Exapmple:
MenloPlugin > Request > {method: GET, endpoint: "/api/v1/file/<dataId>"}
MetadeDefenderCloud > Request > {https://api.metadefender.com/v4/file/converted/<dataId> | {'apikey': <apikey>, 'x-forwarded-for': '81.196.34.198', 'x-real-ip': '81.196.34.198'}}
MetadeDefenderCloud > Response > {status_500}
MeloPlugin > Internal > "Error from md cloud"
MenloPlugin > Response > {500 response: sanitized file (binary data)}
To generate the api documentation from menlo-sanitization-openapi.yaml run the below command: redocly build-docs ./api/menlo-sanitization-openapi.yaml --output './docs/Menlo Sanitization API.html'