Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Web Application Firewall country blacklist heuristic. #8

Open
wants to merge 25 commits into
base: develop
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
25 commits
Select commit Hold shift + click to select a range
2f35bc0
Initial WIP country blacklist structure.
amcgregor Dec 19, 2020
f231f04
Add request de-serialization protections.
amcgregor Dec 19, 2020
1647e14
Improved hinting, now storing packed IPs.
amcgregor Dec 19, 2020
643b9b7
Correct missing import, pre-trigger cached de-serialization operation…
amcgregor Dec 20, 2020
b177978
Additional optional installation USE flags.
amcgregor Dec 20, 2020
48d6dca
Document addition of WAF extension.
amcgregor Dec 20, 2020
dfc0069
Add API definition for persistent blacklists.
amcgregor Dec 20, 2020
5644df9
Permit persistence of the blacklist and exemptions.
amcgregor Dec 20, 2020
c5f3a99
Imports, docstring.
amcgregor Dec 20, 2020
5c1ffcc
Deserialization errors are better handled in core at time of collect …
amcgregor Dec 20, 2020
309e450
Initial WIP country blacklist structure.
amcgregor Dec 19, 2020
5ffe0f4
Sorting of example countries.
amcgregor Dec 19, 2020
1ac34bb
Merge; careful wording.
amcgregor Dec 20, 2020
7b9e49f
Register the WAF heuristics as plugins.
amcgregor Dec 20, 2020
6dd43d5
Use correct code, we are not a client ourselves.
amcgregor Dec 20, 2020
71c27b8
Pass client IP down to heuristics, prime query string arguments.
amcgregor Dec 20, 2020
86592c0
Heuristics are now passed the client IP.
amcgregor Dec 20, 2020
7b5c734
IP2Location utilization.
amcgregor Dec 20, 2020
72d19e1
Ban-by-country implementation.
amcgregor Dec 20, 2020
7460ee3
Hosting combined heuristic default extensions.
amcgregor Dec 20, 2020
5a8ba97
Who needs a temporary variable?
amcgregor Dec 20, 2020
182dbe6
Additional example country, short and long name for logs.
amcgregor Dec 27, 2020
011cbe3
Adjustments to logging levels and extras.
amcgregor Dec 27, 2020
a2807c0
Blacklist serialization and deserialization.
amcgregor Feb 18, 2021
19e498d
Correction for escaping within a quoted string, additional example ge…
amcgregor Feb 18, 2021
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
24 changes: 24 additions & 0 deletions README.rst
Original file line number Diff line number Diff line change
Expand Up @@ -68,6 +68,27 @@ and submit a pull request. This process is beyond the scope of this documentati
`GitHub's documentation <http://help.github.com/>`_.


Installation "Use" Flags
------------------------

Several `extras_require` dependencies are declared, for bundled installation of tools required for additional features
that are not required for basic usage. To utilize these flags, on any reference to the project or on-disk project
location when executing `pip install`, add the flags comma-separated within square brackets after the name or path:

pip install -U -e '.[development,geographic]'

Quoting will be required in most shells, as square brackets would ordinarily be "expanded".

* `development` — Install a standard suite of development-time support packages, testing framework, and testing components.

* `ecdsa` — Require an efficient ECDSA implementation for use of Elliptic Curve signing operations.

* `geo` — This project utilizes IP2Location LITE data available from http://www.ip2location.com to blacklist users by
country of origin. Enabling this flag will install the official `IP2Location` library, however the actual dataset
will need to be downloaded separately.



Version History
===============

Expand All @@ -78,6 +99,9 @@ Version 3.0

* **Removed Python 2 support and version specific code.** The project has been updated to modern Python packaging standards, including modern namespace use. Modern namespaces are wholly incompatible with the previous namespacing mechanism; this project can not be simultaneously installed with any Marrow project that is Python 2 compatible.

* **Added Web Application Firewall extension.** To protect your application against passive scanning attempts, access of tools for a programming language that are absolutely not present (i.e. PHP, ColdFusion, Adobe Flex, …), malicious probes, and even to restrict access by geographic location.


Version 2.0
-----------

Expand Down
15 changes: 12 additions & 3 deletions setup.py
Original file line number Diff line number Diff line change
Expand Up @@ -80,9 +80,10 @@
],

extras_require = dict(
development = tests_require + ['pre-commit'],
ecdsa = ['ecdsa'],
fastecdsa = ['fastecdsa>=1.0.3'],
development = tests_require + ['pre-commit', 'bandit', 'e', 'pudb', 'ptipython'],
ecdsa = ['fastecdsa>=1.0.3'],
fastecdsa = ['fastecdsa>=1.0.3'], # Deprecated reference.
geo = ['IP2Location'],
),

tests_require = tests_require,
Expand All @@ -101,5 +102,13 @@
'matches = web.security.predicate:ContextMatch',
'contains = web.security.predicate:ContextContains',
],
'web.security.heuristic': [
'dns = web.security.waf:ClientDNSHeuristic',
'path = web.security.waf:PathHeuristic',
'php = web.security.waf:PHPHeuristic',
'wordpress = web.security.waf:WordpressHeuristic',
'hosting = web.security.waf:HostingCombinedHeuristic',
'country = web.security.waf:GeoCountryHeuristic',
]
},
)
22 changes: 11 additions & 11 deletions web/ext/acl.py
Original file line number Diff line number Diff line change
Expand Up @@ -249,7 +249,7 @@ def __init__(self, *_policy, default=None, policy=None):
def prepare(self, context):
"""Called to prepare the request context by adding an `acl` attribute."""

if __debug__: log.debug("Populating request context with ACL.", extra=dict(request=id(context)))
if __debug__: log.trace("Populating request context with ACL.", extra=context.extra)

context.acl = ACL(context=context, policy=self.policy)

Expand All @@ -262,24 +262,24 @@ def dispatch(self, context, crumb):
acl = getattr(crumb.handler, '__acl__', ())
inherit = getattr(crumb.handler, '__acl_inherit__', True)

if __debug__: log.debug(f"Handling dispatch event: {crumb.handler!r} {acl!r}", extra=dict(
request = id(context),
consumed = crumb.path,
handler = safe_name(crumb.handler),
endpoint = crumb.endpoint,
acl = [repr(i) for i in acl],
inherit = inherit,
))
if __debug__: log.trace(f"Handling dispatch event: {crumb.handler!r} {acl!r}", extra={
'consumed': crumb.path,
'handler': safe_name(crumb.handler),
'endpoint': crumb.endpoint,
'acl': [repr(i) for i in acl],
'inherit': inherit,
**context.extra
})

if not inherit:
if __debug__: log.info("Clearing collected access control list.")
if __debug__: log.warn("Clearing collected access control list.")
del context.acl[:]

context.acl.extend((Path(context.request.path), i, handler) for i in acl)

def collect(self, context, handler, args, kw):
if not context.acl:
if __debug__: log.debug("Skipping validation of empty ACL.", extra=dict(request=id(context)))
if __debug__: log.debug("Skipping validation of empty ACL.", extra=context.extra)
return

grant = context.acl.is_authorized
Expand Down
108 changes: 85 additions & 23 deletions web/ext/waf.py
Original file line number Diff line number Diff line change
Expand Up @@ -10,13 +10,19 @@
* https://www.cloudflare.com/en-ca/waf/
"""

from abc import ABCMeta, abstractmethod
from html import escape
from pathlib import Path
from re import compile as re
from socket import inet_aton

from typeguard import check_argument_types
from uri import URI
from webob import Request
from webob.exc import HTTPBadRequest

from web.core.typing import Any, Dict, Union, Callable, ClassVar, Path, Set, Pattern, Iterable, MutableSet, Optional
from web.core.typing import Any, Union, Callable, ClassVar, Generator, Iterable, Optional
from web.core.typing import Dict, Path, Set, Pattern, MutableSet
from web.core.typing import Context, WSGI, WSGIEnvironment, WSGIStartResponse, Request, Response, Tags
from web.core.context import Context
from web.security.waf import WAFHeuristic
Expand All @@ -26,15 +32,55 @@
log = __import__('logging').getLogger(__name__) # A standard logger object.


ClientSet = MutableSet[str]
ClientSet = MutableSet[bytes]

class PersistentClientSet(ClientSet, metaclass=ABCMeta):
"""An ABC describing a mutable set that exposes methods for persisting and restoring its contents."""

@abstractmethod
def persist(self, context:Context) -> None:
"""Persist the state of the set.

It is up to the individual implementation to decide how to do this. Typically this would involve serialization
on-disk or the use of some form of data store, such as SQLite, PostgreSQL, or MongoDB.
"""

raise NotImplementedError()

@abstractmethod
def restore(self, context:Context) -> None:
"""Restore the state of the set.

It is up to the individual implementation to decide how to do this. Typically this involves deserialization
from disk or the use of some form of data store, such as SQLite, PostgreSQL, or MongoDB.
"""

raise NotImplementedError()

class WebApplicationFirewallExtension:
"""A basic rules-based Web Application Firewall implementation.

class LineSerializedSet(set, PersistentClientSet):
location:Path # The target path to read and write data from/to.

def __init__(self, *args, location:Union[str,Path]):
self.location = Path(location)

WIP.
"""
def persist(self, context:Context) -> None:
with self.location.open('w') as fh:
for element in sorted(self):
fh.write(str(element) + "\n")

def restore(self, context:Context) -> None:
self.clear()

with self.location.open('r') as fh:
for line in fh.readlines():
self.add(int(line.strip()))


class WebApplicationFirewallExtension:
"""A basic rules-based Web Application Firewall implementation."""

uses:ClassVar[Tags] = {'timing.prefix'} # We want our execution time to be counted.
provides:ClassVar[Tags] = {'waf'} # A set of keywords usable in `uses` and `needs` declarations.
first:ClassVar[bool] = True # Always try to be first: if truthy, become a dependency for all non-first extensions.
extensions:ClassVar[Tags] = {'waf.rule'} # A set of entry_point namespaces to search for related plugin registrations.
Expand All @@ -56,45 +102,52 @@ def __init__(self, *heuristics, blacklist:Optional[ClientSet]=None, exempt:Optio
super().__init__()

self.heuristics = heuristics
self.blacklist = set() if blacklist is None else blacklist # Permit custom backing stores to be passed in.
self.exempt = set() if exempt is None else exempt # Permit custom backing stores to be passed in.

# Permit custom backing stores to be passed in; we optimize by storing packed binary values, not strings.
self.blacklist = set() if blacklist is None else blacklist.__class__(inet_aton(i) for i in blacklist)

# Permit custom backing stores to be passed in for the exemptions, as well.
self.exempt = set() if exempt is None else exempt

def __call__(self, context:Context, app:WSGI) -> WSGI:
"""Wrap the WSGI application callable in our 'web application firewall'."""

assert check_argument_types()

def inner(environ:WSGIEnvironment, start_response:WSGIStartResponse):
# Identify the remote user.
try:
request: Request = Request(environ) # This will be remembered and re-used as a singleton later.
uri: URI = URI(request.url)
request.GET # As will this "attempt to access query string parameters", malformation detection.

request: Request = Request(environ)
uri: URI = URI(request.url)
except Exception as e: # Protect against de-serialization errors.
return HTTPBadRequest(f"Encountered error de-serializing the request: {e!r}")(environ, start_response)

# https://docs.pylonsproject.org/projects/webob/en/stable/api/request.html#webob.request.BaseRequest.client_addr
# Ref: https://www.nginx.com/resources/wiki/start/topics/examples/forwarded/
client: str = request.client_addr

try:
# Immediately reject known bad actors.
if request.client_addr in self.blacklist:
if inet_aton(request.client_addr) in self.blacklist:
return HTTPClose()(environ, start_response) # No need to re-blacklist.

# Validate the heuristic rules.
for heuristic in self.heuristics:
try:
heuristic(environ, uri)
heuristic(environ, uri, client)
except HTTPClose as e:
log.error(f"{heuristic} {e.args[0].lower()}")
raise

# Invoke the wrapped application if everything seems OK. Note that this pattern of wrapping permits
# your application to raise HTTPClose if wishing to blacklist the active connection.
# your application to raise HTTPClose if wishing to blacklist the active connection for any reason.
return app(environ, start_response)

except HTTPClose as e:
if request.client_addr not in self.exempt:
log.warning(f"Blacklisting: {request.client_addr}")
self.blacklist.add(request.client_addr)
self.blacklist.add(inet_aton(request.client_addr))

if not __debug__: e = HTTPClose() # Do not disclose the reason in production environments.
elif ': ' in e.args[0]: # XXX: Not currently effective.
Expand All @@ -112,32 +165,41 @@ def start(self, context: Context) -> None:

Any of the actions you wanted to perform during `__init__` you should do here.
"""
...


# Permit the storage objects to resume from a saved state.
if hasattr(self.blacklist, 'restore'): self.blacklist.restore(context)
if hasattr(self.exempt, 'restore'): self.exempt.restore(context)

def stop(self, context: Context) -> None:
"""Executed during application shutdown after the last request has been served.

The first argument is the global context class, not request-local context instance.
"""
...

# As per startup, permit the storage objects to persist their state.
if hasattr(self.blacklist, 'persist'): self.blacklist.persist(context)
if hasattr(self.exempt, 'persist'): self.exempt.persist(context)

def graceful(self, context: Context, **config) -> None:
def graceful(self, context: Context) -> None:
"""Called when a SIGHUP is sent to the application.

The first argument is the global context class, not request-local context instance.

Allows your code to re-load configuration and your code should close then re-open sockets and files.
"""
...

# Ask the storage object to persist its state, if able.
if hasattr(self.blacklist, 'persist'): self.blacklist.persist(context)
if hasattr(self.exempt, 'persist'): self.exempt.persist(context)

def status(self, context: Context) -> None:
def status(self, context: Context) -> Generator[str, None, None]:
"""Report on the current status of the Web Application Firewall."""

def plural(quantity, single, plural):
return single if quantity == 1 else plural

c = len(self.heuristics)
yield f"**Rules:** {c} {plural(c, 'entry', 'entries')}"
yield f"Rules: {c} {plural(c, 'entry', 'entries')}"

c = len(self.blacklist)
yield f"**Blacklist:** {c} {plural(c, 'entry', 'entries')}"
yield f"Blacklist: {c} {plural(c, 'entry', 'entries')}"
4 changes: 2 additions & 2 deletions web/security/exc.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,6 @@
class HTTPClose(HTTPClientError):
"""Indicate to the front-end load balancer (FELB) that it should hang up on the client."""

code = 499
title = "Client Closed Request"
code = 444
title = "Connection Closed Without Response"
explanation = "The server did not accept your request."
Loading