Skip to content

Commit

Permalink
Improve non-standard type encoding (#12)
Browse files Browse the repository at this point in the history
This PR improves the JSON encoding of non-standard types by introducing
and using the `.defaults` module. The `.defaults` module adds helper
functions that can test and apply formatting for types not supported by
a given encoder.

Please note that in doing so, some outputs of the `JsonFormatter` have
changed. That said these changes return more "reasonable" results rather
the the original `str(o)` fallback.

For more detailed list of changes to the encoders see the CHANGELOG.

## Test Plan
Have added additional tests and now check for specific output.
  • Loading branch information
nhairs authored May 14, 2024
1 parent 59439e9 commit b37c54b
Show file tree
Hide file tree
Showing 8 changed files with 327 additions and 112 deletions.
25 changes: 22 additions & 3 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,14 +4,27 @@ All notable changes to this project will be documented in this file.
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).

## [3.1.0.rc1](https://github.com/nhairs/python-json-logger/compare/v3.0.1...v3.1.0.rc1) - 2023-05-03
## [3.1.0.rc2](https://github.com/nhairs/python-json-logger/compare/v3.0.1...v3.1.0.rc2) - 2023-05-03

This splits common funcitonality out to allow supporting other JSON encoders. Although this is a large refactor, backwards compatibility has been maintained.

### Added
- `.core` - more details below.
- Orjson encoder support via `.orjson.OrjsonFormatter`.
- MsgSpec encoder support via `.msgspec.MsgspecFormatter`.
- `.defaults` module that provides many functions for handling unsupported types.
- Orjson encoder support via `.orjson.OrjsonFormatter` with the following additions:
- bytes are URL safe base64 encoded.
- Exceptions are "pretty printed" using the exception name and message e.g. `"ValueError: bad value passed"`
- Enum values use their value, Enum classes now return all values as a list.
- Tracebacks are supported
- Classes (aka types) are support
- Will fallback on `__str__` if available, else `__repr__` if available, else will use `__could_not_encode__`
- MsgSpec encoder support via `.msgspec.MsgspecFormatter` with the following additions:
- Exceptions are "pretty printed" using the exception name and message e.g. `"ValueError: bad value passed"`
- Enum classes now return all values as a list.
- Tracebacks are supported
- Classes (aka types) are support
- Will fallback on `__str__` if available, else `__repr__` if available, else will use `__could_not_encode__`
- Note: msgspec only supprts enum values of type `int` or `str` [jcrist/msgspec#680](https://github.com/jcrist/msgspec/issues/680)

### Changed
- `.jsonlogger` has been moved to `.json` with core functionality moved to `.core`.
Expand All @@ -21,6 +34,12 @@ This splits common funcitonality out to allow supporting other JSON encoders. Al
- `style` can now support non-standard arguments by setting `validate` to `False`
- `validate` allows non-standard `style` arguments or prevents calling `validate` on standard `style` arguments.
- `default` is ignored.
- `.json.JsonEncoder` default encodings changed:
- bytes are URL safe base64 encoded.
- Exception formatting detected using `BaseException` instead of `Exception`. Now "pretty prints" the exception using the exception name and message e.g. `"ValueError: bad value passed"`
- Dataclasses are now supported
- Enum values now use their value, Enum classes now return all values as a list.
- Will fallback on `__str__` if available, else `__repr__` if available, else will use `__could_not_encode__`

### Deprecated
- `.jsonlogger` is now `.json`
Expand Down
3 changes: 2 additions & 1 deletion pylintrc
Original file line number Diff line number Diff line change
Expand Up @@ -75,8 +75,9 @@ disable=raw-checker-failed,
# cases. Disable rules that can cause conflicts
line-too-long,
# Module docstrings are not required
missing-module-docstring
missing-module-docstring,
## Project Disables
duplicate-code

# Enable the message, report, category or checker with the given id(s). You can
# either give multiple identifier separated by comma (,) or put this option
Expand Down
4 changes: 3 additions & 1 deletion pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ build-backend = "setuptools.build_meta"

[project]
name = "python-json-logger"
version = "3.1.0.rc1"
version = "3.1.0.rc2"
description = "JSON Log Formatter for the Python Logging Package"
authors = [
{name = "Zakaria Zajac", email = "[email protected]"},
Expand Down Expand Up @@ -55,6 +55,8 @@ dev = [
## Test
"pytest",
"freezegun",
"backports.zoneinfo;python_version<'3.9'",
"tzdata",
## Build
"build",
]
Expand Down
146 changes: 146 additions & 0 deletions src/pythonjsonlogger/defaults.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,146 @@
# pylint: disable=missing-function-docstring

### IMPORTS
### ============================================================================
## Future
from __future__ import annotations

## Standard Library
import base64
import dataclasses
import datetime
import enum
import sys
from types import TracebackType
from typing import Any
import traceback
import uuid

if sys.version_info >= (3, 10):
from typing import TypeGuard
else:
from typing_extensions import TypeGuard

## Installed

## Application


### FUNCTIONS
### ============================================================================
def unknown_default(obj: Any) -> str:
try:
return str(obj)
except Exception: # pylint: disable=broad-exception-caught
pass
try:
return repr(obj)
except Exception: # pylint: disable=broad-exception-caught
pass
return "__could_not_encode__"


## Types
## -----------------------------------------------------------------------------
def use_type_default(obj: Any) -> TypeGuard[type]:
return isinstance(obj, type)


def type_default(obj: type) -> str:
return obj.__name__


## Dataclasses
## -----------------------------------------------------------------------------
def use_dataclass_default(obj: Any) -> bool:
return dataclasses.is_dataclass(obj) and not isinstance(obj, type)


def dataclass_default(obj) -> dict[str, Any]:
return dataclasses.asdict(obj)


## Dates and Times
## -----------------------------------------------------------------------------
def use_time_default(obj: Any) -> TypeGuard[datetime.time]:
return isinstance(obj, datetime.time)


def time_default(obj: datetime.time) -> str:
return obj.isoformat()


def use_date_default(obj: Any) -> TypeGuard[datetime.date]:
return isinstance(obj, datetime.date)


def date_default(obj: datetime.date) -> str:
return obj.isoformat()


def use_datetime_default(obj: Any) -> TypeGuard[datetime.datetime]:
return isinstance(obj, datetime.datetime)


def datetime_default(obj: datetime.datetime) -> str:
return obj.isoformat()


def use_datetime_any(obj: Any) -> TypeGuard[datetime.time | datetime.date | datetime.datetime]:
return isinstance(obj, (datetime.time, datetime.date, datetime.datetime))


def datetime_any(obj: datetime.time | datetime.date | datetime.date) -> str:
return obj.isoformat()


## Exception and Tracebacks
## -----------------------------------------------------------------------------
def use_exception_default(obj: Any) -> TypeGuard[BaseException]:
return isinstance(obj, BaseException)


def exception_default(obj: BaseException) -> str:
return f"{obj.__class__.__name__}: {obj}"


def use_traceback_default(obj: Any) -> TypeGuard[TracebackType]:
return isinstance(obj, TracebackType)


def traceback_default(obj: TracebackType) -> str:
return "".join(traceback.format_tb(obj)).strip()


## Enums
## -----------------------------------------------------------------------------
def use_enum_default(obj: Any) -> TypeGuard[enum.Enum | enum.EnumMeta]:
return isinstance(obj, (enum.Enum, enum.EnumMeta))


def enum_default(obj: enum.Enum | enum.EnumMeta) -> Any | list[Any]:
if isinstance(obj, enum.Enum):
return obj.value
return [e.value for e in obj] # type: ignore[var-annotated]


## UUIDs
## -----------------------------------------------------------------------------
def use_uuid_default(obj: Any) -> TypeGuard[uuid.UUID]:
return isinstance(obj, uuid.UUID)


def uuid_default(obj: uuid.UUID) -> str:
return str(obj)


## Bytes
## -----------------------------------------------------------------------------
def use_bytes_default(obj: Any) -> TypeGuard[bytes | bytearray]:
return isinstance(obj, (bytes, bytearray))


def bytes_default(obj: bytes | bytearray, url_safe: bool = True) -> str:
if url_safe:
return base64.urlsafe_b64encode(obj).decode("utf8")
return base64.b64encode(obj).decode("utf8")
39 changes: 22 additions & 17 deletions src/pythonjsonlogger/json.py
Original file line number Diff line number Diff line change
Expand Up @@ -10,15 +10,14 @@
from __future__ import annotations

## Standard Library
from datetime import date, datetime, time
from inspect import istraceback
import datetime
import json
import traceback
from typing import Any, Callable, Optional, Union
import warnings

## Application
from . import core
from . import defaults as d


### CLASSES
Expand All @@ -31,33 +30,39 @@ class JsonEncoder(json.JSONEncoder):
"""

def default(self, o: Any) -> Any:
if isinstance(o, (date, datetime, time)):
if d.use_datetime_any(o):
return self.format_datetime_obj(o)

if istraceback(o):
return "".join(traceback.format_tb(o)).strip()
if d.use_exception_default(o):
return d.exception_default(o)

# pylint: disable=unidiomatic-typecheck
if type(o) == Exception or isinstance(o, Exception) or type(o) == type:
return str(o)
if d.use_traceback_default(o):
return d.traceback_default(o)

if d.use_enum_default(o):
return d.enum_default(o)

if d.use_bytes_default(o):
return d.bytes_default(o)

if d.use_dataclass_default(o):
return d.dataclass_default(o)

if d.use_type_default(o):
return d.type_default(o)

try:
return super().default(o)

except TypeError:
try:
return str(o)

except Exception: # pylint: disable=broad-exception-caught
return None
return d.unknown_default(o)

def format_datetime_obj(self, o):
def format_datetime_obj(self, o: datetime.time | datetime.date | datetime.datetime) -> str:
"""Format datetime objects found in self.default
This allows subclasses to change the datetime format without understanding the
internals of the default method.
"""
return o.isoformat()
return d.datetime_any(o)


class JsonFormatter(core.BaseJsonFormatter):
Expand Down
19 changes: 18 additions & 1 deletion src/pythonjsonlogger/msgspec.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,12 +4,29 @@
from __future__ import annotations

## Standard Library
from typing import Any

## Installed
import msgspec.json

## Application
from . import core
from . import defaults as d


### FUNCTIONS
### ============================================================================
def msgspec_default(obj: Any) -> Any:
"""msgspec default encoder function for non-standard types"""
if d.use_exception_default(obj):
return d.exception_default(obj)
if d.use_traceback_default(obj):
return d.traceback_default(obj)
if d.use_enum_default(obj):
return d.enum_default(obj)
if d.use_type_default(obj):
return d.type_default(obj)
return d.unknown_default(obj)


### CLASSES
Expand All @@ -25,7 +42,7 @@ class MsgspecFormatter(core.BaseJsonFormatter):
def __init__(
self,
*args,
json_default: core.OptionalCallableOrStr = None,
json_default: core.OptionalCallableOrStr = msgspec_default,
**kwargs,
) -> None:
"""
Expand Down
21 changes: 20 additions & 1 deletion src/pythonjsonlogger/orjson.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,12 +4,31 @@
from __future__ import annotations

## Standard Library
from typing import Any

## Installed
import orjson

## Application
from . import core
from . import defaults as d


### FUNCTIONS
### ============================================================================
def orjson_default(obj: Any) -> Any:
"""orjson default encoder function for non-standard types"""
if d.use_exception_default(obj):
return d.exception_default(obj)
if d.use_traceback_default(obj):
return d.traceback_default(obj)
if d.use_bytes_default(obj):
return d.bytes_default(obj)
if d.use_enum_default(obj):
return d.enum_default(obj)
if d.use_type_default(obj):
return d.type_default(obj)
return d.unknown_default(obj)


### CLASSES
Expand All @@ -25,7 +44,7 @@ class OrjsonFormatter(core.BaseJsonFormatter):
def __init__(
self,
*args,
json_default: core.OptionalCallableOrStr = None,
json_default: core.OptionalCallableOrStr = orjson_default,
json_indent: bool = False,
**kwargs,
) -> None:
Expand Down
Loading

0 comments on commit b37c54b

Please sign in to comment.