Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Revise the class-based interface #27

Open
saxbophone opened this issue May 2, 2018 · 8 comments
Open

Revise the class-based interface #27

saxbophone opened this issue May 2, 2018 · 8 comments
Labels
enhancement v1 This issue should be closed before version 1 is released

Comments

@saxbophone
Copy link
Owner

saxbophone commented May 2, 2018

This needs tidying up, there's two or three routes I can go down:

  • Remove the need to instantiate an encoder class by making the methods @classmethod
  • Change the paradigm to having the encoding settings set at object construction decided against
  • Use class inheritance far more to allow customisation at the inheritance level, using mixin classes e.g. there might be a StreamEncoder base class and a MappingEncoder mixin class. This would probably require deprecating the functional interface entirely.
@saxbophone
Copy link
Owner Author

saxbophone commented May 2, 2018

Illustrations of how certain approaches might look to the user:

class Base64EncoderBase(BaseEnccoder):
    input_base = 256
    output_base = 64
    input_ratio = 3
    output_ratio = 4

class Base64MappingMixin(MappingEncoderMixin, Base64EncoderBase):
    input_mapping = [chr(x) for x in range(256)]
    output_mapping = [char(x) for x in range(0x21, 0x21 + 64)]
    padding_character = '='

class Base64StreamingEncoder(Base64MappingMixin, StreamEncoder):
    pass

class Base64Encoder(Base64MappingMixin, Encoder):
    pass

@saxbophone
Copy link
Owner Author

Other things to consider:

  • Automatically generating the input/output ratio for a given base pair and the maximum number of symbols desired in a chunk.

@saxbophone saxbophone added the v1 This issue should be closed before version 1 is released label Oct 19, 2018
@saxbophone
Copy link
Owner Author

I've been doing some more thinking about this and here are some of my thoughts:

  • Perhaps create a EncoderTemplate class, which all custom encoders should inherit from to produce a generic encoder template for that class, for instance:
    • user creates class Base64EncoderTemplate which inherits from EncoderTemplate and sets some class variables defining how base64 is encoded
    • user then is able to create derivative mixin classes or subclasses of Base64EncoderTemplate as required for their needs (perhaps these need to be created manually or could be done via class methods provided in EncoderTemplate), for example we might then from this class, have subclasses Base64StreamEncoder, Base64Encoder or Base64RawEncoder.
  • Allow post-processing and pre-processing of the input and output for encoding and decoding operations. This will be needed for some encodings such as ASCII85 (base 85 for PDF), which IIRC uses some special run-length-encoding on runs of multiple zeros in the output. The easiest way to provide support for this in the library is probably to provide some stubbed methods that inheritors can override as needed (e.g. EncoderTemplate().encoder_pre_process(), EncoderTemplate().encoder_post_process(), EncoderTemplate().decoder_pre_process(), etc...)

@saxbophone
Copy link
Owner Author

I am wanton to deprecate the functional interface of the library if, once I begin implementing my new ideas it becomes easier to simply integrate the functionality into the class-based one.

However, if the functions are left in, they will not be deprecated. Any deprecation, if needed, should be done by v1.0.0 (breaking change).

@saxbophone
Copy link
Owner Author

Another thought: make all class methods @classmethod —encoder classes don't need to be instantiated really, as they don't need any state. Inheritance is useful for layering up their behaviour.

@saxbophone
Copy link
Owner Author

Base classes needed:

  • Class for raw encoding
    • Class for mapped encoding
    • Class for non-generator encoding (all other classes will work as generators by default)

Overrideable, customisable and hookable attributes and methods required in these classes:

  • class attributes for the basic details of the encoding configuration:
    • input/output symbol ratio
    • input base
    • output base
  • well-defined public generator class methods which do the main encoding and decoding (for extending as necessary)
  • well-defined public class methods which do the main encoding and decoding (for non-generator class variants and for extension as required)

@saxbophone
Copy link
Owner Author

saxbophone commented Nov 3, 2018

Example:

class RawStreamingEncoder(object):
    """
    Base class for generator-based encoders which operate on raw ints
    """
    @classmethod
    def encode(cls, data):
        # NOTE: should be implemented in real life!
        # this method will be a generator which implements most of the
        # encoding work, yielding raw ints in the output base
        raise NotImplementedError()

    @classmethod
    def decode(cls, data):
        # NOTE: should be implemented in real life!
        # this method will be a generator which implements most of the
        # decoding work, yielding raw ints in the input base
        raise NotImplementedError()


class MappedStreamingEncoder(RawStreamingEncoder):
    """
    Base class for generator-based encoders which operate on mapped symbols
    """
    @classmethod
    def encode(cls, data):
        # wrap generator with another generator, one which maps the symbols
        for symbol in super(MappedStreamingEncoder, cls).encode(data):
            yield super(MappedStreamingEncoder, cls).encoding_mapping_operation(symbol)

    @classmethod
    def decode(cls, data):
        # wrap generator with another generator, one which maps the symbols
        for symbol in super(MappedStreamingEncoder, cls).decode(data):
            yield super(MappedStreamingEncoder, cls).decoding_mapping_operation(symbol)


class RawEncoder(RawStreamingEncoder):
    """
    Base class for encoders which return a list of raw ints
    """
    @classmethod
    def encode(cls, data):
        # convert generator into list
        return list(super(Encoder, cls).encode(data))

    @classmethod
    def decode(cls, data):
        # convert generator into list
        return list(super(Encoder, cls).decode(data))


class Encoder(MappedStreamingEncoder, RawEncoder):
    """
    Base class for encoders which return a list of symbols
    """
    pass


class TypedEncoder(Encoder):
    """
    Base classs for encoders which return symbols coerced to a custom type
    (for example, outputting a string rather than a list of bytes)
    """
    @classmethod
    def encode(cls, data):
        return super(TypedEncoder, cls).coerce_input(data)

    @classmethod
    def decode(cls, data):
        return super(TypedEncoder, cls).coerce_output(data)


class Base64RawStreamingEncoder(RawStreamingEncoder):
    input_base = 256
    output_base = 64
    encoding_ratio = (3, 4,)
    pass


class Base64StreamingEncoder(Base64RawStreamingEncoder, MappedStreamingEncoder):
    input_alphabet = list(range(256))
    output_alphabet = list(
        'ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/'
    )
    padding = '='


class Base64Encoder(Base64StreamingEncoder, Encoder):
    pass

@saxbophone
Copy link
Owner Author

saxbophone commented Nov 3, 2018

Maybe we need EncoderTemplate classes too, as the above can be a bit unwieldy if one wants to instantiate multiple different combinations of encoder types from one template (e.g. both mapped and un-mapped, generator and non-generator base64).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement v1 This issue should be closed before version 1 is released
Projects
None yet
Development

No branches or pull requests

1 participant