Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Is there any way to compress the data while using mongo persistence with NEventStore? #62

Open
josevfrancis opened this issue Jul 19, 2022 · 11 comments

Comments

@josevfrancis
Copy link

I'm working with C#, Dotnet core, and NeventStore( version- 9.0.1), trying to evaluate various persistence options that it supports out of the box.

More specifically, when trying to use the mongo persistence, the payload is getting stored without any compression being applied.

Note: Payload compression is happening perfectly when using the SQL persistence of NEventStore whereas not with the mongo persistence.

I'm using the below code to create the event store and initialize:

private IStoreEvents CreateEventStore(string connectionString) { var store = Wireup.Init() .UsingMongoPersistence(connectionString, new NEventStore.Serialization.DocumentObjectSerializer()) .InitializeStorageEngine() .UsingBsonSerialization() .Compress() .HookIntoPipelineUsing() .Build(); return store; }

And, I'm using the below code for storing the events:

public async Task AddMessageTostore(Command command) { using (var stream = _eventStore.CreateStream(command.Id)) { stream.Add(new EventMessage { Body = command }); stream.CommitChanges(Guid.NewGuid()); } }

The workaround did: Implementing the PreCommit(CommitAttempt attempt) and Select methods in IPipelineHook and by using gzip compression logic the compression of events was achieved in MongoDB.

Attaching data store image of both SQL and mongo persistence:
MicrosoftTeams-image (12)

MicrosoftTeams-image (11)

So, the questions are:

Is there some other option or setting I'm missing so that the events get compressed while saving(fluent way of calling the compress method)?
Is the workaround mentioned above sensible to do or is it a performance overhead?

@josevfrancis
Copy link
Author

@AGiorgetti @andreabalducci @Iridio Can you guys at least share your initial thoughts? We're really gated with this. Thanks.

@AGiorgetti
Copy link
Member

Hi @josevfrancis , you can try this:

  • replace DocumentObjectSerializer with ByteStreamDocumentSerializer an IDocumentSerializer allows to use custom serialization on each event payload (DocumentObjectSerializer is a no-op serializer).
  • pass an instance of GzipSerializer to the ByteStreamDocumentSerializer (take a look at NEventStore serialization tests).

let me know if it solved your problem.

@andreabalducci
Copy link
Member

imho is useless unless you want to encrypt. Just enable compression for mongo (https://www.mongodb.com/blog/post/new-compression-options-mongodb-30)

@josevfrancis
Copy link
Author

josevfrancis commented Jul 22, 2022

Hi @josevfrancis , you can try this:

  • replace DocumentObjectSerializer with ByteStreamDocumentSerializer an IDocumentSerializer allows to use custom serialization on each event payload (DocumentObjectSerializer is a no-op serializer).
  • pass an instance of GzipSerializer to the ByteStreamDocumentSerializer (take a look at NEventStore serialization tests).

let me know if it solved your problem.

Hi @AGiorgetti

Thanks for the quick response.
We tried to replace DocumentObjectSerializer with ByteStreamDocumentSerializer while passing in a new GzipSerializer(new BinarySerializer()). This resulted in a "binaryformatter serialization and deserialization are disabled within this application" error.

Please let me know if I'm doing something wrong.

@josevfrancis
Copy link
Author

josevfrancis commented Jul 22, 2022

imho is useless unless you want to encrypt. Just enable compression for mongo (https://www.mongodb.com/blog/post/new-compression-options-mongodb-30)

Hi @andreabalducci

Thanks for the quick response.
I'm just being loud and stupid with my questions here.

I understand that we can use the MongoDB compression options but can't we save storage space if we try to save data that is already compressed?
BTW, I'm trying to compress the data by adding my compression logic within the PreCommit() method of the IPipelineHook.

Apart from that, when trying out SQL as commits persistence, with SerializationWireup.Compress() method we were able to reduce the size of the payload (It got reduced to 50% compared to uncompressed data). We are trying to replicate the same with MongoDB as a persistence.
Note: Our commits table may grow up significantly so thinking of optimizing the size as much as possible.

Are there any best practices around the same?

@AGiorgetti
Copy link
Member

Hi @josevfrancis , you can try this:

  • replace DocumentObjectSerializer with ByteStreamDocumentSerializer an IDocumentSerializer allows to use custom serialization on each event payload (DocumentObjectSerializer is a no-op serializer).
  • pass an instance of GzipSerializer to the ByteStreamDocumentSerializer (take a look at NEventStore serialization tests).

let me know if it solved your problem.

Hi @AGiorgetti

Thanks for the quick response. We tried to replace DocumentObjectSerializer with ByteStreamDocumentSerializer while passing in a new GzipSerializer(new BinarySerializer()). This resulted in a "binaryformatter serialization and deserialization are disabled within this application" error.

Please let me know if I'm doing something wrong.

Hi @josevfrancis , the BinaryFormatter was deprecated long time ago by the .NET team.
The current BinarySerializer implementation still uses the old BinaryFormatter (still there for testing purpose and because I'm too lazy to replace it), you should be able to implement your own ISerialize interface that reads and writes bytes to a Stream, it's pretty straightfoward.

@josevfrancis
Copy link
Author

reads and writes bytes to a Stream

Hi @AGiorgetti,

I have now replaced the BinarySerializer with a custom serializer having the below methods:

public virtual void Serialize<T>(Stream output, T graph)
        {
            using (StreamWriter streamWriter = new StreamWriter(output, Encoding.UTF8))
                this.Serialize((JsonWriter) new JsonTextWriter((TextWriter) streamWriter), (object) graph);
        }
protected virtual void Serialize(JsonWriter writer, object graph)
        {
            using (writer)
                _serializer.Serialize(writer, graph);
        }

And used it like ByteStreamDocumentSerializer(new GzipSerializer(new CustomSerializer()))

But, While adding the events to stream, I am getting "Unable to cast object of type 'System.Byte[]' to type 'NEventStore.EventMessage" error.

Please suggest if something is wrong.

@josevfrancis
Copy link
Author

reads and writes bytes to a Stream

Hi @AGiorgetti,

I have now replaced the BinarySerializer with a custom serializer having the below methods:

public virtual void Serialize<T>(Stream output, T graph)
        {
            using (StreamWriter streamWriter = new StreamWriter(output, Encoding.UTF8))
                this.Serialize((JsonWriter) new JsonTextWriter((TextWriter) streamWriter), (object) graph);
        }
protected virtual void Serialize(JsonWriter writer, object graph)
        {
            using (writer)
                _serializer.Serialize(writer, graph);
        }

And used it like ByteStreamDocumentSerializer(new GzipSerializer(new CustomSerializer()))

But, While adding the events to stream, I am getting "Unable to cast object of type 'System.Byte[]' to type 'NEventStore.EventMessage" error.

Please suggest if something is wrong.

@AGiorgetti Sorry for disturbing you with back-to-back questions.
Can you please let us know your thoughts whenever you have some time?

@josevfrancis
Copy link
Author

imho is useless unless you want to encrypt. Just enable compression for mongo (https://www.mongodb.com/blog/post/new-compression-options-mongodb-30)

Hi @andreabalducci

Thanks for the quick response. I'm just being loud and stupid with my questions here.

I understand that we can use the MongoDB compression options but can't we save storage space if we try to save data that is already compressed? BTW, I'm trying to compress the data by adding my compression logic within the PreCommit() method of the IPipelineHook.

Apart from that, when trying out SQL as commits persistence, with SerializationWireup.Compress() method we were able to reduce the size of the payload (It got reduced to 50% compared to uncompressed data). We are trying to replicate the same with MongoDB as a persistence. Note: Our commits table may grow up significantly so thinking of optimizing the size as much as possible.

Are there any best practices around the same?

@andreabalducci @Iridio Do you have any thoughts around the quoted message above?

@andreabalducci
Copy link
Member

double compression is a waste of cpu and adds little or no benefits (could be worse).
Mongo has his own strategies for space allocation. Enabling compression on Mongo will simplify all your data maintenance, query, diagnostics, management.
my2c.

@josevfrancis
Copy link
Author

@andreabalducci
Thanks for your comments.
Your 2 cents 💯 will add a lot of value to people with similar questions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants