Update the comparison table to make it fair #250

Kerollmops · 2024-08-24T11:09:59Z

As stated in this comment, it would be preferable to be fair when comparing against competitors.

Change Summary

Remove a false claim that Meilisearch is not suited for production. The Meilisearch company runs thousands of instances with an availability of more than 98.9% and 99.9% for big customers.

PR Checklist

I have read and signed the Contributor License Agreement.

As stated [in this comment](meilisearch/meilisearch#1148 (comment)), it would be preferable to be fair when comparing against competitors.

jasonbosco · 2024-08-25T00:03:29Z

For a search datastore, an important element of being production-ready is that it does not become a single point of failure when it goes down. This is usually solved by having some replication or clustering mechanism built-in to the datastore, and then running multiple nodes of the datastore, so one node going down doesn't cause availability issues.

So until a search datastore has a built-in HA mechanism, I would not consider it production ready. This is not a false claim - it's an industry-standard definition.

The Meilisearch company runs thousands of instances with an availability of more than 98.9% and 99.9% for big customers.

In Meilisearch Cloud, do you run multiple compute nodes for each Meilisearch deployment, and replicate user data in near realtime between all the nodes?

If so, do you use some external solution that you've built to do this? Or is that HA mechanism built-in into a non-public version of Meilisearch that you run in Meilisearch Cloud? In either case, if such a multi-node solution exists in your Cloud product, let me know and we can may be clarify that in the table.

Kerollmops · 2024-08-28T07:50:23Z

In Meilisearch Cloud, do you run multiple compute nodes for each Meilisearch deployment, and replicate user data in near realtime between all the nodes?

If so, do you use some external solution that you've built to do this? Or is that HA mechanism built-in into a non-public version of Meilisearch that you run in Meilisearch Cloud? In either case, if such a multi-node solution exists in your Cloud product, let me know and we can may be clarify that in the table.

We replicate the disks on our different clusters and also do periodic snapshots; as Meilisearch boots in milliseconds, we don't have downtimes. Please remove this claim that Meilisearch is not production-ready. "Having some replication or clustering mechanism built-in to the datastore" is not mandatory to be highly available; there are other solutions that we at Meilisearch rely on.

In addition, a good availability ratio is the only requirement for a solution to be considered production-ready. The nature of the algorithm used has nothing to do with this claim, only the quality of service.

jasonbosco · 2024-08-29T05:16:50Z

We replicate the disks on our different clusters

I assume you're talking about your infrastructure provider's disk replication feature. So it sounds like you're still running a single Meilisearch process at any given time, and the OS running that single process has this disk mounted on it?

That is the classic definition of a single point of failure.

For eg, if the underlying hardware running that single instance of Meilisearch fails, and you're unable to provision a new replacement node (due to the same hardware issues - trust me this happens at scale), having your data in a replicated disk doesn't help and you experience downtime.

This is why you need to have multiple independent instances of the Meilisearch process running, in different data centers, each with its own independent disk and hardware, replicating data in real-time between each other, to prevent various kinds of unexpected issues like the above.

That way, even if one node goes down, the other independent nodes will continue servicing traffic, uninterrupted. Or alternatively, in some systems you'd promote one of the replicas (that already has the data) to become the primary, but it's still an independent node with its own disk, hardware, running it's own instance of the software.

Here's another way to think about it - think of a typical web app. There is a reason why at scale you run multiple web server processes on independent nodes and put them behind a load balancer, instead of just running one large single node to serve all the traffic (even though the web server process itself boots up in milliseconds). Besides being able to distribute the load, it's also so that that single node doesn't become an SPOF in case of hardware failures - regardless of how quickly you can provision a replacement node or spin up the web server process inside the node.

So without the ability to run independent distributed / replicated Meilisearch processes, you have a single point of failure (SPOF), even with a replicated disk. And when you have an SPOF in a system, it is not considered highly available.

You simply don't run anything with an SPOF in a production environment, if up-time is important for that service.

a good availability ratio is the only requirement for a solution to be considered production-ready.

Ummm. I can agree that a good availability percentage is one of the requirements for production-readiness, but it is most certainly not the only requirement.

Final question for you to hopefully get my point across:

For the database you use to power your Cloud application - the one that stores the users that sign up, projects they create, etc (think MySQL, Postgres, MongoDB, etc) are you running a single node, with a single instance of that database process (even with replicated disks)? Or do you have a full replica setup for that database, running on a different machine in parallel?

The answer to that question should hopefully reveal the importance of (or not) of everything I tried to explain above.

Kerollmops added 2 commits August 24, 2024 13:07

Update the comparison table to make it fair

d90f9ca

As stated [in this comment](meilisearch/meilisearch#1148 (comment)), it would be preferable to be fair when comparing against competitors.

Fix the table

d99578d

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Update the comparison table to make it fair #250

Update the comparison table to make it fair #250

Kerollmops commented Aug 24, 2024

jasonbosco commented Aug 25, 2024

Kerollmops commented Aug 28, 2024

jasonbosco commented Aug 29, 2024 •

edited

Loading

Update the comparison table to make it fair #250

Are you sure you want to change the base?

Update the comparison table to make it fair #250

Conversation

Kerollmops commented Aug 24, 2024

Change Summary

PR Checklist

jasonbosco commented Aug 25, 2024

Kerollmops commented Aug 28, 2024

jasonbosco commented Aug 29, 2024 • edited Loading

jasonbosco commented Aug 29, 2024 •

edited

Loading