deploy: 7c877f6

Arpita-Jaiswal · Mar 3, 2024 · 2ee6a1e · 2ee6a1e
1 parent a663fa7
commit 2ee6a1e
Show file tree

Hide file tree

Showing 29 changed files with 22,236 additions and 13 deletions.
diff --git a/-/arjaupashak.com/ddd/data-intensive-application/images/5-1-dark.png b/-/arjaupashak.com/ddd/data-intensive-application/images/5-1-dark.png
diff --git a/-/arjaupashak.com/ddd/data-intensive-application/images/5-1.png b/-/arjaupashak.com/ddd/data-intensive-application/images/5-1.png
diff --git a/-/arjaupashak.com/ddd/data-intensive-application/images/5-2-dark.png b/-/arjaupashak.com/ddd/data-intensive-application/images/5-2-dark.png
diff --git a/-/arjaupashak.com/ddd/data-intensive-application/images/5-2.png b/-/arjaupashak.com/ddd/data-intensive-application/images/5-2.png
diff --git a/FASTN.ftd b/FASTN.ftd
@@ -55,6 +55,9 @@ nav-title: My Day to Day Learnings and Discoveries
     document: ddd/data-intensive-application/3-storage-retrieval.ftd
   - -  Encoding and Evolution: /ddd/dia-4/
     document: ddd/data-intensive-application/4-encoding-and-evolution.ftd
+  - Part II. Distributed Data:
+  - - Replication: /ddd/dia-5/
+    document: ddd/data-intensive-application/5-replication.ftd
 
 ## FifthTry: /fifthtry/
 

diff --git a/ddd/data-intensive-application/1-reliable-scalable-maintainable/index.html b/ddd/data-intensive-application/1-reliable-scalable-maintainable/index.html
@@ -17,7 +17,7 @@
                 <script src="prism-8FE3A42AD547E2DBBC9FE16BA059F8F1870D613AD0C5B0B013D6205B72A62BAF.js"></script>
                 <script src="default-9987017245102B708B9BEC5DEE998BBF40BEAAD8E43363EA774F86B38F61E19B.js"></script>
                 <link rel="stylesheet" href="prism-73F718B9234C00C5C14AB6A11BF239A103F0B0F93B69CD55CB5C6530501182EB.css">
-                <script src="-/fastn-stack.github.io/fastn-js/download.js"></script><script src="-/arjaupashak.com/search.js"></script>
+                <script src="-/arjaupashak.com/search.js"></script><script src="-/fastn-stack.github.io/fastn-js/download.js"></script>
 
 
 

diff --git a/ddd/data-intensive-application/4-encoding-and-evolution/index.html b/ddd/data-intensive-application/4-encoding-and-evolution/index.html
@@ -17,7 +17,7 @@
                 <script src="prism-8FE3A42AD547E2DBBC9FE16BA059F8F1870D613AD0C5B0B013D6205B72A62BAF.js"></script>
                 <script src="default-9987017245102B708B9BEC5DEE998BBF40BEAAD8E43363EA774F86B38F61E19B.js"></script>
                 <link rel="stylesheet" href="prism-73F718B9234C00C5C14AB6A11BF239A103F0B0F93B69CD55CB5C6530501182EB.css">
-                <script src="-/arjaupashak.com/search.js"></script><script src="-/fastn-stack.github.io/fastn-js/download.js"></script>
+                <script src="-/fastn-stack.github.io/fastn-js/download.js"></script><script src="-/arjaupashak.com/search.js"></script>
 
 
 

diff --git a/ddd/data-intensive-application/5-replication.ftd b/ddd/data-intensive-application/5-replication.ftd
@@ -0,0 +1,128 @@
+-- ds.page: Replication
+
+**Note**: The content is taken from [`Designing Data Intensive Application by
+Martin Kleppmann`](https://public.nikhil.io/Designing%20Data%20Intensive%20Applications.pdf)
+
+In this chapter we will assume that your dataset is so small that each machine can
+hold a copy of the entire dataset. In later chapter we will relax that assumption and
+discuss partitioning (sharding) of datasets that are too big for a single machine.
+
+-- ds.h1: Leaders and Followers
+
+- One of the replicas is designated the leader (also known as master or primary).
+  When clients want to write to the database, they must send their requests to the
+  leader, which first writes the new data to its local storage.
+- The other replicas are known as followers (read replicas, slaves, secondaries, or hot
+  standbys). Whenever the leader writes new data to its local storage, it also sends
+  the data change to all of its followers as part of a replication log or change stream.
+- When a client wants to read from the database, it can query either the leader or
+  any of the followers. However, writes are only accepted on the leader (the follow‐
+  ers are read-only from the client’s point of view).
+
+-- ds.image:
+src: $assets.files.ddd.data-intensive-application.images.5-1.png
+
+-- ds.markdown:
+
+This mode of replication is a built-in feature of many relational databases,
+such as PostgreSQL (since version 9.0), MySQL, Oracle Data Guard, and SQL Server’s
+AlwaysOn Availability Groups. It is also used in some nonrelational databases,
+including MongoDB, RethinkDB, and Espresso. Finally, leader-based replication is
+not restricted to only databases: distributed message brokers such as Kafka and
+RabbitMQ highly available queues also use it. Some network filesystems and
+replicated block devices such as DRBD are similar.
+
+-- ds.h2: Synchronous Versus Asynchronous Replication
+
+-- ds.image:
+src: $assets.files.ddd.data-intensive-application.images.5-2.png
+
+-- ds.markdown:
+
+The replication to follower 1 is synchronous: the leader waits until follower 1
+has confirmed that it received the write before reporting success to the user.
+The replication to follower 2 is asynchronous: the leader sends the message, but
+doesn’t wait for a response from the follower.
+
+The **advantage** of synchronous replication is that the follower is guaranteed
+to have an up-to-date copy of the data that is consistent with the leader. If
+the leader suddenly fails, we can be sure that the data is still available on
+the follower. The **disadvantage** is that if the synchronous follower doesn’t
+respond (because it has crashed, or there is a network fault, or for any other
+reason), the write cannot be processed. The leader must block all writes and wait
+until the synchronous replica is available again.
+
+For that reason, it is impractical for all followers to be synchronous. In
+practice, if you enable synchronous replication on a database, it usually means
+that one of the followers is synchronous, and the others are asynchronous. If
+the synchronous follower becomes unavailable or slow, one of the asynchronous
+followers is made synchronous. This guarantees that you have an up-to-date copy
+of the data on at least two nodes. This configuration is sometimes also called
+semi-synchronous.
+
+Often, leader-based replication is configured to be completely asynchronous. In
+this case, if the leader fails and is not recoverable. This means that a write
+is not  guaranteed to be durable, even if it has been confirmed to the client.
+
+Weakening durability may sound like a bad trade-off, but asynchronous
+replication is nevertheless widely used, especially if there are many followers
+or if they are geo‐graphically distributed.
+
+-- ds.h2: Setting Up New Followers
+
+Simply copying data files from one node to another is typically not sufficient:
+clients are constantly writing to the database, and the data is always in flux,
+so a standard file copy would see different parts of the database at different
+points in time.
+
+You could make the files on disk consistent by locking the database (making it
+unavailable for writes), but that would go against our goal of high availability.
+Fortunately, setting up a follower can usually be done without downtime.
+Conceptually, the process looks like this:
+
+- Take a consistent snapshot of the leader’s database at some point in time—if
+  possible, without taking a lock on the entire database.
+- Copy the snapshot to the new follower node.
+- The follower connects to the leader and requests all the data changes that
+  have happened since the snapshot was taken. This requires that the snapshot
+  is associated with an exact position in the leader’s replication log. That
+  position has various names: for example, PostgreSQL calls it the *log sequence
+  number*, and MySQL calls it the *binlog coordinates*.
+- When the follower has processed the backlog of data changes since the
+  snapshot, we say it has *caught up*.
+
+The practical steps of setting up a follower vary significantly by database. In
+some systems the process is fully automated, whereas in others it can be a
+somewhat arcane multi-step workflow that needs to be manually performed by an
+administrator.
+
+
+-- ds.h2: Handling Node Outages
+
+Any node in the system can go down, perhaps unexpectedly due to a fault, but
+just as likely due to planned maintenance (for example, rebooting a machine to
+install a kernel security patch).
+
+How do you achieve high availability with leader-based replication?
+
+-- ds.h3: Follower failure: Catch-up recovery
+
+On its local disk, each follower keeps a log of the data changes it has received
+from the leader. If a follower crashes and is restarted, or if the network
+between the leader and the follower is temporarily interrupted, the follower
+can recover quite easily: from its log, it knows the last transaction that was
+processed before the fault occurred. Thus, the follower can connect to the
+leader and request all the data changes that occurred during the time when the
+follower was disconnected. When it has applied these changes, it has caught up
+to the leader and can continue receiving a stream of data changes as before.
+
+
+-- ds.h3: Leader failure: Failover
+
+Handling a failure of the leader is trickier: one of the followers needs to be
+promoted to be the new leader, clients need to be reconfigured to send their
+writes to the new leader, and the other followers need to start consuming data
+changes from the new leader. This process is called **failover**.
+
+
+-- end: ds.page