-
Notifications
You must be signed in to change notification settings - Fork 138
Rev Tree Storage on Couchbase Server
Revision Trees, or "Rev Trees" for short, are required to support semantics surrounding conflicts and conflict resolution. The best explanation of the problem is in the Conflicts chapter of the CouchDB guide.
-
Rev Trees are stored in the special
_sync
metadata field, in the Couchbase Server document itself (as opposed to in a metadata field) -
Sync Gateway enforces the Rev Tree semantics
-
Internally, Sync Gateway stores the Rev Tree in memory via this structure, however when serialized to JSON it uses a more space-efficient structure.
When the document contains conflicting revisions, and has multiple branches off of a single revision, then the body of the leaf revision of each branch is stored in the document.
Create a document via Sync Gateway REST API, which results in the following Couchbase Document with a _sync
metadata field:
{
"_sync": {
"rev": "1-51ba9d966e99179007b295b601b0e013",
"sequence": 64756,
"recent_sequences": [
64756
],
"history": {
"revs": [
"1-51ba9d966e99179007b295b601b0e013"
],
"parents": [
-1
],
"channels": [
[
"NBC"
]
]
},
"channels": {
"NBC": null
},
"time_saved": "2015-06-18T19:17:40.469072739Z"
},
"channels": [
"NBC"
],
"type": "test_doc"
}
Update the doc -- add a new revision which adds the doc to another channel.
{
"_sync": {
"rev": "2-e2c395c6006f14e16d0fdd1884c3aedf",
"sequence": 64757,
"recent_sequences": [
64756,
64757
],
"history": {
"revs": [
"1-51ba9d966e99179007b295b601b0e013",
"2-e2c395c6006f14e16d0fdd1884c3aedf"
],
"parents": [
-1,
0
],
"channels": [
[
"NBC"
],
[
"ABC",
"NBC"
]
]
},
"channels": {
"ABC": null,
"NBC": null
},
"time_saved": "2015-06-18T19:18:31.930967967Z"
},
"channels": [
"NBC",
"ABC"
],
"type": "test_doc"
}
Create a conflicting revision:
{
"_sync": {
"rev": "2-e2c395c6006f14e16d0fdd1884c3aedf",
"new_rev": "2-44ba9d966e99179007b295b601b0e013",
"flags": 28,
"sequence": 64759,
"recent_sequences": [
64756,
64757,
64759
],
"history": {
"revs": [
"1-51ba9d966e99179007b295b601b0e013",
"2-e2c395c6006f14e16d0fdd1884c3aedf",
"2-44ba9d966e99179007b295b601b0e013"
],
"parents": [
-1,
0,
0
],
"bodymap": {
"2": "{\"channels\":[\"CBS\"],\"type\":\"test_doc_updated\"}"
},
"channels": [
[
"NBC"
],
[
"ABC",
"NBC"
],
[
"CBS"
]
]
},
"channels": {
"ABC": null,
"NBC": null
},
"time_saved": "2015-06-18T20:51:42.248325853Z"
},
"channels": [
"NBC",
"ABC"
],
"type": "test_doc"
}
And, adding another branch to the rev tree:
{
"_sync": {
"rev": "2-e2c395c6006f14e16d0fdd1884c3aedf",
"new_rev": "2-33ba9d966e99179007b295b601b0e013",
"flags": 28,
"sequence": 64760,
"recent_sequences": [
64756,
64757,
64759,
64760
],
"history": {
"revs": [
"2-e2c395c6006f14e16d0fdd1884c3aedf",
"2-44ba9d966e99179007b295b601b0e013",
"2-33ba9d966e99179007b295b601b0e013",
"1-51ba9d966e99179007b295b601b0e013"
],
"parents": [
3,
3,
3,
-1
],
"bodymap": {
"1": "{\"channels\":[\"CBS\"],\"type\":\"test_doc_updated\"}",
"2": "{\"channels\":[\"PBS\"],\"type\":\"test_doc_updated_second_conflict\"}"
},
"channels": [
[
"ABC",
"NBC"
],
[
"CBS"
],
[
"PBS"
],
[
"NBC"
]
]
},
"channels": {
"ABC": null,
"NBC": null
},
"time_saved": "2015-06-18T20:54:01.006421523Z"
},
"channels": [
"NBC",
"ABC"
],
"type": "test_doc"
}
Suppose that you have a revision history with a single branch, which has 1000 revisions. Pruning is applied, with a MaxRevTreeDepth of 100, leaving only the last 100 revisions in the branch (900-1000). Then, there is an attempt to create a new revision off of revision 1, which has been pruned away. (note: this can be done via the _bulk_docs endpoint, with new_edits set to false)
What happens?
It creats a new "root revision branch" if it couldn't find a match for the anything in the incoming revision history in the existing history. (TODO: check if we have any unit tests to prove this)
-
Due to conflicts that limit the prunability (see Scenario 2 above). This results in storing excessive ancestor revision metadata (but the number of stored revision bodies are not affected by this)
-
Excessive number of conflicts (wide trees), will result in many revision bodies being stored, because:
-
As pruning only prunes revision metadata, and not bodies.
-
Leaf revision bodies for all branches must be kept indefinitely, and cannot be compacted away.
-