Skip to content

Rev Tree Storage on Couchbase Server

Traun Leyden edited this page Jun 20, 2015 · 5 revisions

What problems do Revision Trees solve?

Revision Trees, or "Rev Trees" for short, are required to support semantics surrounding conflicts and conflict resolution. The best explanation of the problem is in the Conflicts chapter of the CouchDB guide.

Rev Tree + Sync Gateway implementation overview

  • Rev Trees are stored in the special _sync metadata field, in the Couchbase Server document itself (as opposed to in a metadata field)

  • Sync Gateway enforces the Rev Tree semantics

  • Internally, Sync Gateway stores the Rev Tree in memory via this structure, however when serialized to JSON it uses a more space-efficient structure.

Which revision bodies are stored

When the document contains conflicting revisions, and has multiple branches off of a single revision, then the body of the leaf revision of each branch is stored in the document.

diagram

Rev tree examples

Create a document via Sync Gateway REST API, which results in the following Couchbase Document with a _sync metadata field:

{
  "_sync": {
    "rev": "1-51ba9d966e99179007b295b601b0e013",
    "sequence": 64756,
    "recent_sequences": [
      64756
    ],
    "history": {
      "revs": [
        "1-51ba9d966e99179007b295b601b0e013"
      ],
      "parents": [
        -1
      ],
      "channels": [
        [
          "NBC"
        ]
      ]
    },
    "channels": {
      "NBC": null
    },
    "time_saved": "2015-06-18T19:17:40.469072739Z"
  },
  "channels": [
    "NBC"
  ],
  "type": "test_doc"
}

Update the doc -- add a new revision which adds the doc to another channel.

{
  "_sync": {
    "rev": "2-e2c395c6006f14e16d0fdd1884c3aedf",
    "sequence": 64757,
    "recent_sequences": [
      64756,
      64757
    ],
    "history": {
      "revs": [
        "1-51ba9d966e99179007b295b601b0e013",
        "2-e2c395c6006f14e16d0fdd1884c3aedf"
      ],
      "parents": [
        -1,
        0
      ],
      "channels": [
        [
          "NBC"
        ],
        [
          "ABC",
          "NBC"
        ]
      ]
    },
    "channels": {
      "ABC": null,
      "NBC": null
    },
    "time_saved": "2015-06-18T19:18:31.930967967Z"
  },
  "channels": [
    "NBC",
    "ABC"
  ],
  "type": "test_doc"
}

Create a conflicting revision:

{
  "_sync": {
    "rev": "2-e2c395c6006f14e16d0fdd1884c3aedf",
    "new_rev": "2-44ba9d966e99179007b295b601b0e013",
    "flags": 28,
    "sequence": 64759,
    "recent_sequences": [
      64756,
      64757,
      64759
    ],
    "history": {
      "revs": [
        "1-51ba9d966e99179007b295b601b0e013",
        "2-e2c395c6006f14e16d0fdd1884c3aedf",
        "2-44ba9d966e99179007b295b601b0e013"
      ],
      "parents": [
        -1,
        0,
        0
      ],
      "bodymap": {
        "2": "{\"channels\":[\"CBS\"],\"type\":\"test_doc_updated\"}"
      },
      "channels": [
        [
          "NBC"
        ],
        [
          "ABC",
          "NBC"
        ],
        [
          "CBS"
        ]
      ]
    },
    "channels": {
      "ABC": null,
      "NBC": null
    },
    "time_saved": "2015-06-18T20:51:42.248325853Z"
  },
  "channels": [
    "NBC",
    "ABC"
  ],
  "type": "test_doc"
}

And, adding another branch to the rev tree:

{
  "_sync": {
    "rev": "2-e2c395c6006f14e16d0fdd1884c3aedf",
    "new_rev": "2-33ba9d966e99179007b295b601b0e013",
    "flags": 28,
    "sequence": 64760,
    "recent_sequences": [
      64756,
      64757,
      64759,
      64760
    ],
    "history": {
      "revs": [
        "2-e2c395c6006f14e16d0fdd1884c3aedf",
        "2-44ba9d966e99179007b295b601b0e013",
        "2-33ba9d966e99179007b295b601b0e013",
        "1-51ba9d966e99179007b295b601b0e013"
      ],
      "parents": [
        3,
        3,
        3,
        -1
      ],
      "bodymap": {
        "1": "{\"channels\":[\"CBS\"],\"type\":\"test_doc_updated\"}",
        "2": "{\"channels\":[\"PBS\"],\"type\":\"test_doc_updated_second_conflict\"}"
      },
      "channels": [
        [
          "ABC",
          "NBC"
        ],
        [
          "CBS"
        ],
        [
          "PBS"
        ],
        [
          "NBC"
        ]
      ]
    },
    "channels": {
      "ABC": null,
      "NBC": null
    },
    "time_saved": "2015-06-18T20:54:01.006421523Z"
  },
  "channels": [
    "NBC",
    "ABC"
  ],
  "type": "test_doc"
}

Pruning

The problem that is being solved by pruning

screenshot

Pruning scenarios

Scenario 1: winning branch is pruned

diagram

Scenario 2: conflict prevents pruning

diagram

Scenario 3: conflict losing branch is pruned

diagram

What happens when new revisions are created off pruned revisions?

Suppose that you have a revision history with a single branch, which has 1000 revisions. Pruning is applied, with a MaxRevTreeDepth of 100, leaving only the last 100 revisions in the branch (900-1000). Then, there is an attempt to create a new revision off of revision 1, which has been pruned away. (note: this can be done via the _bulk_docs endpoint, with new_edits set to false)

What happens?

It creats a new "root revision branch" if it couldn't find a match for the anything in the incoming revision history in the existing history. (TODO: check if we have any unit tests to prove this)

Ways in which Rev Trees can "blow up" in size

  • Due to conflicts that limit the prunability (see Scenario 2 above). This results in storing excessive ancestor revision metadata (but the number of stored revision bodies are not affected by this)

  • Excessive number of conflicts (wide trees), will result in many revision bodies being stored, because:

    • As pruning only prunes revision metadata, and not bodies.

    • Leaf revision bodies for all branches must be kept indefinitely, and cannot be compacted away.

diagram

Clone this wiki locally