Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Wait for createDocument to be loaded for subsequent createConnections #822

Merged
merged 5 commits into from
May 17, 2024
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
20 changes: 15 additions & 5 deletions packages/server/src/Hocuspocus.ts
Original file line number Diff line number Diff line change
Expand Up @@ -72,6 +72,8 @@ export class Hocuspocus {
onDestroy: () => new Promise(r => r(null)),
}

loadingDocuments: Map<string, Promise<Document>> = new Map()

documents: Map<string, Document> = new Map()

server?: HocuspocusServer
Expand Down Expand Up @@ -414,15 +416,23 @@ export class Hocuspocus {
/**
* Create a new document by the given request
*/
public async createDocument(documentName: string, request: Partial<Pick<IncomingMessage, 'headers' | 'url'>>, socketId: string, connection: ConnectionConfiguration, context?: any): Promise<Document> {
if (this.documents.has(documentName)) {
const document = this.documents.get(documentName)
public createDocument(documentName: string, request: Partial<Pick<IncomingMessage, 'headers' | 'url'>>, socketId: string, connection: ConnectionConfiguration, context?: any): Promise<Document> {
if (this.loadingDocuments.has(documentName)) {
const documentPromise = this.loadingDocuments.get(documentName)

if (document) {
return document
if (documentPromise) {
return documentPromise
}
}
janthurau marked this conversation as resolved.
Show resolved Hide resolved

const loadDocPromise = this.loadDocument(documentName, request, socketId, connection, context)

this.loadingDocuments.set(documentName, loadDocPromise)

return loadDocPromise
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This map will just grow in memory and always short circuit fetching the document (since it will have the promise in memory).
I'd suggest doing a .finally to remove this documentName from the map afterwards to clean it up. We don't want this to act as a cache

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch here on this growing infinitely!

However, I dont quite follow using .finally here. The issue that I am thinking of is that we use loadingDocuments as the state of whether a doc has been loaded in the form of a promise. The only time it seems acceptable to delete from this map is when we call unloadDocument. I just pushed a change that removes the Promise<Document> when unloadDocument is called.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jordangarcia I'm unsure that your change got pushed up here, can you check that?

I think that we are thinking of this mapping a bit differently. I am thinking of this map only being useful for "re-using" a fetch to the loading document. This would make it so that two clients requesting the same document while loading would wait for the same document to finish loading.

Client A fetches Doc 1
Client A waits for Doc 1 to load
Client B fetches Doc 1
Client B waits on the same promise as Client A
Doc loads
Client A syncs
Client B syncs

Whereas if I'm understanding you correctly, you want to store the promise here and use this as the source of truth for documents that are loaded or not. When this is what the documents map is meant to do.

Please let me know your thoughts

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My interpretation of the code is that the this.documents map isn't quite intended to keep track of "loaded" documents. Since we insert into that map before the document is loaded. this.documents seems to represent the documents that this hocuspocus instance is managing, regardless of their loading state. Changing this.documents to represent documents to be loaded seems like a much bigger API change since many functions rely on this mapping to understand the state of the system.

The loadDocuments map would be used as the state of the loading documents and a way for createDocument to guarantee a singular doc always being returned (or a promise to the resolved value). The fact that loadDocuments sets the created document in this.documents then returns the document gaurantees referential equality between the two maps. I do think this may seem a bit fragile to code changes that wouldn't keep these two in sync.

The key issue here is that createDocument was using the this.documents map as a cache of the documents. To be consistent we always want createDocument to be Promise<Document> of the loaded document. I dont really see a way around not using loadingDocuments as a cache here for createDocument. We could add more state to the system and keep track of Map<string, boolean> for whether a document is loaded and then use this.documents instead of this.loadingDocuments but that just seems more complex to me.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea this makes sense to me as well. createDocument is idempotent and will always return a Promise that resolves to the singular (referential) source of truth Document.

I do see how this technically is adding more "redundant" state, but I think it's quite clear and simple.

@nperez0111 Curious to hear your thoughts!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I looked into this with @janthurau and we came up with the commit he added on top.

How we were thinking of it is that loadingDocument is just a temporary cache for documents that are still in progress of loading and that anyone who calls createDocument will just receive the same promise that is currently loading if there is one.

We also maintained the signature of Promise<Document> with Promise.resolve of this.documents.get(documentName)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is a good solution as well. Thanks for helping out!

Would love to get this out as soon as possible.


async loadDocument(documentName: string, request: Partial<Pick<IncomingMessage, 'headers' | 'url'>>, socketId: string, connection: ConnectionConfiguration, context?: any): Promise<Document> {
const requestHeaders = request.headers ?? {}
const requestParameters = getParameters(request)

Expand Down
211 changes: 52 additions & 159 deletions tests/server/onLoadDocument.ts
Original file line number Diff line number Diff line change
@@ -1,6 +1,8 @@
import test from 'ava'
import { newHocuspocus, newHocuspocusProvider } from '../utils/index.js'

const sleep = (ms: number) => new Promise(resolve => setTimeout(resolve, ms))

test('executes the onLoadDocument callback', async t => {
await new Promise(async resolve => {
const server = await newHocuspocus({
Expand Down Expand Up @@ -130,6 +132,56 @@ test('multiple simultaneous connections do not create multiple documents', async
})
})

test('multiple simultaneous connections wait for the document to be loaded', async t => {
t.plan(6)

await new Promise(async resolve => {
let resolveOnLoadDocument: () => void = () => {}

const server = await newHocuspocus({
onLoadDocument({ document }) {
// delay more accurately simulates a database fetch
return new Promise(async innerResolve => {
resolveOnLoadDocument = () => {
document.getArray('foo').insert(0, ['bar'])
innerResolve(document)
}
})
},
})

const provider1 = newHocuspocusProvider(server)
const provider2 = newHocuspocusProvider(server)
let provider1Synced = false
let provider2Synced = false

provider1.on('synced', () => {
provider1Synced = true
const value = provider1.document.getArray('foo').get(0)
t.is(value, 'bar')
})
provider2.on('synced', () => {
provider2Synced = true
const value = provider2.document.getArray('foo').get(0)
t.is(value, 'bar')
})

await sleep(100)

t.false(provider1Synced, 'provider1Synced')
t.false(provider2Synced, 'provider2Synced')

resolveOnLoadDocument()

await sleep(100)

t.true(provider1Synced, 'provider1Synced')
t.true(provider2Synced, 'provider2Synced')

resolve('done')
})
})

test('has the server instance', async t => {
await new Promise(async resolve => {
const server = await newHocuspocus({
Expand Down Expand Up @@ -164,60 +216,6 @@ test('stops when an error is thrown in onLoadDocument', async t => {
})
})

test('disconnects all clients related to the document when an error is thrown in onLoadDocument', async t => {
const resolvesNeeded = 4

await new Promise(async resolve => {

const server = await newHocuspocus({
async onLoadDocument() {
return new Promise((resolve, fail) => {
setTimeout(() => {
// eslint-disable-next-line prefer-promise-reject-errors
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why remove this test?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test is removed because this behavior should not exist anymore. This test is asserting that if the onLoadDocument throws then provider2 should still be able to connect to it, which is what we do not want.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But the error may be transient, so if it throws it should be able to retry, no?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not quite sure i follow. Wouldn't it be up to the HocuspocusProvider to handle retry logic once the connection fails to be established.

Copy link
Contributor Author

@jordangarcia jordangarcia May 14, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I add the following console logs to the test i see:

test.only('disconnects all clients related to the document when an error is thrown in onLoadDocument', async t => {
  const resolvesNeeded = 4

  await new Promise(async resolve => {

    const server = await newHocuspocus({
      async onLoadDocument() {
        return new Promise((resolve, fail) => {
          setTimeout(() => {
            // eslint-disable-next-line prefer-promise-reject-errors
            fail('ERROR')
          }, 250)
        })
      },
      async onStoreDocument(data) {
        t.fail('MUST NOT call onStoreDocument')
      },
    })

    let resolvedNumber = 0
    const resolver = () => {
      resolvedNumber += 1

      if (resolvedNumber >= resolvesNeeded) {
        t.is(server.documents.size, 0)
        t.is(server.getConnectionsCount(), 0)
        resolve('done')
      }
    }

    const provider1 = newHocuspocusProvider(server, {
      onConnect() {
        console.log('provider1 connect')
        resolver()
      },
      onClose(event) {
        provider1.disconnect()
        console.log('provider1 disconnect')
        resolver()
      },
    })

    const provider2 = newHocuspocusProvider(server, {
      onConnect() {
        console.log('provider2 connect')
        resolver()
      },
      onClose() {
        provider2.disconnect()
        console.log('provider2 disconnect')
        resolver()
      },
    })

  })

})
provider2 connect
provider1 disconnect
provider2 disconnect
provider2 disconnect

I feel like this is not what i would expect this behavior to do in the first place.

fail('ERROR')
}, 250)
})
},
async onStoreDocument(data) {
t.fail('MUST NOT call onStoreDocument')
},
})

let resolvedNumber = 0
const resolver = () => {
resolvedNumber += 1

if (resolvedNumber >= resolvesNeeded) {
t.is(server.documents.size, 0)
t.is(server.getConnectionsCount(), 0)
resolve('done')
}
}

const provider1 = newHocuspocusProvider(server, {
onConnect() {
resolver()
},
onClose(event) {
provider1.disconnect()
resolver()
},
})

const provider2 = newHocuspocusProvider(server, {
onConnect() {
resolver()
},
onClose() {
provider2.disconnect()
resolver()
},
})

})

})

test('stops when an error is thrown in onLoadDocument, even when authenticated', async t => {
await new Promise(async resolve => {
const server = await newHocuspocus({
Expand All @@ -241,108 +239,3 @@ test('stops when an error is thrown in onLoadDocument, even when authenticated',
})
})
})

test('if a new connection connects while the previous connection still fetches the document, it will just work properly', async t => {
let callsToOnLoadDocument = 0
const resolvesNeeded = 11

await new Promise(async resolve => {

let resolvedNumber = 0
const resolver = () => {
resolvedNumber += 1

if (resolvedNumber >= resolvesNeeded) {
t.is(callsToOnLoadDocument, 1)
resolve('done')
}
}

const server = await newHocuspocus({
onLoadDocument({ document }) {
return new Promise(async resolve => {
setTimeout(() => {
callsToOnLoadDocument += 1
document.getArray('foo').insert(0, [`bar-${callsToOnLoadDocument}`])
resolve(document)
}, 5000)
})
},
})

let provider1MessagesReceived = 0
const provider = newHocuspocusProvider(server, {
onSynced({ state }) {
// if (!state) return
t.is(server.documents.size, 1)

const value = provider.document.getArray('foo').get(0)
t.is(value, 'bar-1')

setTimeout(() => {
provider.document.getArray('foo').insert(0, ['bar-updatedAfterProvider1Synced'])
}, 100)

resolver()
},
onMessage() {
if (!provider.isSynced) return
provider1MessagesReceived += 1

const value = provider.document.getArray('foo').get(0)

if (provider1MessagesReceived === 1) {
// do nothing, this is just the ACK for the sync
} else if (provider1MessagesReceived === 2) {
// do nothing, this is just the ACK for the received update (set "bar-updatedAfterProvider1Synced")
} else if (provider1MessagesReceived === 3) {
t.is(value, 'bar-updatedAfterProvider1Synced')
} else {
t.is(value, 'bar-updatedAfterProvider2ReceivedMessageFrom1')
}

resolver()
},
})

let provider2MessagesReceived = 0
setTimeout(() => {
const provider2 = newHocuspocusProvider(server, {
onSynced({ state }) {
// if (!state) return

t.is(server.documents.size, 1)

const value = provider.document.getArray('foo').get(0)
t.is(value, undefined) // document hasnt loaded yet because it loads for 5sec, but this runs after ~2sec

resolver()
},
onMessage(data) {
if (!provider2.isSynced) return
provider2MessagesReceived += 1

const value = provider.document.getArray('foo').get(0)

if (provider2MessagesReceived === 1) {
// do nothing, this is just the ACK for the sync
t.is(value, undefined)
} else if (provider2MessagesReceived === 2) {
// initial state is now synced
t.is(value, undefined)
} else if (provider2MessagesReceived === 3) {
t.is(value, 'bar-updatedAfterProvider1Synced')
setTimeout(() => {
provider.document.getArray('foo').insert(0, ['bar-updatedAfterProvider2ReceivedMessageFrom1'])
}, 100)
} else {
t.is(value, 'bar-updatedAfterProvider2ReceivedMessageFrom1')
}

resolver()
},
})

}, 2000)
})
})