Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(backup): [DO NOT MERGE] support importing multiplatform backups [WPB-11230] #2163

Draft
wants to merge 1 commit into
base: develop
Choose a base branch
from

Conversation

vitorhugods
Copy link
Member

@vitorhugods vitorhugods commented Nov 18, 2024

Issue

We have different backup formats on different platforms. Not great for users.

The goal is to have a single format for all platforms, and in the future we can use this backup to sync history between clients. Like client A sending its history to client B when logging in, etc.

Important notes

  1. The main goal of this PR is not to be merged.
  2. The multiplatform backup library is a work in progress.
  3. The goal is to get feedback and showcase its usage.
  4. Encryption/Decryption is being developed

Because of [2], the backup.xcframework is not properly published to a regular dependency repository, but rather being added to the project just to get quick feedback before actually setting up automated deployment.

Testing

  1. Download this file: https://github.com/user-attachments/assets/db06c630-153b-41c1-b533-8755babaa0c1
  2. Rename it to "something.ios_wbu" (suffix needed for now, GitHub doesn't allow all formats).
  3. Login on the iOS app
  4. Import this file as a backup
  5. Use an empty password. (Encryption is WIP)
  6. See the console:

image


Checklist

  • Title contains a reference JIRA issue number like [WPB-XXX].
  • Description is filled and free of optional paragraphs.
  • Adds/updates automated tests.

@@ -208,6 +239,16 @@ extension CoreDataStack {
let backupStoreFile = backupDirectory.appendingPathComponent(databaseDirectoryName).appendingStoreFile()
let metadataURL = backupDirectory.appendingPathComponent(metadataFilename)

let multiplatformBackupFileName = MPBackup().ZIP_ENTRY_DATA
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is MBPBackup() and ZIP_ENTRY_DATA?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was supposed to be a static constant accessed by MPBackup.ZIP_ENTRY_DATA. It holds the name of the file within the zip that all MP Backups should have.

So, if it exists: it's a MP Backup.

Considering that this lib might have to deal with encryption/decryption as well... this might change and we might make it so the whole file becomes completely opaque to clients.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah it would make sense to me to simply pass the url and the importer throws an error if it's not what it expects.

) {
// TODO: Figure out the actual self-user domain before importing
let importer = MPBackupImporter(selfUserDomain: "wire.com")
let result = importer.import(multiplatformBackupFilePath: mpBackupFile.path())
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would have expected the import to be asynchronous. Is it blocking?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is blocking at the moment.

I am considering making it async as well in the next iterations.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be async or would it be a completion handler?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment on lines +201 to +213
let backupData = success.backupData
for user in backupData.users {
// TODO: Import users
print("Imported User \(user)")
}
for conversation in backupData.conversations {
// TODO: Import conversations
print("Imported Conversation \(conversation)")
}
for message in backupData.messages{
// TODO: Import messages
print("Imported Message \(message)")
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this load the entire backup contents into memory? If so, do you think it could be a problem?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a really good point.

I thought of it in the past and it was my main issue when opting for Protobuf as the underlying backup format.
Although small and efficient, there aren't many parsers that are able to read it as a stream. AFAIK it is not recommended to use it for big data sets.

Since then, my idea switched to instead of having a single entry within the Zip file, to have multiple entries. Each entry being a Protobuf with a chunk of messages.

I was planning on just having a single big protobuf for the POC and then work internally within the library to make this more optimal in the future.

However, I am now seeing the opportunity to attack this issue earlier 🤔

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Breaking it up in to chunks for manageable batches (i.e one file for a bunch of objects) makes sense to me. From the consumer point of view, I would imagine a kind of paging API:

for conversations in await backup.conversations {
  import(conversations)
}

for users in await backup.users {
  import(users)
}

for messages in await backup.messages {
  import messages
}

Since these pages would be delivered asynchronously, the library could internally manage its memory consumption by parsing protobuf objects in batches.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants