feat(cozy-clisk):Implementing selective file dowload and fixing metadata dedupli… #930

LucsT · 2023-03-17T10:09:11Z

This PR bring a new exported function in cozy-clisk for flagship app.
This function is needed by the launcher to evaluate early if the file can be found on the cozy.
This code will be use by this PR on flagship app cozy/cozy-flagship-app#670

-> Separation and adaptation of function getFileIfExist in the lib
-> Evaluation of some parameters sooner in the process (folderpath, sourceAccountIdentifier, ...)
-> Adding some units tests that cover the function getFileIfExist.
-> Fixing deduplication on metadata that was not operational before due to received data structure

Need to be done before merge
-> Adapting timing wraper to this new lib function.

Should be done soon
-> Moving 'shouldreplace' sooner in the process
-> We lost the 'free' metadata update when file is already present, as we now do not update each time the file. Need to be adapt. (ex: add qualification on an existing already download file, ...)
-> At minimum, write basic saveFiles tests.

…data deduplication

Crash-- · 2023-03-20T17:42:09Z

packages/cozy-clisk/src/launcher/getFileIfExists.js

+
+module.exports = getFileIfExists
+
+async function getFileIfExists({ client, entry, options, folderPath, slug }) {


jsdoc please with //ts-check

If we make a request to the cozy, then the method should be called fetchSomething (https://github.com/cozy/cozy-guidelines/#naming-of-functions)

Get should only be used to get something in memory

Crash-- · 2023-03-20T17:42:35Z

packages/cozy-clisk/src/launcher/getFileIfExists.js

+
+async function getFileIfExists({ client, entry, options, folderPath, slug }) {
+  const fileIdAttributes = options.fileIdAttributes
+  const sourceAccountIdentifier = options.sourceAccountIdentifier


nit: const { fileIdAttributes, sourceAccountIdentifier } = options

Crash-- · 2023-03-20T17:43:47Z

packages/cozy-clisk/src/launcher/getFileIfExists.js

+  } else {
+    log.debug('Rolling back on detection by filename')
+    return getFileFromPath(client, entry, folderPath)
+  }


We prefer the early return, it's a lot easier to read:

if(!isReadyForFileMedata){ return getFileFromPath } else { ... ... }

Crash-- · 2023-03-20T17:46:14Z

packages/cozy-clisk/src/launcher/getFileIfExists.js

+  }
+}
+
+async function getFileFromMetaData(


same here for naming convention. Please follow https://github.com/cozy/cozy-guidelines/#naming-of-functions

Crash-- · 2023-03-20T17:46:19Z

packages/cozy-clisk/src/launcher/getFileIfExists.js

+  fileIdAttributes,
+  sourceAccountIdentifier,
+  slug
+) {


jsdoc please

Crash-- · 2023-03-20T17:48:00Z

packages/cozy-clisk/src/launcher/getFileIfExists.js

+
+  const isReadyForFileMetadata =
+    fileIdAttributes && slug && sourceAccountIdentifier
+  if (isReadyForFileMetadata) {


to me isReadyForFileMetadata means that this method will be called several times in a short period of time. But I don't think this is the meaning here. It's more hasEnoughtMetadata or something no?

Yes, this is to decide if enough Metadata are available.
I can change some naming, but this kind of naming was inherited from the existing saveFiles
Same for functions name that you point.

Crash-- · 2023-03-20T17:57:24Z

packages/cozy-clisk/src/launcher/getFileIfExists.js

+        'cozyMetadata.sourceAccountIdentifier',
+        'cozyMetadata.createdByApp'
+      ])
+  )


1/ We should really avoid the queryAll() method. This method can return a lot of documents and can be pretty long to execute. Can't we restrict the returned data by querying only the last 12 months or else?

2/ I don't know if you need all the fields from this io.cozy.files but you can reduce the size of the returned data by selecting them by adding : .select(['name', 'cozyMetadata.createdAt'])

3/ You queryAll() but after that, you only return the first result. You should use something like this instead:

client.query(Q('io.cozy.files').where...limitBy(1))

be careful by changing to client.query() since query() will return an object with data.

Yes, the queryAll is here only to know if more than one document satisfies the request, to log a waring if more than on file does it.

My mistake in the first place.

The get the same feature, we should then use client.query with limitBy(2)

Crash-- · 2023-03-20T17:59:29Z

packages/cozy-clisk/src/launcher/getFileIfExists.js

+) {
+  log.debug(
+    `Checking existence of ${calculateFileKey(entry, fileIdAttributes)}`
+  )


const fileKey = calculateFileKey(entry, fileIdAttributes)

and use fileKey later instead of calculating it 3 times

Crash-- · 2023-03-20T18:01:10Z

packages/cozy-clisk/src/launcher/getFileIfExists.js

+        'metadata.fileIdAttributes',
+        'trashed',
+        'cozyMetadata.sourceAccountIdentifier',
+        'cozyMetadata.createdByApp'


Since the order is important (https://docs.cozy.io/en/tutorials/data/advanced/#indexes-performances-and-design) I'll changed to:

[ 'metadata.fileIdAttributes', 'cozyMetadata.sourceAccountIdentifier', 'cozyMetadata.createdByApp' 'trashed' ]

I'd also suggest to use a partial index for the trashed attribute:

Q('io.cozy.files') .where({ metadata: { fileIdAttributes: calculateFileKey(entry, fileIdAttributes) }, cozyMetadata: { sourceAccountIdentifier, createdByApp: slug } }) .indexFields([ 'metadata.fileIdAttributes', 'cozyMetadata.sourceAccountIdentifier', 'cozyMetadata.createdByApp' ]) .partialIndex({ trashed: false })

Why that? I though the same at first (putting it in a partial index) but then I checked trashed attribute and it seems that is always there. So I don't understand why to put it in a partial index. :'(

Since it's always there, it will always be indexed, but we only need the trashed: false docs. So the partial index will allow a lighter index with only non-trashed files

Crash-- · 2023-03-20T18:01:36Z

packages/cozy-clisk/src/launcher/getFileIfExists.js

+  }
+}
+
+async function getFileFromPath(client, entry, folderPath) {


Crash-- · 2023-03-20T18:01:54Z

packages/cozy-clisk/src/launcher/getFileIfExists.js

+
+async function getFileFromPath(client, entry, folderPath) {
+  try {
+    log.debug(`Checking existence of ${getFilePath({ entry, folderPath })}`)


store the result of getFilePath and use it later 🙏

Crash-- · 2023-03-20T18:03:12Z

packages/cozy-clisk/src/launcher/getFileIfExists.js

+    return files[0]
+  } else {
+    log.debug('File not found')
+    return false


should return null. Not boolean

Crash-- · 2023-03-20T18:03:30Z

packages/cozy-clisk/src/launcher/getFileIfExists.js

+    return result.data
+  } catch (err) {
+    log.debug(err.message)
+    return false


same should return null

Crash-- · 2023-03-21T07:36:41Z

Y a un truc qui me dérange avec ça, c'est qu'on fait les requêtes fichier par fichier. Et si on en a beaucoup, ça va vraiment pas être terrible.

Crash-- · 2023-03-21T07:38:54Z

Et aussi, on est censé avoir un context sur les clisk des fichiers déjà récupérés etc, on devrait le passer à cet endroit là, non ?

doubleface · 2023-03-21T08:22:44Z

@Crash-- Et aussi, on est censé avoir un context sur les clisk des fichiers déjà récupérés etc, on devrait le passer à cet endroit là, non ?

On avait enlevé la récupération du contexte lorsque les fichier n'avaient pas le bon createdByApp. Et on ne l'a pas remis depuis. Je pense qu'on peut faire une carte pour ça

doubleface · 2023-03-21T08:27:52Z

@Crash-- Y a un truc qui me dérange avec ça, c'est qu'on fait les requêtes fichier par fichier. Et si on en a beaucoup, ça va vraiment pas être terrible.

Avec un index ça n'est pas sensé être suffisamment performant ? Parce que sinon, on va de toutes façons devoir se reconstruire un inde x en mémoire pour savoir si un fichier existe déjà.

En même temps, c'est vrai qu'ici on est sur un téléphone et le temps de réponse du réseau risque d'être plus élevé. Je ne sais pas vraiment dire à quel point.

Mais je suppose que c'est une bonne chose qu'on soit plus facilement capables de faire des mesures maintenant

Crash-- · 2024-01-02T07:17:01Z

@doubleface @LucsT what is the status of this PR?

LucsT changed the title ~~feat[cozy-clisk]:Implementing selective file dowload and fixing metadata dedupli…~~ feat(cozy-clisk):Implementing selective file dowload and fixing metadata dedupli… Mar 17, 2023

LucsT force-pushed the feat/SelectiveFileDownload branch 2 times, most recently from 2c89c24 to ecb3e4e Compare March 17, 2023 10:16

feat(cozy-clisk): Implementing selective file dowload and fixing meta…

c61d290

…data deduplication

LucsT force-pushed the feat/SelectiveFileDownload branch from ecb3e4e to c61d290 Compare March 17, 2023 10:22

LucsT mentioned this pull request Mar 17, 2023

Feat/selective file download cozy/cozy-flagship-app#670

Closed

LucsT marked this pull request as ready for review March 20, 2023 15:26

LucsT requested a review from a team as a code owner March 20, 2023 15:26

Crash-- reviewed Mar 20, 2023

View reviewed changes

packages/cozy-clisk/src/launcher/getFileIfExists.js

fileIdAttributes,

sourceAccountIdentifier,

slug

) {

Copy link

Contributor

Crash-- Mar 20, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

jsdoc please

Crash-- reviewed Mar 20, 2023

View reviewed changes

packages/cozy-clisk/src/launcher/getFileIfExists.js

}

}

async function getFileFromPath(client, entry, folderPath) {

Copy link

Contributor

Crash-- Mar 20, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

jsdoc 🙏

Crash-- reviewed Mar 20, 2023

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(cozy-clisk):Implementing selective file dowload and fixing metadata dedupli… #930

feat(cozy-clisk):Implementing selective file dowload and fixing metadata dedupli… #930

LucsT commented Mar 17, 2023 •

edited

Loading

Crash-- Mar 20, 2023

Crash-- Mar 20, 2023

Crash-- Mar 20, 2023

Crash-- Mar 20, 2023

Crash-- Mar 20, 2023

Crash-- Mar 20, 2023

Crash-- Mar 20, 2023

LucsT Mar 20, 2023

Crash-- Mar 20, 2023 •

edited

Loading

doubleface Mar 21, 2023

Crash-- Mar 20, 2023

Crash-- Mar 20, 2023

paultranvan Mar 20, 2023 •

edited

Loading

Crash-- Mar 20, 2023

paultranvan Mar 20, 2023

Crash-- Mar 20, 2023

Crash-- Mar 20, 2023

Crash-- Mar 20, 2023

Crash-- Mar 20, 2023

Crash-- commented Mar 21, 2023

Crash-- commented Mar 21, 2023

doubleface commented Mar 21, 2023

doubleface commented Mar 21, 2023

Crash-- commented Jan 2, 2024


		module.exports = getFileIfExists

		async function getFileIfExists({ client, entry, options, folderPath, slug }) {

feat(cozy-clisk):Implementing selective file dowload and fixing metadata dedupli… #930

Are you sure you want to change the base?

feat(cozy-clisk):Implementing selective file dowload and fixing metadata dedupli… #930

Conversation

LucsT commented Mar 17, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Crash-- Mar 20, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

paultranvan Mar 20, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Crash-- commented Mar 21, 2023

Crash-- commented Mar 21, 2023

doubleface commented Mar 21, 2023

doubleface commented Mar 21, 2023

Crash-- commented Jan 2, 2024

LucsT commented Mar 17, 2023 •

edited

Loading

Crash-- Mar 20, 2023 •

edited

Loading

paultranvan Mar 20, 2023 •

edited

Loading