-
Notifications
You must be signed in to change notification settings - Fork 25
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(cozy-clisk):Implementing selective file dowload and fixing metadata dedupli… #930
base: master
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,128 @@ | ||
import Minilog from '@cozy/minilog' | ||
import { Q } from 'cozy-client' | ||
import get from 'lodash/get' | ||
|
||
const log = Minilog('getFileIfExists') | ||
|
||
module.exports = getFileIfExists | ||
|
||
async function getFileIfExists({ client, entry, options, folderPath, slug }) { | ||
const fileIdAttributes = options.fileIdAttributes | ||
const sourceAccountIdentifier = options.sourceAccountIdentifier | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. nit: const { fileIdAttributes, sourceAccountIdentifier } = options |
||
|
||
const isReadyForFileMetadata = | ||
fileIdAttributes && slug && sourceAccountIdentifier | ||
if (isReadyForFileMetadata) { | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. to me There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yes, this is to decide if enough Metadata are available. |
||
log.debug('Detecting file with metadata') | ||
const file = await getFileFromMetaData( | ||
client, | ||
entry, | ||
fileIdAttributes, | ||
sourceAccountIdentifier, | ||
slug | ||
) | ||
if (!file) { | ||
// no file with correct metadata, maybe the corresponding file already exist in the default | ||
// path from a previous version of the connector | ||
log.debug('Rolling back on detection by filename') | ||
return getFileFromPath(client, entry, folderPath) | ||
} else { | ||
return file | ||
} | ||
} else { | ||
log.debug('Rolling back on detection by filename') | ||
return getFileFromPath(client, entry, folderPath) | ||
} | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We prefer the early return, it's a lot easier to read:
|
||
} | ||
|
||
async function getFileFromMetaData( | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. same here for naming convention. Please follow https://github.com/cozy/cozy-guidelines/#naming-of-functions |
||
client, | ||
entry, | ||
fileIdAttributes, | ||
sourceAccountIdentifier, | ||
slug | ||
) { | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. jsdoc please |
||
log.debug( | ||
`Checking existence of ${calculateFileKey(entry, fileIdAttributes)}` | ||
) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
and use fileKey later instead of calculating it 3 times |
||
const files = await client.queryAll( | ||
Q('io.cozy.files') | ||
.where({ | ||
metadata: { | ||
fileIdAttributes: calculateFileKey(entry, fileIdAttributes) | ||
}, | ||
trashed: false, | ||
cozyMetadata: { | ||
sourceAccountIdentifier, | ||
createdByApp: slug | ||
} | ||
}) | ||
.indexFields([ | ||
'metadata.fileIdAttributes', | ||
'trashed', | ||
'cozyMetadata.sourceAccountIdentifier', | ||
'cozyMetadata.createdByApp' | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Since the order is important (https://docs.cozy.io/en/tutorials/data/advanced/#indexes-performances-and-design) I'll changed to:
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'd also suggest to use a partial index for the trashed attribute: Q('io.cozy.files')
.where({
metadata: {
fileIdAttributes: calculateFileKey(entry, fileIdAttributes)
},
cozyMetadata: {
sourceAccountIdentifier,
createdByApp: slug
}
})
.indexFields([
'metadata.fileIdAttributes',
'cozyMetadata.sourceAccountIdentifier',
'cozyMetadata.createdByApp'
])
.partialIndex({
trashed: false
}) There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Why that? I though the same at first (putting it in a partial index) but then I checked There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Since it's always there, it will always be indexed, but we only need the |
||
]) | ||
) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 1/ We should really avoid the queryAll() method. This method can return a lot of documents and can be pretty long to execute. Can't we restrict the returned data by querying only the last 12 months or else? 2/ I don't know if you need all the fields from this 3/ You queryAll() but after that, you only return the first result. You should use something like this instead:
be careful by changing to client.query() since query() will return an object with data. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yes, the queryAll is here only to know if more than one document satisfies the request, to log a waring if more than on file does it. My mistake in the first place. The get the same feature, we should then use client.query with limitBy(2) |
||
if (files && files[0]) { | ||
if (files.length > 1) { | ||
log.warn( | ||
`Found ${files.length} files corresponding to ${calculateFileKey( | ||
entry, | ||
fileIdAttributes | ||
)}` | ||
) | ||
} | ||
return files[0] | ||
} else { | ||
log.debug('File not found') | ||
return false | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. should return null. Not boolean |
||
} | ||
} | ||
|
||
async function getFileFromPath(client, entry, folderPath) { | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. jsdoc 🙏 |
||
try { | ||
log.debug(`Checking existence of ${getFilePath({ entry, folderPath })}`) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. store the result of getFilePath and use it later 🙏 |
||
const result = await client | ||
.collection('io.cozy.files') | ||
.statByPath(getFilePath({ entry, folderPath })) | ||
return result.data | ||
} catch (err) { | ||
log.debug(err.message) | ||
return false | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. same should return null |
||
} | ||
} | ||
|
||
function getFilePath({ file, entry, folderPath }) { | ||
if (file) { | ||
return folderPath + '/' + getAttribute(file, 'name') | ||
} else if (entry) { | ||
return folderPath + '/' + getFileName(entry) | ||
} | ||
} | ||
|
||
function getAttribute(obj, attribute) { | ||
return get(obj, `attributes.${attribute}`, get(obj, attribute)) | ||
} | ||
|
||
function calculateFileKey(entry, fileIdAttributes) { | ||
return fileIdAttributes | ||
.sort() | ||
.map(key => get(entry, key)) | ||
.join('####') | ||
} | ||
|
||
function getFileName(entry) { | ||
let filename | ||
if (entry.filename) { | ||
filename = entry.filename | ||
} else { | ||
log.error('Could not get a file name for the entry') | ||
return false | ||
} | ||
return sanitizeFileName(filename) | ||
} | ||
|
||
function sanitizeFileName(filename) { | ||
return filename.replace(/^\.+$/, '').replace(/[/?<>\\:*|":]/g, '') | ||
} |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,129 @@ | ||
import getFileIfExists from './getFileIfExists' | ||
|
||
describe('getFileIfExists', function () { | ||
function setup() { | ||
const statByPath = jest.fn() | ||
const queryAll = jest.fn() | ||
const client = { | ||
queryAll, | ||
collection: () => ({ | ||
statByPath | ||
}) | ||
} | ||
return { | ||
statByPath, | ||
queryAll, | ||
client | ||
} | ||
} | ||
|
||
it('should use metadata if fileIdAttributes, slug and sourceAccountIdentifier are present', async () => { | ||
const { client, queryAll } = setup() | ||
queryAll.mockResolvedValue([{ _id: 'abcd' }]) // Mocking the return with a simple file obj | ||
const file = await getFileIfExists({ | ||
client, | ||
entry: { | ||
filename: 'filename' | ||
}, | ||
options: { | ||
fileIdAttributes: ['filename'], | ||
sourceAccountIdentifier: 'Identifier' | ||
}, | ||
folderPath: '/fakepath', | ||
slug: 'slug' | ||
}) | ||
// Test that the Q request cotains our parameters | ||
expect(queryAll).toHaveBeenCalledWith( | ||
expect.objectContaining({ | ||
selector: expect.objectContaining({ | ||
cozyMetadata: expect.objectContaining({ | ||
createdByApp: 'slug', | ||
sourceAccountIdentifier: 'Identifier' | ||
}), | ||
metadata: expect.objectContaining({ fileIdAttributes: 'filename' }) | ||
}) | ||
}) | ||
) | ||
expect(file).toStrictEqual({ _id: 'abcd' }) | ||
}) | ||
|
||
it('should roll back to filename when with metadata return nothing', async () => { | ||
const { client, queryAll, statByPath } = setup() | ||
queryAll.mockResolvedValue(undefined) // Mocking an empty return | ||
statByPath.mockResolvedValue({ data: { _id: 'abcd' } }) | ||
const file = await getFileIfExists({ | ||
client, | ||
entry: { | ||
filename: 'filename' | ||
}, | ||
options: { | ||
fileIdAttributes: ['filename'], | ||
sourceAccountIdentifier: 'Identifier' | ||
}, | ||
folderPath: '/fakepath', | ||
slug: 'slug' | ||
}) | ||
expect(queryAll).toHaveBeenCalledWith( | ||
expect.objectContaining({ | ||
selector: expect.objectContaining({ | ||
cozyMetadata: expect.objectContaining({ | ||
createdByApp: 'slug', | ||
sourceAccountIdentifier: 'Identifier' | ||
}), | ||
metadata: expect.objectContaining({ fileIdAttributes: 'filename' }) | ||
}) | ||
}) | ||
) | ||
expect(statByPath).toHaveBeenCalledWith('/fakepath/filename') | ||
expect(file).toStrictEqual({ _id: 'abcd' }) | ||
}) | ||
|
||
it('shoud use filename if sourceAccountIdentifier is missing', async () => { | ||
const { statByPath, client } = setup() | ||
statByPath.mockResolvedValue({ data: { _id: 'abcd' } }) | ||
const file = await getFileIfExists({ | ||
client, | ||
entry: { | ||
filename: 'filename' | ||
}, | ||
options: { fileIdAttributes: ['filename'] }, | ||
folderPath: '/fakepath', | ||
slug: 'slug' | ||
}) | ||
expect(statByPath).toHaveBeenCalledWith('/fakepath/filename') | ||
expect(file).toStrictEqual({ _id: 'abcd' }) | ||
}) | ||
|
||
it('shoud use filename if slug is missing', async () => { | ||
const { statByPath, client } = setup() | ||
statByPath.mockResolvedValue({ data: { _id: 'abcd' } }) | ||
const file = await getFileIfExists({ | ||
client, | ||
entry: { | ||
filename: 'filename' | ||
}, | ||
options: { | ||
fileIdAttributes: ['filename'], | ||
sourceAccountIdentifier: 'Identifier' | ||
}, | ||
folderPath: '/fakepath' | ||
}) | ||
expect(statByPath).toHaveBeenCalledWith('/fakepath/filename') | ||
expect(file).toStrictEqual({ _id: 'abcd' }) | ||
}) | ||
it('shoud use filename if fileIdAttributes is missing', async () => { | ||
const { statByPath, client } = setup() | ||
statByPath.mockResolvedValue({ data: { _id: 'abcd' } }) | ||
const file = await getFileIfExists({ | ||
client, | ||
entry: { | ||
filename: 'filename' | ||
}, | ||
options: { sourceAccountIdentifier: 'Identifier' }, | ||
folderPath: '/fakepath', | ||
slug: 'slug' | ||
}) | ||
expect(statByPath).toHaveBeenCalledWith('/fakepath/filename') | ||
expect(file).toStrictEqual({ _id: 'abcd' }) | ||
}) | ||
}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
jsdoc please with
//ts-check
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we make a request to the cozy, then the method should be called fetchSomething (https://github.com/cozy/cozy-guidelines/#naming-of-functions)
Get should only be used to get something in memory