-
Notifications
You must be signed in to change notification settings - Fork 23
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[FEATURE] Separation of file meta data #345
Comments
This is much of a discussion than issue |
I need more time to check and think it. I will keep you update |
similar to #301 |
Regarding creation of table instances of reference_objects per allocation; As per more recent discussions, we should also include allocation_updates in this optimization. Full allocation_id (varchar(64)) is too long to include as suffix to e.g. reference_objects_xxxxxxxx, so we should use unique index generated by allocations table itself. I propose standard for such be obj_idn referring to unique hash of obj_id, so in case of allocations, If wrapper functions are able to be implemented that will initially return existing table until new model implemented then this allows vast bulk of change to be implemented in readiness without breaking change; Note: stats functions as returned by _stats seem to be only place where multiple allocations are required to be referenced by queries. This method is inefficient and needs replacing with more modular method anyway. |
This issue is incorporated in (#627) |
Currently we put all of the file meta data of all allocations to a single table
reference_objects
. Since there can be multiple allocations and each allocation can have multitude of files/directories and soft delete is implemented(i.e row is not deleted upon delete query instead itsdeleted_at
field is updated with current timestamp), thisreference_objects
table grows very large.Also primary id of file metadata in
reference_objects
won't be same across blobbers for same allocation. File ref id is important for it to be unique. It will have use cases in future applications just like inode number in linux system; currently it is required by at least 0fs.One solution for the file ref id issue would be to add new column say
unique_id
for example inreference_objects
table and make it unique within an allocation.The other solution would be to add new table for file metadata for each allocation. This makes file metadata more granular. Also primary id of file metadata will be consistent across blobbers. When allocation expires, blobber can simply drop the table. Since query for each allocation will look into its own respective table it will obviously improve performance.
The other benefit later on would be to allow client to choose indexing as per their requirement.
The text was updated successfully, but these errors were encountered: