[FEATURE] Separation of file meta data #345

lpoli · 2021-09-20T03:40:11Z

Currently we put all of the file meta data of all allocations to a single table reference_objects. Since there can be multiple allocations and each allocation can have multitude of files/directories and soft delete is implemented(i.e row is not deleted upon delete query instead its deleted_at field is updated with current timestamp), this reference_objects table grows very large.

Also primary id of file metadata in reference_objects won't be same across blobbers for same allocation. File ref id is important for it to be unique. It will have use cases in future applications just like inode number in linux system; currently it is required by at least 0fs.

One solution for the file ref id issue would be to add new column say unique_id for example in reference_objects table and make it unique within an allocation.

The other solution would be to add new table for file metadata for each allocation. This makes file metadata more granular. Also primary id of file metadata will be consistent across blobbers. When allocation expires, blobber can simply drop the table. Since query for each allocation will look into its own respective table it will obviously improve performance.
The other benefit later on would be to allow client to choose indexing as per their requirement.

The text was updated successfully, but these errors were encountered:

lpoli · 2021-09-20T03:40:39Z

This is much of a discussion than issue

cnlangzi · 2021-09-23T02:18:27Z

I need more time to check and think it. I will keep you update

cnlangzi · 2021-10-02T01:17:38Z

similar to #301

sculptex · 2022-03-15T11:01:21Z

Regarding creation of table instances of reference_objects per allocation;

As per more recent discussions, we should also include allocation_updates in this optimization.

Full allocation_id (varchar(64)) is too long to include as suffix to e.g. reference_objects_xxxxxxxx, so we should use unique index generated by allocations table itself.

I propose standard for such be obj_idn referring to unique hash of obj_id, so in case of allocations,
allocation_idn (int) gets added as primary key and
allocation_id becomes indexed field. References to allocation_id can then be replaced by more compact allocation_idn, such as suffix of reference_objects_nnnn and allocation_updates_nnnn as well as potential to replace such obj_id key references in other tables like replacing
allocation_id varchar(64) with
allocation_idn (int)

If wrapper functions are able to be implemented that will initially return existing table until new model implemented then this allows vast bulk of change to be implemented in readiness without breaking change;
e.g. reference to reference_objects table be replaced by get_reference_objects_table(allocation_id). Initially this will just return reference_objects table but once model change implemented, function can be switched to return reference_objects_nnnn table. (Can be fetched from in memory map[]). Simultaneous dropping of (now redundant) allocation_id field from table and struct would also require handling.

Note: stats functions as returned by _stats seem to be only place where multiple allocations are required to be referenced by queries. This method is inefficient and needs replacing with more modular method anyway.

lpoli · 2022-04-15T02:17:35Z

This issue is incorporated in (#627)

lpoli assigned cnlangzi and guruhubb Sep 20, 2021

cnlangzi added this to the v1.0.2 milestone Oct 2, 2021

kushthedude added the hacktoberfest label Oct 3, 2021

kushthedude removed the hacktoberfest label Oct 25, 2021

moldis changed the title ~~Separation of file meta data~~ [FEATURE] Separation of file meta data Oct 25, 2021

cnlangzi assigned lpoli and unassigned cnlangzi Feb 24, 2022

cnlangzi added post-mainnet and removed post-mainnet labels Mar 24, 2022

lpoli closed this as completed Apr 15, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[FEATURE] Separation of file meta data #345

[FEATURE] Separation of file meta data #345

lpoli commented Sep 20, 2021 •

edited

Loading

lpoli commented Sep 20, 2021

cnlangzi commented Sep 23, 2021

cnlangzi commented Oct 2, 2021

sculptex commented Mar 15, 2022

lpoli commented Apr 15, 2022

[FEATURE] Separation of file meta data #345

[FEATURE] Separation of file meta data #345

Comments

lpoli commented Sep 20, 2021 • edited Loading

lpoli commented Sep 20, 2021

cnlangzi commented Sep 23, 2021

cnlangzi commented Oct 2, 2021

sculptex commented Mar 15, 2022

lpoli commented Apr 15, 2022

lpoli commented Sep 20, 2021 •

edited

Loading