-
Notifications
You must be signed in to change notification settings - Fork 20
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix bug when renaming multiple objects #1607
base: api-breakers
Are you sure you want to change the base?
Conversation
06ab024
to
c542e63
Compare
@n8maninger I assigned you as a reviewer Nate, in this particular case though I really need Chris to take a look at it as well. It's only couple of days and I'm sure he'll want to have a pass on it before we consider hot-fixing it, especially since he recently refactored the whole thing, his understanding of the various pitfalls and caveats should be quite fresh in his memory. |
// Directories returns the directories for the given path. When explicit is | ||
// true, the returned directories do not include the path itself should it be a | ||
// directory. The root path ('/') is always excluded. | ||
func Directories(path string, explicit bool) (dirs []string) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ChrisSchinnerl I'm not a big fan of the naming here, I went with explicit
because we have this test called TestObjectsExplicitDir
re-reading it now makes me think I have to inverse it actually. Holding off on that for now until we review the entire thing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tbh I'm not a fan of having the bool. It feels a bit too forced. The way I understand it is that it's only required for InsertDirectoriesForRename
. Looking at that method it seems very verbose. I'm pretty sure your first instinct was to just update the directory names instead of doing the whole mapping thing but you ran into issues due to the fact that the directories
table doesn't contain the bucket.
I think it might be worth adding the bucket id to the directories table and extending the unique key to cover that.
Then simplify InsertDirectoriesForRename
and finally get rid of the explicit
bool since we should be able to just update all directories of the bucket that we rename within. My guess is that that's also a lot faster than looping and inserting directories.
That makes the migration more complicated but if we ever decide to use the directories
table as a cache for fields like size
and health
of contained objects, we will need to introduce buckets to that table anyway.
"/firefly/trailer", | ||
"/firefly/s1/ep1", | ||
"/firefly/s1/ep2", | ||
"/firefly/s2/ep1", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The naming feels a bit too shady ^^
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes debugging so much easier but ok 😁 I was on the fence whether to rename it last minute myself.
// Directories returns the directories for the given path. When explicit is | ||
// true, the returned directories do not include the path itself should it be a | ||
// directory. The root path ('/') is always excluded. | ||
func Directories(path string, explicit bool) (dirs []string) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tbh I'm not a fan of having the bool. It feels a bit too forced. The way I understand it is that it's only required for InsertDirectoriesForRename
. Looking at that method it seems very verbose. I'm pretty sure your first instinct was to just update the directory names instead of doing the whole mapping thing but you ran into issues due to the fact that the directories
table doesn't contain the bucket.
I think it might be worth adding the bucket id to the directories table and extending the unique key to cover that.
Then simplify InsertDirectoriesForRename
and finally get rid of the explicit
bool since we should be able to just update all directories of the bucket that we rename within. My guess is that that's also a lot faster than looping and inserting directories.
That makes the migration more complicated but if we ever decide to use the directories
table as a cache for fields like size
and health
of contained objects, we will need to introduce buckets to that table anyway.
This was quite the rabbit hole, I'll try and concisely describe:
Aside from the above I noticed
MakeDirsForPath
doesn't work properly handle multi-byte UTF-8 characters because we useutf8.RuneCountInString
and then access the bytes of the string at certain position, not the runes.This PR also adds a migration which queries the objects that were corrupted by a rename and recreates their directory structure and points them to the proper parent. Has to be tested still on
arequipa
or another node that has corrupted directory structure.Fixes #1599
Directories
In
renterd
every object is linked to a directory. Important to note here is that directories are not scoped to a bucket. Objects from different buckets can point to the same directory if the path from the root is identical. They are merely used to store structure and to speed up listing queries.Adding an object
When you add an object we pre-create the directories using
MakeDirsForPath
. What's important about this method is that it returns the ID of the parent directory, which we'll use when we insert the object. Another important thing to mention is that if you supply it a path that ends with a/
it will return the ID of the parent folder.Renaming objects
When we rename an object we pre-create the directories again using
MakeDirsForPath
. We then update existing objects and update the key of an object to whatever renamed value. Important here is that the target object can already exist, in which case a rename fails unless you rename by force.Renaming multiple objects
Renaming multiple objects is used in the UI to rename folder names. That makes sense because you want all contents of that folder to still be in the same place after renaming the folder. This feature reuses
MakeDirsForPath
, and this was the problem because we use whatever parent directory ID to set as the directory ID of the objects we are renaming.Imagine having a file (and directory structure) like
/firefly/s1/ep1
and I want to renames1
tos01
. I would make directories for this new path andMakeDirsForPath
would return me the ID of thefirefly
directory. It would then go and update the episode 1 and point it to the firefly directory -> wreaking total havocInsertDirectoriesForRename
which return a mapping of old directory IDs to renamed directory IDsobject.Directories
which return directories for a path, it takes a special boolean that returns the path itself should the path be a directory (this is important when we rename multiple objects at a time)This very quickly becomes quite complicated because the UI also creates dummy objects for directories. Typing it all out here is quite lengthy so for now I'll refer to
TestRenameObjectsRegression
where I showcase some tricky renames in a regression test.