Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature request]Custom column list support for csv dump #122

Open
Anutrix opened this issue Feb 6, 2024 · 6 comments
Open

[Feature request]Custom column list support for csv dump #122

Anutrix opened this issue Feb 6, 2024 · 6 comments

Comments

@Anutrix
Copy link

Anutrix commented Feb 6, 2024

We don't always need all columns when generating the csv/json dump. Requesting a way to supply column list for csv dump.

@omerbenamram
Copy link
Owner

@Anutrix
Copy link
Author

Anutrix commented Feb 6, 2024

I thought less columns during load and dump meant little more speed boost.

I am trying to reduce the time for mft to csv conversion so xsv won't help.

@omerbenamram
Copy link
Owner

from my intuition i doubt that would be meaningful. simply transmuting fields to csv is a trivial amount of compute. you're welcome to try and modify the code and check the results if you'd like, or you can integrate the library to whatever downstream task you're doing for better performance.

@Anutrix
Copy link
Author

Anutrix commented Feb 12, 2024

Reducing columns didn't help much. So I tried a different approach.

I tried simply skipping the parse process if it's deleted in from_entry function:

let is_deleted: bool = !entry.header.flags.contains(EntryFlags::ALLOCATED);

if is_deleted {
    return None;
};

And skipped it from serialization/writing it in print_csv_entry function:

if flat_entry.is_some() {
    writer.serialize(flat_entry)?;
}

Got about 60% faster conversion for my data.
I know it might just be my case to skip deleted files but it worked. Thx for the awesome library.

@omerbenamram
Copy link
Owner

sounds very reasonable, i'm glad you found something that works for you :)

@Anutrix
Copy link
Author

Anutrix commented Feb 14, 2024

sounds very reasonable, i'm glad you found something that works for you :)

Should I make MR out of my changes so that it works with a flag like --exclude-deleted or is it not needed?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants