Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RFC]: achieve feature parity with builtin Node.js fs module #72

Closed
6 tasks done
Daniel777y opened this issue Mar 31, 2024 · 9 comments
Closed
6 tasks done

[RFC]: achieve feature parity with builtin Node.js fs module #72

Daniel777y opened this issue Mar 31, 2024 · 9 comments
Labels
2024 2024 GSoC proposal. rfc Project proposal.

Comments

@Daniel777y
Copy link

Daniel777y commented Mar 31, 2024

Full name

Dexu (Daniel) Yu

University status

Yes

University name

Northeastern University

University program

Computer Science

Expected graduation

2025

Short biography

I am currently pursuing a Master's degree in Computer Science at the Oakland campus of Northeastern University, having previously earned a Bachelor's degree in Software Engineering. My technical skill set encompasses programming languages such as C/C++, JavaScript, and Python. Additionally, I possess a strong foundation in Docker, Linux, MySQL, and Firebase.

During my undergraduate and graduate, I took various courses in the field of computer science, such as Data Structures, Operating System, Programming Design Paradigm, Web Development, and Software Test.

My passion for problem-solving has drawn me to competitive programming, where I have honed my abilities in algorithms and optimization. Moreover, I feel a great sense of achievement in developing personal applications and working in group projects, which allow me to bring my innovative ideas to life.

Timezone

US Pacific Time

Contact details

GitHub: Daniel777y, e-mail: [email protected], [email protected]

Platform

Mac

Editor

Vim and tmux are my favorite choices for coding. The best thing about Vim is that I can start coding on any computer with just a few settings. Once I got the hang of its shortcuts and how it works, I found I could code much more efficient. When I'm working on bigger projects, I also use VSCode because it has a lot of plugins and helps me manage files better, making everything smoother.

Programming experience

My programming experience includes competitive programming, personal and group projects, and so on, covering a wide range of technologies, such as React.js, Vite.js, Vue.js, Node.js, Django, MySQL, C/C++, Bootstrap, Tailwind CSS. Here're some of my recent projects:

  • FlashFingers: A web typing game | React.js, Vite, Node.js, firebase

    • Designed dynamic content generation for unique typing challenges, improving user engagement and learning.
    • Implemented real-time feedback with color-coded text and sound effects for immediate accuracy assessment.
    • Engineered performance metrics to calculate and display typing speed (WPM) and accuracy, facilitating user progress tracking.
    • Achieved a seamless user experience through performance optimization and an intuitive UI/UX design.
  • DevDeck: A tech stack selector of a project management tool | React.js, Vite, Node.js, firebase

    • Implemented a component for users to choose tech stack before they start a project.
    • Utilized firebase to manage data.
  • BeiBei Words: A program for memorizing English words | Python, Uni-app, Vue.js

    • Designed a mobile app aimed at aiding users in memorizing English words, offering features such as word lists, and adaptive word repetition, collection, and search based on user proficiency.
    • Employed Python crawlers to extract words, definitions, example sentences, and related data from Bing Dictionary's public database tailored for major English exams, storing the data in a structured JSON format.
    • Utilized the Uni-App framework to craft the core logic and Vue.js to design an intuitive front-end interface.
    • Constructed a serverless, lightweight back-end architecture through uniCloud, packaged word data and transformed it into an accessible API that users can query for word information.
  • WuMo Drawing: A mobile game for children to write and draw | JavaScript, Cocos Engine

    • Collaborated with a team of four to develop a game that educates users on writing characters by providing visual guides and assessing accuracy through strategically placed points on the screen.
    • Utilized JavaScript for logic design and Cocos Creator for an intuitive front-end interface.

JavaScript experience

I use JavaScript intensively to develop various full-stack applications, for course work as well as personal projects. Recently, I also started contributing to stdlib by implementing the math/base/tools/normhermitepolyf package for evaluating a normalized Hermite polynomial using single-precision floating-point arithmetic.

I really appreciate JavaScript's ease of learning, flexibility and widespread adoption. It allows me to organize code in traditional OOP style or functional programming. Moreover, it streamlines and simplifies the development process significantly. For instance, I can craft user interfaces with React.js or Vue.js, and develop server-side applications with Express.js.

However, one limitation of JavaScript is its performance in computation-intensive tasks. While it's a popular choice for web development, for tasks requiring heavy computation, such as data analysis or machine learning, developers might prefer Python or R. Imagine if we can do such tasks on a browser, that would be exciting!

Node.js experience

In my full-stack projects, I usually utilize Node.js and Express.js for backend development. This includes tasks like database connectivity, API implementation, and file management among others.

C/Fortran experience

I learned C in my freshman year of undergraduate, and I have applied it in multiple course projects, including developing a library management tool and file system in Linux. Beyond these applications, I have been using C/C++ in competitive programming contests for over five years, which has helped me build a strong foundation in this language.

Interest in stdlib

When I first delved into competitive programming, the majority of participants favored C/C++ and Java. However, in recent years, more and more people start using Python, particularly with libraries like NumPy for computational tasks, while the use of Java has significantly dwindled and even C/C++'s dominance has seen a decline.

As I've mentioned, JavaScript's popularity in web development is undeniable, offering ease of implementation for ideas and product demonstrations. However, for data retrieval and analysis, developers still turn to alternative languages. Can JavaScript go further?

Stdlib attracts me because it is enhancing JavaScript's flexibility and capability, such as numerical and scientific computation and other functionality. This expansion not only broadens JavaScript's applicability but also shows the potential for intensive computing tasks to be executed directly in the browser. Especially with the rising importance of machine learning and data science, I believe there will be more and more innovative applications built with JavaScript, necessitating robust libraries like stdlib to support these advancements.

Version control

Yes

Contributions to stdlib

Pull Request

Issue

My first contribution is implementing the single precision equivalent for math/base/tools/normhermitepoly.

Though not aligned with the project I am proposing, this experience has given me a good understanding of the community's standards and the development process.

Also, The task of reimplementing single-precision functions shares similarities with the work involved in implementing the fs module, as both tasks are guided by a related overarching approach.

Goals

The primary goal of this project is to achieve complete feature parity with the Node.js fs module, thereby providing users with a full set of file system operations within stdlib.

Additionally, this project will enhances compatibility with older versions of Node.js. Therefore, developers, even if they use older Node.js versions, can access and utilize new file management features through stdlib.

Moreover, I will implement some of Promise-APIs for these functionalities, which will be beneficial for developers who prefer using Promises over callbacks.

The successful implementation of this project is expected to significantly enhance stdlib's flexibility and utility.

Functionality

Here're some of functionalities I am planning to implement (asynchronous versions):

  • mkdir:
/**
* Asynchronously creates a directory.
*
* @param {(string|Buffer)} path - directory path
* @param {(Options|ingeger)} [options] - options
* @param {boolean} [options.recursive] - whether recursive creating
* @param {string|integer} [options.mode=0o777] - directory mode
* @param {Function} clbk - callback to invoke after creating a directory
* @returns {void}
*/
  • mkdtemp:
/**
* Asynchronously creates a temporary directory.
*
* @param {(string|Buffer)} path - directory path
* @param {Options} [options] - options
* @param {string} [options.encoding='utf8'] - encoding mode
* @param {Function} clbk - callback to invoke after creating a temporary directory
* @returns {void}
*/
  • rmdir:
/**
* Asynchronously deletes a directory.
*
* @param {(string|Buffer)} path - directory path
* @param {(Options|string)} [options] - options
* @param {interger} [options.maxRetries=0] - maximum number of retries
* @param {boolean} [options.recursive] - whether recursive deleting
* @param {interger} [options.retryDelay=100] - delay between retries
* @param {Function} clbk - callback to invoke after deleting a directory
* @returns {void}
*/
  • opendir:
/**
* Asynchronously opens a directory.
*
* @param {(string|Buffer)} path - directory path
* @param {Options} [options] - options
* @param {string} [options.encoding='utf8'] - encoding mode
* @param {bufferSize} [options.bufferSize=32] - number of directory entries that are buffered internally when reading from the directory.
* @param {boolean} [options.recursive=false] - whether to recursively open directories
* @param {Function} clbk - callback to invoke after opening a directory
* @returns {void}
*/
  • cp:
/**
* Asynchronously copies a directory.
*
* @param {(string|Buffer)} src - source directory path
* @param {(string|Buffer)} dest - destination directory path
* @param {Options} [options] - options
* @param {boolean} [options.dereference=false] - whether to dereference symbolic links
* @param {boolean} [options.errorOnExist=false] - whether to throw an error if the destination exists
* @param {Function} [options.filter=undefined] - filter function
* @param {boolean} [options.force=false] - whether to force copying
* @param {interger} [options.mode=0] - operation mode
* @param {boolean} [options.preserveTimestamps=false] - whether to preserve timestamps
* @param {boolean} [options.recursive] - whether recursive copying
* @param {boolean} [options.verbatimSymlinks=false] - whether to skip resolving symbolic links
* @param {Function} clbk - callback to invoke after copying a directory
* @returns {void}
*/
  • copyFile:
/**
* Asynchronously copies a file.
*
* @param {(string|Buffer)} src - source directory path
* @param {(string|Buffer)} dest - destination directory path
* @param {interger} [mode=0] - operation mode
* @param {Function} clbk - callback to invoke after copying a file
* @returns {void}
*/
  • rm:
/**
* Asynchronously deletes files and directories.
*
* @param {(string|Buffer)} path - file or directory path
* @param {Options} [options] - options
* @param {boolean} [options.force=false] - whether to force deleting
* @param {boolean} [options.maxRetries=0] - maximum number of retries
* @param {boolean} [options.recursive=false] - whether recursive deleting
* @param {interger} [options.retryDelay=100] - delay between retries
* @param {Function} clbk - callback to invoke after deleting files and directories
* @returns {void}
*/
  • move:
/**
* Asynchronously moves a file.
*
* @param {(string|Buffer)} src - source file path
* @param {(string|Buffer)} dest - destination file path
* @param {Options} [options] - options
* @param {Function} clbk - callback to invoke after moving a file
* @returns {void}
*/
  • ensureFile:
/**
* Asynchronously ensures that a file exists, if not, creates a file.
*
* @param {(string|Buffer)} path - file path
* @param {Options} [options] - options
* @param {Function} clbk - callback to invoke after ensuring a file
* @returns {void}
*/
  • ensureDir:
/**
* Asynchronously ensures that a directory exists, if not, create a directory.
*
* @param {(string|Buffer)} path - directory path
* @param {Options} [options] - options
* @param {Function} clbk - callback to invoke after ensuring a directory
* @returns {void}
*/
  • emptyDir:
/**
* Asynchronously ensures that a directory exists and is empty, if not exists, create a directory, and if not empty, deletes all files and directories within it.
*
* @param {(string|Buffer)} path - directory path
* @param {Options} [options] - options
* @param {Function} clbk - callback to invoke after ensuring an empty directory
* @returns {void}
*/
  • access:
/**
* Asynchronously tests users' permission of a file.
*
* @param {(string|Buffer)} path - directory path
* @param {interger} [mode=fs.constants.F_OK] - operation mode
* @param {Function} clbk - callback to invoke after testing users' permission of a file
* @returns {void}
*/
  • stat:
/**
* Asynchronously returns information about a file or directory.
*
* @param {(string|Buffer)} path - directory path
* @param {Options} [options] - options
* @param {boolean} [options.bigint=false] - whether to return numeric values as bigint
* @param {Function} clbk - callback to invoke after returning information about a file or directory
* @returns {void}
*/
  • utimes:
/**
* Asynchronously changes the timestamp of a file.
*
* @param {(string|Buffer)} path - directory path
* @param {number|string|Date} atime - access time
* @param {number|string|Date} mtime - modification time
* @param {Function} clbk - callback to invoke after changing the timestamp of a file
* @returns {void}
*/
  • chmod:
/**
* Asynchronously changes the permissions of a file.
*
* @param {(string|Buffer)} path - directory path
* @param {interger} mode - operation mode
* @param {Function} clbk - callback to invoke after changing the permissions of a file
* @returns {void}
*/
  • chown:
/**
* Asynchronously changes the owner of a file.
*
* @param {(string|Buffer)} path - directory path
* @param {interger} uid - user id
* @param {interger} gid - group id
* @param {Function} clbk - callback to invoke after changing the owner of a file
* @returns {void}
*/
  • link: creates a link between two paths.
/**
* Asynchronously creates a link between two paths.
*
* @param {(string|Buffer)} existingPath - existing path
* @param {(string|Buffer)} newPath - new path
* @param {Function} clbk - callback to invoke after creating a link between two paths
* @returns {void}
*/
  • readlink:
/**
* Asynchronously reads the contents of the symbolic link.
*
* @param {(string|Buffer)} path - path
* @param {Options} [options] - options
* @param {string} [options.encoding='utf8'] - encoding mode
* @param {Function} clbk - callback to invoke after reading the contents of the symbolic link
* @returns {void}
*/
  • read:
/**
* Asynchronously reads data from a file.
*
* @param {interger} fd - file descriptor
* @param {Buffer} buffer - buffer to write data
* @param {interger} offset - buffer offset
* @param {interger} length - number of bytes to read
* @param {interger} position - file position
* @param {Function} clbk - callback to invoke after reading data from a file
* @returns {void}
*/
  • write:
/**
* Asynchronously writes buffer to a file.
*
* @param {interger} fd - file descriptor
* @param {Buffer} buffer - buffer to write data
* @param {interger} offset=0 - buffer offset
* @param {interger} length=buffer.byteLength - number of bytes to read
* @param {(interger|null)} position=null - file position
* @param {Function} clbk - callback to invoke after writing buffer to a file
* @returns {void}
*/
  • appendFile:
/**
* Asynchronously appends data to a file, if it does not yet exist, creates a new file.
*
* @param {(string|Buffer)} path - path
* @param {(string|Buffer)} data - data to append
* @param {Options} [options] - options
* @param {string} [options.encoding='utf8'] - encoding mode
* @param {interger} [options.mode=0o666] - operation mode
* @param {string} [options.flag='a'] - flag
* @param {booleam} [options.flush=false] - whether to flush before closing
* @param {Function} clbk - callback to invoke after appending data to a file
* @returns {void}
*/
  • truncate:
/**
* Asynchronously truncates a file.
*
* @param {(string|Buffer)} path - path
* @param {interger} len - length to truncate
* @param {Function} clbk - callback to invoke after truncating a file
* @returns {void}
*/
  • createReadStream:
/**
* Asynchronously creates a readable stream from a file.
*
* @param {(string|Buffer)} path - path
* @param {Options} [options] - options
* @param {string} [options.flags='r'] - flags
* @param {string} [options.encoding=null] - encoding mode
* @param {interger} [options.fd=null] - file descriptor
* @param {interger} [options.mode=0o666] - operation mode
* @param {interger} [options.autoClose=true] - whether to close the file after reading
* @param {interger} [options.emitClose=true] - whether to emit close event
* @param {interger} [options.start=0] - start position
* @param {interger} [options.end=Infinity] - end position
* @param {interger} [options.highWaterMark=64*1024] - high water mark
* @param {Object} [options.fs=null] - file system
* @returns {fs.ReadStream} readable stream
*/
  • createWriteStream:
/**
* Asynchronously creates a writable stream to a file.
*
* @param {(string|Buffer)} path - path
* @param {Options} [options] - options
* @param {string} [options.flags='w'] - flags
* @param {string} [options.encoding=null] - encoding mode
* @param {interger} [options.fd=null] - file descriptor
* @param {interger} [options.mode=0o666] - operation mode
* @param {interger} [options.autoClose=true] - whether to close the file after reading
* @param {interger} [options.emitClose=true] - whether to emit close event
* @param {interger} [options.start=0] - start position
* @param {interger} [options.highWaterMark=16384] - high water mark
* @param {Object} [options.fs=null] - file system
* @prama {boolean} [options.flush=true] - whether to flush before closing
* @returns {fs.WriteStream} writable stream
*/
  • watch:
/**
* Asynchronously watches for changes in a file or directory.
*
* @param {(string|Buffer)} path - path
* @param {Options} [options] - options
* @param {boolean} [options.persistent=true] - whether to keep the process running as long as files are being watched
* @param {boolean} [options.recursive=false] - whether to watch all files and directories in the given directory
* @param {string} [options.encoding='utf8'] - encoding mode
* @param {Function} listener=undefined - listener callback function
* @returns {fs.FSWatcher} file system watcher
*/
  • watchFile:
/**
* Asynchronously watches for changes in a file.
*
* @param {(string|Buffer)} path - path
* @param {Options} [options] - options
* @param {boolean} [options.bigint=false] - whether to use bigint
* @param {boolean} [options.persistent=true] - whether to keep the process running as long as files are being watched
* @param {interger} [options.interval=5007] - interval
* @param {Function} listener=undefined - listener callback function
* @returns {fs.StatWatcher} file system watcher
*/
  • unwatchFile:
/**
* Asynchronously stops watching for changes in a file.
*
* @param {(string|Buffer)} path - path
* @param {Function} listener - listener previously attached
*/
  • constants: returns an object in which contains commonly used constants for file system operations.

Besides asynchronous functions, I will also implement synchronous versions of them.

Other than these functionalities, I will also implement utility functions to polyfill the older versions of Node.js.

Compatibility

Currently, stdlib is compatible with Node.js v0.10 and above. Therefore, to maintain compatibility with older versions of Node.js, I need to provide polyfills for some functionalities. To do this, I can borrow ideas from readable-stream. Here's an general example:

const fs = require('fs');

function cp( src, dest, options, callback ) {
    if ( arguments.length < 4 ) {
        callback = options;
        options = {};
    }
    
    // Check if the current Node.js version supports fs.cp
    if ( fs.cp ) {
        fs.cp( src, dest, options, callback );
    } else {
        // Polyfill
        // ...
    }
}

module.exports = cp;

In this case, I might need to manually implement some helper functions and handle the various options.

Promise

stdlib also plans to provide Promise-APIs in the long run. Therefore, I will try to also implement Promise versions of these functionalities. But I will first focus on the callback and synchronous versions, then move to @stdlib/fs/promise/* later. I need to polyfill the older versions not supporting native Promise as well. To do this, I can borrow ideas from promise-polyfill and implement the @stdlib/promise.

Performance

To ensure correctness and performance, every implementation will be tested through Tape framework and benchmarked in TAP format.

All the functionalities will provide concise error messages, and handle potiential edge cases, such as invalid path, permission denied, and so on.

Documentation and Examples

I will adhere to the development guidelines, offering comprehensive examples and documentation for each function to help users understand their usage and support developers in code maintenance.

Why this project?

File management is a core operation for developers, and stdlib focuses on numerical and scientific computing, making file system crucial for handling data files. By contributing to this project, I will be enhancing the capabilities for high-performance applications that run in browsers with stdlib, which I find immensely exciting.

Additionally, this experience will deepen my understanding of JavaScript and Node.js. While I have previously worked with file systems within various frameworks, engaging with this project will provide me with a profound comprehension the mechanics of file system and other functionalities.

As a freshman in open-source, this project presents a valuable opportunity to contribute meaningfully to the real world and to learn best practices in software development. Working on the math/base/tools/normhermitepolyf was an eye-opening experience for me. I was particularly struck by the the detailed coding standards, the structured development cycles, and the thorough testing norms. It's exciting to anticipate a long-term involvement with this community.

Finally, being part of stdlib's expanding community of both contributors and users is motivational and an honor. I am eager to make a significant impact on the development of excellent applications and to collaborate with dedicated mentors and fellow contributors.

Qualifications

  • Programming Skill:
    With over five years of experience in competitive programming primarily in C/C++, coupled with various projects in JavaScript, Python, Node.js, I have strong knowledge of these technologies and understand the requirements and scope of this project. My experiences have equipped me with a robust grasp of algorithms and data structures, as well as the capability to implement complex programs efficiently.

  • Problem-Solving Skills and Self-Driven Attitude:
    I enjoy challenging difficulties and remain motivated even when faced with obstacles. I will leverage online resources, learn to use advanced tools, and consult experts when needed to find solutions.

  • Collaboration and Knowledge Sharing:
    I used to write blogs and offer free lessons on algorithms, sharing my knowledge with others. Additionally, my active participation in team projects reflects my cooperative nature and openness to collaboration.

  • Prior Contributions to stdlib:
    My prior contribution to stdlib, specifically the addition of the math/base/tools/normhermitepolyf package, has made me well-acquainted with the community's standards. This experience has paved the way for a smooth transition into working on this project.

Prior art

In this project, I will implement fs package features in Node.js , so I'll mainly use the Node.js source code as a guide.

Here're some extra packages I might refer to:

  • graceful-fs and fs-extra: gives a replacement for the native fs module.
  • readable-stream: gives an example on polyfilling.
  • fs-minipass: provides implementations of streams.
  • chokidar: to watch file changes.
  • promise-polyfill: to polyfill promises.

Additionally, I might borrow approaches from file systems in other languages, like Python's PyFilesystem.

Commitment

My semester ends on April 29th, so I’ll be completely free to work on this project from May to August. Since I don't have any other occupations, I can put in more than 30 hours each week during the summer. After the GSoC program ends, I plan to keep contributing in the community for about 10 hours a week.

Schedule

Assuming a 12 week schedule with extra 4 weeks.

  • Community Bonding Period:
    In this phase, I will deepen my familiarity with stdlib, including the norms and source code while starting working on related issues. In addition, I will discuss with mentors and refer other resources to improve my project plan and determine the scope of work. For example, I will try to find out which functionalities are needed to polyfill and the best approach to implement them.

  • Week 1 to Week 3:
    I mainly work on basic file and directory management functionalities and open PRs for mkdir, mkdtemp, rmdir, ensureFile, ensureDir, emptyDir, cp.

  • Week 4:
    This week, I'll deal with any backlogs and seek feedback from mentors to evaluate the quality of my work. Additionally, I may revise the work plan for the upcoming phase based on feedback and my progress.

  • Week 5 to Week 7:
    I will start developing file manupulation functionalities and open PRs for copyFile, rm, move, read, write, appendFile, truncate, rm. Week 6 marks the midterm evaluation period. By this time, I aim to have completed half of the planned functionalities. This will also be an opportune moment to collect feedback from my mentors.

  • Week 8 to Week 9:
    During this period, I would be opening PRs and implementing metadata and permissions, such as constants, access, stat, utimes, chomod, chown. Also, I will handle any backlogs.

  • Week 10 to Week 11:
    By this time, my attention will turn to link and stream functionalities, such as link, readlink, createReadStream, createWriteStream. Moreover, I will discuss with mentors to ensure the quality of my work and plan for the final phase.

  • Week 12 to Week 13:
    I will focus on watching functionalities, such as watch, watchFile, unwatchFile, and starting working on @stdlib/promise.

  • Week 14 to Week 16:
    In this final phase, I will try to implement Promise versions of fs functions. Lastly, I will test and review all of my work, and handle any final tasks.

  • Post GsoC:
    After completing this project, I plan to continue contributing to the stdlib community, continue working on implementing the Promise-APIs and support of other packages.

Notes:

  • The community bonding period is a 3 week period built into GSoC to help you get to know the project community and participate in project discussion. This is an opportunity for you to setup your local development environment, learn how the project's source control works, refine your project plan, read any necessary documentation, and otherwise prepare to execute on your project project proposal.
  • Usually, even week 1 deliverables include some code.
  • By week 6, you need enough done at this point for your mentor to evaluate your progress and pass you. Usually, you want to be a bit more than halfway done.
  • By week 11, you may want to "code freeze" and focus on completing any tests and/or documentation.
  • During the final week, you'll be submitting your project.

Potential risks

This potencial risks or obstacles are about the scope of work and timeline. Like mentioned previously, I will need to implement some utility functions to ensure the backward compatibility, handle functionalities with options, and try to provide Promise-APIs, so I need to balance priorities and difficulty of implementation. But the bright side is that there are many references and examples in other packages, and I am also free to extend the timeline to 16 weeks to ensure I have enough time to learn and implement the features.

Related issues

#10 [Idea]: achieve feature parity with builtin Node.js fs module

Checklist

  • I have read and understood the Code of Conduct.
  • I have read and understood the application materials found in this repository.
  • I understand that plagiarism will not be tolerated, and I have authored this application in my own words.
  • I have read and understood the patch requirement which is necessary for my application to be considered for acceptance.
  • The issue name begins with [RFC]: and succinctly describes your proposal.
  • I understand that, in order to apply to be a GSoC contributor, I must submit my final application to https://summerofcode.withgoogle.com/ before the submission deadline.
@Daniel777y Daniel777y added 2024 2024 GSoC proposal. rfc Project proposal. labels Mar 31, 2024
@Daniel777y
Copy link
Author

Hello, @kgryte @Planeshifter @Pranavchiku.

This draft proposal comes a bit late but I'm eager for the chance to contribute to this community and learn from its exceptional members. Any feedback or suggestions you might have would be greatly appreciated. Thank you.

@Planeshifter
Copy link
Member

Thanks @Daniel777y for your proposal and desire to contribute to stdlib!

This is definitely an area where it would be good to make progress. A few comments and suggestions for strengthening the proposal:

  • What potential risks or obstacles do you envision and how would you go about managing them?
  • There's no explicit mention of performance goals or benchmarks; this may be good to touch upon
  • What does "complete feature parity" means in the context of stdlib's existing capabilities? For example, long-term it would also be good to have promise-based equivalents like in Node.js fs/promise, although we currently lack Promise-APIs
  • Have you looked beyond the native fs APIs and studied packages such as fs-extra; any functions from there or other missing ones that would should be included in stdlib's fs namespace?

@kgryte
Copy link
Member

kgryte commented Apr 1, 2024

Thanks for working on this proposal. One follow-up question I have is

  • Node's filesystem APIs have continued to evolve over the years, adding support for new options and functionality. E.g., fs.cp was added in v16.7.0. One of the core principles of stdlib is ensuring backward compatibility. Historically, we've supported all the way back to Node.js v0.10. While we could potentially relax this requirement, how do you plan to support stdlib APIs which can accommodate and smooth over the differences across Node.js versions?

Ideally, any fs API we provide should work across all versions of Node.js that we support. And thus, for Node.js versions with missing functionality, we'd need to provide polyfills. And this could potentially be quite involved, and, if so, could affect your project timeline.

@Daniel777y
Copy link
Author

Daniel777y commented Apr 1, 2024

@kgryte Yes, thanks for response. As you mentioned, I do need to consider the backward compatibility.

I walked through the implementation of readable-stream and tried to understand how it works. For example, the isReadable function. What it does is like:

var isReadable = require('stream').isReadable || require('readable-stream').isReadable;

That is, if the native stream has the isReadable function or the readable-stream is disable for some reasons, we will use it from stream; otherwise, use the one from readable-stream. So I might have to implement some functionalities manually, such as cp, instead of including it from native fs.

In is isReadable case, its implementations is:

function isReadable(stream) {
  if (stream && stream[kIsReadable] != null) return stream[kIsReadable]
  if (typeof (stream === null || stream === undefined ? undefined : stream.readable) !== 'boolean') return null
  if (isDestroyed(stream)) return false
  return isReadableNodeStream(stream) && stream.readable && !isReadableFinished(stream)
}

isDestroyed, isReadableNodeStream, and isReadableFinished are manually-implemented
utils functions as well.

Do you think this idea is enough resolve the backward compatibility issue in stdlib?

As for the complexity, I think I can extend my timeline to 16 weeks to ensure I have enough time to learn and implement the features.

I guessed that the graceful-fs also provides polyfill across different versions, but unfortunately it seems not.

@kgryte
Copy link
Member

kgryte commented Apr 1, 2024

Yes, potentially. readable-stream is a similar idea, but arguably more complex than is necessary. I would anticipate needing to manually implement in a number of cases. For those APIs having many options, could be a bit of a slog to polyfill and ensure adequate testing.

@Daniel777y
Copy link
Author

Daniel777y commented Apr 1, 2024

@Planeshifter Thanks for your feedback!

Yes, I also walked through graceful-fs and fs-extra to see how they implement fs functionalities, while some other modules, like fs-minipass and chokidar, also provide references for specific features in fs. I will discuss in detail with mentors about priorities and decide what extra features to implement and the order of implementation.

As for Promise-APIs, indeed it would be good to provide mordern Promise style in stdlib, while I noticed that currently stdlib provides sync/async APIs. If Promise-APIs are needed, for those support Promise, I think they can be implemented like sync/async APIs. For those not support Promise, I can implement async first, then "universalify" them to Promise style, like what fs-extra and universalify do.

As an example, for the rename function, I can do something like:

var rename = require( '@stdlib/fs/rename' );

function universalify( rename ) {
    return Object.defineProperty( function ( ...args ) {
        if ( typeof args[args.length - 1] === 'function' ) {
            rename.apply(this, args);
        } else {
            return new Promise( ( resolve, reject ) => {
                args.push( (err, res ) => ( err != null ) ? reject( err ) : resolve( res ) );
                rename.apply( this, args );
            } )
        }
    }, 'name', { value: fn.name } )
}

var universalRename = universalify( rename );

That is, when users rename a file asynchronously, if they pass a callback function, it will use the callback; otherwise, it will return a Promise:

universalRename( './beep/boop.txt', './beep/foo.txt', done );

// or

universalRename( './beep/boop.txt', './beep/foo.txt' ).then( done );

This is a general idea, and do you think this is good enough for Promise-APIs? I can universalify existing APIs in stdlib to Promise style in the coming days to give it a try. If the workload of Promise-APIs are potentially time-consuming, maybe we can divide them into sub-projects, and I would love to continue working on them after GSoC.

This potencial risks or obstacles are about the scope of work and timeline. Like mentioned previously, I will need to implement some utility functions to ensure the backward compatibility, handle functionalities with options, and try to provide Promise-APIs, so I need to balance priorities and difficulty of implementation. But the bright side is that there are many references and examples in other packages, and I am also free to extend the timeline to 16 weeks to ensure I have enough time to learn and implement the features.

As for correctness and performance, every implementation will be tested through Tape framework and benchmarked in TAP format. All the functionalities will provide concise error messages, and polyfill across older Node.js versions. Another concern is that, one emphasis of stdlib is scientific computing, so I suppose it would process large or multiple files. Do you think it is necessary to handle some edge cases to avoid potential crashes?

@kgryte
Copy link
Member

kgryte commented Apr 2, 2024

I'd advocate for providing dedicated promise APIs and not the pattern of "if no callback, return a promise". That would fundamentally change error handling for legitimate use cases where a user intentionally does not provide a callback.

@kgryte
Copy link
Member

kgryte commented Apr 2, 2024

In general, I'd focus first on callback APIs. Then move to promise APIs (e.g. @stdlib/fs/promise/*). For the promise APIs, the main prerequisite is that we need to create @stdlib/promise/ctor with a polyfill fallback for older environments not having native Promise support.

@Daniel777y
Copy link
Author

Daniel777y commented Apr 2, 2024

@kgryte Thank you very much for your suggestions! This way the stdlib code will be easier to reuse and maintain. I suppose @stdlib/promise/* can potentially occupy some time slots, but I'd love to give it a try. I found promise-polyfill for reference.

@kgryte kgryte closed this as completed Apr 30, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
2024 2024 GSoC proposal. rfc Project proposal.
Projects
None yet
Development

No branches or pull requests

3 participants