-
Notifications
You must be signed in to change notification settings - Fork 57
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Infrastructure tracking: compilation, source mapping, and TypeScript #1338
Comments
@tomka I was looking at the webpack feature branch again curious how much work it would be so that we could move dependencies out-of-tree and be able to support TypeScript, etc. It actually seems somewhat straightforward now to be able to avoid name mangling and still separate libraries into separate bundles, and we're most of the way there if we wanted to incrementally migrate libraries or code out of pipeline.
|
For testing and benchmarking alternative implementations of Arbor.js and friends, I wrote https://github.com/clbarnes/arbor-harness / https://www.npmjs.com/package/arbor-harness, which fetches and mangles the original implementation from raw.github . If those bits of catmaid-lib (arbor, arbor-parser, synapse-clustering) were on npm, that would get easier. Hardly isolated, though! Would we run into issues surrounding the order in which everything is imported, IIFEs executed, and stuff added to the |
Making components available elsewhere as their own libraries would be a goal (besides better tooling and organization within CATMAID). First we'd separate them into modules in-tree to sort things out, then later move them out.
It's a consideration, but my plan for incremental conversion would be to start from the "top" of the import order, and have a pseudo webpack entry point before the real legacy CATMAID scripts that would just import whatever is ES6 moduled/webpackified. It's similar to how we've done several other modernization/big cleanup sweeps in the codebase. If everything were converted at once, it wouldn't be a problem. But I think that would be too high of an activation energy. |
Yes, I would merge such a change. The
During regular development, I suppose, that would only be needed if NPM dependencies have changed (or initial setup), i.e. I don't need to call it normally?
This would also only be need to be called if if the NPM dependencies or our own TypeScript and webpack bundles change, wouldn't it? If so, unless, work is done in TypeScript or bundles, there is no need to call this a lot either. Both would of course be part of a regular production server update.
👍
I believe so too and think we should try to do this as an incremental update. |
I was doing something yesterday that required me editing on one computer then reloading and debugging very slowly on another, and waiting a minute to refresh a page and setup some widgets just to find out I'd typoed a property name was very motivating to want to use TS.
Correct, basically
Also correct, although there is also a server/watch mode that will watch for changes and automatically do incremental updates to the bundles, which would be useful if frequently edited code is bundled. The "neat" deep end of this (not that we'd use it) is being able to do thing like React component hot reloading, where you edit React components, and they live-update in the browser because the state of the component is separate from the UI/controls that are getting reloaded. |
I think it would be nice to move towards using the webpack dev server; by constraining usage of TS / modern JS to a small set of new files to avoid having to manually recompile all the time, we're obviously losing out on a lot of the incentive to switch. Related, I've been fumbling with catmaid-lib in this context today and have got it to a stage where webpack can bundle it and it doesn't immediately error when loaded in a browser. I haven't really touched any of the IIFEs, so the modules don't really add any value, but it's there as an experiment. But there's still bits of catmaid-lib which depend on globals or |
Would it be easier to factor out smaller libraries (like arbor) first and bundle those, or do you think it's best to bundle all of CATMAID-lib at once, then factor it into focused libraries later?
Agreed, it makes sense for CATMAID-lib to define the |
To add, a concern with bundling all of CATMAID-lib at once and moving it to an NPM package is that some parts of CATMAID-lib still have high churn. The update process is a bit more cumbersome for those if it requires editing and publishing in a separate repo, publishing to NPM, and updating in CATMAID. That would make sense for stable parts like arbor. However, there may be nothing stopping us from have multiple packages in-tree here. So CATMAID-lib can stay in this repo, with local dependencies from the main CATMAID front-end package to the CATMAID-lib packages, and each of the individual CATMAID-lib packages get published separately to NPM as part of the release process. I like that option if we can get it set up, as there's no new impedance to our existing workflow but all the benefits of separate, published libraries. |
Yes, probably. My current factoring doesn't really benefit from the modularisation because it's still dependent on the import order, and most of the imports are only being done for the side-effect of adding stuff to the global The exports are:
Those in bold are good targets for excising first.
Agreed; this is an opportunity to take a more systematic approach and separate them properly. The other such point I found is that Minor point, here for documentation: d3 will need to remain an in-tree dependency because node 7+ can't build d3 <4. There may be similar issues with other dependencies; I haven't looked too closely at their version constraints yet.
https://github.com/lerna/lerna seems to be popular for managing interdependent JS projects in a monorepo, and handles symlinking to each other as local dependencies, deduplicating shared external dependencies, synchronising version updates and publishing changes. I've seen a fair bit of advice recommending that frontend and backend are treated essentially as separate projects, which took me a while to get my head around given our current repo structure. If we're considering making "real" JS libraries and building the frontend, it might be worth separating the concerns a bit by giving them (closer to) top-level directories rather than keeping them buried in django/applications/catmaid/static/thekitchensink/etc ; especially if we were to move towards not having them managed by django at all. The compiled bundles could always go into those django/etc paths: that could make our lives a bit simpler when it comes to not serving source files. So our repo could look more like
For reference: https://github.com/dan-kez/lerna-webpack-example |
Removing the IIFE could (/should) mean unindenting most of the file, thus trashing the git blame anyway, thus removing the block against using an autoformatter. |
While I'm a fan of autoformatting, because of the peculiarities of JS, especially the heavy use of anon functions, I'm initially skeptical that one that wouldn't hurt readability exists. Happy to be persuaded, though. OTOH we should autoformat and static analyze the hell out of the backend. I aspirationally put flake8 in Travis years ago and for awhile played the minigame of reducing warnings. |
Also with any formatting tools, they have to be compatible with all of our dev environments, which can be a tall order:
Even getting jshint working for everyone wasn't painless. |
Relatedly, if we ES6ify a lot of classes at the same time we remove the IIFE, not that much indentation should change. |
Yeah, I have noticed prettier can be pretty funky about putting way too many lines around short expressions. I don't envision the environment being a big issue; if we had an npm script to do it, called in a pre-commit hook (and then it was checked in CI to make sure people were using the hook) the editor wouldn't need to be involved. It's a good point about saving ourselves the blame churn by ES6ing stuff at the same time, though. |
The only decent JS autoformatter I've been able to find is Prettier. A big personal nit is that when it line breaks single statements, it uses one tab-width instead of two, and there's no way to configure this: myObject
.foo(); But I could probably bring myself to live with that. More annoying is that it changes code that looks like this: if (stackViewer.isLayerRemovable(key)) {
container.append($('<div class="layerClose">')
.append($('<input type="button" value="x" class="remove"/>')
.click(makeRemoveHandler(stackViewer, key))));
} Into this if (stackViewer.isLayerRemovable(key)) {
container.append(
$('<div class="layerClose">').append(
$('<input type="button" value="x" class="remove"/>').click(
makeRemoveHandler(stackViewer, key),
),
),
);
} Config
{
"arrowParens": "always",
"bracketSpacing": true,
"htmlWhitespaceSensitivity": "css",
"insertPragma": false,
"jsxBracketSameLine": false,
"jsxSingleQuote": false,
"parser": "flow",
"proseWrap": "preserve",
"requirePragma": false,
"semi": true,
"singleQuote": true,
"tabWidth": 2,
"trailingComma": "all",
"useTabs": false,
"printWidth": 88
} |
I am not in favor of having an autoformatter or enforcing a particular style. Of course, everyone is free to use them on their code personally, but I feel the general style decisions we make and the code style we follow are reasonable enough. Leaving some room for variation and evolution of styles is useful to in my opinion. Also, to me the idea of introducing webpack/etc first for a small set of our library functions or just a dependency seems preferable over changing everything in one go. I'd like to gain some experience with the new workflow and design options first, before we decide if it is a good idea to change the whole code base at once. I won't be able to spend much time on this and can't really judge the implications and required works for all this yet. The same is true for general project layout and publishing options (npm publishing, lerna, versions, etc.). I am curious and happy to discuss these things, but I feel like it would be useful if we could decouple this decision from including Generally I would prefer fewer tools we need to build and setup things. When updating a lot of production servers and having to remember an >5 step update process isn't ideal. Of course it could be automated and abstracted, but simpler would be nicer I believe.
👍
That's certainly reasonable for the type of application CATMAID is and reorganizing our code base accordingly is certainly useful and something I would support (e.g. like you suggested above). Such a codebase update would certainly also benefit from a more modular design options, possibly even with individually publishable components. How we would do this exactly is also affected by how well weback and npm dependencies work for us. Therefore, I'd like to gain some experience with webpack and npm dependencies first before I can offer a more qualified opinion on such a change. :-) |
This is a good point. There's a school of thought in favour of committing the bundles (webpack's output) to git. If we did this, then the deployment complexity wouldn't change at all - the only difference would be that the javascript code living in django/applications/catmaid/static would be webpack bundles built on the dev machines rather than source, which would live elsewhere, and collectstatic wouldn't need to do any concatenation etc.. I'd be in favour of this; we'd just need to run the webpack build on CI as well to could prove that the bundles are in sync with the source. That would also mean that our dependencies in the output files are in-tree just like they are now, agnostic of whether they are fetched on the dev machine by npm or from in-tree. |
Agreed on this front, at least in that it's a completely independent topic from the webpack infrastructure being discussed here, it would have to be an unanimous decision, and there's no compelling reason at the moment. I like autoformatting in some contexts, because it does a few things:
That said, I don't think there's good reason to do it in the CATMAID frontend now, and I don't think nontrivial JS is suited to it because of the syntax quirks and idioms in the language, unlike Rust or Python. JS also doesn't have much syntax that's difficult to manually format consistently (unlike, e.g., fn signatures in Rust with multiple generics, HRTBs, bounds). I ran several files through Prettier, and with the exception of some specific cases with lots of existing syntax noise, I was almost always happier with the original. If we still had a lot of ice age code from pre-strict-mode, pre-jshint, pre-IIFE, I would be in favor of a one-off run, but as things are I agree we've converged on a reasonable style.
Honestly I dislike this practice. It makes the repo huge, requires setting additional defaults on Having a script that does pip install -r requirements.txt
manage.py migrate --all
npm install
webpack
manage.py collectstatic etc., possibly with a mode that does a non-modifying version of each first to check and display which needs to run seems like a better amelioration for these tasks. |
Agreed that formatting is not a priority here; it was a throwaway comment I made when I recognised that my catmaid-lib experiments were necessarily blowing away the git history anyway. A third way of doing deployment (although whether it's simpler is up for debate) would be for travis to bundle the static files and throw a tarball onto an FTP server (on tagged releases) which could be untarred onto each catmaid server. Saves having to manage a node environment as well as python on every server, and the build step in deployment. But it would make it harder to include third-party apps, it would make the development deployment very different to production, and we'd probably want to put all of our CSS, images etc. through the same wringer, which doesn't sound much like an MVP for now. |
Noting down issues as I hit them while experimenting: If we end up using lerna (which I think is a good idea if we were to split out and put components on npm), in-tree dependencies (as d3 will have to be) would need to be owned by particular subproject. If that dependency was needed by multiple subprojects, we'd either need to duplicate it (unpleasant), or have it owned by one subproject and re-exported for consumption by others (introduces more cross-dependencies than strictly necessary). |
Here's my experimentation repo. I haven't tested it very hard but it loads without error... It includes 3 subprojects: Arbor (which includes SynapseClustering), catmaid-lib (depends on local Arbor, vendorised d3, and some npm libraries), and CATMAID (depends on catmaid-lib, handles all the globals). catmaid-lib has been refactored to strip out IIFEs and use ES6 imports. CATMAID has some typescript and ESnext in it, which gets transpiled down to target IE11 and a few versions back of major desktop browsers; it's just got a few stubby classes which are required by catmaid-lib. I haven't looked at how bad the name mangling is. The typing isn't enforced (cf mypy). The output bundles include sourcemaps. I haven't checked whether common dependencies are deduplicated. |
Chris and I talked about the next iteration of the experiment targeting ES6 straight from TypeScript so there's only bundling, no babel, and thus minimal/no name mangling. |
Further offline discussions: As per #1955, an option for a performant 64-bit ndarray implementation is a wrapper around rust's ndarray; a build system would be useful so we don't have to commit another large wasm blob to the source tree. Typing still has many advantages for entering, maintaining, and expanding the codebase. The build prototype doesn't represent an MVP - we could simplify it significantly for the first pass by keeping all of our JS in the same package, with vendorised dependencies (possibly switching one small, modern library over to an npm dependency, as a proof of concept). This avoids the extra layer of config and redirection that is lerna; we could then split out libraries at a later date if we wanted. The downside of keeping the vendorised JS is that it all works by mutating the global scope. Side-effect-only imports are an option, although I believe we'll lose the benefits of typescript for libraries which function that way. Heavily-used features could get typescript type definition files (cf mypy stubs) in the interim. We would still benefit from using typescript in our own code, or even just the import system which lends itself to static analysis. Also for the interim period, we could have an Uncertain points (for me at least) is how much effort it'll take for typescript to become useful when so much of what we touch can't make use of it without us adding definitions, and also what the development workflow will look like (replacing django-pipeline, updating the extension system etc.). |
A bit more tracking: The build prototype above is just about splitting and refactoring existing JS code into a webpack-y fashion - it gets us closer to having resolvable dependencies and so on once webpack is integrated with CATMAID, but doesn't do anything to get us there, and so is probably more of a stage 2 thing. I did do some work on a prototype to get django and webpack playing with each other, and got it working to an extent. Basically, you run a webpack dev server alongside the django dev server. The webpack dev server detects changes to frontend files and recompiles them in memory, while dumping out a file giving information on its last build. The django dev server reads that file and knows to make requests for static files to the webpack dev server rather than its own static directory. The webpack bundles can still happily make requests to django's REST endpoints: the landing page is handled by django's templating, but loads the static files from the webpack dev server. In theory, all we need to do is write a monster I'll push it when I'm home so you can take a look. Still some outstanding queries:
e.g. (function(CATMAID) {
CATMAID.variable = "potato";
})() becomes export default function(CATMAID) {
CATMAID.variable = "potato";
}; and const CATMAID = {};
import myLibFn from "./myLib.js";
myLibFn(CATMAID)
import myOtherLibFn from "./myOtherLib.js";
myOtherLibFn(CATMAID)
... This would just be for the first pass of getting webpack to do what pipeline is doing currently. In the longer term we want files to explicitly import what they need from each other. Exports not currently tied to the
Re. timeline: we may possibly be having a CATMAID hackathon some time around May, possibly with a view to hiring another dev or two. If we're going to make significant changes to the dev workflow, we probably want to do that before undertaking hackathon projects and introducing people to the codebase. |
Django + webpack prototype, with a CATMAID-y backend layout, but frontend files put into their own top-level directory. This is a proof of concept for a dev setup: haven't tested a production deployment. |
This is an internal tracking ticket for infrastructure for compiling client code through Django-pipeline @tomka and I have been discussing for some time.
Motivation
Requirements
These are the capabilities any compiled pipeline must maintain:
NOTE: The below plan is no longer valid. See the discussion beginning in 2019 below for a webpack-based approach.
Draft PlanDrawbacks
Alternatives
The text was updated successfully, but these errors were encountered: