Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP: Enable mangling of global references #1326

Open
wants to merge 2 commits into
base: harmony
Choose a base branch
from
Open

WIP: Enable mangling of global references #1326

wants to merge 2 commits into from

Conversation

AshleyScirra
Copy link
Contributor

This is an experimental work-in-progress patch to fix issue #1313.

The problem is currently window.Foo will mangle "Foo", but then Foo.bar will not mangle "Foo", breaking code. This patch attempts to recognise global references which are not identified as referencing any local variables, then also mangle them. The intent is to mangle "Foo" in both cases in the previous example.

The caller must pass a new treat_as_global array. If this is non-empty it enables the new mode. The caller can pass e.g. window to then recognise window.Foo as declaring "Foo" as global.

I don't really understand how the existing APIs work and a failing test is also included because I can't figure out how to get it to work. The intent is that if the left side of a dot expression refers to a local variable, then it should not treat it as defining a global property. For example if the caller passes "self" in treat_as_global, then the following should work:

// treat "Foo" as global
self.Foo = {};

function test() {
 var self = this;

 // do not treat "Bar" as global - 'self' refers to local var
 self.Bar = {};
};

To try to identify this I call node.global() but it does not seem to properly identify this case. I am also unsure as to whether to call figure_out_scope, which seems to be a destructive operation that appeared to partially undo previous mangling in my testing, so I'm not sure what's happening there.

There is also one more big hack: renaming globals opens the possibility of name collisions with local variables. For example:

window.Foo = {};

function test(p) {
 Foo.bar();
};

could be mangled as:

window.p = {};

function test(p) {
 p.bar(); // oops - now collides with parameter
};

To avoid this I simply prefix all globally-mangled names with g_, in the hope that g_XYZ will never collide with local variables (which I assume possibly incorrectly can never start with g_). Obviously this isn't a great solution and there are a range of alternative ways to solve this. For example local variables could use only lowercase names, and globals uppercase. Alternatively local and global mangling could share dictionaries of the names they use so globals will never mangle to the name of a local variable used anywhere in the program. These require wider changes though, and I would prefer to hear from other devs before implementing one of them.

Sorry about the partial state of all this, it's a complex codebase which I am unfamiliar with.

This enables mangling of global references using a given list of global
alias names, which can be e.g. "window", "self", etc. depending on
context.
@@ -83,7 +83,8 @@ function mangle_properties(ast, options) {
cache : null,
only_cache : false,
regex : null,
ignore_quoted : false
ignore_quoted : false,
treat_as_global : []
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please add a trailing comma to make smaller diffs for future options


// TODO: don't know if this is necessary to get scope information?
//if (mangle_globals)
// ast.figure_out_scope();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not necessary.

node.expression.global() &&
is_global_alias(node.expression.name) &&
!is_global_alias(node.property))
{
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Uglify code style puts the appends { to the previous line while leaving indenting as you have it. Yes it's unusual.

if (mangle_globals &&
node.global() &&
!is_global_alias(node.name))
{
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

code style

// HACK: to avoid mangled global references colliding with local variable names, use
// a prefix on all mangled global names to effectively move them to a different namespace.
if (mangle_globals && name in cache.global_defs)
mangled = "g_" + mangled;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not crazy about this hack. Collisions can be avoided in a similar fashion to the other PR you wrote.

if (mangle_globals &&
node.global() &&
node.name in cache.global_defs)
{
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

code style

@kzc
Copy link
Contributor

kzc commented Oct 10, 2016

Neutral on the PR, but the concept seems okay behind a flag defaulting to false.

Needs a command line flag and documentation.

Please put a WIP: prefix in PR title.

@AshleyScirra
Copy link
Contributor Author

Can you point me in the direction of how to fix the failing test case? I just need a reliable way to identify if a symbol refers to a local variable or not.

@kzc
Copy link
Contributor

kzc commented Oct 11, 2016

AST_SymbolRef#global() will tell you whether it's global or not. Mind you if a global is aliased via assignment or passed via a function argument all bets are off. There's no data flow analysis in uglify.

Update style and remove some commented code
@AshleyScirra AshleyScirra changed the title Enable mangling of global references WIP: Enable mangling of global references Oct 11, 2016
@AshleyScirra
Copy link
Contributor Author

I only intend to statically identify local variables, so the var self = this doesn't treat self.foo as a global definition in that scope if "self" is one of the global aliases. As far as I can tell I am calling AST_SymbolRef#global() on the left side of the dot (i.e. the self referring to a local variable), but it still returns true for some reason. I guessed maybe it was because it hadn't figured out the scope? That's why I had that commented-out code in there. If that's not the problem I'm at a loss as to how to fix it.

@kzc
Copy link
Contributor

kzc commented Oct 11, 2016

AST_SymbolRef#global() does work. Something else is wrong in your algorithm. Add some debug print statements.

You still have to handle AST_Sub cases and mangle collisions as well.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants