Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Regex with negative character group is not correctly handled #2373

Closed
8 tasks done
danny0838 opened this issue Nov 18, 2022 · 1 comment
Closed
8 tasks done

Regex with negative character group is not correctly handled #2373

danny0838 opened this issue Nov 18, 2022 · 1 comment
Labels
bug Something isn't working fixed issue has been addressed

Comments

@danny0838
Copy link

danny0838 commented Nov 18, 2022

Prerequisites

  • I verified that this is not a filter list issue. Report any issues with filter lists or broken website functionality in the uAssets issue tracker.
  • This is not a support issue or a question. For any support, questions or help, visit /r/uBlockOrigin.
  • I performed a cursory search of the issue tracker to avoid opening a duplicate issue.
  • The issue is not present after disabling uBO in the browser.
  • I checked the documentation to understand that the issue I am reporting is not normal behavior.

I tried to reproduce the issue when...

  • uBO is the only extension.
  • uBO uses default lists and settings.
  • using a new, unmodified browser profile.

Description

Current token extraction algorithm does not check for negativity of a character group, and thus [^&] is treated as [&] and results in an invalid token.

A variant of the regex may be something like [^0-9A-Za-z%,;] to match a specific separator.

To fix the issue we may need to implement full support of negative character groups and character ranges. A strategy to do so is to recreate a regex from the character class and test whether it matches a string with all token chars:

case 8: /* T_CHARGROUP, 'CharacterGroup' */ {
  let re = [];
  for (let i = 0, I = node.val.length; i < I; i++) {
    const n = node.val[i];
    switch (n.type) {
      case 32: /* T_UNICODECHAR, 'UnicodeChar' */ {
        re.push('\\u' + n.flags.Code);
        break;
      }
      case 64: /* T_HEXCHAR, 'HexChar' */ {
        re.push('\\x' + n.flags.Code);
        break;
      }
      case 128: /* T_SPECIAL, 'Special' */ {
        re.push('\\' + n.val);
        break;
      }
      case 256: /* T_CHARS, 'Characters' */ {
        n.val.forEach(v => re.push(this.escapeCharacterClass(v)));
        break;
      }
      case 512: /* T_CHARRANGE, 'CharacterRange' */ {
        const start = this.charFromCharacterRangeVal(n.val[0]);
        const end = this.charFromCharacterRangeVal(n.val[1]);
        re.push(this.escapeCharacterClass(start) + '-' + this.escapeCharacterClass(end));
        break;
      }
    }
  }
  re = new RegExp('[' + (node.flags.NegativeMatch ? '^' : '') + re.join('') + ']');
  return re.test(TOKEN_CHARS) ? '\x01' : '\x00';
}

Related additional methods for the regex class:

static escapeCharacterClass(s) {
  return ['^', '-', ']'].includes(s) ? '\\' + s : s;
}

static charFromCharacterRangeVal(val) {
  if (typeof val === 'string') {
    return val;
  }
  /* T_HEXCHAR or T_UNICODECHAR */
  return val.flags.Char;
}

A specific URL where the issue occurs.

https://example.com/?p=fooadsbar

Steps to Reproduce

  1. Add a user rule /^https?://(?:example\.com|ex\.com)/\?(?:p|pat)=[^&]*ads[^&]*(?:[&#]|$)/$document
  2. Visit https://example.com/?p=fooadsbar

Expected behavior

The page should be blocked.

Actual behavior

The page is not blocked.

uBO version

1.44.4

Browser name and version

107.0.5304.107

Operating System and version

Independent

gorhill added a commit to gorhill/uBlock that referenced this issue Nov 18, 2022
@gorhill
Copy link
Member

gorhill commented Jan 8, 2023

Closing as fixed as per #2374 (comment).

@gorhill gorhill closed this as completed Jan 8, 2023
@gorhill gorhill added fixed issue has been addressed bug Something isn't working labels Jan 8, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working fixed issue has been addressed
Projects
None yet
Development

No branches or pull requests

2 participants