Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

text masking settings apply to inputs #1097

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

mdellanoce
Copy link
Contributor

@mdellanoce mdellanoce commented Jan 17, 2023

@mdellanoce mdellanoce force-pushed the md-mask-enhancements-1096 branch from eae89e8 to c4980bc Compare January 17, 2023 15:29
@mdellanoce mdellanoce force-pushed the md-mask-enhancements-1096 branch from c4980bc to 96fb991 Compare February 1, 2023 17:36
if (classMatchesRegex(el, maskTextClass, true)) return true;
}
const maskDistance = distanceToMatch(el, maskTextClass, maskTextSelector);
const unmaskDistance = distanceToMatch(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think skipping this calculation when maskAllText is false could improve the performance.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated this so that if maskAllText is true, and unmasking either is not configured (best case) or fails to find a match (worst case), then the masking distance is not computed.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

According to the documentation unmaskTextClass also works on conventional masking techniques. Maybe we need to check if a mask is applied first, before we check its unmask distance

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i added a branch for when maskAllText is false where masking is checked first, and then unmasking is checked only if masking applies to the element. The opposite happens when maskAllText is true, which addresses @YunFeng0817 's original comment.

@Juice10
Copy link
Contributor

Juice10 commented Feb 3, 2023

Hi @mdellanoce, thanks for creating this Pull Request, I think this is a great idea to help give people more privacy and this should definitely be included in rrweb.

I did notice one potential issue with this implementation: the needMaskingText function has been a performance problem in the past because it gets called on every single node that gets recorded (sometimes tens-of-thousands or even hundreds of thousands of nodes get recorded at once), and because the new needMaskingText now compares distances for both masked and unmasked parents its doing twice as much work.
Maybe you could implement some escape hatches to simplify the work it has to do, for example if the is no masked parent, then there is no need to figure out what the unmasked distance is.

Thanks again for submitting this, I think if we can get this performant this would be a great addition to rrweb!

@mdellanoce
Copy link
Contributor Author

@Mark-Fenng @Juice10 thanks for the feedback, totally understand the performance concerns. I think the maskAllText suggestion will work, and I have a few other ideas. I'll see what I can do and hopefully ping you back in a few days.

@YunFeng0817 YunFeng0817 linked an issue Feb 6, 2023 that may be closed by this pull request
1 task
mdellanoce added a commit to pendo-io/rrweb that referenced this pull request Feb 13, 2023
@changeset-bot
Copy link

changeset-bot bot commented Feb 13, 2023

🦋 Changeset detected

Latest commit: b1a1922

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 8 packages
Name Type
rrweb-snapshot Patch
rrweb Patch
rrdom Patch
rrdom-nodejs Patch
rrweb-player Patch
@rrweb/types Patch
@rrweb/web-extension Patch
rrvideo Patch

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

mdellanoce added a commit to pendo-io/rrweb that referenced this pull request Feb 13, 2023
@mdellanoce mdellanoce force-pushed the md-mask-enhancements-1096 branch from ac3ea6e to 69e5fb3 Compare February 13, 2023 20:58
mdellanoce added a commit to pendo-io/rrweb that referenced this pull request Feb 13, 2023
@mdellanoce mdellanoce force-pushed the md-mask-enhancements-1096 branch from 69e5fb3 to 0f6a611 Compare February 13, 2023 21:13
@mdellanoce
Copy link
Contributor Author

@YunFeng0817 @Juice10 sorry for the delay, but I added 3 commits to address performance:

  1. if maskAllText is true, and unmasking either is not configured (best case) or fails to find a match (worst case), then the masking distance is not computed, and needsMaskingText returns true early
  2. I removed the default for unmaskTextClass, so that unmasking overhead won't "surprise" anyone with existing setups until they opt-in to use it.
  3. I reworked the distance computation so that it is a single walk from the element to the root node. Each step will check for class match first, then check selectors. Also, in the masking distance computation, if the computed distance at a given step exceeds what was found for the unmasking distance, it'll stop walking upwards immediately and return a non-match.

Hope that all makes sense. Let me know what you think.

attributes.value = maskInputValue({
type: attributes.type,
tagName,
value,
maskInputOptions,
maskInputFn,
forceMask,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this mean that maskAllText will also mask inputs? Do we need (un)maskInput(Selector|Class) as well so they can be selectively (un)masked?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the masking and unmasking options in general apply to inputs, including maskAllText. So I was thinking unmaskSelector/Class already cover this.

the forceMask parameter is a little awkward, maybe there's a better way to do that. I'm evaluating the masking/unmasking settings outside of this function call to get a true/false value to pass to forceMask.

mdellanoce added a commit to pendo-io/rrweb that referenced this pull request Mar 14, 2023
@mdellanoce mdellanoce force-pushed the md-mask-enhancements-1096 branch from 0f6a611 to cc38118 Compare March 14, 2023 19:32
@Juice10
Copy link
Contributor

Juice10 commented Mar 15, 2023

@mdellanoce This PR is coming along really nicely. Since performance is such an important part of this it probably makes sense to re-run the benchmarks, and make some benchmarks of our own to understand the performance of this.
Check out this PR #903, it should give you a handy guideline. Specifically checkout packages/rrweb/test/benchmark/dom-mutation.test.ts it has the contents of these benchmarks

mdellanoce added a commit to pendo-io/rrweb that referenced this pull request Apr 6, 2023
@mdellanoce mdellanoce force-pushed the md-mask-enhancements-1096 branch 2 times, most recently from 39c1af0 to 79e113e Compare April 10, 2023 16:29
@mdellanoce
Copy link
Contributor Author

mdellanoce commented Apr 10, 2023

@Juice10 I added 2 new benchmarks for masking and unmasking.

Here are the results from master:

image
image
image
image

Note: the unmasking test on master is effectively the same as the masking test, since unmasking is unsupported there.

Here are the results from my branch:

image
image
image
image

(commits with changes to follow soon, currently in rebase hell) rebased

@mdellanoce mdellanoce force-pushed the md-mask-enhancements-1096 branch from 3b7ded7 to 32b726b Compare April 10, 2023 16:49
@mdellanoce
Copy link
Contributor Author

@Juice10 ready for another look, i think

Copy link
Contributor

@billyvg billyvg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mdellanoce I meant to make these PR's to your fork, but I updated my local branches with master and it's causing bad diffs, so I'll just link the PRs from my fork -- if you update your branch with rrweb master, I can send out PRs

  1. fix tests with maskAllText, add textarea to mask-text.html  getsentry/rrweb#97 -- This fixes tests not properly using maskAllText, this shows that unmasking does not work. I've also added <textarea> in mask-text.html, as it has a textContent prop that can conflict with its value attribute when masking.

  2. Add test for dynamically added inputs getsentry/rrweb#98 - adds additional tests for dynamically added inputs and different configurations of maskAllInput.

There are additional comments within these PRs that point out the failing/desired test results.

Let me know if you have any questions and if I can assist in any way!

@@ -597,6 +597,7 @@ export function generateRecordSnippet(options: recordOptions<eventWithTime>) {
maskTextSelector: ${JSON.stringify(options.maskTextSelector)},
maskAllInputs: ${options.maskAllInputs},
maskInputOptions: ${JSON.stringify(options.maskAllInputs)},
maskInputFn: ${options.maskInputFn},
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're not passing maskAllText here so it's not being tested at all. e.g. add a <div> with text content as a direct child of <body>

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed. also looks like the unmasking issue was the same (not passing unmaskTextSelector here), so I updated that as well. The test snapshots look correct to me now, but would definitely appreciate your eyes on them as well.

@mdellanoce mdellanoce force-pushed the md-mask-enhancements-1096 branch from 5ffcb9f to afec5a2 Compare July 13, 2023 14:09
@mdellanoce
Copy link
Contributor Author

mdellanoce commented Jul 13, 2023

@billyvg i synced the branch with master, i'm looking into the unmasking issue you pointed out

@mdellanoce mdellanoce force-pushed the md-mask-enhancements-1096 branch 2 times, most recently from 2f726fb to 0bccb5b Compare July 13, 2023 14:52
\\"childNodes\\": [
{
\\"type\\": 3,
\\"textContent\\": \\"*******\\\\n \\",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be unmasked since it has rr-unmask class?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i don't think so. one of the later commits makes unmaskTextClass empty by default, so since i didn't specify it in the test, this shouldn't be unmasked. I could remove the rr-unmask from the test file though, since I can see how that is confusing.

\\"childNodes\\": [
{
\\"type\\": 3,
\\"textContent\\": \\"unmask2\\",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks correct, parent has [data-masking="false"]

@mdellanoce
Copy link
Contributor Author

@billyvg

  1. okay, I see this, I must be missing some case for textarea, b/c it looks like it masked the value attribute, but not the textContent. Will fix that up.
  2. This one is because maskAllInputs has precedence over the text masking settings. We decided to give the existing input masking options (maskAllInputs and maskInputOptions) highest priority here, so our customers don't end up accidentally unmasking something sensitive. Maybe I just need to update the docs to be more explicit about the masking precedence?

@billyvg
Copy link
Contributor

billyvg commented Jul 14, 2023

@billyvg

  1. okay, I see this, I must be missing some case for textarea, b/c it looks like it masked the value attribute, but not the textContent. Will fix that up.
    What I did on my fork was strip textContent completely from textarea, but my implementation of maskAllText was different because it was independent of maskAllInputs. So you could have maskAllText != maskAllInputs, and textareas could have only one of value/textContent masked, and the other unmasked. Not sure if this is the case with your impl of maskAllText
  1. This one is because maskAllInputs has precedence over the text masking settings. We decided to give the existing input masking options (maskAllInputs and maskInputOptions) highest priority here, so our customers don't end up accidentally unmasking something sensitive. Maybe I just need to update the docs to be more explicit about the masking precedence?

Ah I see, that makes sense then, i was purely going from what I saw in the test snapshots (the snapshots make the tests hard to reason about). Documenting the masking precedence would definitely be helpful!

@mdellanoce
Copy link
Contributor Author

but my implementation of maskAllText was different because it was independent of maskAllInputs. So you could have maskAllText != maskAllInputs, and textareas could have only one of value/textContent masked, and the other unmasked. Not sure if this is the case with your impl of maskAllText

Yeah, that's what is happening here too, as far as I can tell. The test sets maskAllInputs to true, which masks the value of the textarea, but the textContent is not masked because that is considered "not an input" and maskAllText is not set to true. Definitely seems incorrect.

@mdellanoce mdellanoce force-pushed the md-mask-enhancements-1096 branch from 0bccb5b to d74b662 Compare July 17, 2023 17:49
@mdellanoce
Copy link
Contributor Author

@billyvg fixed the textarea masking. it can still be a little wonky with the default masking functions, b/c one (maskTextFn) excludes whitespace from masking, while the other (maskInputFn) masks every character regardless.

@billyvg
Copy link
Contributor

billyvg commented Jul 17, 2023

@billyvg fixed the textarea masking. it can still be a little wonky with the default masking functions, b/c one (maskTextFn) excludes whitespace from masking, while the other (maskInputFn) masks every character regardless.

Should we have a default masking regex/function?

@Juice10 @YunFeng0817 Can you take another look when you all get the chance?

if (type === 'radio' || type === 'checkbox') {
isChecked = (target as HTMLInputElement).checked;
} else if (
maskInputOptions[tagName.toLowerCase() as keyof MaskInputOptions] ||
maskInputOptions[type as keyof MaskInputOptions]
maskInputOptions[type as keyof MaskInputOptions] ||
forceMask
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we can re-use needMaskingText/forceMask here because we could have an unmaskTextSelector that matches which means forceMask is false, but maskInputOptions is true.

forceMask needs to have 3 outcomes I suppose: 1) matches mask text class, 2) matches unmask, 3) does not match any masking OR unmasking

Only in the 3rd case should we fallback to checking maskInputOptions

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not sure i follow

because we could have an unmaskTextSelector that matches which means forceMask is false, but maskInputOptions is true

in this case, it'd still be masked b/c maskInputOptions is true, and the input masking options have priority on inputs like we discussed above?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤦 sorry - maskAllInputs sets defaults on maskInputOptions, overriding the options I pass it.

@mdellanoce
Copy link
Contributor Author

@Juice10 @YunFeng0817 something I've been playing with in a different branch:

pendo-io@6516f54

I noticed when taking a snapshot or processing a large mutation, we'd be evaluating masking selectors for the same nodes over and over. This commit caches those results temporarily to avoid re-evaluating the selectors when the result for a given node is known already. In my testing, it seems to provide a minor improvement (~10ms faster for a 100ms snapshot). Thought it might provide a bigger boost, so that was a bit of a disappointment, though I've seen it be quite a bit faster in certain configurations.

Not sure my approach here is the best/cleanest, but figured I'd throw it out there in case performance concerns are still a sticking point on this PR.

@mdellanoce mdellanoce force-pushed the md-mask-enhancements-1096 branch from d74b662 to 2665e50 Compare August 28, 2023 18:43
@eoghanmurray
Copy link
Contributor

eoghanmurray commented Nov 17, 2023

I haven't read this PR in detail, but one idea could be that maskDistance be passed down the call chain (and incremented at each level) like the boolean needsMask is passed down in #1349 (✓that is merged to trunk now)


although some sort of distanceToMatch would likely still be needed for checking whether a mask is needed in a newly added node from a mutation

@eoghanmurray
Copy link
Contributor

eoghanmurray commented Nov 24, 2023

So #1349 is merged now which I wrote before being aware of this PR; it fixes performance problems with the prior approach namely the constant looking back up the tree when recursing during full snapshot.
The approach from that PR should render the need for caching moot, as each node should only be checked once (with the exception of mutations, where you always need to look back up the tree after insertion)

I've tried to rebase this PR based on that, but the merge is non-trivial, but I might be able to do it given half a day.

Possibly it needs a new approach now:

  • having a boolean variable which controls mask/unmask passing down the tree might now do the job of distance calculation (in #1349 we can stop further checking after this is found to be true, as we don't have to consider the 'unmask' case)
  • for mutations, where both mask & unmask are set, we could do closestParent = el.closest(maskSelector + ', ' + unmaskSelector) which will might be good enough to find the closest matching either one, at which point we could do closestParent.matches(maskSelector) vs. closestParent.matches(unmaskSelector)

Another option as a first step would be to disallow maskTextClass/Selector when maskAllText is on, in order to avoid any distance calculation.

(Also, it might be nice to first merge the maskTextClass/Selector options for simplicity, and/or combine all into a single 'maskTextOptions' object similar to MaskInputOptions)

@mdellanoce
Copy link
Contributor Author

@eoghanmurray i'll work on rebasing and modifying the approach

@eoghanmurray
Copy link
Contributor

If you wish to contribute the fix for #874 as a separate PR that might make review easier, we should be able to fasttrack that.

@mdellanoce mdellanoce force-pushed the md-mask-enhancements-1096 branch from 2665e50 to fe88c35 Compare December 7, 2023 16:16
@mdellanoce mdellanoce changed the title text & input mask enhancements text masking settings apply to inputs Dec 7, 2023
@mdellanoce
Copy link
Contributor Author

@eoghanmurray i rebased and updated this PR to only address #874. I removed all the new unmasking logic, since I can apply that through the maskText/InputFn callbacks now that they receive the element as a parameter.

I also added a check for maskTextSelector == '*' to return true faster from needMaskingText without needing to run matches or closest.

@colingm colingm force-pushed the md-mask-enhancements-1096 branch from fe88c35 to fc9b27b Compare May 1, 2024 17:44
@colingm
Copy link
Contributor

colingm commented May 1, 2024

@eoghanmurray it has been awhile since we commented on this. This PR has been cut down to just address #874 , could we get another look here?

@colingm colingm force-pushed the md-mask-enhancements-1096 branch from 39defb8 to b1a1922 Compare May 1, 2024 19:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Feature Request]: text & input masking enhancements maskTextClass and maskTextSelector not working
6 participants