Skip to content

Commit

Permalink
[Security Solution] Fix incorrect changes highlighting in diff view (e…
Browse files Browse the repository at this point in the history
…lastic#205138)

**Resolves: elastic#202016

## Summary

This PR resolves an issue where the diff view incorrectly marked certain
characters as changed (using bold font) in some cases.

## Root Cause
The issue arises from a bug in the `diff` library (v5). The library is
used to compute two-way diffs between strings (old field value and new
field value), producing an array of change objects that is then used for
rendering.

Conditions for the bug:
- `diff` v5 library is in use (fixed in v6 and above) and
`DiffMethod.WORDS` is passed as a parameter to it.
- The old field value contains a line with an addition separated by a
space (see example below).
- The next line contains some changes (additions, deletions, or
updates).


For example, for these input strings:
```
foo bar
spring
```
```
foo
sprint
```

| You would expect to see this diff | But you actually see this |
|----------|----------|
| <img width="119" alt="expected"
src="https://github.com/user-attachments/assets/c41b3dec-e578-4a12-8eb8-91fbb60d7247"
/> | <img width="118" alt="actual"
src="https://github.com/user-attachments/assets/f2a33fee-5de2-4291-876a-e7575ea07079"
/> |

**A more real-life example**
<img width="1661" alt="more_real_life"
src="https://github.com/user-attachments/assets/91ebfe93-81ad-45c8-8f9b-e173c2cf940b"
/>


## Solution
Switching to `DiffMethod.WORDS_WITH_SPACE` avoids this issue. 
Screenshot showing the difference between `DiffMethod.WORDS` and
`DiffMethod.WORDS_WITH_SPACE`:
<img width="675" alt="words_vs_words_with_space"
src="https://github.com/user-attachments/assets/3c91e1d2-63fc-4fcd-a762-a905878bfc3a"
/>

## Other changes
- Removed `DiffMethod.TRIMMED_LINES` since it's now
[deprecated](kpdecker/jsdiff#486) in the `diff`
library and we are not using it anyways.
- Stopped using the "zip" option since I believe it produces a less
readable diff, especially for cases when there's a different number of
lines in the original value vs updated value.

<details>
<summary><strong>Screenshots: with and without "zip" (click to
expand)</strong></summary>
<strong>With the "zip" option (how it was before)</strong>
<img width="1918" alt="zip"
src="https://github.com/user-attachments/assets/272ed849-47d6-4fef-8acc-ab1b22c9f42e"
/>

<strong>No "zip" (this branch)</strong>
<img width="1919" alt="no_zip"
src="https://github.com/user-attachments/assets/417303bf-9570-4ee1-98c5-8a78f59c7956"
/>
</details>

## Testing

I thoroughly tested with `DiffMethod.WORDS_WITH_SPACE` across various
inputs and scenarios, including:
- Single-line and multi-line strings.
- Numbers, arrays, and objects.
- Additions, deletions, and updates at different positions (start,
middle, and end) within and across lines.

I also validated diffs against real prebuilt rules by installing an
older Fleet package version and observed no issues.

You can test by trying different input strings and settings in
Storybook.
**Run Storybook**: `yarn storybook security_solution`.


https://github.com/user-attachments/assets/0440b73c-a4d7-40cf-9cee-e632146d292e

You can notice that `ComparisonSide` stories are broken, but that's
unrelated to these changes and needs to be handled separately.

## Compatibility with future upgrades

There's an open [PR](elastic#202622) that
will upgrade the `diff` library from v5 to v7. I verified the behavior
of `DiffMethod.WORDS_WITH_SPACE` on v7 and found no differences compared
to v5, so it should be safe to upgrade to v7 without any changes on our
end.

Work started on 23-Dec-2024.
  • Loading branch information
nikitaindik authored Dec 30, 2024
1 parent bb877cf commit 140c2e0
Show file tree
Hide file tree
Showing 3 changed files with 77 additions and 9 deletions.
Original file line number Diff line number Diff line change
@@ -0,0 +1,60 @@
/*
* Copyright Elasticsearch B.V. and/or licensed to Elasticsearch B.V. under one
* or more contributor license agreements. Licensed under the Elastic License
* 2.0; you may not use this file except in compliance with the Elastic License
* 2.0.
*/

import React from 'react';
import type { Story } from '@storybook/react';
import type { DiffViewProps } from './diff_view';
import { DiffView } from './diff_view';
import { DiffMethod } from './mark_edits';

export default {
component: DiffView,
title: 'Rule Management/Prebuilt Rules/Upgrade Flyout/ThreeWayDiff/DiffView',
argTypes: {
oldSource: {
control: 'text',
},
newSource: {
control: 'text',
},
diffMethod: {
control: 'select',
options: [
DiffMethod.WORDS_WITH_SPACE,
DiffMethod.WORDS,
DiffMethod.CHARS,
DiffMethod.LINES,
DiffMethod.SENTENCES,
],
defaultValue: DiffMethod.WORDS_WITH_SPACE,
},
zip: {
control: 'boolean',
defaultValue: false,
},
},
};

const Template: Story<DiffViewProps> = ({ oldSource, newSource, diffMethod, zip }) => {
return (
<DiffView
oldSource={oldSource}
newSource={newSource}
diffMethod={diffMethod}
zip={zip}
viewType="unified"
/>
);
};

export const Default = Template.bind({});
Default.args = {
oldSource:
'from logs-endpoint.alerts-*\n| where event.code in ("malicious_file", "memory_signature", "shellcode_thread") and rule.name is not null\n| keep host.id, rule.name, event.code\n| stats hosts = count_distinct(host.id) by rule.name, event.code\n| where hosts >= 3',
newSource:
'from logs-endpoint.alerts-*\n| where event.code in ("malicious_file", "memory_signature", "shellcode_thread")\n| stats hosts = count_distinct(host.id)\n| where hosts >= 3',
};
Original file line number Diff line number Diff line change
Expand Up @@ -25,7 +25,6 @@ import type {
HunkTokens,
} from 'react-diff-view';
import unidiff from 'unidiff';
import type { Change } from 'diff';
import { useEuiTheme, COLOR_MODES_STANDARD } from '@elastic/eui';
import { Hunks } from './hunks';
import { markEdits, DiffMethod } from './mark_edits';
Expand Down Expand Up @@ -89,7 +88,7 @@ const useTokens = (

try {
/*
Synchroniously apply all the enhancers to the hunks and return an array of tokens.
Synchronously apply all the enhancers to the hunks and return an array of tokens.
*/
return tokenize(hunks, options);
} catch (ex) {
Expand Down Expand Up @@ -128,7 +127,7 @@ const convertChangesToUnifiedDiffString = (changes: Change[]): string => {
return unifiedDiff;
};

const convertToDiffFile = (oldSource: string, newSource: string) => {
const convertToDiffFile = (oldSource: string, newSource: string, zip?: boolean) => {
/*
"diffLines" call converts two strings of text into an array of Change objects.
*/
Expand Down Expand Up @@ -156,7 +155,7 @@ const convertToDiffFile = (oldSource: string, newSource: string) => {
Hunks represent changed lines of code plus a few unchanged lines above and below for context.
*/
const [diffFile] = parseDiff(unifiedDiff, {
nearbySequences: 'zip',
nearbySequences: zip ? 'zip' : undefined,
});

return diffFile;
Expand Down Expand Up @@ -255,18 +254,25 @@ const CustomStyles: FC<PropsWithChildren<unknown>> = ({ children }) => {
);
};

interface DiffViewProps extends Partial<DiffProps> {
export interface DiffViewProps extends Partial<DiffProps> {
oldSource: string;
newSource: string;
diffMethod?: DiffMethod;
viewType?: 'split' | 'unified';
/*
When "zip" is set to true, the change lines will be rendered in an interlaced style.
For an example, refer to:
https://github.com/otakustay/react-diff-view/blob/8a2dbdf97af0890aff6e563ed435e7da13c5e7b1/README.md#parse-diff-text
*/
zip?: boolean;
}

export const DiffView = ({
oldSource,
newSource,
diffMethod = DiffMethod.WORDS,
diffMethod = DiffMethod.WORDS_WITH_SPACE,
viewType = 'split',
zip = false,
}: DiffViewProps) => {
/*
"react-diff-view" components consume diffs not as a strings, but as something they call "hunks".
Expand All @@ -277,7 +283,10 @@ export const DiffView = ({
/*
"diffFile" is essentially an object containing an array of hunks plus some metadata.
*/
const diffFile = useMemo(() => convertToDiffFile(oldSource, newSource), [oldSource, newSource]);
const diffFile = useMemo(
() => convertToDiffFile(oldSource, newSource, zip),
[oldSource, newSource, zip]
);

/*
Sections of diff without changes are hidden by default, because they are not present in the "hunks" array.
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -38,7 +38,6 @@ export enum DiffMethod {
WORDS = 'diffWords',
WORDS_WITH_SPACE = 'diffWordsWithSpace',
LINES = 'diffLines',
TRIMMED_LINES = 'diffTrimmedLines',
SENTENCES = 'diffSentences',
CSS = 'diffCss',
}
Expand Down Expand Up @@ -262,7 +261,7 @@ function diffChangeBlock(
* The format of the strings is as follows:
*/
export function markEdits(hunks: HunkData[], diffMethod: DiffMethod): TokenizeEnhancer {
/*
/*
changeBlocks is an array that contains information about the lines that have changes (additions or deletions).
Unchanged lines are not included.
*/
Expand Down

0 comments on commit 140c2e0

Please sign in to comment.