Investigate parallelisation of the longest common subsequence algorithm. #132

sageserpent-open · 2024-12-27T11:53:06Z

The example merge from #35 takes around 20 seconds to perform per-file merging at the end.

No progress bar is shown during this period, which leaves the user with a sense of the application having hung.

A spike to add a progress bar in #35 yielded very unsatisfactory results - rather than devote more time to that, how about simply speeding it up via some parallelisation?

If the execution time comes down to less than a minute, I think that beats a psychological barrier.

The implementation uses an imperative cache of subproblem solutions, but perhaps to use of swathes described in #107 could allow each swathe's problems to be processed in a parallel fashion?

sageserpent-open · 2024-12-28T12:59:50Z

An outline of how to implement this...

As per Revisit LongestCommonSubsequence.of to use a top-down recursive approach. #107, organise the subproblems into swathes, where each swathe is labelled by the maximum index from any of the base, left or right sides. This means that as the longest common subsequence's dynamic programming algorithm progresses, the swathes get larger, up to a loose upper bound of 1 + BaseSequenceSize * LeftSequenceSize + BaseSequenceSize * RightSequenceSize + LeftSequenceSize * RightSequenceSize (this may be an overestimate if the label index is smaller than any of the sequence sizes, but the point is that there is an upper bound and it isn't just the product of all three sizes).
Evaluate the subproblems for each swathe, working in sequentially through swathes of increasing label index. This ensures that each swathe's subproblems depend only on either other subproblems in the same swathe or subproblems in the previous swathe. So it is only necessary to work with the swathe whose subproblems are being evaluated and the previous one that is consulted for its solutions only.
Store the subproblem solutions in a swathe as futures (or something of that ilk) that allows computation to take place on a separate thread and allows the result to be shared once it is available.
Tie subproblems together by using applicative composition, so they can be evaluated in parallel from the point of view of the consumer. This allows multiple consumers to share results from overlapping subproblems.
The swathe can use random access into an array or whatever to fetch a subproblem future, unlike the current implementation that needs to do a bianary search to access a subproblem result.
The result is awaited from the final top-level subproblem in the final swathe - or should LongestCommonSubsequence.of be cutover to yield an effect, which could be Id or IO?

sageserpent-open · 2024-12-30T16:58:24Z

As of 66261b3, have implemented the first, second and fifth items from above to make a sequential implementation, just to make sure the idea is correct and can pass the tests.

That was a good idea, as it has been a really hard and painful slog trying to deliver just that part!

The ordering of subproblems was really tricky to get right, and the implementation, while encapsulated, is a sprawling mess of imperative loops. Got to love it.

The payoff though is that execution times for the example merge from #35 has come down a bit to:

1 minute 13 seconds, 1 minute 12 seconds, 1 minute 11 seconds, 1 minute 12 seconds.

This is without introducing any parallelisation.

It has become apparent that there is a very obvious way of parallelising three linear strands of computation of subproblems within each leading swathe without having to worry about potential overlap, so we may be in for a pleasant surprise next...

sageserpent-open · 2024-12-31T10:05:18Z

The code is a lot tidier as of fe7023c. This is important, because part of the aforementioned really hard and painful slog was having to keep size unreadable near-duplicated blocks of code consistent with each other as bug-fixes were applied. This didn't always happen!

The duplication has been removed, and performance hasn't dropped noticeably as a result of using a more principled approach.

sageserpent-open · 2024-12-31T10:16:00Z

Introduced parallelisation in d36e70a.

Execution times for the example merge from #35 before the change:

1 minute 13 seconds, 1 minute 12 seconds, 1 minute 14 seconds, 1 minute 12 seconds

Execution times after the change:

1 minute 1 seconds, 1 minute 1 seconds, 1 minute, 1 minute 1 seconds.

Result!

sageserpent-open · 2024-12-31T13:10:26Z

Right, got to b6885ee.

This fuses the swathes' future computations together to make a big IO. Had to mix between Future and IO because:

I don't know how to write a catamorphism that unfolds futures analoguous to Seq.unfold, so had to use Monad.whileM_ and thus IO (Future does not play well with the control constructs in Monad) and
Just using IO everywhere instead of Future results in noticeably poorer performance. This is over and above the gotcha that the *> operator means applicative combination for Future, and sequential flat-mapping for IO which did bite me to start with - ended up using &>.

It's possible that something was misconfigured - perhaps the Cats threading model needs tweaking for this?

Anyway, the execution times are now:

1 minute, 1 minute, 1 minute.

Let's call it a wrap!

sageserpent-open · 2024-12-31T13:12:15Z

Branch main fast-forwarded to b6885ee.

sageserpent-open · 2024-12-31T16:36:54Z

This went out in Release 1.4.0.

sageserpent-open added the enhancement New feature or request label Dec 27, 2024

sageserpent-open closed this as completed Dec 31, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Investigate parallelisation of the longest common subsequence algorithm. #132

Investigate parallelisation of the longest common subsequence algorithm. #132

sageserpent-open commented Dec 27, 2024 •

edited

Loading

sageserpent-open commented Dec 28, 2024 •

edited

Loading

sageserpent-open commented Dec 30, 2024 •

edited

Loading

sageserpent-open commented Dec 31, 2024 •

edited

Loading

sageserpent-open commented Dec 31, 2024

sageserpent-open commented Dec 31, 2024

sageserpent-open commented Dec 31, 2024

sageserpent-open commented Dec 31, 2024

Investigate parallelisation of the longest common subsequence algorithm. #132

Investigate parallelisation of the longest common subsequence algorithm. #132

Comments

sageserpent-open commented Dec 27, 2024 • edited Loading

sageserpent-open commented Dec 28, 2024 • edited Loading

sageserpent-open commented Dec 30, 2024 • edited Loading

sageserpent-open commented Dec 31, 2024 • edited Loading

sageserpent-open commented Dec 31, 2024

sageserpent-open commented Dec 31, 2024

sageserpent-open commented Dec 31, 2024

sageserpent-open commented Dec 31, 2024

sageserpent-open commented Dec 27, 2024 •

edited

Loading

sageserpent-open commented Dec 28, 2024 •

edited

Loading

sageserpent-open commented Dec 30, 2024 •

edited

Loading

sageserpent-open commented Dec 31, 2024 •

edited

Loading