Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unable To Decrypt meta issue #245

Open
44 of 86 tasks
BillCarsonFr opened this issue Apr 28, 2022 · 45 comments
Open
44 of 86 tasks

Unable To Decrypt meta issue #245

BillCarsonFr opened this issue Apr 28, 2022 · 45 comments
Labels
A-E2EE O-Frequent Affects or can be seen by most users regularly or impacts most users' first experience S-Major Severely degrades major functionality or product features, with no satisfactory workaround T-Defect T-Epic Issue is at Epic level Team: Crypto Z-Chronic

Comments

@BillCarsonFr
Copy link
Member

BillCarsonFr commented Apr 28, 2022

Unable to decrypt Epic Issue

Meta issue relating to all "Unable to Decrypt" problems, a.k.a "Waiting for this message, this may take a while".

[A decryption failure normally manifests when the recipient doesn't receive the keys for a particular encryption session; hence the acronym UISI: "Unknown inbound session ID". Nowadays the acronym UTD (unable to decrypt) is generally preferred.]

Needed information to resolve a reported issue

Reports from users are hugely helpful in identifying and prioritising the causes of Unable to Decrypt errors.

In order to properly debug an Unable To Decrypt error, we need logs from the receiver of the message (the one seeing the issue) and those from the sender. We can't debug issues without logs from both sides. Instructions on sending problem reports (aka "rageshakes") from within various clients are below.

In your problem report, please use the word "decrypt" or "UTD", and identify which event you can't read. Either include the event ID, or something like message from @dave:example.com at 10:10. For example, you might write UTD $173031096849423vSpOu:matrix.org.

On the sender side, the description is less important: just write "Bob can't decrypt me" or something.

If you'd like us to feed back to you on why you can't decrypt a message, open an issue in the Github project for your client, and mention the issue in your problem report.

Sending a rageshake from Element Web:

  • Click on your avatar in top left part of the screen.
  • Select "All Settings".
  • Select "Help & About".
  • Fill in the description as above.
  • Then click on the Send logs button.

(Protip: although the dialog tells you you need to create a github issue, that is only necessary if you want us to feed back to you about what went wrong.)

Alternatively:

  • From the composer in any room, type /rageshake <description>, where <description> is as above.
  • A dialog box opens. Click on the Send logs button.

Sending a rageshake from Element X:

  • Click on your avatar in the top left part of the screen.
  • Select "Report a problem".
  • Enter a description, as above.
  • Ensure that "Allow logs" is enabled.
  • Click "Send".

How to send rageshake from (classic) Element Android:

  • Tap on the top right button on the home screen.
  • Select Report Bug.

How to send rageshake from (classic) element iOS:

  • Open the top left drawer.
  • Select the Feedback button at the bottom.

Causes of Unable to Decrypt errors

🟢: We believe that this will be fixed with Element R
🕛: Affects a feature which is not yet supported in Element R

We categorize the main sources of UISI errors as follow:

Client Issues

Sender Side

Receiver Side

Server Issues

Key Backups

Protocol Issues

Missing features

UX

Expected UTD

  • Some room history (pre-invite for example), should never be decryptable by you. This kind of history should probably be hidden or displayed differently.
  • Expected UTDs: Hide device-relative historical UTDs #2313: New devices cannot decrypt existing history until they have access to key backup

User Config

@BillCarsonFr
Copy link
Member Author

Interesting related blog post https://blog.neko.dev/posts/unable-to-decrypt-matrix.html

@jittygitty
Copy link

@BillCarsonFr Wow what a great write-up you found at https://blog.neko.dev/posts/unable-to-decrypt-matrix.html

Please correct me if I'm wrong, but having looked at that, it seems that just about all of those errors could be resolved via wizard/options provided to the user by the client which could try to resend encryption key with help of server or to even create a new one for the channel/room. I don't see any 'technical' issue/blocker other than currently missing functionality client-side to prompt user for permission to, with the help of in-between server, do what's necessary to fix room encryption/decryption.

If there's a 'security' issue with properly re-identifying the correct user, for example to re-provide keys to the other user while they've gone offline so they can pick them up from server when they are online again, can't we have a simple "challenge/response" ie question/answer to "pick-up" the keys? (I'm trying to avoid creation of a "new" room/channel and instead fix one already created but giving errors of unable to decrypt. The sender's device has not sent us the keys for this message.)

It seems the reason certain steps aren't by default taken to automatically fix such issues, is that it could be a security risk in certain situations. But if the "creator" of a room gives needed permission, it seems in a sense trivial to resolve such decryption/key errors etc. Or am I wrong and missed something?

And to the author of:
https://blog.neko.dev/posts/unable-to-decrypt-matrix.html
Many thanks!

@BillCarsonFr
Copy link
Member Author

Please correct me if I'm wrong, but having looked at that, it seems that just about all of those errors could be resolved via wizard/options provided to the user by the client which could try to resend encryption key

FTR, there used to be such UI, but it was prone to social attack and annoying in the UI.
It's something explored, but it's a bit hard to have a fix all encryption problems button. We are trying to go to the bottom of the root cause for distribution failure by moving to the rust sdk

@jittygitty
Copy link

@BillCarsonFr Ah ok guess I'm relatively new so didn't know of the older UI. Personally, I'd be ok with the "social attack" risks and UI annoyance versus the embarrassment of inviting users to my new chat and getting decryption errors.

My concern was that it may be impossible to fix the root for all cases of decryption issues, especially if some are due to security measures which may need to be over-ridden by user consent in order to fix, which brings us back to that UI annoyance.

But regardless, it's great to hear the underlying sdk is being improved to hopefully eliminate or greatly reduce these issues. I had heard of rust in conduit (I run go-dendrite) but didn't know Kotlin SDK is being all redone in rust, is that right?

Anyway, thanks again to everyone working on this! (I look forward to some beta-testing with the new sdk when its ready for that.)

@anon8675309
Copy link

The link to send debug logs does not appear on Element Web on my server. Did this UI change, or is there some option that an admin needs to enable to get that to show up?

@t3chguy
Copy link
Member

t3chguy commented Mar 17, 2023

@anon8675309 your config.json must have the URL to send debug logs to, like the example https://github.com/vector-im/element-web/blob/develop/config.sample.json#L25

@anon8675309
Copy link

Can you confirm that this is still the case? I have the bug_report_endpoint_url entry from the sample and the link to send debug logs does not appear. If it's working as expected for you, I'll set up a new server and open a new ticket with the minimal steps to reproduce. (I searched for a report of this issue and didn't find anything, but I'll do so again before opening a new issue).

@hieronymousch
Copy link

Was able to send feedback from Element desktop... and have this issue with one single user on my home server, both users on the same server. Upgraded the room to version 10 and got the error again after not even 10 messages

@Ezwen
Copy link

Ezwen commented Jun 1, 2023

Hi there, I have been encountering this problem a lot recently. I've already sent logs from element-web and element-android as the person who received encrypted messages that cannot be decrypted, but I could not yet send logs as somehow sending such problematic messages.

Question: when this situation happens in a matrix room, where a given user ends up only sending messages that cannot be decrypted by other room members, is there any (even intricate) known workaround? The only "workaround" I used so far was to upgrade the room to a newer room version, which solves the issue by creating a new room, but I can't really call this a proper solution… and right now I have this problem on a room that is already using version 10…

@lousando
Copy link

lousando commented Jun 2, 2023

@Ezwen I've found that asking sender of messages to run /discardsession usually fixes any messages moving forward. Though it does not solve the messages with the encryption issue.

@kegsay
Copy link

kegsay commented Jun 20, 2024

Recently I've been giving updates for this on This Week in Matrix. If you fail to decrypt a message please:

  • send a bug report to us and mention:
    • the event ID which failed to decrypt,
    • whether it was 1 or many events which failed to decrypt,
    • if many events, are they all from different people?
  • ask the sender to also send a bug report mentioning the event ID

We often need both sides of the conversation to fix the issue.

It would also be helpful for us if you can opt-in to analytics, as that feeds into our graphs which plot UTDs in aggregate. The general trend of the past few months has thankfully been fewer UTDs across clients that opt-in, but there is more work to be done here.

@yennor
Copy link

yennor commented Jun 20, 2024

I usually got that kind of problem when me or the peer beeing in an area with bad mobile phone connection. With bad I mean really bad. You can get disconnected from the network for several minutes all the time, randomly get connected again for a few seconds, or beeing connected, but almost no data gets through.
I haven't been there (rural area of Colombia) since last year and won't for a few months. So So I can't tell if the situation improved with the new clients.
But maybe for your test-suite, simulate random tcp package drops (very high percentage) with high RTT (Several seconds, sometimes I measured up to 20 seconds. around 5-8 seconds is normal). And sent a few thousand messages there and back.

@jittygitty
Copy link

Does matrix-org/matrix-js-sdk#454 already "implement" solution for: #647 ?
And if so how would my friend request keys from me (on Desktop Element) on a new fresh install of his?

Are we able now to send decryption keys to whomever we wish?

@t3chguy
Copy link
Member

t3chguy commented Jul 1, 2024

@jittygitty yes but Rust Crypto does not support that and that is what Element uses, so the js-sdk PR is unrelated

@jittygitty
Copy link

@t3chguy That's good news, so I guess if I wanted to make use of this feature, I would have to install the "Web" version client of element chat on my webserver? (I currently run Dendrite already, but with mobile and Desktop clients only.)

Or is there another client app that uses js-sdk? (Element Desktop does not use it?) Or will rust crypto gain it soon?

I use another account on 'matrix.org' using desktop+mobile, for that acct. we can simply login at chat.element.io and if I'm logged in via my other devices, I can send keys to my web chat.element.io and resend to friend who lost his? thx

@t3chguy
Copy link
Member

t3chguy commented Jul 1, 2024

Element Desktop = Element Web + Electron. Element Web only supports Rust Crypto at this time. As for other matrix-js-sdk consumers I suggest finding a place discussing Matrix rather than Element specifically.

@jittygitty
Copy link

Apologies, indeed, seems I'm confused as to what matrix SDK is used by what "client", and I had not noticed the matrix-org/matrix-js-sdk#454 was outside of element-hq repositories. So, I guess you are saying that likely none of the clients under https://github.com/element-hq are able to leverage that matrix-js-sdk pull 454?

Is there a place I can find all such pertinent information on the various clients available? If anyone can point me in the right direction with a link would be appreciated. Otherwise, I'll try some search engine lookups to dig for such info. thx

@mpeter50
Copy link

mpeter50 commented Jul 1, 2024

It seems Cinny uses the matrix-js-sdk, and maybe there are more, but I havent found another client that does.
Then, its an other question if Cinny makes use of this feature of the SDK. You could ask about that in Cinny's support room.

@richvdh
Copy link
Member

richvdh commented Jul 4, 2024

Another round-up of recent updates. (Most of these have already been reported in TWIM, but I think it's handy to have a record here too.)

  • We found and fixed a cause of broken Olm sessions on Element iOS. As with the previous fix to EX iOS: you don't have to be using Element iOS yourself to notice UTDs in this scenario: it causes problems on the sender and recipient side alike.
  • We also have more fixes to both Element X iOS (#2944) and Element X Android (#3050) which could cause broken Olm Sessions.
  • We found and fixed a bug in matrix-rust-sdk affecting Element X.
  • We added a workaround to the sliding-sync proxy which would guard against messages being lost after the database was dropped, potentially causing UTDs for Element X clients.
  • We rolled out a change to Synapse and all Element clients, which will identify messages sent before you joined a room, and let you know you shouldn't expect to decrypt them.

A reminder that we're still at war with this issue, and it's incredibly helpful for people to send debug logs when they come across UTD errors.

That said, our analytics show that we are starting to make progress here:

image

@Xiretza

This comment was marked as resolved.

@jivanpal

This comment was marked as resolved.

@penyuan

This comment was marked as off-topic.

@masterflitzer

This comment was marked as off-topic.

@richvdh

This comment was marked as off-topic.

@richvdh
Copy link
Member

richvdh commented Nov 6, 2024

I realised we're overdue for an update of recent work done in this area.

First, some graphs of UTD rates, according to our metrics:

image
(blue=android, purple=ios, green=web/desktop)
image
image

We changed how the metrics were calculated back in August, to allow breakdown between EX and classic mobile clients. As you can see, there was a big drop back in June, and rates have been relatively static at 1-5% of users (depending on client) since then.

Focus recently has been on analysing reports to see where the most important areas for fixes are. #2356 is the most frequent root cause, so we're going to prioritise a fix for that.

Meanwhile we did fix a few problems:

There are a few new causes in the list above, but I won't enumerate them here, as they are still under investigation and haven't been seen in many cases.

@ToddCrimson
Copy link

Hello, just checking in if this has a clear RCA yet?
Saw this pop on our side yesterday :( :(

@androclus
Copy link

androclus commented Dec 27, 2024

BTW, if it is any help, I did give my friends problems (the infamous "**Unable to decrypt: The sender's device has not sent us the keys for this message.**" error message, as in Issue #19748 ) on their end.

Two things I did:

  1. I made a new Linux laptop the lazy way (I copied my home directory over verbatim from my desktop, including ~/.config/Element ), and then ran both at the same time.

  2. I also installed Element on a second Android phone (the correct way). I don't know if this caused it, either.

At that point, I had at least 4 Elements running. Then the problems seemed to start when I turned on the laptop and used its Element (in a coffee shop) and later in the day shut it down, came home, and started up the desktop copy (which had not been running when I was at the coffee shop).

In the previous day's I'd been noticing some lags between the phones, too.

If I have to, I'll remove Element from one of the phones, but after I signed out from the laptop's Element, then recursively removed its ~/.config/Element directory and restarted (thus creating a completely unique Element session -- as I should have in the first place of course -- things seem better now.

This experience is problably of no help, but I post it in case it does provide any clues.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-E2EE O-Frequent Affects or can be seen by most users regularly or impacts most users' first experience S-Major Severely degrades major functionality or product features, with no satisfactory workaround T-Defect T-Epic Issue is at Epic level Team: Crypto Z-Chronic
Projects
None yet
Development

No branches or pull requests