-
Notifications
You must be signed in to change notification settings - Fork 7
/
NOTES
95 lines (82 loc) · 8.35 KB
/
NOTES
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
To do / to fix:
* Using a single request to send-and-then-receive data adds variable latency to the received data, since it has to wait in line behind the sent data (and there's no easy way to know how long it waited.)
* Subtle issue: The relative precision of unsynchronized computer clocks is no better than 1ppm (generally 2-10x worse, IIRC.)
* Use request headers instead of query params so that URL doesn't change (forcing CORS preflight to be redone).
Debugging notes from 2020-07-22:
* Weird as fuck: sometimes after it lags, when it comes back the pitch
is too high? ??? ?????? I have seen it too high by a consistent ~3%
for many seconds. (1/2 a semitone roughly)
* iiiinteresting, I can repro this using "main app thread" loopback.
* this matches with what we see in the visualizer, which is skipping _inside_ a batch sent to the server.
* Clients seem to get further and further behind. I think they are generally stable except when weird shit is happening, but weird shit causes them to slip, and eventually they slip too far and die. Connecting multiple clients seems to make this happen much faster but I can't tell if that's inherent, or just because it increases lossage due to bandwidth, CPU, etc.
* This may be specific to the bluetooth headset actually, and it may be related to the sporadic glitching I was getting with them many versions ago, which usually goes away if I close and reopen the device.
* Hmmmmmm, under heavy load (3 clients), my request payloads seem to end up empty, even when the source is constant audio data (hamilton). That doesn't make any sense.
Debugging notes from 2020-07-23 w/ jefftk:
* should add an alert if something goes wrong with the fixed-offset relationship between the read and write clocks going to the server
* a speaker-to-mic echo test with clicks seemed quite smooth, which is cool
Debugging notes 2020-07-24:
* NOTE that changing the latency compensation at runtime messes up the continuity of the buffer in the audioworklet and I think virtually guarantees it will report overflow/underflow after wraparound.
* Would be good if latency window did not grow when the software / network glitches out
Debugging notes 2020-07-26:
* Automatic latency calibration is absolutely mandatory to have
* For further testing, in case of any doubt, multiple options would be good to make sure there's something that works for everyone, as long as they're all easy to try and it's obvious if it worked.
* Noise is obnoxiously additive
* I think a lot of this is quantization noise from naive downsampling to 8-bit. Some is just background noise. (Jeff suggests gating.)
* No point in sending audio if we know it's silent OR we know we're the caboose
* "Snap to caboose" feature would be nice
* Ray thinks it would be cool for people e.g. with iphones to be able to just hear stuff and not send any stuff (since sending stuff doesn't work.)
* No point in receiving audio if we're not going to do anything with it, give us a way to just not request it.
* If things lag, sometimes we start up with an unexpectedly high "client total time consumed" from the very beginning. (~9s vs ~5s.) [I _think_ this makes sense due to how we're managing our server connection, but I need to think more about it and how to fix it.]
* I saw at one point that a client was experiencing "cascading read slippage", i.e. it was getting later and later (and slippage getting larger) on every request. I don't understand what could cause this. I took a profile using the dev tools profiler, and there's a lot of weird stuff in it, but I don't really know how to read it.
* The profile contains multiple tabs sharing threads, which makes sense in retrospect, but makes things confusing, and seems like it COULD be somehow related to the actual problem.
* Comparing to a non-slippage profile: Rendering of "frames" starts to take longer and longer. Hundreds, then thousands of ms. (With CPU time dozens of ms, up to over 100.) In the healthy state it takes consistently <10 ms and a decent fraction of that is CPU time. I don't know whether this is real or an artifact.
* Our tab's "frames" show up as stretching from a point when the OTHER tab completes a network request, until a point where we do. That makes no sense to me. I'm not sure whether it's an artifact but I think it could be.
*** UGH *** I may be screwing myself with my testing method. The blocking limit of XHRs per domain is SHARED if I have many window to the same domain open. So I can rapidly run out if they overlap. The offending XHRs causing the apparent priority inversion are from DIFFERENT WINDOWS. The multiple entries on the audioworklet thread are also. [NOTE: During the offending tests we did not ever run out of parallel XHRs that I could see, but this is still a problem.]
* I can't figure out whether the apparent interaction between windows is real or an artifact, whether the "long frames" are real or a devtools bug, if they really are long, if that's a chrome bug or my bug somehow.
* It kind of seems like the XHRs are just happening at a consistent but too-slow interval. If the "long frames" are an artifact, then this would seemingly be caused by process not getting called often enough, or otherwise somehow us not having the target amount of data to send until later?
* Could our process be getting starved or something by the other window's process, which kicks in if they happen to drift into alignment or something?
* We seem to be running very close to 3/4 target speed, which is .... weird unless something changed the sample rate or something broke in Web Audio.
*** We should start tracking things like how often our callbacks get called, how much data we accumulate, and whether it seems to match the purported sample rate. ***
* Random note: "go to chrome://flags, search for worklet and enable the flag named "Use realtime priority thread for Audio Worklet"
* NOTE: test and fix in firefox
* Perhaps we can get audiocontext latency more stable if we request a specific value (perhaps we can measure and then request higher?)
* make calibration user interface nicer and easier to use -- guide people through volume settings, max out our click volume but warn them not to hurt their ears, see if automatic works, then suggest manual. Tell them when we think it's done.
* allow calibrating any time, whether or not connected to server. allow stopping
and starting at any time without destroying audio context.
* ideally, allow changing server offset and such without reconnecting
* getting everyone set up with offsets and so forth is really obnoxious, and then
having to do the jump-to-end thing or it otherwise being hard to hear what we did
* it would be good to be able to 'admin' configure server settings like short wrapping, and clearing on/off.
* would be nice to be able to admin-force people to offsets (this requires being
able to change offsets dynamically at all)
* Notes from 2020-08-05 testing:
* We really need a way to test audio I/O at the beginning before doing ANYTHING else, as several people had problems
* (Debian Chrome, the default device had weird behavior, it displayed as "default" with no description and for at least one person it didn't seem to work for input at first, and then changing the input selection possibly caused output to stop working)
* Background noise remains annoying
Time constants:
* Audioworklet "time quantum": 128 samples @ 44,100Hz ~= 3ms
* Our send buffer: SAMPLE_BATCH_SIZE * 128 samples ~= 290ms
Sources of latency to account for:
* Sending:
* "Outside world latency":
* Head-to-mic acoustic latency: <= 3ms (about 1ms/ft)
* [Optional] bluetooth latency: 100-200 ms
* System/JS audio processing latency: dozens of ms?
* Buffer latency into audioworklet: ~3ms
* Client side latency:
* Buffer latency (our code): ~290ms
* Network/backend latency:
* XHR TCP connection establishment: ~1.5x RTT (unless conn is reused)
* [Optional] wait for single-threaded HTTP server to be free
* Upload time: Send buffer size / upload bandwidth
* Receiving:
* [Time from XHR start until receiving begins]
* This is time we have to compensate for when deciding which audio to ask for, but not inherently latency in getting it
* Network/backend latency:
* Download time
* Client side latency:
* Buffer latency (our code)
* "Outside world latency":
* System/JS audio processing latency
* Optional bluetooth latency
* Speaker-to-head acoustic latency