-
Notifications
You must be signed in to change notification settings - Fork 129
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
soundwire: add higher bus clock rate support #5160
base: topic/sof-dev
Are you sure you want to change the base?
soundwire: add higher bus clock rate support #5160
Conversation
if you select the 9.6 MHz clock, you also need to change the default_col_size to 8 instead of 4. Otherwise the frame size will not be correct. |
071461c
to
feca0c2
Compare
feca0c2
to
86a0798
Compare
if (curr_dr_freq == 19200000) | ||
mstr_prop->default_col = 8; | ||
else | ||
mstr_prop->default_col = 4; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@plbossart Do you have a rule of selecting frame shape? Currently, I use a hack to test.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you need to start from the root frequency.
19.2: default_row = 50
12: default_row = 125
12.288: default_row = 64
Then you need to take the bus frequency, multiply it by to to get actual DDR rate, divide it by the frame rate and again by default row and you get the default_col.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you need to start from the root frequency. 19.2: default_row = 50 12: default_row = 125 12.288: default_row = 64
Then you need to take the bus frequency, multiply it by to to get actual DDR rate, divide it by the frame rate and again by default row and you get the default_col.
Changed to default_col = curr_dr_freq / m_rt->stream->params.rate / mstr_prop->default_row;
86a0798
to
b6e82e4
Compare
b6e82e4
to
0cd7cbf
Compare
__func__, curr_dr_freq); | ||
i++; | ||
goto select_clk; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure why you would only increase the clock when there is a single lane. The clock should be increased even in the multi-lane case when there's no bandwidth left?
I must be missing something.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is about the priority of using multi-lane or higher clock rate. At this point, we tried multi-lane but can't find an available lane and it will return -EINVAL if it reaches the highest clock rate. But yeah, I didn't consider about using multi-lane with higher clock rate. I assume that we always have enough bandwidth if we can find an extra lane. Let me redo this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, lanes should be used before cranking the speed up, but an error should only be thrown when multilane+higher speed combined doesn't support the required bandwidth.
drivers/soundwire/bus.c
Outdated
* frequency base and scale registers are required for SDCA | ||
* devices. They may also be used for 1.2+/non-SDCA devices. | ||
* Driver can set the property, we will need a DisCo property | ||
* to discover this case from platform firmware. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not able to follow this comment, consider rewording.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It was just copy and paste from the existing code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe clarify the comment
"Driver can set the property directly, for now there's no DisCo property to discover support for the scaling registers from platform firmware'
28b5981
to
cf9ed92
Compare
On the second thought, I think the existing code partially support dynamic clock change already. What missed is change frame shape dynamically. So, it will be better if we add the dynamic frame shape support commit first and multi-lane support commits late. It is more reviewable, and the logic is clearer. |
drivers/soundwire/bus.c
Outdated
* frequency base and scale registers are required for SDCA | ||
* devices. They may also be used for 1.2+/non-SDCA devices. | ||
* Driver can set the property, we will need a DisCo property | ||
* to discover this case from platform firmware. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe clarify the comment
"Driver can set the property directly, for now there's no DisCo property to discover support for the scaling registers from platform firmware'
/* Check if default_col can be changed */ | ||
list_for_each_entry(s_rt, &m_rt->slave_rt_list, m_rt_node) { | ||
if (!s_rt->slave->id.class_id) { | ||
dev_err(&s_rt->slave->dev, "The Peripheral doesn't comply with SDCA\n"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you probably need a comment to explain that some devices expose the class ID but can't support dynamic scaling.
I would actually move this loop to a helper that can be modified/quirked if needed, e.g. is_clock_scaling_supported().
frame_int = mstr_prop->default_row * mstr_prop->default_col; | ||
frame_freq = curr_dr_freq / frame_int; | ||
|
||
if ((curr_dr_freq - (frame_freq * SDW_FRAME_CTRL_BITS)) < |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is partially wrong, because we never use column0 for audio. so for a 50 row frame, this calculation would be off by 2 bits times the frame_freq. I think you need to use curr_dr_freq - default_row * frame_freq.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is copy from sdw_select_row_col()
. Do we need to do the same change in sdw_select_row_col()
, too?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, I think so. I don't think anyone can use column0 with 19.2 or 12.228 root frequencies. It's only possible for the 125-row frame but that's such a corner case.
But the formula could be simplifed to avoid a division btw:
curr_dr_freq - (curr_dr_freq / (sdw_rows[r] * sdw_cols[c]) * sdw_rows[r])
curr_d_freq - curr_dr_freq / sdw_cols[c] < bandwidth
curr_d_freq * (sdw_cols[c] - 1) < bandwidth * sdw_cols[c]
drivers/soundwire/stream.c
Outdated
return ret; | ||
} | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The clocks are always updated on prepare/deprepare, I am not sure why you want to remove this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure if we really need to update clocks on both prepare/deprepare. We update clock in sdw_compute_bus_params(), but we still reserve the bit slots for the deprepared stream in sdw_compute_port_params(). That doesn't look correct to me.
This is to see if any issue will happen if I remove this. Looks like CI has told me that compute_params() can not be removed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let me re-explain. We have two bank switches, one on prepare and one on trigger. This is intentional, we want to go to the 'new' frequency and let the clock toggle for some time before enabling all the channels on the trigger.
So the point is that we do need to update the parameters and reduce the clock on deprepare. I am not sure why the prepare and de-prepare parts would become asymmetrical?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let me re-explain. We have two bank switches, one on prepare and one on trigger. This is intentional, we want to go to the 'new' frequency and let the clock toggle for some time before enabling all the channels on the trigger.
So the point is that we do need to update the parameters and reduce the clock on deprepare.
Agree, but m_rt is removed after deprepared and list_for_each_entry(m_rt, &bus->m_rt_list, bus_node) is used by _sdw_compute_port_params(), sdw_compute_group_params(), etc. As a result, the bit slots of the deprepared stream are still reserved. If we reduce the clock, we will get a no sufficient bandwidth error.
I am not sure why the prepare and de-prepare parts would become asymmetrical?
I don't think it is asymmetrical. Below is the sequence when we start and stop aplay.
intel_hw_params
-> sdw_stream_add_master
asoc_sdw_prepare
-> sdw_prepare_stream
asoc_sdw_hw_free
-> sdw_deprepare_stream
intel_hw_free
-> sdw_stream_remove_master
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not following this. Do we agree when the stream is deprepared, it should use exactly zero bitslots? I don't understand why the bits would remain reserved.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
but the code above still accounts for the bandwidth of the stream, so the stream is still taken into account.
It makes no sense to me to reduce the bandwidth count after performing the bandwidth allocation.
Sorry, completely lost here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Moving the bandwidth reduction count after performing the bandwidth allocation is based on the fact that the m_rt has not been removed when we allocate the bit slots. To fix the issue, I can either move the bandwidth reduction count after performing the bandwidth as the current PR does or remove the m_rt before allocating the bit slots.
I choose the former because removing m_rt first seems break the add m_rt -> prepare -> deprepare ->remove m_rt sequence.
I know the side effects of moving the bandwidth reduction count after performing the bandwidth allocation is that we may keep the sdw bus clock or lane that is not needed. But it will happen only when we reach the threshold that need higher clock rate or more lane. So, I think it is acceptable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a bit convoluted really.
The bits for the stream should no longer be accounted for after this routine:
int asoc_sdw_hw_free(struct snd_pcm_substream *substream)
{
struct snd_soc_pcm_runtime *rtd = snd_soc_substream_to_rtd(substream);
struct sdw_stream_runtime *sdw_stream;
struct snd_soc_dai *dai;
/* Find stream from first CPU DAI */
dai = snd_soc_rtd_to_cpu(rtd, 0);
sdw_stream = snd_soc_dai_get_stream(dai, substream->stream);
if (IS_ERR(sdw_stream)) {
dev_err(rtd->dev, "no stream found for DAI %s\n", dai->name);
return PTR_ERR(sdw_stream);
}
return sdw_deprepare_stream(sdw_stream);
}
but the problem is that we have a half-baked dai handling and comments
static int
intel_hw_free(struct snd_pcm_substream *substream, struct snd_soc_dai *dai)
{
struct sdw_cdns *cdns = snd_soc_dai_get_drvdata(dai);
struct sdw_intel *sdw = cdns_to_intel(cdns);
struct sdw_cdns_dai_runtime *dai_runtime;
int ret;
dai_runtime = cdns->dai_runtime_array[dai->id];
if (!dai_runtime)
return -EIO;
/*
* The sdw stream state will transition to RELEASED when stream->
* master_list is empty. So the stream state will transition to
* DEPREPARED for the first cpu-dai and to RELEASED for the last
* cpu-dai.
*/
ret = sdw_stream_remove_master(&cdns->bus, dai_runtime->stream);
if (ret < 0) {
dev_err(dai->dev, "remove master from stream %s failed: %d\n",
dai_runtime->stream->name, ret);
return ret;
}
That comment is wrong, the stream state will transition to DEPREPARED in the dailink hw_free(), and then to RELEASED when the last DAI hw_free() is called
That still doesn't really tell me why the bits are not released during the sdw_deprepare_stream() part. There should be no dependency on the m_rt list.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
asoc_sdw_hw_free() is called before intel_hw_free(). IOW, sdw_deprepare_stream() is called before sdw_stream_remove_master(). So, m_rt is not removed yet when sdw_deprepare_stream() is called.
_sdw_deprepare_stream
-> bus->compute_params(bus)
-> sdw_compute_params()
-> sdw_compute_bus_params(bus)
-> sdw_compute_port_params(bus)
port_bo = 1;
list_for_each_entry(m_rt, &bus->m_rt_list, bus_node) {
sdw_compute_master_ports(m_rt, ¶ms[i], &port_bo, hstop);
}
hstop = hstop - params[i].hwidth;
And we calculate hstart, hstop, and offset in sdw_compute_master_ports()
t_data.hstart = hstart;
t_data.hstop = hstop;
t_data.block_offset = *port_bo;
t_data.sub_block_offset = 0;
(*port_bo) += bps * ch;
So that we allocate bit slots for each m_rt.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, that's clearly wrong, the generic bit allocation should take the states into account, and skip the deprepared streams.
See #5177 for compile-tested-only proposal.
@@ -353,6 +391,10 @@ static int sdw_compute_bus_params(struct sdw_bus *bus) | |||
clk_buf = NULL; | |||
} | |||
|
|||
m_rt = list_last_entry(&bus->m_rt_list, struct sdw_master_runtime, bus_node); | |||
s_rt = list_first_entry(&m_rt->slave_rt_list, struct sdw_slave_runtime, m_rt_node); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not following what this is supposed to do and why you are not dealing with M and S in symmetrical ways, the last/first isn't clear to me at all.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I use the last m_rt is because that one is recently added to the bus and it is the m_rt that we need to handle. And for the s_rt, it doesn't really matter to use what s_rt in the m_rt. The s_rt will be used to find the peripheral lane, and we will check if the lane is connected to all peripherals.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think that's correct @bardliao. When there is a new stream, we need to compute the parameters for all streams and all _rt contexts. Think of it this way: at any bank switch the position of all bits could change. It could very well be that the new stream will have higher priority in the bandwidth allocation than the new one.
IOW, the generic allocation is NOT a purely additive mechanism. At every bank switch, the bus allocation is a blank slate and ALL streams/ports can be re-programmed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed to list_for_each_entry(m_rt, &bus->m_rt_list, bus_node)
to check all m_rt
2297111
to
e6977d3
Compare
e6977d3
to
69dd90f
Compare
@@ -380,6 +406,10 @@ static int sdw_compute_bus_params(struct sdw_bus *bus) | |||
return -EINVAL; | |||
} | |||
|
|||
/* Get the last m_rt that is recently added to the bus. */ | |||
m_rt = list_last_entry(&bus->m_rt_list, struct sdw_master_runtime, bus_node); | |||
mstr_prop->default_col = curr_dr_freq / m_rt->stream->params.rate / mstr_prop->default_row; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh no no no, big confusion here!
The frame shape need to be computed with the SoundWire frame rate, which can be different to all the stream rates. Think e.g. of the example I presented with a 96 kHz frame rate to support two streams, one at 48kHz and one at 48 kHz.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, right, fixed now.
69dd90f
to
ac34bb3
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks mostly good, I still think the code can be made a bit clearer/compact, see suggestions below.
@@ -354,6 +375,10 @@ static int sdw_compute_bus_params(struct sdw_bus *bus) | |||
clk_buf = NULL; | |||
} | |||
|
|||
/* If dynamic scaling is supported, don't try higher freq */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
logical inversion in comment, should be "NOT supported"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed.
|
||
/* Run loop for all groups to compute transport parameters */ | ||
for (i = 0; i < count; i++) { | ||
port_bo = 1; | ||
for (l = 0; l < SDW_MAX_LANES; l++) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
wondering if we can optimize and only deal with existing lanes. 90% of the time there will be a single lane, it's a bit odd to loop on all possible lanes.
ac34bb3
to
1f3769c
Compare
58739b7
to
ffc7d87
Compare
…p_params" This reverts commit 72e43ef. Signed-off-by: Bard Liao <[email protected]>
This reverts commit e56ee9f. Signed-off-by: Bard Liao <[email protected]>
The existing logic is problematic in that we deprepare all the ports, but still take into account the stream for bit allocation by just walking through the bus->m_rt list. This patch sets the state earlier, so that such DEPREPARED streams can be skipped in the bandwidth allocation (to be implemented in a follow-up patch). Signed-off-by: Pierre-Louis Bossart <[email protected]>
We should not blindly walk through all the m_rt list, since it will have the side effect of accounting for deprepared streams. This behavior is the result of the split implementation where the dailink hw_free() handles the stream state change and the bit allocation, and the dai hw_free() modifies the m_rt list. The bit allocation ends-up using m_rt entries in zoombie state, not longer relevant but still used. Signed-off-by: Pierre-Louis Bossart <[email protected]>
Currently, we only set peripheral frequency when the peripheral is initialized. However, curr_dr_freq may change to get required bandwidth. For example, curr_dr_freq may increase from 4.8MHz to 9.6MHz when the 4th stream is opened. Add a helper to get the scale index so that we can get the scale index and program it. Signed-off-by: Bard Liao <[email protected]>
ffc7d87
to
8958021
Compare
SOFCI TEST |
We need to program bus clock scale to adjust the bus clock if current bus clock doesn't fit the bandwidth. Signed-off-by: Bard Liao <[email protected]>
We need to recalculate frame shape when sdw bus clock is changed. And need to make sure all Peripherals connected to the Manager support dynamic clock change. Signed-off-by: Bard Liao <[email protected]>
…w_select_row_col The bits in Column 0 of Rows 0 to 47 are for control word and can not be used for audio. In practice, entire Column 0 is skipped. Signed-off-by: Bard Liao <[email protected]>
Currently, we check curr_dr_freq roughly by "if (curr_dr_freq <= bus->params.bandwidth)" in sdw_compute_bus_params() and check it accurately in sdw_select_row_col(). It works if we only support one freq. But, we need to check it accurately in sdw_select_row_col() to give it a chance to use a higher freq or use multi-lane. Signed-off-by: Bard Liao <[email protected]>
If a peripheral supports multi-lane, we can use data lane x to extend the bandwidth. The patch suggests to select data lane x where x > 0 when bandwidth is not enough on data lane 0. Signed-off-by: Bard Liao <[email protected]>
All active streams with the same parameters are grouped together and the params are store in the sdw_group struct. We compute the required bandwidth for each group. However, each lane has individual bandwidth. Therefore, we should separate different lanes in different params groups. Add lane variable to separate params groups. Signed-off-by: Bard Liao <[email protected]>
8958021
to
cd3144c
Compare
This reverts commit f2fc74f. To test without this commit. It seems work on my new LNL device. I would like to test with CI to see what the failure scenario is.