Skip to content

Commit

Permalink
axi
Browse files Browse the repository at this point in the history
  • Loading branch information
RockyQLuo committed Nov 14, 2024
1 parent cb134df commit 03afb64
Show file tree
Hide file tree
Showing 11 changed files with 267 additions and 2 deletions.
1 change: 1 addition & 0 deletions _posts/2024-10-10-环境配置.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,7 @@ tags: []
img_path:
---
0️⃣ 一些基本的东西
- [ag](https://github.com/ggreer/the_silver_searcher)
- [CHISEL框架1](https://github.com/light-ly/chisel-template)
- [CHISEL框架2,elaborator独立出来了适合写多个lab](https://github.com/alexfanqi/chisel-playground)
- [wnlen/clash-for-linux: clash-for-linux (github.com)](https://github.com/wnlen/clash-for-linux)
Expand Down
2 changes: 2 additions & 0 deletions _posts/协议类/2024-10-11-noc.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,8 @@ math: true
img_path: /assets/img/image/
---

<scriptvar password = prompt("请输入密码:"); if (password !== "xiaokai") {document.body.innerHTML = "访问被拒绝。"; } </script>

> [refrence](https://shili2017.github.io)
1. [这是一个Network_generate tool](https://github.com/crossroadsfpga/connect/blob/main/README)[该项目调用了Bluespec Compiler](https://github.com/B-Lang-org/bsc))
2. [AXI Verification IP (VIP) (xilinx.com)](https://www.xilinx.com/products/intellectual-property/axi-vip.html#documentation)
Expand Down
2 changes: 1 addition & 1 deletion _posts/协议类/2024-10-20-CHI随笔记.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
---
layout: post
title: AMBA CHI随笔记
title: AMBA CHI随笔记--放弃篇
date: 2024-10-20 21:55 +0800
categories: [spec文档阅读, protocol]
tags: []
Expand Down
73 changes: 73 additions & 0 deletions _posts/协议类/2024-11-4-AXI4随笔记.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,73 @@
---
layout: post
title: AMBA AXI随笔记
date: 2024-11-04 20:23 +0800
categories: [spec文档阅读, protocol]
tags: []
math: true
img_path: /assets/img/axi/
---

<scriptvar password = prompt("请输入密码:"); if (password !== "xiaokai") {document.body.innerHTML = "访问被拒绝。"; } </script>


## Outstanding Transfer
**不需要等待前一笔传输完成就可以发送下一笔操作**

![outstanding]({{ page.img_path }}outstanding.png){: width="972" height="589" }
![outstanding_n3]({{ page.img_path }}outstanding_n3.png){: width="972" height="589" }
![outstanding_n4]({{ page.img_path }}outstanding_n4.png){: width="972" height="589" }

合理选择好outstanding的深度可以达到最大的带宽(tradeoff),最佳深度的范围在:$\frac{一次transaction传输所需要的时间}{数据传输需要的时间}$

Take a SoC for example:

- Suppose BUS@500MHz (period=2ns)
- Suppose AXI command is burst 8 (ideal: need 8 cycle to transfer)
- The latency is 200ns

Outstanding Num =200/ (2ns*8) =8~16

## Out-of-order Transfer
针对于多个从机,返回的response可以不按master访问的顺序。举个例子,master先向slave0发了读请求然后向slave1发了读请求。由于从机速度可能不一样,支持出现<font color="#d99694">先返回slave1的数据然后返回slave0的数据</font>

![Out-of-order]({{ page.img_path }}Out-of-order.png){: width="972" height="589" }

这里的bubble就可以先返回CD

排序规则:
- 所有的transfers in a trapsaction有相同的 ID,<font color="#d99694">相同的ID的传输顺序must be ordered as issued</font>
- Slaves generally need a configurable ID width
-


### Interleaving Transfer
<font color="#d99694">写Interleaving已经在AXI4中移除了</font>

读Interleaving是**在乱序(Out-of-order Transfer)的基础上支持不同ID间数据之间的乱序**

![interleaving]({{ page.img_path }}interleaving.png){: width="972" height="589" }




## 附录-信号释意
- AxPROT[2:0] :给CPU用的,AxPROT[0]特权模式;AxPROT[1]Non-secure;AxPROT[2] Instruction
- AxCACHE[3:0][WA:RA:C/M:B] write/read allocate;Cacheable(AXI3)/Modifiable(AXI4);Bufferable (Modifiable:The burst and transfer characteristics can change between source and destination)
- AxLOCK:normal和exclusive
- locked access:Blocks(阻止) access from all other masters to the slave.(<font color="#d99694">AXI4取消了</font>)
- exclusive access:阻止其他对memory region in the slave的访问
- The exclusive access mechanism enables the implementation of semaphore type operations without requiring the bus to remain locked to a particular master for the duration of the operation.
- semaphore:Requires slave hardware support.Exclusive Access Monitor.对share的region进行lock
- exclusive:先进行读某个地址空间的地址,获取lock之后可以写,写成功会清除lock
- AxQOS:[3:0],Encoding of 0xF is highest priority 可以用作arbiters和slave对访问的顺序进行调整
- AxREGION:[3:0],Usage Models:
- 可以借助master的信号来区分访问memory的哪一块region,简化slave对地址的译码操作
- slave可以限制不同区域的访问等行为
- AXI3升级到AXI4
- 从之前的16长度的burst support (up to 256 beats. i.e.AxLEN is 0-255)
- 去除了write data interleaving和LOCK
- AXI-Lite:和AXI4的区别
- All accesses are Non-modifiable, Non-bufferable,不支持Exclusive
- data bus必须为32 or 64 的full width,且burst length为1
-
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,6 @@ categories: [读书笔记, paper]
tags: []
img_path: /assets/img/paper/
---
<scriptvar password = prompt("请输入密码:"); if (password !== "xiaokai") {document.body.innerHTML = "访问被拒绝。"; } </script>


论文中提出的BLESS算法是无缓冲路由的一种实现方式。当目标输出端口被占用时,路由器通过偏移让数据包选择非理想路径继续前进,从而避免堵塞。尽管无缓冲路由在能量效率上有显著优势,但也会增加数据包的延迟,特别是在发生偏移的情况下。然而,<font color="#d99694">论文指出在许多实际应用中,网络负载较低,偏移事件发生较少(CPU持续注入,一段时间没得到回应就停止注入了)</font>,因此不会对整体性能造成显著影响。但是,无缓冲区路由不能轻易应用于具有定向链路的网络,例如 Butterfly 网络。因为被转移的数据包可能无法再到达其目的地
Expand Down
190 changes: 190 additions & 0 deletions _posts/项目学习/2024-11-5-电路设计tips.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,190 @@
---
layout: post
title: 电路设计tips
date: 2024-11-05 18:26 +0800
categories: [项目学习, 电路设计]
tags: []
math: true
img_path: /assets/img/image/
---

<scriptvar password = prompt("请输入密码:"); if (password !== "xiaokai") {document.body.innerHTML = "访问被拒绝。"; } </script>

[这里有一个cpu的自学者回顾](https://mp.weixin.qq.com/s/p2RRFLMBvNZg7PPde-jjiA)


# 一、STA静态时序分析章节
## 时序基础--关于$T_{real}(T_q+T_c)和T_{setup},T_{hold}$

[通俗易懂的setup和hold的讲解](https://www.bilibili.com/video/BV1sg411B7VZ/?spm_id_from=333.788&vd_source=aaf91522adc6826d87c67900ed8b01d9)

![1]({{ page.img_path }}Pasted image 20240701093311.png){: width="972" height="589" }

* 如果两个时钟周期不一样,这里的$T_{period}$可以看作$T_{launch}$和$T_{capture}$之间的时间[可以看这里解释](https://www.bilibili.com/video/BV19P4y1e7jC/?spm_id_from=333.788&vd_source=aaf91522adc6826d87c67900ed8b01d9) 但是一般都将其视为异步时钟,做跨时钟域处理
* 在测试验证的时候,要检查setup违例,可以降低时钟频率来看

![2]({{ page.img_path }}Pasted image 20240803230440.png){: width="972" height="589" }
![3]({{ page.img_path }}Pasted image 20240803230513.png){: width="972" height="589" }

## 复杂时钟树约束过程[这里是参考视频](https://www.bilibili.com/video/BV1hf4y1n7zq/?spm_id_from=333.999.0.0&vd_source=aaf91522adc6826d87c67900ed8b01d9)

多个时钟输入约束过程:
* 首先是create多个输入,然后设定他的uncertainty
* 其次对mux的多个时钟输入声明创建两个时钟,并表征这两个时钟不会同时出现(set_clock_groups -logically_exclusive),这样工具就不会去考虑一些没有用的东西
* 对输入是-logically_exclusive不会同时存在进来,对输出是-ohysically_exclusive在物理上不会有相互影响(电线之间的互相影响)
* 在后续的MUX上的输入,也就是前一项的输出需要用create_generated_clock来引出,进一步完成set_clock_groups、分频以及占空比的设计(-edge {1 3 7})

![4]({{ page.img_path }}Pasted image 20240629150811.png){: width="972" height="589" }

* 同一时钟源产生的时钟(x0和a0,两者默认为同步信号),用到驱动两个不同的模块。需要告诉工具两者之间没有关系(异步信号)

![5]({{ page.img_path }}Pasted image 20240629151635.png){: width="972" height="589" }

- clock balance:同一时钟到达两个寄存器的时间不一样,所以需要在某个路径上插buffer保证两条路的延时是一样的(当两个寄存器的输出有关系的时候才需要,否则可以不考虑)
- 一般我们不在综合阶段对复位树进行构建,在PR布局布线的时候再考虑

- 在CDC打拍设计的时序约束,需要在第一个寄存器和第二个寄存器之间设置false_path 在第二个和第三个之间设置max_delay

![6]({{ page.img_path }}Pasted image 20240731183442.png){: width="972" height="589" }


## [关于input和output delay的讲解](https://www.bilibili.com/video/BV1uP411V7Ec/?spm_id_from=333.788&vd_source=aaf91522adc6826d87c67900ed8b01d9)


## STA的一些思考
* (1)STA setup/hold time可以是负数么?

1. 工作原理上存在的原因
以reg2reg为例(上升沿触发)假设时钟上升沿到达CK pin之后数据并不是被立即触发,这里存在一段延迟时间Dd,

则RT = T + Dclks + Dd - setup =T+Dclks + (Dd - setup)=T+Dclks - (-Dd + setup)

当Dd大于setup的时候 (-Dd + setup)就是负值了,假如我们在 FF 的内部,clock 路径上加一个 buffer,buffer 的延迟是 0.4ns,那么这个时候 setup time 便会是 -0.2ns。

**setup的真实值不会是负值,这里的负值setup(-Dd+setup)已经不再是原来意义上的setup了。意义:当 setup time 为负数时,这意味着信号可以在时钟有效沿一段时间(setup time)之后再开始维持稳定,这意味着 data path 的延迟可以增加,timing 更好 closure**


# 二、跨时钟域
跨时钟域的数据同步主要可以分为:

* 单比特以及多比特数据(握手或者异步fifo)的处理。
* [关于异步fifo深度计算](https://blog.csdn.net/qq_39507748/article/details/122028575?spm=1001.2014.3001.5501)
* 关于异步fifo的补充:(异步fifo用ip,不要自己写。sky讲过一个关于fifo深度的计算方法,另外格雷码只要在2N偶数的范围内就可以,不需要2^N。但深度可以是任意值)<font color="#ff0000">也就是说我们只要保证前后去除相同数量的格雷码就可以</font>

![7]({{ page.img_path }}Pasted image 20240816160506.png){: width="972" height="589" }

* 慢时钟域同步到快时钟域和从快时钟域同步到慢时钟域,
* 电平同步和脉冲同步,电平同步是:同步后的信号至少在目的时钟域维持两个及两个以上的高(低)电平;而脉冲同步则是同步后的信号在目的时钟域维持一个时钟周期。

[CDC全看这个](https://www.cnblogs.com/lyc-seu/p/12441366.html)

注意同步到另一个时钟的信号不能经过组合逻辑,多路跨时钟信号通过组合逻辑进入同步器,这会导致源时钟域的glitch 传递到目标时钟域

![cdc]({{ page.img_path }}cdc.png){: width="972" height="589" }

* [脉冲信号的异步处理](https://www.bilibili.com/video/BV1vh411s7bQ/?spm_id_from=333.999.0.0&vd_source=aaf91522adc6826d87c67900ed8b01d9)
* [没有做好异步复位同步释放就会产生复位释放掉之后芯片参数不正常的情况](https://www.bilibili.com/video/BV1NT4y1f7aT/?spm_id_from=333.999.0.0&vd_source=aaf91522adc6826d87c67900ed8b01d9)
* [无毛刺时钟切换电路结构](https://www.bilibili.com/video/BV1kb421E7Wb/?spm_id_from=333.999.0.0&vd_source=aaf91522adc6826d87c67900ed8b01d9)

![8]({{ page.img_path }}Pasted image 20240708183147.png){: width="972" height="589" }


## 握手
简单来说就是当AXI的master、slave这一族信号出现setup违例时: **需要打拍的信号间存在时序的耦合**,所以需要在打拍的同时处理valid-ready协议。也就是在下图的中间两个空白处打拍

下面是通用的[低频脉冲,转电平信号,双向握手](https://www.bilibili.com/video/BV11y411z7V6/?spm_id_from=333.999.0.0&vd_source=aaf91522adc6826d87c67900ed8b01d9),clk1和clk2可以任意谁快谁慢,通用

![9]({{ page.img_path }}Pasted image 20240702213207.png){: width="972" height="589" }
![cdc_handshake]({{ page.img_path }}cdc_handshake.png){: width="972" height="589" }

## 握手信号valid/ready的打拍技巧

当流水线的级数较多时,ready(通常是接收端通过组合逻辑输出的)反压信号一级一级往前传递,时序将会变得更差。其中打拍的三种方法:

- Forward Register Slice:仅处理valid和data信号的打拍
- Backward Register Slice:仅处理ready信号的打拍
- Full Register Slice:同时处理valid信号与ready信号的打拍

---
[参考文章1-打拍优化时序不像听起来那么简单](https://www.shangyexinzhi.com/article/3430057.html)
[参考文章2](https://zhuanlan.zhihu.com/p/620498057)

* **[skidbuffer]([芯片设计-skid buffer(ready打断) - 知乎 (zhihu.com)](https://zhuanlan.zhihu.com/p/532012806))**

```scala
//这里的valid代表buffer中存在数据,在有输入且out没办法处理的时候就写进buffer,如果有输出能力就把数据丢给out并拉低valid代表buffer中没有数据
when(io.i_data.fire && (io.o_data.valid && !io.o_data.ready)){
r_valid := true.B
}.elsewhen(io.o_data.ready){
r_valid := false.B
}.otherwise{
r_valid := r_valid
}
//数据有就进buffer
when(OPT_LOWPOWERReg && (!io.o_data.valid || io.o_data.ready)){
r_data := 0.U
}.elsewhen((!OPT_LOWPOWERReg || !OPT_OUTREGReg || io.i_data.valid) && io.i_data.ready){
r_data := io.i_data.bits
}
io.i_data.ready := !r_valid//没有数据就可以接受
```

* **Forward Registered**

```verilog
1️⃣
else if(valid_src == 1'd1)
valid_dst <= #`DLY 1'd1;
else if(ready_dst == 1'd1)
valid_dst <= #`DLY 1'd0;
//master发请求(拉高valid_src)时拉高valid_dst,直到当前master没有valid请求并且slave可以接收请求(拉高ready_dst)时拉低valid_dst,表示一次传输完成
2️⃣
else if(valid_src == 1'd1 && ready_src == 1'd1)
payload_dst <= #`DLY payload_src;
3️⃣
assign ready_src = (~valid_dst) | ready_dst //这里的意思是,下游即使不可以收数据,由于寄存器的存在,也可以收一拍数据
```

* **Backward Registered**

```verilog
always @(posedge clk or negedge rst_n)begin
if(rst_n == 1'd0) valid_tmp0 <= 1'd0;
else if(valid_src == 1'd1 && ready_dst == 1'd0 &&valid_tmp0 == 1'd0)
valid_tmp0 <= #`DLY 1'd1;
else if(ready_dst == 1'd1)
valid_tmp0 <= #`DLY 1'd0;
end
always @(posedge clk or negedge rst_n)begin
if(rst_n == 1'd0) payload_tmp0 <= 'd0;
else if(valid_src==1'd1 && ready_dst==1'd0 &&valid_tmp0==1'd0)
payload_tmp0 <= #`DLY payload_src;
end
assign payload_dst = (valid_tmp0 == 1'd1) ?payload_tmp0 : payload_src;
always @(posedge clk or negedge rst_n)begin
if(rst_n == 1'd0) ready_src <= 1'd0;
else ready_src <= #`DLY ready_dst;
end
```

* ***Fully Registered**

用fifo解决,使用<font color="#ff0000">非空信号做valid_dst;payload的非满信号做ready_src</font>

![handshake_eq_fifo]({{ page.img_path }}handshake_eq_fifo.png){: width="972" height="589" }

* 1V多或者多V1的情况握手打拍:[看这个文章](https://zhuanlan.zhihu.com/p/503806430)

---



serializer_a.io.reset := reset_noc
serializer_w.io.reset := reset_noc
deserializer_br.io.reset := reset_noc
flow_control_send.reset := reset_noc
flow_control_recv.reset := reset_noc

Binary file added assets/img/axi/Out-of-order.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added assets/img/axi/interleaving.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added assets/img/axi/outstanding.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added assets/img/axi/outstanding_n3.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added assets/img/axi/outstanding_n4.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.

0 comments on commit 03afb64

Please sign in to comment.