-
Notifications
You must be signed in to change notification settings - Fork 0
/
atom.xml
193 lines (97 loc) · 99.8 KB
/
atom.xml
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
<?xml version="1.0" encoding="utf-8"?>
<feed xmlns="http://www.w3.org/2005/Atom">
<title>ZQA</title>
<link href="http://misakiz.github.io/atom.xml" rel="self"/>
<link href="http://misakiz.github.io/"/>
<updated>2024-01-10T08:29:15.313Z</updated>
<id>http://misakiz.github.io/</id>
<author>
<name>ZQA</name>
</author>
<generator uri="https://hexo.io/">Hexo</generator>
<entry>
<title>基于FlinkSql的Nginx指标监控方案</title>
<link href="http://misakiz.github.io/2024/01/10/%E5%9F%BA%E4%BA%8EFlinkSql%E7%9A%84Nginx%E6%8C%87%E6%A0%87%E7%9B%91%E6%8E%A7%E6%96%B9%E6%A1%88/"/>
<id>http://misakiz.github.io/2024/01/10/%E5%9F%BA%E4%BA%8EFlinkSql%E7%9A%84Nginx%E6%8C%87%E6%A0%87%E7%9B%91%E6%8E%A7%E6%96%B9%E6%A1%88/</id>
<published>2024-01-10T11:43:54.000Z</published>
<updated>2024-01-10T08:29:15.313Z</updated>
<content type="html"><![CDATA[<div class="note blue icon-padding modern"><i class="note-icon fas fa-bullhorn"></i><p>基于FlinkSql的Nginx指标监控方案</p></div><div class="note red icon-padding flat"><i class="note-icon fas fa-fan"></i><p>暂时不知道咋hexo咋配置评论功能,如需要和我沟通的话麻烦通过b站私信我,b站id:<strong>披着双马尾的大叔</strong></p></div><h1>背景</h1><p><img src="/FlinkNg/1.png" alt="image-20230702200907879"></p><p>原先指标采集逻辑架构如上图,exporter通过es聚合函数定时采集上报指标到Prometheus。<br><em><strong>问题</strong></em><br>es消费topic需要将消息写入磁盘,有创建索引的耗时,聚合也需要用到cpu和内存资源,<strong>指标采集延迟高</strong></p><p>会有其他人使用es系统,ng的es系统功能不止用做于监控指标的采集,当遇到环境错误时,会有人使用系统,造成es服务器资源负载高,影响到监控采集,进而导致指标<strong>数据丢失</strong>或者<strong>不准确</strong>。</p><h1>基于FlinkSql的指标监控方案</h1><p><img src="/FlinkNg/2.png" alt="image-20230702200907879"></p><h2 id="topic-cdc-数据清理">topic cdc 数据清理</h2><p>由于原先Kafka中的topic的信息格式不是flink接收的标准格式,因此做一层cdc处理。</p><h2 id="指标采集处理">指标采集处理</h2><figure class="highlight go"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br><span class="line">43</span><br><span class="line">44</span><br><span class="line">45</span><br><span class="line">46</span><br><span class="line">47</span><br><span class="line">48</span><br><span class="line">49</span><br><span class="line">50</span><br><span class="line">51</span><br><span class="line">52</span><br><span class="line">53</span><br><span class="line">54</span><br><span class="line">55</span><br><span class="line">56</span><br><span class="line">57</span><br><span class="line">58</span><br><span class="line">59</span><br><span class="line">60</span><br><span class="line">61</span><br><span class="line">62</span><br><span class="line">63</span><br><span class="line">64</span><br><span class="line">65</span><br><span class="line">66</span><br><span class="line">67</span><br><span class="line">68</span><br><span class="line">69</span><br><span class="line">70</span><br><span class="line">71</span><br><span class="line">72</span><br><span class="line">73</span><br><span class="line">74</span><br><span class="line">75</span><br><span class="line">76</span><br><span class="line">77</span><br><span class="line">78</span><br><span class="line">79</span><br><span class="line">80</span><br><span class="line">81</span><br><span class="line">82</span><br><span class="line">83</span><br><span class="line">84</span><br><span class="line">85</span><br><span class="line">86</span><br><span class="line">87</span><br><span class="line">88</span><br><span class="line">89</span><br><span class="line">90</span><br><span class="line">91</span><br><span class="line">92</span><br><span class="line">93</span><br><span class="line">94</span><br><span class="line">95</span><br><span class="line">96</span><br></pre></td><td class="code"><pre><span class="line">CREATE FUNCTION UdafTP AS <span class="string">'udf.UdafTP'</span>;</span><br><span class="line">-- 创建输入表,从Kafka主题读取数据</span><br><span class="line">CREATE TABLE nginx_logs (</span><br><span class="line"><span class="string">`domain`</span> STRING ,</span><br><span class="line"><span class="string">`cluster`</span> <span class="type">string</span>,</span><br><span class="line"><span class="string">`status`</span> STRING,</span><br><span class="line"><span class="string">`row_time`</span> <span class="type">string</span>,</span><br><span class="line"><span class="string">`upstream_response_time`</span> <span class="type">string</span>,</span><br><span class="line"><span class="string">`request_time`</span> <span class="type">string</span>,</span><br><span class="line"><span class="string">`size`</span> <span class="type">string</span>,</span><br><span class="line"><span class="string">`upstream_status`</span> <span class="type">string</span>,</span><br><span class="line"><span class="string">`request_url`</span> <span class="type">string</span>,</span><br><span class="line">-- 字段处理</span><br><span class="line">tmp_time AS cast(REPLACE(SUBSTRING(row_time, <span class="number">1</span>, <span class="number">19</span>) ,<span class="string">'T'</span>,<span class="string">' '</span>) as timestamp(<span class="number">3</span>)),</span><br><span class="line"><span class="string">`cluster_nopool`</span> AS (</span><br><span class="line">CASE WHEN <span class="string">`cluster`</span> = <span class="string">''</span> THEN <span class="string">'-'</span></span><br><span class="line">ELSE SUBSTRING(<span class="string">`cluster`</span>, <span class="number">1</span>, CHARACTER_LENGTH(<span class="string">`cluster`</span>) - CHARACTER_LENGTH(<span class="string">'_pool'</span>))</span><br><span class="line">END</span><br><span class="line">),</span><br><span class="line">urtime AS cast(upstream_response_time AS FLOAT),</span><br><span class="line">rtime AS cast(request_time AS FLOAT),</span><br><span class="line"><span class="string">`tmp_size`</span> AS cast(<span class="string">`size`</span> AS INT),</span><br><span class="line">WATERMARK FOR <span class="string">`tmp_time`</span> AS <span class="string">`tmp_time`</span> - INTERVAL <span class="string">'5'</span> SECOND</span><br><span class="line">) WITH (</span><br><span class="line"><span class="string">'connector'</span> = <span class="string">'kafka'</span>,</span><br><span class="line"></span><br><span class="line">);</span><br><span class="line"></span><br><span class="line">-- 创建base_metric输出表,用于存储聚合结果</span><br><span class="line">CREATE TABLE domain_cluster_metric (</span><br><span class="line">domain STRING,</span><br><span class="line">cluster <span class="type">string</span>,</span><br><span class="line">............,</span><br><span class="line">window_start TIMESTAMP(<span class="number">3</span>),</span><br><span class="line">window_end TIMESTAMP(<span class="number">3</span>),</span><br><span class="line">primary key(domain) NOT ENFORCED</span><br><span class="line">) WITH (</span><br><span class="line"><span class="string">'connector'</span> = <span class="string">'jdbc'</span>,</span><br><span class="line">);</span><br><span class="line"></span><br><span class="line"></span><br><span class="line"></span><br><span class="line">---执行聚合查询</span><br><span class="line">INSERT INTO domain_cluster_metric</span><br><span class="line"><span class="keyword">select</span></span><br><span class="line">split_index(domain_rand,<span class="string">'_'</span>,<span class="number">0</span>) as domain,</span><br><span class="line"><span class="string">`cluster_nopool`</span> as cluster,</span><br><span class="line">............,</span><br><span class="line">window_start,</span><br><span class="line">window_end</span><br><span class="line">from (</span><br><span class="line">SELECT</span><br><span class="line">domain || <span class="string">'_'</span> || cast(cast(RAND()*<span class="number">100</span> as <span class="type">int</span>) as <span class="type">string</span>) as domain_rand,</span><br><span class="line"><span class="string">`cluster_nopool`</span>,</span><br><span class="line">............,</span><br><span class="line">TUMBLE_START(tmp_time, INTERVAL <span class="string">'1'</span> MINUTE) AS window_start,</span><br><span class="line">TUMBLE_END(tmp_time, INTERVAL <span class="string">'1'</span> MINUTE) AS window_end</span><br><span class="line">FROM nginx_logs</span><br><span class="line">GROUP BY</span><br><span class="line">domain || <span class="string">'_'</span> || cast(cast(RAND()*<span class="number">100</span> as <span class="type">int</span>) as <span class="type">string</span>),</span><br><span class="line"><span class="string">`cluster_nopool`</span>,</span><br><span class="line">TUMBLE(tmp_time, INTERVAL <span class="string">'1'</span> MINUTE)</span><br><span class="line">)</span><br><span class="line">group by split_index(domain_rand,<span class="string">'_'</span>,<span class="number">0</span>),</span><br><span class="line"><span class="string">`cluster_nopool`</span>,</span><br><span class="line">window_start,</span><br><span class="line">window_end;</span><br><span class="line"></span><br><span class="line"></span><br><span class="line"></span><br><span class="line">--创建输出表,用于tp存储聚合结果</span><br><span class="line">CREATE TABLE tp_metric (</span><br><span class="line">domain STRING,</span><br><span class="line">tp90 FLOAT,</span><br><span class="line">tp95 FLOAT,</span><br><span class="line">tp99 FLOAT,</span><br><span class="line">tp999 FLOAT,</span><br><span class="line">window_start TIMESTAMP(<span class="number">3</span>),</span><br><span class="line">window_end TIMESTAMP(<span class="number">3</span>)</span><br><span class="line">) WITH (</span><br><span class="line"><span class="string">'connector'</span> = <span class="string">'jdbc'</span>,</span><br><span class="line">);</span><br><span class="line"></span><br><span class="line"> </span><br><span class="line">INSERT INTO tp_metric</span><br><span class="line">SELECT</span><br><span class="line"><span class="string">`domain`</span>,</span><br><span class="line">UdafTP(urtime,cnt,<span class="number">90</span>) AS tp90,</span><br><span class="line">UdafTP(urtime,cnt,<span class="number">95</span>) AS tp95,</span><br><span class="line">UdafTP(urtime,cnt,<span class="number">99</span>) AS tp99,</span><br><span class="line">UdafTP(urtime,cnt,<span class="number">999</span>) AS tp999,</span><br><span class="line">window_start,</span><br><span class="line">window_end</span><br><span class="line">FROM</span><br><span class="line">tpView</span><br><span class="line">GROUP BY window_start, window_end, <span class="string">`domain`</span>;</span><br></pre></td></tr></table></figure><h2 id="udaf函数">udaf函数</h2><p>有统计tp的的需求,但是flink没有tp统计相关函数<br>自定义统计tp的函数<br>方案一:获取所有记录数量,然后根据这么多记录数量进行排序,获取记录数量的第99%位置的值即是tp99的值。 pass,实践下来程序基本跑起来就down 数据太多,存入了一个map排序处理。<br>参考文章 <a href="https://zhuanlan.zhihu.com/p/228518056">flink实战-使用自定义聚合函数统计网站TP指标</a></p><p>对上述方案进行了优化<br>方案二:通过二次将数据聚合的方式,按照urtime和domain聚合后, 获取到它的urtime 和其count的数据,然后再做urtime的tp99的排序。<br>对比方案一 开窗口数据量减少巨多。<br>用于百分位的计算,主要逻辑如下</p><figure class="highlight java"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br><span class="line">43</span><br><span class="line">44</span><br><span class="line">45</span><br><span class="line">46</span><br><span class="line">47</span><br><span class="line">48</span><br><span class="line">49</span><br><span class="line">50</span><br><span class="line">51</span><br><span class="line">52</span><br><span class="line">53</span><br><span class="line">54</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">public</span> <span class="keyword">class</span> <span class="title class_">UdafTP</span> <span class="keyword">extends</span> <span class="title class_">AggregateFunction</span><Float, TPAccum> {</span><br><span class="line"><span class="meta">@Override</span></span><br><span class="line"><span class="keyword">public</span> TPAccum <span class="title function_">createAccumulator</span><span class="params">()</span>{</span><br><span class="line"><span class="keyword">return</span> <span class="keyword">new</span> <span class="title class_">TPAccum</span>();</span><br><span class="line">}</span><br><span class="line"></span><br><span class="line"><span class="meta">@Override</span></span><br><span class="line"><span class="keyword">public</span> Float <span class="title function_">getValue</span><span class="params">(TPAccum acc)</span>{</span><br><span class="line"><span class="comment">//如果map为空,则返回null</span></span><br><span class="line"><span class="keyword">if</span> (acc.map.size() == <span class="number">0</span>){</span><br><span class="line"><span class="keyword">return</span> Float.valueOf(<span class="number">0</span>);</span><br><span class="line">} <span class="keyword">else</span> {</span><br><span class="line">Map<Float,Long> map = <span class="keyword">new</span> <span class="title class_">TreeMap</span><>(acc.map);</span><br><span class="line"></span><br><span class="line"><span class="type">int</span> <span class="variable">totalCount</span> <span class="operator">=</span> <span class="number">0</span>;</span><br><span class="line"><span class="keyword">for</span> (Long count : acc.map.values()) {</span><br><span class="line">totalCount += count;</span><br><span class="line">}</span><br><span class="line"><span class="type">double</span> <span class="variable">i</span> <span class="operator">=</span> <span class="number">0</span>;</span><br><span class="line"><span class="keyword">if</span> (acc.tp==<span class="number">999</span>){</span><br><span class="line">i= <span class="number">0.999</span>;</span><br><span class="line">}<span class="keyword">else</span>{</span><br><span class="line">i= (<span class="type">double</span>) acc.tp / <span class="number">100</span>;</span><br><span class="line">}</span><br><span class="line"></span><br><span class="line"><span class="comment">//获取这批记录的总记录数量,</span></span><br><span class="line"><span class="type">int</span> <span class="variable">targetCount</span> <span class="operator">=</span> (<span class="type">int</span>) Math.ceil(totalCount * i);</span><br><span class="line"></span><br><span class="line"><span class="type">int</span> <span class="variable">currentCount</span> <span class="operator">=</span> <span class="number">0</span>;</span><br><span class="line"></span><br><span class="line"><span class="type">Float</span> <span class="variable">responseTime</span> <span class="operator">=</span> Float.valueOf(<span class="number">0</span>);</span><br><span class="line"></span><br><span class="line"><span class="comment">//遍历map</span></span><br><span class="line"><span class="keyword">for</span> (Map.Entry<Float,Long> entry: map.entrySet()) {</span><br><span class="line"><span class="type">Long</span> <span class="variable">count</span> <span class="operator">=</span> entry.getValue();</span><br><span class="line">currentCount += count;</span><br><span class="line"></span><br><span class="line"><span class="keyword">if</span> (currentCount >= targetCount) {</span><br><span class="line">responseTime = entry.getKey();</span><br><span class="line"><span class="keyword">break</span>; }</span><br><span class="line">}</span><br><span class="line"><span class="keyword">return</span> responseTime;</span><br><span class="line">}</span><br><span class="line">}</span><br><span class="line"></span><br><span class="line"><span class="keyword">public</span> <span class="keyword">void</span> <span class="title function_">accumulate</span><span class="params">(TPAccum acc, Float iValue, Long cnt , Integer tp)</span>{</span><br><span class="line">acc.tp = tp;</span><br><span class="line"><span class="type">Float</span> <span class="variable">tmp</span> <span class="operator">=</span>iValue;</span><br><span class="line"><span class="comment">//判空处理</span></span><br><span class="line"><span class="keyword">if</span> (Objects.isNull(tmp)){</span><br><span class="line">tmp= Float.valueOf(<span class="number">0</span>);</span><br><span class="line">}</span><br><span class="line">acc.map.put(tmp, cnt);</span><br><span class="line">}}</span><br></pre></td></tr></table></figure><h2 id="数据倾斜问题处理">数据倾斜问题处理</h2><p>数据倾斜 20个线程, 有些 收到字节1g。有些最大的字节30T。 3w倍 数据严重倾斜。线程压力过大 导致topic 消费堆积的问题<br><img src="/FlinkNg/3.png" alt="image-20230702200907879"><br>对分组的key加上随机数。再次打散,分别计算打散后不同的分组的pv数, 然后 打散的数据再次聚合<br>打散。 聚合。再聚合</p><p>优化后压力均摊,解决消息堆积的情况</p><h1>方案优化前后对比</h1><table><thead><tr><th>方案</th><th>稳定性</th><th>延迟</th><th>精确性</th></tr></thead><tbody><tr><td>es方案</td><td>稳定性不稳定,与服务器压力成反比,服务器压力受多外界因素影响</td><td>延迟<strong>4分钟</strong>,指标颗粒度30s</td><td>受到外界因素,数据<strong>会有丢失</strong>和<strong>不准确</strong>风险</td></tr><tr><td>flink方案</td><td><strong>稳定性更高</strong>,<strong>独立的进程</strong>,基于<strong>内存</strong>的实时统计计算</td><td><strong>延迟更低</strong>。延迟1分钟,指标颗粒度15s</td><td><strong>数据精确性更高</strong>,无数据不准确的风险,有状态的数据(ng日志的时间)cdc计算。<strong>(抛开kafka挂了或者kafka出问题的情况)</strong></td></tr></tbody></table>]]></content>
<summary type="html"><div class="note blue icon-padding modern"><i class="note-icon fas fa-bullhorn"></i><p>基于FlinkSql的Nginx指标监控方案</p>
</div>
<div class="note re</summary>
<category term="Grafana" scheme="http://misakiz.github.io/tags/Grafana/"/>
<category term="Flink" scheme="http://misakiz.github.io/tags/Flink/"/>
<category term="Nginx" scheme="http://misakiz.github.io/tags/Nginx/"/>
<category term="Prometheus" scheme="http://misakiz.github.io/tags/Prometheus/"/>
</entry>
<entry>
<title>Grafana-动态阈值配置</title>
<link href="http://misakiz.github.io/2023/10/16/Grafana-%E5%8A%A8%E6%80%81%E9%98%88%E5%80%BC%E9%85%8D%E7%BD%AE/"/>
<id>http://misakiz.github.io/2023/10/16/Grafana-%E5%8A%A8%E6%80%81%E9%98%88%E5%80%BC%E9%85%8D%E7%BD%AE/</id>
<published>2023-10-16T12:25:49.000Z</published>
<updated>2023-10-16T12:37:17.722Z</updated>
<content type="html"><![CDATA[<div class="note blue icon-padding modern"><i class="note-icon fas fa-bullhorn"></i><p>动态根据函数获取的值做阈值对比</p></div><div class="note red icon-padding flat"><i class="note-icon fas fa-fan"></i><p>暂时不知道咋hexo咋配置评论功能,如需要和我沟通的话麻烦通过b站私信我,b站id:<strong>披着双马尾的大叔</strong></p></div><h2 id="动态阈值方案">动态阈值方案</h2><p>需求,希望完成这样的大盘图,成功率>sla为绿色,成功率<sla为红色,即每个域名的成功率要和sla做对比。</p><p><img src="/grafana/5.png" alt="image-20230702200907879"></p><p>但每个域名的sla都是不同的,因此在Panel中无法通过设置静态阈值来区分颜色</p><h2 id="解决">解决</h2><p>思路是从函数角度实现该效果,参考文档<a href="https://community.grafana.com/t/highlight-high-or-critical-values/51922/4">https://community.grafana.com/t/highlight-high-or-critical-values/51922/</a></p><p><img src="/grafana/6.png" alt="image-20230702200907879"></p><figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br></pre></td><td class="code"><pre><span class="line">假设查询结果为C</span><br><span class="line">C and C>sla则设置为绿色</span><br><span class="line"></span><br><span class="line">C and C<sla则设置为红色</span><br><span class="line"></span><br><span class="line">通过上述两个查询结果 只有一个会为true实现该大盘效果。</span><br></pre></td></tr></table></figure><p><img src="/grafana/7.png" alt="image-20230702200907879"></p>]]></content>
<summary type="html"><div class="note blue icon-padding modern"><i class="note-icon fas fa-bullhorn"></i><p>动态根据函数获取的值做阈值对比</p>
</div>
<div class="note red icon-</summary>
<category term="Grafana" scheme="http://misakiz.github.io/tags/Grafana/"/>
</entry>
<entry>
<title>《GoogleSRE-alerting-on-slos》- 分析与实践</title>
<link href="http://misakiz.github.io/2023/07/28/%E3%80%8AGoogleSRE-alerting-on-slos%E3%80%8B-%E5%88%86%E6%9E%90%E4%B8%8E%E5%AE%9E%E8%B7%B5/"/>
<id>http://misakiz.github.io/2023/07/28/%E3%80%8AGoogleSRE-alerting-on-slos%E3%80%8B-%E5%88%86%E6%9E%90%E4%B8%8E%E5%AE%9E%E8%B7%B5/</id>
<published>2023-07-28T08:35:02.000Z</published>
<updated>2023-08-07T12:00:46.677Z</updated>
<content type="html"><![CDATA[<div class="note blue icon-padding modern"><i class="note-icon fas fa-bullhorn"></i><p>对《GoogleSRE-alerting-on-slos》的理解分析,相当于是翻译一遍加上自己的理解吧,里面有些概念真的太抽象了</p></div><div class="note red icon-padding flat"><i class="note-icon fas fa-fan"></i><p>暂时不知道咋hexo咋配置评论功能,如需要和我沟通的话麻烦通过b站私信我,b站id:<strong>披着双马尾的大叔</strong></p></div><h1>Alerting Considerations</h1><p>先介绍一些名词</p><p><a href="https://zhuanlan.zhihu.com/p/93107394">一文看懂机器学习指标:准确率、精准率、召回率、F1、ROC曲线、AUC曲线</a></p><p><em><strong>Precision(精确率)</strong></em></p><p>检测到成功的重大事件的比例,比如每个告警都是一次重大的事件,那么精确率就是100%。</p><p>要注意的是低流量的情况下,举一个<strong>例子</strong>请求成功率的日slo目标为99.99%,如果你一天只有1000个请求,则一天内只要有一个错误请求,就会导致这一天的请求成功率不达标1/1000>(1-slo),告警的敏感度很高。</p><p><em><strong>Recall(召回率)</strong></em></p><p>检测到重大事件的比例。如果每个重要事件都会产生报警,那么召回率就是100%(假告警率高(噪音大)),</p><p><em><strong>Detection time(异常检测的时间)</strong></em></p><p>满足告警条件下,触发告警需要多长的时间。</p><p><em><strong>Reset time(复位时间)</strong></em></p><p>告警后的问题被解决之后,多长时间才会再发报警。</p><p>举一个例子你的错误率检测窗口为2h。即ErrorRate(2h),第一1分钟内错误请求飙升,此时触发了你的告警,</p><p>因为你的检测窗口设置为了让2h,所以2h内会触发一次告警,如果第60分钟又发生了第二次错误请求飙升的情况,不会发出告警,因为你的错误率检测窗口设置为了2h。</p><h1>Ways to Alert on Significant Events</h1><p>“<strong>错误预算</strong>”和“错误率”适用于所有 SLI,而不仅仅是那些名称中带有“<strong>错误</strong>”的 SLI。在测量内容:使用 SLI 部分中,使用 SLI 来采集良好事件与总事件的比率。其中<strong>错误预算</strong>是指允许的<strong>不良事件</strong>的数量,<strong>错误率</strong>是不良事件与总事件的比率。</p><p>拿一个实际例子:</p><p>SLI指标为:http请求错误率 (ErrorRate)</p><p>请求错误率=错误请求数量(4xx+5xx)/总请求</p><p>30天SLO设置为99.99%</p><h2 id="Target-Error-Rate-≥-SLO-Threshold">Target Error Rate ≥ SLO Threshold</h2><p>我们选择一个时间窗口,比如设置为10分钟,如果ErrorRate[10min]>=(1-99.99%)就触发告警。</p><p>则告警规则为</p><figure class="highlight go"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br></pre></td><td class="code"><pre><span class="line">record: job:slo_errors_per_request:ratio_rate10m</span><br><span class="line">expr:</span><br><span class="line"> sum(rate(slo_errors[<span class="number">10</span>m])) by (job)</span><br><span class="line"> /</span><br><span class="line"> sum(rate(slo_requests[<span class="number">10</span>m])) by (job)</span><br><span class="line"></span><br><span class="line">If you don’t export slo_errors and slo_requests from your job, you can create the time series by renaming a metric:</span><br><span class="line"></span><br><span class="line">record: slo_errors</span><br><span class="line">expr: http_errors</span><br></pre></td></tr></table></figure><p><img src="/googleSre/1.png" alt="image-20230702200907879"></p><p>举例: 比如a级域名要求日请求成功率为99.99%,告警窗口为5m,alerting window size=5m。reporting period为1d(1440m)即触发288次告警</p><p><img src="/googleSre/2.png" alt="image-20230702200907879"></p><p>下边告警时间和ErrorRate消耗率之间的函数关系图</p><p>f=x^{-1} x的负一次房 奇函数图。</p><p><img src="/googleSre/12.png" alt="image-20230702200907879"></p><table><thead><tr><th>优点</th><th>缺点</th></tr></thead><tbody><tr><td>检测时间良好:当10分内消耗错误预算(30天的资源*0.01%)就会触发告警。</td><td>精度很低:(报警次数多,过多报警没意义,噪音大且精确率(就是每次错误率刚好0.01%,对于长期30d的slo的角度来看,是“假告警”)太低</td></tr><tr><td>召回率(就是只要是不符合slo要求,都会触发告警)高。</td><td>日请求成功率又刚刚好是99.99%。从时间5分钟维度来看其ErrorRate>=0.01%即,但从一天的视角来看来看只消耗了0.000035%的错误预算。<br />从一天来看0.01%(目标错误率) / 288(5分钟的数量/天) = 0.00003472%(每个5分钟的错误预算)。</td></tr></tbody></table><h2 id="Increased-Alert-Window(slo99-9-为例)">Increased Alert Window(slo99.9%为例)</h2><p>上述方案精确率太低,可以通过增加告警窗口解决<strong>精确率</strong>的问题。</p><p>举例:设置当事件消耗了30天错误预算的5%(30d * 5% = 36h)时才收到报警,以提高精确率。</p><figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line">- alert: HighErrorRate</span><br><span class="line"> expr: job:slo_errors_per_request:ratio_rate36h{job="myjob"} > 0.001</span><br></pre></td></tr></table></figure><p>现在触发告警的耗时变为</p><p><img src="/googleSre/2.png" alt="image-20230702200907879"></p><table><thead><tr><th>优点</th><th>缺点</th></tr></thead><tbody><tr><td>检测时间良好:当36h内消耗错误预算(30天的资源✖️5%)就会触发告警。<br />当服务(可用性完全不可用的时候)完全停机需要2分10秒(30d * 0.1% * 5%) 就是当erorr率为100%的时候。</td><td>告警重置时间特别差:极端情况下,当错误率为100%当时候,报警会在2分10秒后触发,但是在36h后才恢复。因此该告警会有36h的告警检测空档期。</td></tr><tr><td>比方案一更精确了,错误率持续了更长时间。</td><td>0.1%的错误下,需要36h时才触发报警,此时存在⼤量数据点,计算成本会很高</td></tr></tbody></table><p><img src="/googleSre/3.png" alt="image-20230702200907879"></p><h2 id="Incrementing-Alert-Duration">Incrementing Alert Duration</h2><p>加一个持续告警的时间,除非该值在一段时间内保持高于阈值,否则警报不会触发。</p><figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br></pre></td><td class="code"><pre><span class="line">- alert: HighErrorRate</span><br><span class="line"> expr: job:slo_errors_per_request:ratio_rate1m{job="myjob"} > 0.001</span><br><span class="line"> for: 10m</span><br></pre></td></tr></table></figure><table><thead><tr><th>优点</th><th>缺点</th></tr></thead><tbody><tr><td>告警的精度值更高了:在触发之前要求持续的错误率意味着这次告警发生了重大的事件。</td><td>召回率和检测时间差:极端情况下如果错误率为100%,或者说普通的错误率为1%都是在10分钟后才告警,灵敏度与请求错误率无关。10分钟里前5分钟平均错误率特别高,消耗了特别大的错误预算,但是他在5分钟时恢复了,这样前5分钟的错误永远不会爆出来。</td></tr></tbody></table><p><img src="/googleSre/4.png" alt="image-20230702200907879"></p><h2 id="引入BrunRate和ErrorBudget两个概念">引入BrunRate和ErrorBudget两个概念</h2><h3 id="BurnRate">BurnRate</h3><p>为了改进之前的告警方案,希望有良好检测时间和高精度的告警策略。为此引入BurnRate(消耗错误预算的速度)这个概念来减少告警时间窗口的大小,同时保持错误预算不变。</p><p>举一个例子:我们将错误预算比喻为路程,BurnRate比喻为速度。那么错误预算(路程)不变的情况下,减少时间,速度(BurnRate就会变多)。S=vt</p><p>问题:为什么可以用BurnRate来作为错误率速度的使用?每个时间段的错误预算以及消耗错误预算的速度不是不变的么?</p><p><strong>slo就是sli的聚合,类似将月度时间内对应的sli相加,对sli指标做微分后积分。</strong></p><p><strong>下图为异常状态吗的count数量,横坐标为time,纵坐标为异常状态吗count总数量(类比路程),可见将时间拉大后 斜率近似恒定。</strong></p><p><img src="/googleSre/5.png" alt="image-20230702200907879"></p><h3 id="ErrorBudget">ErrorBudget</h3><p>Error Budget 的一种参考实现:</p><p>1.计算过去 30天的error budget</p><p>拿成功率举例,A级域名要求30day成功率 >99.99%,</p><p>假设A级域名要求30day成功率,在 30天内的请求成功率为 99.99%。这相当于259.2s秒的Error Budget。</p><p>Error Budget= 资源*(1-slo)</p><table><thead><tr><th style="text-align:left">SLO</th><th style="text-align:left">错误预算(30d)</th><th style="text-align:left">错误预算(1d)</th><th style="text-align:left">错误预算(1d时间窗口的5%)</th></tr></thead><tbody><tr><td style="text-align:left">99.99%</td><td style="text-align:left">30d*0.01%=259.2s</td><td style="text-align:left">1d*0.01%=8.64s</td><td style="text-align:left">1d*0.01%*5%=0.432s</td></tr><tr><td style="text-align:left"></td><td style="text-align:left"></td><td style="text-align:left"></td><td style="text-align:left"></td></tr></tbody></table><h2 id="Alert-on-Burn-Rate">Alert on Burn Rate</h2><p>下图显示了燃烧率和错误预算之间的关系。</p><p><img src="/googleSre/6.png" alt="image-20230702200907879"></p><p>举一个例子: 当错误率是0.1%时,此时消耗速率是1,当错误率是10%时,此时消耗速率是100。下表展示消耗掉一个可用率SLO为99.9%的服务月度错误预算时的时间表</p><p><img src="/googleSre/7.png" alt="image-20230702200907879"></p><p>我们将报警窗口设置为1h,并设定消耗了一个月错误预算的5%的时候发出告警。</p><figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line">- alert: HighErrorRate</span><br><span class="line"> expr: job:slo_errors_per_request:ratio_rate1h{job="myjob"} > 36 * 0.001</span><br></pre></td></tr></table></figure><p>此时错误预算的消耗率为:( 30d的资源✖️24h✖️60m✖️5%)/60m=36</p><p>报警触发所需的时间为:</p><p><img src="/googleSre/8.png" alt="image-20230702200907879"></p><p>根据公式算出:</p><ul><li>当错误预算消耗速率是36时,1小时发出报警</li><li>当服务完全不可用,错误预算消耗速率是1000,2分10秒发出报警</li></ul><table><thead><tr><th>优点</th><th>缺点</th></tr></thead><tbody><tr><td>精确率高:选择了大部分错误预算消耗时发出报警。</td><td>召回率低:比如当错误消耗率一直低于36,那么就永远不会报警,假设BurnRate为35,在20.5小时(30d / 35 * 24h)后会消耗掉30天的全部错误预算。</td></tr><tr><td>良好的检测时间</td><td>服务完全不可用时触发报警需要2分10秒</td></tr><tr><td>重置时间也更短了</td><td>报警的重置时间:58分钟仍然太长</td></tr></tbody></table><h2 id="Multiple-Burn-Rate-Alerts">Multiple Burn Rate Alerts</h2><p>使用多个消耗率和时间窗口,并在消耗率超过指定阈值时发出警报。这个方案即保留了(Alert on Burn Rate告警方案)的优点(精确率高),也会确保你不会忽视较低(但仍然很重要)的错误率就是也会<strong>提高召回率</strong>。</p><p>GoogleSre建议一小时内 2% 的预算消耗和六小时内 5% 的预算消耗作为寻呼的合理起始数字,三天内 10% 的预算消耗作为工单警报的良好基准。适当的数字取决于服务和基线页面加载。</p><ul><li>1小时内消耗月度2%的预算消耗,发出紧急报警</li><li>6小时内消耗月度5%的错误预算,发出紧急报警</li><li>3天内消耗月度10%的预算,发出故障工单</li></ul><p>下表显示了消耗的SLO预算百分比、相应消耗效率和时间窗口</p><p><img src="/googleSre/9.png" alt="image-20230702200907879"></p><p>告警规则变为:</p><figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br></pre></td><td class="code"><pre><span class="line">expr: (</span><br><span class="line"> job:slo_errors_per_request:ratio_rate1h{job="myjob"} > (14.4*0.001)</span><br><span class="line"> or</span><br><span class="line"> job:slo_errors_per_request:ratio_rate6h{job="myjob"} > (6*0.001)</span><br><span class="line"> )</span><br><span class="line">severity: page</span><br><span class="line"></span><br><span class="line">expr: job:slo_errors_per_request:ratio_rate3d{job="myjob"} > 0.001</span><br><span class="line">severity: ticket</span><br></pre></td></tr></table></figure><table><thead><tr><th>优点</th><th>缺点</th></tr></thead><tbody><tr><td>能够根据关键值调整监控配置以适应多种情况:错误率高时快速报警;如果错误率很低但持续发生,最终会发出报警。</td><td>需要管理更多的时间窗口大小和阈值。</td></tr><tr><td>精确率高</td><td>告警的恢复时间长,由于有3天消耗10%错误预算的报警,当服务完全不可用时,4.3分钟就会触发报警,但需要3天后才恢复</td></tr><tr><td>和召回率也不错(因为3天10%的错误预算的报警策略,可以识别到大部分是真正是errorRate>(1-slo)了)</td><td>为了避免在所有条件都成立时触发多个警报,您需要实施警报抑制。例如:5 分钟内花费了 10% 的预算也意味着 6 小时内花费了 5% 的预算,1 小时内花费了 2% 的预算。这种情况将触发三个通知,除非监控系统足够智能来阻止它这样做。</td></tr><tr><td>可以根据不同的错误率严重程度配置不同的告警</td><td></td></tr></tbody></table><h2 id="Multiwindow-Multi-Burn-Rate-Alerts">Multiwindow, Multi-Burn-Rate Alerts</h2><p>在方案5增强加窗口告警,我们需要添加另一个参数:较短的窗口来检查在触发警报时是否仍在消耗错误预算。</p><p>Google建议将短窗口设为长窗口持续时间的1/12</p><p><img src="/googleSre/10.png" alt="image-20230702200907879"></p><p>理解短窗口的作用:</p><p>以一小时的窗口做报警计算为例,若5分钟时错误已经告警,那么接下来的55分钟也会继续发出告警。因此这个策略增加另一个窗口,在报警和恢复时检查是否仍然达到错误预算消耗速率。</p><figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br></pre></td><td class="code"><pre><span class="line">expr: (</span><br><span class="line"> job:slo_errors_per_request:ratio_rate1h{job="myjob"} > (14.4*0.001)</span><br><span class="line"> and</span><br><span class="line"> job:slo_errors_per_request:ratio_rate5m{job="myjob"} > (14.4*0.001)</span><br><span class="line"> )</span><br><span class="line"> or</span><br><span class="line"> (</span><br><span class="line"> job:slo_errors_per_request:ratio_rate6h{job="myjob"} > (6*0.001)</span><br><span class="line"> and</span><br><span class="line"> job:slo_errors_per_request:ratio_rate30m{job="myjob"} > (6*0.001)</span><br><span class="line"> )</span><br><span class="line">severity: page</span><br><span class="line"></span><br><span class="line">expr: (</span><br><span class="line"> job:slo_errors_per_request:ratio_rate24h{job="myjob"} > (3*0.001)</span><br><span class="line"> and</span><br><span class="line"> job:slo_errors_per_request:ratio_rate2h{job="myjob"} > (3*0.001)</span><br><span class="line"> )</span><br><span class="line"> or</span><br><span class="line"> (</span><br><span class="line"> job:slo_errors_per_request:ratio_rate3d{job="myjob"} > 0.001</span><br><span class="line"> and</span><br><span class="line"> job:slo_errors_per_request:ratio_rate6h{job="myjob"} > 0.001</span><br><span class="line"> )</span><br><span class="line">severity: ticket</span><br></pre></td></tr></table></figure><p><img src="/googleSre/11.png" alt="image-20230702200907879"></p><p>1、假如错误率为10%,则错误预算消耗率是100,超过了1h窗口下的错误消耗率14.4,即消耗了一个月2%的错误预算,在43秒(14.4 * 5 / 100 * 60s = 43s)时已经触发了短窗口的条件。<strong>总结错误率为10%下,43s就会触发告警</strong></p><p>2、当错误率为10%并且持续了8.64分钟预算路程/实际错误预算速度(30d * 24h * 60m * 2% / 100 = 8.64m),或者(1-slo)/实际errorRate ✖️ 报警时间✖️预算消耗速度=8.64分钟,触发长窗口的报警阈值。<strong>总结错误持续率为10%下,8.64m就会触发告警</strong></p><p>3、服务故障停止5分钟后,短窗口错误预算消耗速率平均值低于14.4,不会再触发,报警恢复</p><p>4、服务故障停止51.4分钟后,长窗口错误预算消耗平均值速率低于14.4,不会再触发</p><table><thead><tr><th>优点</th><th>缺点</th></tr></thead><tbody><tr><td>比起上边的方案,降低了Reset time(告警恢复时间)</td><td>指定的参数很多,这可能使报警规则难以管理</td></tr><tr><td>精确率高</td><td></td></tr></tbody></table><h2 id="参考文章">参考文章</h2><p><a href="https://sre.google/workbook/alerting-on-slos/#detection-time">GoogleSRE-Chapter 5 - Alerting on SLOs</a></p><p><a href="https://mp.weixin.qq.com/s/lhyHlZB6fpukV0YzICz3IQ">没有SLO就没有SRE?来看看B站SRE对SLO的实践总结(下)</a></p>]]></content>
<summary type="html"><div class="note blue icon-padding modern"><i class="note-icon fas fa-bullhorn"></i><p>对《GoogleSRE-alerting-on-slos》的理解分析,相当于是翻译一遍加上自己的理解吧,里</summary>
<category term="Grafana" scheme="http://misakiz.github.io/tags/Grafana/"/>
<category term="SRE" scheme="http://misakiz.github.io/tags/SRE/"/>
<category term="Google" scheme="http://misakiz.github.io/tags/Google/"/>
<category term="Alert" scheme="http://misakiz.github.io/tags/Alert/"/>
</entry>
<entry>
<title>数据库死锁记录分析</title>
<link href="http://misakiz.github.io/2023/07/04/%E6%95%B0%E6%8D%AE%E5%BA%93%E6%AD%BB%E9%94%81%E8%AE%B0%E5%BD%95%E5%88%86%E6%9E%90/"/>
<id>http://misakiz.github.io/2023/07/04/%E6%95%B0%E6%8D%AE%E5%BA%93%E6%AD%BB%E9%94%81%E8%AE%B0%E5%BD%95%E5%88%86%E6%9E%90/</id>
<published>2023-07-04T13:42:45.000Z</published>
<updated>2023-08-01T10:57:59.288Z</updated>
<content type="html"><![CDATA[<div class="note blue icon-padding modern"><i class="note-icon fas fa-bullhorn"></i><p>记录mysql死锁日志分析</p></div><div class="note red icon-padding flat"><i class="note-icon fas fa-fan"></i><p>暂时不知道咋hexo咋配置评论功能,如需要和我沟通的话麻烦通过b站私信我,b站id:<strong>披着双马尾的大叔</strong></p></div><h2 id="数据库死锁记录分析">数据库死锁记录分析</h2><p>报错定位</p><figure class="highlight go"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">//后端服务每日执行的定时任务失败</span></span><br><span class="line"></span><br><span class="line"><span class="number">624</span><span class="number">2023</span><span class="number">-07</span><span class="number">-04</span> <span class="number">04</span>:<span class="number">30</span>:<span class="number">00.099</span><span class="number">2023</span><span class="number">-07</span><span class="number">-04</span> <span class="number">04</span>:<span class="number">30</span>:<span class="number">00.099</span>xxxxxx<span class="number">2023</span><span class="number">-07</span><span class="number">-04</span> <span class="number">04</span>:<span class="number">30</span>:<span class="number">00.098</span><span class="number">0</span></span><br><span class="line"><span class="comment">//查看对应job日志</span></span><br><span class="line"><span class="number">2023</span>/<span class="number">07</span>/<span class="number">04</span> - <span class="number">04</span>:<span class="number">30</span>:<span class="number">14.064</span> Error <span class="number">1213</span>: Deadlock found when trying to get lock; try restarting transaction</span><br></pre></td></tr></table></figure><h3 id="show-engine-innodb-status获取INNODB引擎当前信息">show engine innodb status获取INNODB引擎当前信息</h3><figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br><span class="line">43</span><br><span class="line">44</span><br><span class="line">45</span><br><span class="line">46</span><br><span class="line">47</span><br><span class="line">48</span><br><span class="line">49</span><br><span class="line">50</span><br><span class="line">51</span><br><span class="line">52</span><br><span class="line">53</span><br><span class="line">54</span><br><span class="line">55</span><br><span class="line">56</span><br><span class="line">57</span><br></pre></td><td class="code"><pre><span class="line">=====================================</span><br><span class="line">2023-07-04 08:20:10 0x7f957c0a1700 INNODB MONITOR OUTPUT</span><br><span class="line">=====================================</span><br><span class="line">Per second averages calculated from the last 31 seconds</span><br><span class="line">-----------------</span><br><span class="line">BACKGROUND THREAD</span><br><span class="line">-----------------</span><br><span class="line">srv_master_thread loops: 1122 srv_active, 0 srv_shutdown, 1287352 srv_idle</span><br><span class="line">srv_master_thread log flush and writes: 1288474</span><br><span class="line">----------</span><br><span class="line">SEMAPHORES</span><br><span class="line">----------</span><br><span class="line">OS WAIT ARRAY INFO: reservation count 2203</span><br><span class="line">OS WAIT ARRAY INFO: signal count 2362</span><br><span class="line">RW-shared spins 0, rounds 2596, OS waits 766</span><br><span class="line">RW-excl spins 0, rounds 9755, OS waits 153</span><br><span class="line">RW-sx spins 82, rounds 1774, OS waits 33</span><br><span class="line">Spin rounds per wait: 2596.00 RW-shared, 9755.00 RW-excl, 21.63 RW-sx</span><br><span class="line">------------------------</span><br><span class="line">LATEST DETECTED DEADLOCK</span><br><span class="line">------------------------</span><br><span class="line">2023-07-03 20:30:14 0x7f957c0e3700</span><br><span class="line">*** (1) TRANSACTION:</span><br><span class="line">TRANSACTION 87180, ACTIVE 0 sec starting index read</span><br><span class="line">mysql tables in use 2, locked 2</span><br><span class="line">LOCK WAIT 136 lock struct(s), heap size 24784, 3222 row lock(s)</span><br><span class="line">MySQL thread id 2118, OS thread handle 140280007620352, query id 463874 172.17.0.1 root Sending data</span><br><span class="line">UPDATE daily_report_domain_cluster dc JOIN daily_report dr on dr.domain=dc.domain AND dr.date=dc.date set dc.percent = ROUND((dc.status_total*100)/dr.status_total ,4) where dc.date = ?</span><br><span class="line">*** (1) WAITING FOR THIS LOCK TO BE GRANTED:</span><br><span class="line">RECORD LOCKS space id 70 page no 39 n bits 256 index idx_domain_date of table `gva`.`daily_report` trx id 87180 lock mode S waiting</span><br><span class="line">Record lock, heap no 182 PHYSICAL RECORD: n_fields 3; compact format; info bits 0</span><br><span class="line"> 0: len 19; hex 6465706c6f792e7a6875616e696e632e636f6d; asc test.xxx.com;;</span><br><span class="line"> 1: len 10; hex 323032332d30372d3033; asc 2023-07-03;;</span><br><span class="line"> 2: len 8; hex 000000000000322c; asc 2,;;</span><br><span class="line"></span><br><span class="line">*** (2) TRANSACTION:</span><br><span class="line">TRANSACTION 87179, ACTIVE 0 sec inserting</span><br><span class="line">mysql tables in use 1, locked 1</span><br><span class="line">3 lock struct(s), heap size 1136, 2 row lock(s), undo log entries 59</span><br><span class="line">MySQL thread id 2120, OS thread handle 140280008161024, query id 464030 172.17.0.1 root update</span><br><span class="line">INSERT INTO `xxx_table` VALUES (?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?)</span><br><span class="line">*** (2) HOLDS THE LOCK(S):</span><br><span class="line">RECORD LOCKS space id 70 page no 39 n bits 256 index idx_domain_date of table `gva`.`daily_report` trx id 87179 lock_mode X locks rec but not gap</span><br><span class="line">Record lock, heap no 182 PHYSICAL RECORD: n_fields 3; compact format; info bits 0</span><br><span class="line"> 0: len 19; hex 6465706c6f792e7a6875616e696e632e636f6d; asc test.xxx.com;;</span><br><span class="line"> 1: len 10; hex 323032332d30372d3033; asc 2023-07-03;;</span><br><span class="line"> 2: len 8; hex 000000000000322c; asc 2,;;</span><br><span class="line"></span><br><span class="line"></span><br><span class="line">*** (2) WAITING FOR THIS LOCK TO BE GRANTED:</span><br><span class="line">RECORD LOCKS space id 70 page no 214 n bits 272 index idx_domain_date of table `gva`.`daily_report` trx id 87179 lock_mode X locks gap before rec insert intention waiting</span><br><span class="line">Record lock, heap no 64 PHYSICAL RECORD: n_fields 3; compact format; info bits 0</span><br><span class="line"> 0: len 18; hex 63732e7a6875616e7370697269742e636f6d; asc sandbox.xxx.com;;</span><br><span class="line"> 1: len 10; hex 323032332d30362d3038; asc 2023-06-08;;</span><br><span class="line"> 2: len 8; hex 0000000000000155; asc U;;</span><br><span class="line"></span><br><span class="line">*** WE ROLL BACK TRANSACTION (2)</span><br></pre></td></tr></table></figure><p>这记录着上一次死锁的记录</p><p>时间为2023-07-03 20:30:14 ,因为数据库时间为utc +8。 正好是对应cronjob任务失败的时间</p><p>我们分析一下上述日志信息</p><p>1、事物id 87180 即<strong>第一个事物</strong>,<strong>WAITING FOR THIS LOCK TO BE GRANTED</strong> 正在<strong>等待</strong> daily_report 表中**(<a href="http://test.xxx.com">test.xxx.com</a>,2023-07-03)记录上的记录锁**(锁模式为 S)</p><p>2、事物id 87179 即<strong>第二个事物</strong>, HOLDS THE LOCK(S) <strong>持有(<a href="http://test.xxx.com">test.xxx.com</a>,2023-07-03)记录</strong>(锁模式为X)。</p><p><strong>index idx_domain_date of table <code>gva</code>.<code>daily_report</code> trx id 87179 lock_mode X locks gap before rec insert intention waiting</strong>并正在<strong>等待在同一表中相邻记录上的意向锁</strong>(锁模式为 X locks gap before rec insert intention waiting)插入新记录。</p><p><strong>等待获取(<a href="http://sandbox.xxx.com">sandbox.xxx.com</a>,2023-06-08)记录的X意向锁</strong></p><h3 id="查看daily-report表的索引">查看daily_report表的索引</h3><figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br></pre></td><td class="code"><pre><span class="line">有一个(domain,date)的联合索引</span><br><span class="line">SHOW INDEX FROM daily_report;</span><br><span class="line">daily_report1idx_domain_date1domainA733YESBTREE</span><br><span class="line">daily_report1idx_domain_date2dateA196686YESBTREE</span><br></pre></td></tr></table></figure><h3 id="执行顺序分析">执行顺序分析</h3><table><thead><tr><th>顺序</th><th>事务1</th><th>事务2</th><th>说明</th></tr></thead><tbody><tr><td>1</td><td>begin</td><td></td><td></td></tr><tr><td>2</td><td></td><td>begin</td><td></td></tr><tr><td>3</td><td></td><td>04:30:08.86<br />INSERT INTO <code>daily_report</code> <strong>(<a href="http://test.xxx.com">test.xxx.com</a>,2023-07-03)记录</strong></td><td>事务2给 <strong>daily_report</strong>表**(<a href="http://test.xxx.com">test.xxx.com</a>,2023-07-03)记录**上X记录锁</td></tr><tr><td></td><td></td><td></td><td></td></tr><tr><td>4</td><td>UPDATE daily_report_domain_cluster dc JOIN daily_report dr on dr.domain=dc.domain AND dr.date=dc.date set dc.percent = ROUND((dc.status_total*100)/dr.status_total ,4) where dc.date = ? 获取daily_report的 <strong>(<a href="http://test.xxx.com">test.xxx.com</a>,2023-07-03)和记录(<a href="http://sandbox.xxx.com">sandbox.xxx.com</a>,2023-06-08)</strong></td><td></td><td><br />因为join 是dr.domain=dc.domain AND dr.date=dc.date 走了daily_report的(domain, date)联合索引<br /><strong>事务1</strong> 想给 <strong>daily_report</strong>的**(<a href="http://test.xxx.com">test.xxx.com</a>,2023-07-03)<strong>记录表上s行记录锁,但是发现已经被事物2上了X锁,因此等待释放<br />同时</strong>事务1**给domain<‘xxx’ 和 domain>‘xxxx’ 这两个间隙添加间隙锁,以防止其他事务在这些间隙中插入新行。即持有gap锁</td></tr><tr><td>5</td><td></td><td>INSERT INTO <code>daily_report</code> <strong>(<a href="http://test.xxx.com">test.xxx.com</a>,2023-07-03)记录</strong></td><td><strong>事务2</strong>尝试给给 <strong>daily_report</strong>表**(<a href="http://sandbox.xxx.com">sandbox.xxx.com</a>,2023-06-08)记录<strong>上X记录锁,却发现已经被</strong>事务1**在domain<‘xxx’ 和 domain>‘xxxx’ 加了gap锁</td></tr></tbody></table><p><strong>事务2</strong>持有(<a href="http://test.xxx.com">test.xxx.com</a>,2023-07-03)记录x锁、<strong>事务1</strong>申请(<a href="http://test.xxx.com">test.xxx.com</a>,2023-07-03)的s锁,阻塞。<br><strong>事务1</strong>持有domain<‘xxx’ 和 domain>‘xxxx’ 这两个间隙添加间隙锁,<strong>事务2</strong>申请插入(<a href="http://test.xxx.com">test.xxx.com</a>,2023-07-03)的记录,发现被加了gap锁,阻塞,等待事务1释放gap锁。</p><h3 id="解决">解决</h3><p>两个定时任务job更新语句同步完成</p>]]></content>
<summary type="html"><div class="note blue icon-padding modern"><i class="note-icon fas fa-bullhorn"></i><p>记录mysql死锁日志分析</p>
</div>
<div class="note red icon-pa</summary>
<category term="Mysql" scheme="http://misakiz.github.io/tags/Mysql/"/>
</entry>
<entry>
<title>Grafana配置(一)</title>
<link href="http://misakiz.github.io/2023/07/02/Grafana%E9%85%8D%E7%BD%AE/"/>
<id>http://misakiz.github.io/2023/07/02/Grafana%E9%85%8D%E7%BD%AE/</id>
<published>2023-07-02T11:09:45.000Z</published>
<updated>2023-07-02T12:43:29.709Z</updated>
<content type="html"><![CDATA[<div class="note blue icon-padding modern"><i class="note-icon fas fa-bullhorn"></i><p>记录着Grafana的一些使用配置技巧</p></div><div class="note red icon-padding flat"><i class="note-icon fas fa-fan"></i><p>暂时不知道咋hexo咋配置评论功能,如需要和我沟通的话麻烦通过b站私信我,b站id:<strong>披着双马尾的大叔</strong></p></div><h2 id="lable改名">lable改名</h2><p>因为原label名字为zqa_xx{domain=“domainName”},包含多余的信息,需要转换为domainName</p><h3 id="Transform">Transform</h3><p>Grafana官网链接地址: <a href="https://grafana.com/docs/grafana/latest/panels-visualizations/query-transform-data/transform-data/">https://grafana.com/docs/grafana/latest/panels-visualizations/query-transform-data/transform-data/</a></p><h4 id="Rename-by-regex">Rename by regex</h4><p>如下图,可以通过正则表达式修改对应的label名,并进行转换。</p><p><img src="/grafana/1.png" alt="image-20230702200907879"></p><figure class="highlight go"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line">原label: {domain=<span class="string">"baidu.com"</span>}</span><br><span class="line">新label: baidu.com</span><br></pre></td></tr></table></figure><p>若还有多个label,比如通过promql函数查询出的新结果为**(sum((sic_exporter_xxxx{domain=~"(baidu**</p><p>同样可以配置多个Rename by regex函数来匹配到需要修改的label</p><p>这里举个例子</p><p><img src="/grafana/2.png" alt="image-20230702200907873"></p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line">原label: (<span class="built_in">sum</span>((sic_exporter_xxxx{domain=~"(baidu</span><br><span class="line">新label: xxxxxxx</span><br></pre></td></tr></table></figure><h4 id="Labels-to-fields">Labels to fields</h4><p>此函数可以根据label对应的标签进行转换</p><p>如下图</p><p><img src="/grafana/3.png" alt="image-202307022009078379"></p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line">原label: {domain=<span class="string">"baidu.com"</span>}</span><br><span class="line">新label: baidu.com</span><br></pre></td></tr></table></figure><p>总结一下Rename by regex函数自由度更高,Labels to fields易于配置</p><h3 id="Overrides">Overrides</h3><p><img src="/grafana/4.png" alt="image-202307022009078373"></p><p>第二种是通过重写filed实现</p><p>这种只适用于批量修改label名,比如通过正则匹配到符合条件的名字,然后修改为目标名字。</p><p>或者针对某个特定的label进行修改。<strong>扩展度特别低</strong>。</p>]]></content>
<summary type="html"><div class="note blue icon-padding modern"><i class="note-icon fas fa-bullhorn"></i><p>记录着Grafana的一些使用配置技巧</p>
</div>
<div class="note red i</summary>
<category term="Grafana" scheme="http://misakiz.github.io/tags/Grafana/"/>
</entry>
<entry>
<title>k8s虚拟网络是否会与外层网络冲突的分析</title>
<link href="http://misakiz.github.io/2023/06/28/k8s%E8%99%9A%E6%8B%9F%E7%BD%91%E7%BB%9C%E6%98%AF%E5%90%A6%E4%BC%9A%E4%B8%8E%E5%A4%96%E5%B1%82%E7%BD%91%E7%BB%9C%E5%86%B2%E7%AA%81%E7%9A%84%E5%88%86%E6%9E%90/"/>
<id>http://misakiz.github.io/2023/06/28/k8s%E8%99%9A%E6%8B%9F%E7%BD%91%E7%BB%9C%E6%98%AF%E5%90%A6%E4%BC%9A%E4%B8%8E%E5%A4%96%E5%B1%82%E7%BD%91%E7%BB%9C%E5%86%B2%E7%AA%81%E7%9A%84%E5%88%86%E6%9E%90/</id>
<published>2023-06-28T12:16:40.000Z</published>
<updated>2024-01-10T08:42:35.852Z</updated>
<content type="html"><![CDATA[<div class="note blue icon-padding modern"><i class="note-icon fas fa-bullhorn"></i><p>k8s虚拟网络是否会与外层网络冲突的分析</p></div><div class="note red icon-padding flat"><i class="note-icon fas fa-fan"></i><p>暂时不知道咋hexo咋配置评论功能,如需要和我沟通的话麻烦通过b站私信我,b站id:<strong>披着双马尾的大叔</strong></p></div><h3 id="查看集群默认启动网络策略">查看集群默认启动网络策略</h3><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br><span class="line">43</span><br><span class="line">44</span><br><span class="line">45</span><br><span class="line">46</span><br><span class="line">47</span><br><span class="line">48</span><br><span class="line">49</span><br><span class="line">50</span><br><span class="line">51</span><br><span class="line">52</span><br><span class="line">53</span><br><span class="line">54</span><br><span class="line">55</span><br><span class="line">56</span><br><span class="line">57</span><br><span class="line">58</span><br><span class="line">59</span><br><span class="line">60</span><br><span class="line">61</span><br></pre></td><td class="code"><pre><span class="line">[root@KuberYQ3 ~]<span class="comment"># k describe cm canal-config</span></span><br><span class="line">Name: canal-config</span><br><span class="line">Namespace: kube-system</span><br><span class="line">Labels: <none></span><br><span class="line">Annotations: <none></span><br><span class="line"></span><br><span class="line">Data</span><br><span class="line">====</span><br><span class="line">canal_iface:</span><br><span class="line">----</span><br><span class="line"></span><br><span class="line">cni_network_config:</span><br><span class="line">----</span><br><span class="line">{</span><br><span class="line"> <span class="string">"name"</span>: <span class="string">"k8s-pod-network"</span>,</span><br><span class="line"> <span class="string">"cniVersion"</span>: <span class="string">"0.3.1"</span>,</span><br><span class="line"> <span class="string">"plugins"</span>: [</span><br><span class="line"> {</span><br><span class="line"> <span class="string">"type"</span>: <span class="string">"calico"</span>,</span><br><span class="line"> <span class="string">"log_level"</span>: <span class="string">"WARNING"</span>,</span><br><span class="line"> <span class="string">"datastore_type"</span>: <span class="string">"kubernetes"</span>,</span><br><span class="line"> <span class="string">"nodename"</span>: <span class="string">"__KUBERNETES_NODE_NAME__"</span>,</span><br><span class="line"> <span class="string">"ipam"</span>: {</span><br><span class="line"> <span class="string">"type"</span>: <span class="string">"host-local"</span>,</span><br><span class="line"> <span class="string">"subnet"</span>: <span class="string">"usePodCidr"</span></span><br><span class="line"> },</span><br><span class="line"> <span class="string">"policy"</span>: {</span><br><span class="line"> <span class="string">"type"</span>: <span class="string">"k8s"</span>,</span><br><span class="line"> <span class="string">"k8s_auth_token"</span>: <span class="string">"__SERVICEACCOUNT_TOKEN__"</span></span><br><span class="line"> },</span><br><span class="line"> <span class="string">"kubernetes"</span>: {</span><br><span class="line"> <span class="string">"kubeconfig"</span>: <span class="string">"/etc/kubernetes/ssl/kubecfg-kube-node.yaml"</span></span><br><span class="line"> }</span><br><span class="line"> },</span><br><span class="line"> {</span><br><span class="line"> <span class="string">"type"</span>: <span class="string">"portmap"</span>,</span><br><span class="line"> <span class="string">"snat"</span>: <span class="literal">true</span>,</span><br><span class="line"> <span class="string">"capabilities"</span>: {<span class="string">"portMappings"</span>: <span class="literal">true</span>}</span><br><span class="line"> }</span><br><span class="line"> ]</span><br><span class="line">}</span><br><span class="line">masquerade:</span><br><span class="line">----</span><br><span class="line"><span class="literal">true</span></span><br><span class="line">net-conf.json:</span><br><span class="line">----</span><br><span class="line">{</span><br><span class="line"> <span class="string">"Network"</span>: <span class="string">"10.42.0.0/16"</span>,</span><br><span class="line"> <span class="string">"Backend"</span>: {</span><br><span class="line"> <span class="string">"Type"</span>: <span class="string">"vxlan"</span></span><br><span class="line"> }</span><br><span class="line">}</span><br><span class="line"></span><br><span class="line">typha_service_name:</span><br><span class="line">----</span><br><span class="line">none</span><br><span class="line"></span><br><span class="line">BinaryData</span><br><span class="line">====</span><br><span class="line"></span><br><span class="line">Events: <none></span><br></pre></td></tr></table></figure><p>可以看到pod与pod跨node通信是使用flannel的vxlan,</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br></pre></td><td class="code"><pre><span class="line">[root@KuberJD1 ~]<span class="comment"># route -n</span></span><br><span class="line">Kernel IP routing table</span><br><span class="line">Destination Gateway Genmask Flags Metric Ref Use Iface</span><br><span class="line">0.0.0.0 192.168.12.1 0.0.0.0 UG 100 0 0 ens192</span><br><span class="line">10.42.0.0 10.42.0.0 255.255.255.0 UG 0 0 0 flannel.1</span><br><span class="line">10.42.1.0 10.42.1.0 255.255.255.0 UG 0 0 0 flannel.1</span><br><span class="line">10.42.2.0 10.42.2.0 255.255.255.0 UG 0 0 0 flannel.1</span><br><span class="line">10.42.4.0 10.42.4.0 255.255.255.0 UG 0 0 0 flannel.1</span><br></pre></td></tr></table></figure><p>Node内部好像使用的calico(奇奇怪怪)</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br></pre></td><td class="code"><pre><span class="line">10.42.5.6 0.0.0.0 255.255.255.255 UH 0 0 0 caliaa55f522eed</span><br><span class="line">10.42.5.7 0.0.0.0 255.255.255.255 UH 0 0 0 cali25daf88f8f4</span><br><span class="line">10.42.5.9 0.0.0.0 255.255.255.255 UH 0 0 0 cali5b9c9dc412f</span><br><span class="line">10.42.5.11 0.0.0.0 255.255.255.255 UH 0 0 0 cali0a2d09ea729</span><br><span class="line">10.42.5.19 0.0.0.0 255.255.255.255 UH 0 0 0 cali7d6026f7214</span><br><span class="line">10.42.5.20 0.0.0.0 255.255.255.255 UH 0 0 0 cali496a637f2b6</span><br><span class="line">10.42.5.21 0.0.0.0 255.255.255.255 UH 0 0 0 cali094f27868c8</span><br><span class="line">10.42.5.22 0.0.0.0 255.255.255.255 UH 0 0 0 calif660f4b0f0a</span><br><span class="line">10.42.5.23 0.0.0.0 255.255.255.255 UH 0 0 0 cali24c6af87c12</span><br><span class="line">10.42.5.24 0.0.0.0 255.255.255.255 UH 0 0 0 cali83ef23bb08c</span><br><span class="line">10.42.5.25 0.0.0.0 255.255.255.255 UH 0 0 0 califbc174b3e86</span><br><span class="line">10.42.5.28 0.0.0.0 255.255.255.255 UH 0 0 0 cali22ab0aca219</span><br><span class="line">10.42.5.29 0.0.0.0 255.255.255.255 UH 0 0 0 calif651c149e71</span><br><span class="line">10.42.5.30 0.0.0.0 255.255.255.255 UH 0 0 0 calib3afa4b713a</span><br><span class="line">10.42.5.31 0.0.0.0 255.255.255.255 UH 0 0 0 cali131d95a95c5</span><br><span class="line">10.42.5.32 0.0.0.0 255.255.255.255 UH 0 0 0 cali8c4e0fefffb</span><br><span class="line">10.42.5.33 0.0.0.0 255.255.255.255 UH 0 0 0 cali49b08a26992</span><br><span class="line">10.42.5.34 0.0.0.0 255.255.255.255 UH 0 0 0 cali8b5603fa18d</span><br><span class="line">10.42.5.35 0.0.0.0 255.255.255.255 UH 0 0 0 calif3d7150d86b</span><br><span class="line">10.42.5.36 0.0.0.0 255.255.255.255 UH 0 0 0 cali9f7a2a81999</span><br><span class="line">10.42.5.37 0.0.0.0 255.255.255.255 UH 0 0 0 calic7b99bf611c</span><br><span class="line">10.42.5.38 0.0.0.0 255.255.255.255 UH 0 0 0 cali9cd5eefdbdb</span><br><span class="line">10.42.5.39 0.0.0.0 255.255.255.255 UH 0 0 0 cali60886b4d62a</span><br><span class="line">10.42.5.40 0.0.0.0 255.255.255.255 UH 0 0 0 cali723de22ee93</span><br><span class="line">10.42.5.41 0.0.0.0 255.255.255.255 UH 0 0 0 calic1c3382ae7e</span><br><span class="line">10.42.5.42 0.0.0.0 255.255.255.255 UH 0 0 0 calif204cdda79b</span><br><span class="line">10.42.5.43 0.0.0.0 255.255.255.255 UH 0 0 0 cali4f497f76d9e</span><br><span class="line">10.42.5.44 0.0.0.0 255.255.255.255 UH 0 0 0 caliebf9b64ae51</span><br><span class="line">10.42.5.45 0.0.0.0 255.255.255.255 UH 0 0 0 calie98b8f7dfac</span><br><span class="line">10.42.5.48 0.0.0.0 255.255.255.255 UH 0 0 0 cali522dc0ad58c</span><br><span class="line">10.42.5.49 0.0.0.0 255.255.255.255 UH 0 0 0 calia377a430aec</span><br><span class="line">10.42.5.50 0.0.0.0 255.255.255.255 UH 0 0 0 calid7bffa46af9</span><br></pre></td></tr></table></figure><h3 id="先讲讲nodes内部的pod之间通信,直接brige-走路由表">先讲讲nodes内部的pod之间通信,直接brige+走路由表</h3><p>![image-20220617160051098](data:image/svg+xml,<svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 1180 588"></svg>)</p><p>根据上图来套娃</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br></pre></td><td class="code"><pre><span class="line">[root@KuberYQ3 ~]<span class="comment"># k get po -o wide -A| grep 5.6</span></span><br><span class="line">monitor filebeat-gsjmj 1/1 Running 1 23d 10.42.5.6 kuberjd1 <none> <none></span><br><span class="line">[root@KuberYQ3 ~]<span class="comment"># k get po -o wide -A| grep 5.7</span></span><br><span class="line">kube-system coredns-564fc6c9d5-2pqcv 1/1 Running 0 15d 10.42.5.7 kuberjd1 <none> <none></span><br></pre></td></tr></table></figure><p>我们先看5.6和5.7的路由地址</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br></pre></td><td class="code"><pre><span class="line">route -n</span><br><span class="line">10.42.5.6 0.0.0.0 255.255.255.255 UH 0 0 0 caliaa55f522eed</span><br><span class="line">10.42.5.7 0.0.0.0 255.255.255.255 UH 0 0 0 cali25daf88f8f4</span><br></pre></td></tr></table></figure><p>意思是当</p><p>访问10.42.5.6 的时候 走内部网关0.0.0.的网卡caliaa55f522eed</p><p>访问10.42.5.7 的时候 走内部网关0.0.0.的网卡cali25daf88f8f4</p><p>查看caliaa55f522eed的网卡信息</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br></pre></td><td class="code"><pre><span class="line">[root@KuberJD1 ~]<span class="comment"># ip addr | grep -n3 caliaa55f522eed</span></span><br><span class="line">14- <span class="built_in">link</span>/ether 02:42:c9:bf:4a:fa brd ff:ff:ff:ff:ff:ff</span><br><span class="line">15- inet 172.17.0.1/16 brd 172.17.255.255 scope global docker0</span><br><span class="line">16- valid_lft forever preferred_lft forever</span><br><span class="line">17:4: caliaa55f522eed@if3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default</span><br><span class="line">18- <span class="built_in">link</span>/ether ee:ee:ee:ee:ee:ee brd ff:ff:ff:ff:ff:ff link-netnsid 0</span><br><span class="line">19- inet6 fe80::ecee:eeff:feee:eeee/64 scope <span class="built_in">link</span></span><br><span class="line">20- valid_lft forever preferred_lft forever</span><br></pre></td></tr></table></figure><p>caliaa55f522eed@if3 对端在所在网络命名空间的3号网卡</p><p>link-netnsid 0:对端在netnsid为0的网络命名空间里</p><p>我们查看对应网络命名空间</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br><span class="line">43</span><br><span class="line">44</span><br></pre></td><td class="code"><pre><span class="line">[root@KuberJD1 netns]<span class="comment"># ln -s /var/run/docker/netns /var/run/netns</span></span><br><span class="line">[root@KuberJD1 netns]<span class="comment"># ip netns list</span></span><br><span class="line">37cce77c51d4 (<span class="built_in">id</span>: 31)</span><br><span class="line">530dcb70e8e0 (<span class="built_in">id</span>: 12)</span><br><span class="line">e5ea8393436e (<span class="built_in">id</span>: 11)</span><br><span class="line">b81e5302147c (<span class="built_in">id</span>: 32)</span><br><span class="line">b683515c66ec (<span class="built_in">id</span>: 40)</span><br><span class="line">de2211fac377 (<span class="built_in">id</span>: 39)</span><br><span class="line">bfe028cd215b (<span class="built_in">id</span>: 38)</span><br><span class="line">acd5fbc4c28a (<span class="built_in">id</span>: 37)</span><br><span class="line">0ea1ab4a67d6 (<span class="built_in">id</span>: 35)</span><br><span class="line">76c9f2b4c621 (<span class="built_in">id</span>: 36)</span><br><span class="line">9a31e12f6cca (<span class="built_in">id</span>: 34)</span><br><span class="line">2671c0ee9a88 (<span class="built_in">id</span>: 33)</span><br><span class="line">84ce5614c6fa (<span class="built_in">id</span>: 30)</span><br><span class="line">9dc991ef1c43 (<span class="built_in">id</span>: 29)</span><br><span class="line">4edeab5b9630 (<span class="built_in">id</span>: 28)</span><br><span class="line">f64f647b6419 (<span class="built_in">id</span>: 27)</span><br><span class="line">6708b6e21557 (<span class="built_in">id</span>: 26)</span><br><span class="line">25bbbcab0d77 (<span class="built_in">id</span>: 25)</span><br><span class="line">43b6d3bd5e36 (<span class="built_in">id</span>: 24)</span><br><span class="line">54bef310fedf (<span class="built_in">id</span>: 23)</span><br><span class="line">c2984e198b4f (<span class="built_in">id</span>: 22)</span><br><span class="line">2a18c17a5943 (<span class="built_in">id</span>: 21)</span><br><span class="line">4168e630ff80 (<span class="built_in">id</span>: 20)</span><br><span class="line">20c0eb4d7538 (<span class="built_in">id</span>: 18)</span><br><span class="line">2f4f9325c803 (<span class="built_in">id</span>: 19)</span><br><span class="line">a9baab3a884b (<span class="built_in">id</span>: 16)</span><br><span class="line">2feb0a598645 (<span class="built_in">id</span>: 17)</span><br><span class="line">b5969004f3af (<span class="built_in">id</span>: 15)</span><br><span class="line">1c5ab0ae8bf9 (<span class="built_in">id</span>: 14)</span><br><span class="line">ff3a73a2e3ce (<span class="built_in">id</span>: 13)</span><br><span class="line">4165a6961f4a (<span class="built_in">id</span>: 9)</span><br><span class="line">a64a18caeb4b (<span class="built_in">id</span>: 6)</span><br><span class="line">9facfe36be9b (<span class="built_in">id</span>: 7)</span><br><span class="line">8364a8ac4dbc (<span class="built_in">id</span>: 10)</span><br><span class="line">841bdaad82e3 (<span class="built_in">id</span>: 8)</span><br><span class="line">43f94c2a0d7c (<span class="built_in">id</span>: 5)</span><br><span class="line">734bd753b4a0 (<span class="built_in">id</span>: 4)</span><br><span class="line">a91a35b6aa00 (<span class="built_in">id</span>: 2)</span><br><span class="line">14ea019a86d0 (<span class="built_in">id</span>: 3)</span><br><span class="line">a3dd041b853d (<span class="built_in">id</span>: 1)</span><br><span class="line">f90fd1c46112 (<span class="built_in">id</span>: 0)</span><br><span class="line">default</span><br></pre></td></tr></table></figure><p>所在网络的明名空间为0号f90fd1c46112</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br></pre></td><td class="code"><pre><span class="line">[root@KuberJD1 netns]<span class="comment"># ip netns exec f90fd1c46112 ip a</span></span><br><span class="line">1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000</span><br><span class="line"> <span class="built_in">link</span>/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00</span><br><span class="line"> inet 127.0.0.1/8 scope host lo</span><br><span class="line"> valid_lft forever preferred_lft forever</span><br><span class="line">3: eth0@if4: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default</span><br><span class="line"> <span class="built_in">link</span>/ether 56:70:84:3b:2d:6e brd ff:ff:ff:ff:ff:ff link-netnsid 0</span><br><span class="line"> inet 10.42.5.6/32 scope global eth0</span><br><span class="line"> valid_lft forever preferred_lft forever</span><br></pre></td></tr></table></figure><p>可以看到etho网卡ip为10.42.5.6</p><p>10.42.5.6的路由</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br></pre></td><td class="code"><pre><span class="line">[root@KuberJD1 netns]<span class="comment"># ip netns exec 14ea019a86d0 route -n</span></span><br><span class="line">Kernel IP routing table</span><br><span class="line">Destination Gateway Genmask Flags Metric Ref Use Iface</span><br><span class="line">0.0.0.0 169.254.1.1 0.0.0.0 UG 0 0 0 eth0</span><br><span class="line">169.254.1.1 0.0.0.0 255.255.255.255 UH 0 0 0 eth0</span><br></pre></td></tr></table></figure><p>当10.42.5.6 访问10.42.5.7</p><p>从etho网卡出去,因为etho网卡的另一对veth对设备是caliaa55f522eed,caliaa55f522eed设备在宿主机。</p><p>宿主机找路由表找到10.42.5.7要进去网卡cali25daf88f8f4,然后根据veth设备进去了另一个网络命名空间(该pod)</p><p><strong>从集群角度:根据上述结果当客户网络中有一台机器ip为10.42.5.7时,该nodes(10.42.5.6 )机器上访问10.42.5.7,根据路由规则,根本不会访问到客户的宿主机。</strong></p><p><strong>从客户角度:客户访问10.42.5.7的宿主机。走他们局域网的三层路由器,然后下一条下一条,然后本地找mac,找不到就arp找mac。与我们的集群网络互不影响。除非指定下一条到我们集群该node</strong></p><p><strong>综上:calico的(node内部网络策略,和docker brige一样)不会影响客户网络!!!!</strong></p><h3 id="pod与pod之间跨主机通信">pod与pod之间跨主机通信</h3><p>用的是flannel的vxlan</p><p>![image-20220620113506571](data:image/svg+xml,<svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 2074 1078"></svg>)</p><p>找两个在不同节点的pod 10.42.5.165和10.42.8.97</p><figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br></pre></td><td class="code"><pre><span class="line">srm-source-1c79a-84d9cd498f-8pmxb 1/1 Running 1 15h 10.42.5.165 kuberjd1 <none> <none></span><br><span class="line">srm-source-1c79a-84d9cd498f-m5gfr 1/1 Running 0 16d 10.42.8.97 kubejd3 <none> <none></span><br><span class="line"></span><br><span class="line">找到对应网卡(后续抓包看看)</span><br><span class="line">10.42.5.165 0.0.0.0 255.255.255.255 UH 0 0 0 calie4e931da8e0</span><br><span class="line">10.42.8.97 0.0.0.0 255.255.255.255 UH 0 0 0 cali86ad7dcb113</span><br></pre></td></tr></table></figure><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br></pre></td><td class="code"><pre><span class="line">[root@KuberJD1 ~]<span class="comment"># route -n</span></span><br><span class="line">Kernel IP routing table</span><br><span class="line">Destination Gateway Genmask Flags Metric Ref Use Iface</span><br><span class="line">0.0.0.0 192.168.12.1 0.0.0.0 UG 100 0 0 ens192</span><br><span class="line">10.42.0.0 10.42.0.0 255.255.255.0 UG 0 0 0 flannel.1</span><br><span class="line">10.42.1.0 10.42.1.0 255.255.255.0 UG 0 0 0 flannel.1</span><br><span class="line">10.42.2.0 10.42.2.0 255.255.255.0 UG 0 0 0 flannel.1</span><br><span class="line">10.42.4.0 10.42.4.0 255.255.255.0 UG 0 0 0 flannel.1</span><br><span class="line"></span><br></pre></td></tr></table></figure><h4 id="vxlan-how-to-work"><strong>vxlan how to work</strong></h4><ul><li><p>1.flannel进程对传过来的报文数据进行封包,解包。</p></li><li><p>2.node1上pod1 <strong>8.97</strong>访问node2上的pod2 <strong>(5.156)</strong></p></li><li><h5 id="pod1(8-97)首先通过veth对将数据包发到node1,在node1根据路由表规则将数据发到flannel1-1这个网卡"><strong>pod1(8.97)首先通过veth对将数据包发到node1,在node1根据路由表规则将数据发到flannel1.1这个网卡</strong></h5></li><li><h5 id="这个flannel是采用vxlan的模式,vxlan需要veth的数据封装与解封装,所以flannel就把这个数据包交给vxlan-而vxlan是一个内核级的驱动程序,由它去封装这个包,因为vxlan本身是工作在二层的,它还需要目的的mac地址">这个flannel是采用vxlan的模式,vxlan需要veth的数据封装与解封装,所以flannel就把这个数据包交给vxlan,而vxlan是一个内核级的驱动程序,由它去封装这个包,因为vxlan本身是工作在二层的,它还需要目的的mac地址</h5></li><li><h5 id="在flanneld进程启动后,就会自动添加其他节点ARP记录">在flanneld进程启动后,就会自动添加其他节点ARP记录</h5></li></ul><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">##在worker1上执行 查看其网卡mac地址,worker2的flannel地址为10.42.8.0 </span></span><br><span class="line"><span class="comment">##在flanneld进程启动后,就会自动添加其他节点ARP记录,可以通过ip命令查看</span></span><br><span class="line">[root@KuberJD2 ~]<span class="comment"># ip neigh show dev flannel.1</span></span><br><span class="line">10.42.2.0 lladdr ca:6d:ed:bc:07:62 PERMANENT</span><br><span class="line">10.42.1.0 lladdr 72:2b:f3:e1:d0:9c PERMANENT</span><br><span class="line">10.42.8.0 lladdr aa:e1:0d:ae:fa:d5 PERMANENT</span><br><span class="line">10.42.5.0 lladdr 46:af:4f:33:e2:37 PERMANENT</span><br><span class="line">10.42.0.0 lladdr 02:54:c3:85:40:6f PERMANENT</span><br></pre></td></tr></table></figure><h5 id="知道了目的MAC地址-(pod2)-,封装二层数据帧(容器源IP和目的IP)后,linux内核将这个数据帧进一步封装成宿主机网络的一个普通数据帧,然后通过宿主机的网卡出去。单目前只知道目的pod2的flannel-1网卡的mac地址,而不知道目的宿主机的mac地址,到宿主机网卡的数据包不知道传到哪台宿主主机上。">知道了目的MAC地址**(pod2)**,封装二层数据帧(容器源IP和目的IP)后,linux内核将这个数据帧进一步封装成宿主机网络的一个普通数据帧,然后通过宿主机的网卡出去。单目前只知道目的pod2的flannel.1网卡的mac地址,而不知道目的宿主机的mac地址,到宿主机网卡的数据包不知道传到哪台宿主主机上。</h5><h5 id="通过已经存好的flannel-1网卡的mac地址和目的宿主机之间的ip映射关系表找到目的主机的ip"><strong>(通过已经存好的flannel.1网卡的mac地址和目的宿主机之间的ip映射关系表找到目的主机的ip)</strong></h5><h5 id="flanneld进程也维护着一个叫做FDB的转发数据库,可以通过bridge-fdb命令查看:">flanneld进程也维护着一个叫做FDB的转发数据库,可以通过bridge fdb命令查看:</h5><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br></pre></td><td class="code"><pre><span class="line">[root@KuberJD2 ~]<span class="comment"># bridge fdb show dev flannel.1</span></span><br><span class="line">72:2b:f3:e1:d0:9c dst 192.168.12.91 self permanent</span><br><span class="line">aa:e1:0d:ae:fa:d5 dst 192.168.12.213 self permanent</span><br><span class="line">7e:6e:be:f8:3a:2f dst 192.168.12.92 self permanent</span><br><span class="line">02:54:c3:85:40:6f dst 192.168.12.90 self permanent</span><br><span class="line">46:af:4f:33:e2:37 dst 192.168.12.87 self permanent 10.42.5.0 lladdr 46:af:4f:33:e2:37 PERMANENT 也就是发到这台宿主机</span><br><span class="line">ca:6d:ed:bc:07:62 dst 192.168.12.92 self permanent</span><br><span class="line">7e:70:0f:c0:3e:7b dst 192.168.12.89 self permanent</span><br><span class="line"><span class="comment">###我们可以到pod1(8.97)中维护的flannel.1的MAC地址对应宿主机IP,也就是UDP要发往的目的地。使用这个目的IP进行封装。</span></span><br></pre></td></tr></table></figure><h5 id="数据包到达目的宿主机:Node1的eth0网卡发出去,发现是VXLAN数据包,把它交给flannel-1设备。flannel-1设备则会进一步拆包,取出原始二层数据帧包,发送ARP请求,转发给container。">数据包到达目的宿主机:Node1的eth0网卡发出去,发现是VXLAN数据包,把它交给flannel.1设备。flannel.1设备则会进一步拆包,取出原始二层数据帧包,发送ARP请求,转发给container。</h5><h5 id="抓包看看">抓包看看</h5><h5 id="对pod1上对cali86ad7dcb113网卡和宿主机node1(192-168-12-89)ens192进行抓包,并在pod1向pod2ping">对pod1上对cali86ad7dcb113网卡和宿主机node1(192.168.12.89)ens192进行抓包,并在pod1向pod2ping</h5><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br></pre></td><td class="code"><pre><span class="line"> </span><br><span class="line">tcpdump -i cali86ad7dcb113 -vnn dst host 10.42.5.165 -w ./8.97-container-result.cap</span><br><span class="line"></span><br><span class="line">tcpdump -i ens192 dst host 192.168.12.87 -w ./pod1-host-result.cap</span><br></pre></td></tr></table></figure><h3 id="先看cali86ad7dcb113网卡的包信息">先看cali86ad7dcb113网卡的包信息</h3><p>![image-20220619004717238](data:image/svg+xml,<svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 2530 1200"></svg>)</p><h3 id="再看ens192的网卡包信息">再看ens192的网卡包信息</h3><h4 id="为了方便分析数据包将upd报文解码为vxlan">为了方便分析数据包将upd报文解码为vxlan</h4><p>![image-20220619012921188](/Users/zouquanan/Library/Application Support/typora-user-images/image-20220619012921188.png)</p><h4 id="然后过滤一下">然后过滤一下</h4><p>![image-20220619013342986](data:image/svg+xml,<svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 3022 494"></svg>)</p><p>![image-20220619182224316](data:image/svg+xml,<svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 2352 1138"></svg>)</p><p>综上总结</p><p>flannel的vxlan技术将容器(pod)角度的三层网络转换成了二层网络,就是通过封装二层帧 无需三层(下一条下一条)找到ip,而直接通过宿主机的ip和容器的mac做成一个关系表。根据其关系表封装数据包进而传输。</p><h3 id="集群网络角度当pod的ip和客户ip冲突时候,外层网络是根据的三层ip-src和dst的宿主机ip,最终在客户局域网内发送的数据包都是根据宿主机ip转发的。">集群网络角度当pod的ip和客户ip冲突时候,外层网络是根据的三层ip src和dst的宿主机ip,最终在客户局域网内发送的数据包都是根据宿主机ip转发的。</h3><h3 id=""></h3>]]></content>
<summary type="html"><div class="note blue icon-padding modern"><i class="note-icon fas fa-bullhorn"></i><p>k8s虚拟网络是否会与外层网络冲突的分析</p>
</div>
<div class="note red </summary>
<category term="kubenetes" scheme="http://misakiz.github.io/tags/kubenetes/"/>
</entry>
<entry>
<title>Hello World</title>
<link href="http://misakiz.github.io/2023/06/17/hello-world/"/>
<id>http://misakiz.github.io/2023/06/17/hello-world/</id>
<published>2023-06-17T08:42:51.486Z</published>
<updated>2023-06-17T08:42:51.486Z</updated>
<content type="html"><![CDATA[<p>Welcome to <a href="https://hexo.io/">Hexo</a>! This is your very first post. Check <a href="https://hexo.io/docs/">documentation</a> for more info. If you get any problems when using Hexo, you can find the answer in <a href="https://hexo.io/docs/troubleshooting.html">troubleshooting</a> or you can ask me on <a href="https://github.com/hexojs/hexo/issues">GitHub</a>.</p><h2 id="Quick-Start">Quick Start</h2><h3 id="Create-a-new-post">Create a new post</h3><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">$ hexo new <span class="string">"My New Post"</span></span><br></pre></td></tr></table></figure><p>More info: <a href="https://hexo.io/docs/writing.html">Writing</a></p><h3 id="Run-server">Run server</h3><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">$ hexo server</span><br></pre></td></tr></table></figure><p>More info: <a href="https://hexo.io/docs/server.html">Server</a></p><h3 id="Generate-static-files">Generate static files</h3><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">$ hexo generate</span><br></pre></td></tr></table></figure><p>More info: <a href="https://hexo.io/docs/generating.html">Generating</a></p><h3 id="Deploy-to-remote-sites">Deploy to remote sites</h3><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">$ hexo deploy</span><br></pre></td></tr></table></figure><p>More info: <a href="https://hexo.io/docs/one-command-deployment.html">Deployment</a></p>]]></content>
<summary type="html"><p>Welcome to <a href="https://hexo.io/">Hexo</a>! This is your very first post. Check <a href="https://hexo.io/docs/">documentation</a> for</summary>
</entry>
</feed>