Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: make pika compactible with redis-sentinel #2854

Merged

Conversation

cheniujh
Copy link
Collaborator

@cheniujh cheniujh commented Aug 8, 2024

fix issue #2695 (Pika 3.5.x不支持redis-sentinel)
fix issue #2286 (Pika资源释放流程不正确)

本PR让Pika支持使用redis-sentinel来实现failover。

(测试使用20240807时redis的unstable源码编译的redis-sentinel,距离该版本最近的release版本为7.4.0,但经过实测,一些旧版本的redis-sentinel也是兼容的)

自去年350版本以来,Pika无法兼容Redis-Sentinel主要是以下几个原因

  1. 事务冲突:redis-sentinel选举以后,进行切主操作时会给slave发送slaveof no one命令,但是实际上是以事务的形式发送了下列一批命令:

    multi
    salveof no one
    config rewrite
    client kill type pubsub
    client kill type normal
    exec

    但Pika实际上不支持client kill type pubsubclient kill type normal命令(Doinitial都会失败),所以整个事务会abort不执行,实际的slaveof命令也就根本没有执行,所以无法完成切主

    • 针对1的修复:本PR给Pika增加了对client kill type pubsubclient kill type normal的支持,前者会杀掉所有pubsub client连接,后者会杀掉除pubsub外的其他client连接。
  2. 指标的不兼容redis-sentinel实际上使用slave_repl_offset指标来在选主时做判断,这个指标反映的是该slave实例总共接受了多少主从同步的字节数(假如是多DB,这个值就是多DB的汇总),而pika的主从体系是每个DB各自独立的,每个db都有一个binlog偏移量,由两个数组成: (filenum, offset),自然也没有提供slave_repl_offset指标供redis-sentinel做判断(以前336版本也兼容redis-sentinel, 但一样没有提供这个指标,有点怀疑之前的选主其实是不可靠的)。

  • 针对2的修复:本PR将Pika内部的Binlog偏移做了合并转换,具体地,取一个64位数,让binlog的filenum处于其前32位,让offset处于低32位(单个Binlog文件最大2GB,32位可以表达的值接近4GB,所以不会溢出),最终用这个融合的64位数字输出为salve_repl_offset供redis-sentinel进行选举参考。
    多DB场景是如何处理的:首先将每个DB的filenum累加,将offset也累加,最后得到一个total filenum和total offset, 此时的offset可能已经大于单个binlog文件的大小,所以再做下面代码段中的这两个处理来将offset中超出binlog file size的部分转移到filenum中,这样就得到了一个融合的filenum, offset,综合反映多个DB的主从进度,最后再按照前述的位运算处理,将其融合成一个整个slave_repl_offset。

    slave_repl_offset最终的计算方法是:遍历每个DB, 将binlog_filenum * binlog_file_size,然后累加到slave_repl_offset, 并且将binlog offset(最新binlog文件的内部偏移量)也累加到slave_repl_offset,得到的就是整个实例的同步进度,以字节数计。需要说明的是:binlog filenum是从0开始递增,所以这里直接使用的binlog_filenum相当于真实的binlog文件历史计数减1,恰好符合需求,因为binlog offset反映的就是最后一个Binlog文件的大小。
  1. 默认参数的变更:Pika Info中提供的slave-priority默认值改成了0,默认值应当是100,这样redis-sentinel才会将实例列入候选。
  • 针对3的修复:这个默认值已经在之前的PR中修改过来了 。
  1. 发现的coredump问题:在修复了Pika对Redis-sentinel的兼容后,发现sentinel连接上以后正常shutdown也会coredump,追查后发现了以下原因:Pika退出时的资源释放流程中,网络线程关闭的时机不对,关的太晚了,有些重要组件已经析构了,但网络线程还能收发包,这就导致如果shutdown的时候有流量进来,能正常解析成命令执行,会使用到已经析构的组件,进而导致coredump。
  • 针对问题4修复:修改了网络线程Stop的位置,提前停止pika_dispatcher_thread_(也就是停止dispatcher以及workerThreads)。
  1. 附带修复了一个也是资源退出问题的主从相关的问题:pika从节点消费BInlog时,同步先将BInlog落盘,然后提交异步WriteDB任务,之前Pika退出的时候,不会等待这些异步WriteDB任务全部落盘再退出,这就可能导致:某些请求对应的Binlog在从节点写下去了,但是对应的WriteDB任务没有执行完进程就退出了,进而导致丢数据。
  • 针对问题5的修复:在Pika退出时,要等到WriteDBWorker中TaskQueue的size归零才继续往下释放资源,退出进程。

This PR enables Pika to support using redis-sentinel for failover

Since version 3.5.0 last year, Pika has been incompatible with Redis-Sentinel mainly due to the following reasons:

  1. Transaction conflicts: After the redis-sentinel election, the failover operation sends the slaveof no one command to the slave. However, it actually sends a batch of commands in the form of a transaction:

    multi
    slaveof no one
    config rewrite
    client kill type pubsub
    client kill type normal
    exec

    Pika does not support the client kill type pubsub and client kill type normal commands (Doinitial will fail), causing the entire transaction to abort. As a result, the slaveof command is not executed, and the failover cannot be completed.

    • Fix for issue 1: This PR adds support for client kill type pubsub and client kill type normal to Pika. The former will kill all pubsub client connections, and the latter will kill all other client connections except pubsub clients.
  2. Incompatible metrics: redis-sentinel uses the slave_repl_offset metric to determine the master during an election. This metric reflects the total number of bytes received by the slave instance through replication (if multiple databases are involved, this value is the sum for all databases). Pika's replication system is independent for each database, with each DB having its own binlog offset, consisting of two parts: (filenum, offset). Therefore, Pika does not provide a slave_repl_offset metric for redis-sentinel to use (earlier versions like 3.3.6 also supported redis-sentinel, but did not provide this metric, raising doubts about the reliability of previous elections).

    • Fix for issue 2: This PR merges and converts Pika's internal binlog offsets. Specifically, a 64-bit number is used where the binlog filenum occupies the upper 32 bits and the offset occupies the lower 32 bits (a single binlog file can be up to 2GB, and a 32-bit value can represent nearly 4GB, so overflow will not occur). This combined 64-bit number is output as slave_repl_offset for redis-sentinel to use during elections.

    Handling multiple DB scenarios: First, the filenum and offset of each DB are summed to get a total filenum and total offset. The offset may exceed the size of a single binlog file, so the following code transfers the excess part of the offset to the filenum, resulting in a combined filenum and offset that reflects the replication progress of multiple DBs. Finally, the bitwise operation mentioned above is used to merge them into a single slave_repl_offset.

total_file_num += total_offset / g_pika_conf->binlog_file_size();
total_offset = total_offset % g_pika_conf->binlog_file_size();

  1. Change in default parameters: The default value of slave-priority provided by Pika Info was changed to 0. The default should be 100, so redis-sentinel can consider the instance as a candidate.
  • Fix for issue 3: This default value has already been corrected in a previous PR.
  1. Discovered coredump issue: After fixing Pika's compatibility with Redis-sentinel, it was found that a normal shutdown after sentinel connection would cause a coredump. Investigation revealed the following reason: the timing of closing network threads during Pika's resource release process was incorrect, too late. Some important components were already destructed, but network threads could still send and receive packets. If there was traffic during shutdown, commands would be parsed and executed, using already destructed components, leading to a coredump.
  • Fix for issue 4: The position of stopping the network threads was adjusted. Pika's dispatcher_thread_ (including dispatcher and worker threads) is now stopped earlier.

Summary by CodeRabbit

Summary by CodeRabbit

  • New Features

    • Enhanced client management with new commands for terminating client connections (ClientKillPubSub, ClientKillAllNormal).
    • Improved command handling in the ClientCmd class to support different kill types.
    • New method in PikaReplBgWorker class to retrieve the size of the task queue, aiding performance monitoring.
  • Bug Fixes

    • Improved resource management and connection handling in worker threads.
    • Enhanced shutdown sequence for threads, ensuring orderly termination.

2. support client kill type pubsub/normal
3. ensure fd is removed in epoll if server wanna close fd
1. ensure NetWork Thread(Dispacher) can be stopped in time
2. ensure all queued Async WriteDB task can be done before exit
Copy link

coderabbitai bot commented Aug 8, 2024

Walkthrough

The recent updates enhance the Pika server's functionality by introducing new methods for managing client connections and improving thread operations. Key changes include expanded command handling for diverse client termination types, refined connection management with new signaling methods, and enhanced resource cleanup processes. These improvements aim to create a more robust and efficient operational environment, facilitating smoother execution and management of server commands.

Changes

Files Change Summary
include/pika_admin.h, src/pika_admin.cc Enhanced ClientCmd with functionalities for new kill types, added static string members, and incorporated a new variable for kill_type_.
src/pika_server.cc Improved client management and thread operations by adding ClientKillPubSub() and ClientKillAllNormal() methods, enhancing client termination processes.
include/pika_repl_bgworker.h Added TaskQueueSize() method to PikaReplBgWorker class for retrieving the size of the task queue, aiding in performance monitoring.

Sequence Diagram(s)

sequenceDiagram
    participant ClientCmd
    participant Server
    participant PikaDispatchThread
    participant PikaPubSubThread

    ClientCmd->>Server: Issue "kill" command
    Server->>PikaDispatchThread: Process ClientKillAll()
    PikaDispatchThread->>PikaPubSubThread: NotifyCloseAllConns()
    PikaDispatchThread-->>Server: Confirmation of closure
    Note over ClientCmd, Server: Client connections are now managed and terminated gracefully.
Loading

🐰 In the meadow, changes bloom,
New commands chase away the gloom.
Threads now stop with a gentle sigh,
As clients wave their sweet goodbye.
Pika hops with joy to see,
A smoother, brighter harmony! 🌼


Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

Share
Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai generate interesting stats about this repository and render them as a table.
    • @coderabbitai show all the console.log statements in this repository.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (invoked as PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Additionally, you can add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

@github-actions github-actions bot added the ☢️ Bug Something isn't working label Aug 8, 2024
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

Commits

Files that changed from the base of the PR and between 645da7e and 75a7040.

Files selected for processing (11)
  • include/pika_admin.h (2 hunks)
  • include/pika_dispatch_thread.h (1 hunks)
  • include/pika_repl_bgworker.h (1 hunks)
  • include/pika_server.h (1 hunks)
  • src/net/include/net_pubsub.h (1 hunks)
  • src/net/src/net_pubsub.cc (3 hunks)
  • src/net/src/worker_thread.cc (2 hunks)
  • src/pika_admin.cc (5 hunks)
  • src/pika_dispatch_thread.cc (1 hunks)
  • src/pika_repl_client.cc (1 hunks)
  • src/pika_server.cc (2 hunks)
Additional comments not posted (23)
include/pika_repl_bgworker.h (1)

27-31: New method TaskQueueSize looks good!

The method provides a useful utility for retrieving the total size of the task queue by leveraging existing functionality. This addition enhances the class's interface without introducing any breaking changes.

include/pika_dispatch_thread.h (1)

17-17: New method StopThread is a valuable addition!

The addition of the StopThread method enhances the class's thread management capabilities, providing improved control over the thread lifecycle.

src/pika_dispatch_thread.cc (1)

62-64: Implementation of StopThread is well done!

The method correctly invokes StopThread on the thread_rep_ member, enhancing thread management. This aligns with the goal of improving resource handling and operational control.

src/net/include/net_pubsub.h (3)

93-93: Atomic boolean close_all_conn_sig_ is well-suited for signaling.

The use of an atomic boolean for signaling connection closures is appropriate for thread-safe operations. Ensure it is consistently checked and updated across the codebase.


85-85: Ensure proper handling of connection closures in CloseAllConns.

The CloseAllConns method should ensure all connections are safely closed. Verify that it integrates well with the rest of the connection management logic.

Verification successful

CloseAllConns method handles connection closures properly.

The implementation of CloseAllConns in src/net/src/net_pubsub.cc ensures that all connections are safely closed with appropriate synchronization and resource cleanup. This method integrates well with the connection management logic.

  • src/net/src/net_pubsub.cc: Implementation of CloseAllConns with synchronization and cleanup logic.
Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Verify the implementation of `CloseAllConns` to ensure all connections are safely closed.

# Test: Search for the method implementation. Expect: Proper connection closure logic.
ast-grep --lang cpp --pattern $'void CloseAllConns() {
  $$$
}'

Length of output: 64


Script:

#!/bin/bash
# Search for the definition and usage of `CloseAllConns` in the codebase.
rg 'CloseAllConns' -A 10

Length of output: 3111


80-80: Ensure thread-safety when notifying to close all connections.

The method NotifyToCloseAllConns likely sets close_all_conn_sig_ to signal closure. Ensure this method is used in a thread-safe manner, given the concurrent nature of the application.

Verification successful

Thread-safety of NotifyToCloseAllConns is ensured.

The NotifyToCloseAllConns method uses an atomic operation to set close_all_conn_sig_ to true, ensuring thread-safety. No additional synchronization is needed in its current usage.

  • src/net/src/net_pubsub.cc: Implementation of NotifyToCloseAllConns uses close_all_conn_sig_.store(true).
Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Verify usage of `NotifyToCloseAllConns` to ensure thread-safety.

# Test: Search for the method usage. Expect: Proper synchronization mechanisms in place.
rg --type cpp $'NotifyToCloseAllConns'

Length of output: 303


Script:

#!/bin/bash
# Description: Retrieve the implementation of `NotifyToCloseAllConns` to check for thread-safety mechanisms.
ast-grep --lang cpp --pattern 'void PubSubThread::NotifyToCloseAllConns() { $$$ }'

# Description: Retrieve the context around the usage of `NotifyToCloseAllConns` in `pika_server.cc`.
rg --type cpp -A 5 -B 5 $'NotifyToCloseAllConns' src/pika_server.cc

Length of output: 496


Script:

#!/bin/bash
# Description: Search for the implementation of `NotifyToCloseAllConns` in `net_pubsub.cc`.
rg --type cpp -A 10 'void PubSubThread::NotifyToCloseAllConns()' src/net/src/net_pubsub.cc

Length of output: 195

src/net/src/worker_thread.cc (2)

275-277: Ensure proper resource cleanup for closed connections.

The addition of net_multiplexer_->NetDelEvent(conn->fd(), 0); ensures that network events are deregistered for closed connections. This is crucial for preventing resource leaks.


281-283: Ensure timeout handling correctly deregisters events.

The changes ensure that connections timing out have their events deregistered and resources cleaned up. Verify that this logic is correctly integrated with the rest of the timeout handling.

Verification successful

Timeout handling correctly deregisters events.

The NetDelEvent function is consistently used in the timeout handling logic to deregister events, confirming its correct integration with the rest of the timeout handling process.

Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Verify that timeout handling correctly deregisters events and cleans up resources.

# Test: Search for timeout handling logic. Expect: Consistent use of `NetDelEvent`.
rg --type cpp $'keepalive_timeout_'

Length of output: 1702


Script:

#!/bin/bash
# Search for instances of NetDelEvent to verify its integration with timeout logic.
rg --type cpp 'NetDelEvent'

Length of output: 2160


Script:

#!/bin/bash
# Search for context around NetDelEvent calls in worker_thread.cc to verify association with timeout handling.
rg --type cpp --context 5 'NetDelEvent' src/net/src/worker_thread.cc

Length of output: 1597

src/pika_repl_client.cc (1)

80-95: Ensure all asynchronous tasks are completed before stopping.

The loop in the Stop method ensures that all write tasks are completed before the method exits. This change is crucial for preventing data loss and ensuring data integrity.

include/pika_server.h (2)

226-226: Verify the implementation of ClientKillAllNormal.

Ensure that the ClientKillAllNormal method is correctly implemented and integrated within the codebase.

Verification successful

ClientKillAllNormal is implemented and used correctly.

The method ClientKillAllNormal is implemented in src/pika_server.cc and is used in src/pika_admin.cc. It calls ClientKillAll on pika_dispatch_thread_, which suggests it performs the intended function of terminating all normal client connections.

  • Implementation: src/pika_server.cc
  • Usage: src/pika_admin.cc
Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Verify the implementation of `ClientKillAllNormal`.

# Test: Search for the method definition. Expect: Function implementation details.
ast-grep --lang cpp --pattern $'void PikaServer::ClientKillAllNormal() { $$$ }'

Length of output: 80


Script:

#!/bin/bash
# Description: Search for references to `ClientKillAllNormal` in the codebase.

# Test: Search for any references to `ClientKillAllNormal` to verify its usage or implementation.
rg 'ClientKillAllNormal' -A 5

Length of output: 888


225-225: Verify the implementation of ClientKillPubSub.

Ensure that the ClientKillPubSub method is correctly implemented and integrated within the codebase.

Verification successful

The ClientKillPubSub method is correctly implemented and integrated.

  • The method is defined in src/pika_server.cc and calls NotifyToCloseAllConns on pika_pubsub_thread_.
  • It is used in src/pika_admin.cc, indicating its role in handling administrative commands.
Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Verify the implementation of `ClientKillPubSub`.

# Test: Search for the method definition. Expect: Function implementation details.
ast-grep --lang cpp --pattern $'void PikaServer::ClientKillPubSub() { $$$ }'

Length of output: 77


Script:

#!/bin/bash
# Description: Search for any occurrences of `ClientKillPubSub` in the codebase.

# Search for the method declaration, definition, and usage.
rg 'ClientKillPubSub' -A 5

Length of output: 853

src/net/src/net_pubsub.cc (3)

162-174: Well-implemented CloseAllConns method.

The method effectively clears connection-related data and closes all connections, ensuring thread safety with appropriate locking mechanisms.


608-610: Efficient signaling in NotifyToCloseAllConns.

The method efficiently uses an atomic boolean to signal the closure of all connections.


431-435: Improved connection management in ThreadMain.

The addition of a signal check for close_all_conn_sig_ enhances the method's responsiveness to shutdown requests, improving connection management.

include/pika_admin.h (1)

239-242: Verify usage of new kill type members in ClientCmd.

Ensure that KILLTYPE_NORMAL, KILLTYPE_PUBSUB, and kill_type_ are correctly utilized in the implementation of the ClientCmd class.

Verification successful

New kill type members are correctly utilized in ClientCmd.

The KILLTYPE_NORMAL, KILLTYPE_PUBSUB, and kill_type_ are properly integrated into the ClientCmd class. They are defined, assigned, and used in conditional logic as expected.

  • Usage in src/pika_admin.cc:
    • strcasecmp checks against KILLTYPE_NORMAL and KILLTYPE_PUBSUB.
    • kill_type_ is assigned from argv_[3] and compared with the constants.
Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Verify the usage of new kill type members in `ClientCmd`.

# Test: Search for the usage of `kill_type_`, `KILLTYPE_NORMAL`, and `KILLTYPE_PUBSUB`. Expect: Proper integration in the class methods.
rg --type cpp 'kill_type_|KILLTYPE_NORMAL|KILLTYPE_PUBSUB'

Length of output: 733

src/pika_server.cc (4)

111-113: Orderly shutdown improvement.

The addition of pika_dispatch_thread_->StopThread() in the destructor ensures that the dispatch thread is stopped before the worker thread, promoting orderly shutdown.


858-861: Enhanced cleanup process in ClientKillAll.

The addition of pika_pubsub_thread_->NotifyToCloseAllConns() ensures that all publish/subscribe connections are closed when all clients are killed, enhancing the cleanup process.


863-865: New method ClientKillPubSub improves clarity.

The introduction of ClientKillPubSub provides a clear separation of concerns by focusing solely on closing publish/subscribe connections, enhancing code clarity and maintainability.


867-869: New method ClientKillAllNormal enhances clarity.

The introduction of ClientKillAllNormal retains the previous functionality of ClientKillAll but is explicitly named to indicate its focus on normal client termination, enhancing code clarity.

src/pika_admin.cc (4)

687-693: Enhancement in ClientCmd::DoInitial for type-based client kill.

The addition of handling for the type argument in the "kill" command enhances the flexibility of client management. Ensure that the KILLTYPE_NORMAL and KILLTYPE_PUBSUB constants are correctly defined and used throughout the codebase.


808-809: Addition of ClientCmd constants for client types.

The constants KILLTYPE_NORMAL and KILLTYPE_PUBSUB provide clear identifiers for client types in the "kill" command. Ensure these constants are consistently used throughout the codebase.

Verification successful

Verified usage of ClientCmd constants.

The constants KILLTYPE_NORMAL and KILLTYPE_PUBSUB are consistently used in the codebase, specifically in logical conditions to differentiate client types for the "kill" command.

  • Found in src/pika_admin.cc and include/pika_admin.h.
Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Verify the usage of `KILLTYPE_NORMAL` and `KILLTYPE_PUBSUB` constants.

# Test: Search for the usage of these constants in the codebase.
rg --type cpp 'KILLTYPE_NORMAL'
rg --type cpp 'KILLTYPE_PUBSUB'

Length of output: 782


743-752: Verify server methods for type-based client kill.

The logic for handling KILLTYPE_NORMAL and KILLTYPE_PUBSUB in the "kill" command is well-structured. Ensure that the g_pika_server->ClientKillAllNormal() and g_pika_server->ClientKillPubSub() methods are correctly implemented and tested.

Verification successful

Server methods for client kill types are implemented.

The methods ClientKillAllNormal and ClientKillPubSub are implemented in src/pika_server.cc and declared in include/pika_server.h. They are correctly invoked in src/pika_admin.cc, confirming the logic for handling different client kill types in the "kill" command. The review comment is verified.

Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Verify the implementation of client kill methods for different types.

# Test: Search for the implementation of `ClientKillAllNormal` and `ClientKillPubSub`.
ast-grep --lang cpp --pattern 'void PikaServer::ClientKillAllNormal() { $$$ }'
ast-grep --lang cpp --pattern 'void PikaServer::ClientKillPubSub() { $$$ }'

Length of output: 158


Script:

#!/bin/bash
# Description: Search for the method names `ClientKillAllNormal` and `ClientKillPubSub` in the codebase using ripgrep.

# Test: Search for `ClientKillAllNormal` method.
rg 'ClientKillAllNormal'

# Test: Search for `ClientKillPubSub` method.
rg 'ClientKillPubSub'

Length of output: 385


Line range hint 1222-1247: Improved tracking of replication offsets in InfoCmd::InfoReplication.

The introduction of total_file_num and total_offset variables, along with the adjustments to slave_repl_offset, enhance the accuracy of replication tracking. Ensure that the bounds handling for slave_repl_offset is correctly implemented and tested.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

Commits

Files that changed from the base of the PR and between 75a7040 and 511d7b4.

Files selected for processing (2)
  • include/pika_server.h (1 hunks)
  • src/pika_server.cc (2 hunks)
Files skipped from review as they are similar to previous changes (2)
  • include/pika_server.h
  • src/pika_server.cc

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

Commits

Files that changed from the base of the PR and between 511d7b4 and 9bcc2cb.

Files selected for processing (1)
  • .github/workflows/pika.yml (4 hunks)
Files skipped from review due to trivial changes (1)
  • .github/workflows/pika.yml

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

Commits

Files that changed from the base of the PR and between 9bcc2cb and 9acc708.

Files selected for processing (1)
  • .github/workflows/pika.yml (5 hunks)
Files skipped from review due to trivial changes (1)
  • .github/workflows/pika.yml

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

Commits

Files that changed from the base of the PR and between 9acc708 and 2089ccc.

Files selected for processing (2)
  • .github/workflows/pika.yml (6 hunks)
  • tests/integration/rsync_dynamic_reconfig.go (2 hunks)
Files skipped from review as they are similar to previous changes (1)
  • .github/workflows/pika.yml
Additional comments not posted (3)
tests/integration/rsync_dynamic_reconfig.go (3)

33-34: Verify the impact of reduced key and value sizes.

Reducing the key and value sizes from 1024 to 16 will speed up test execution but may not adequately simulate real-world scenarios with larger data. Ensure that the test objectives are still met with these smaller sizes.

Verification successful

Consider the impact of reduced key and value sizes on test objectives.

The reduction in key and value sizes from 1024 to 16 in the RefillMaster function is likely intended to improve test execution speed. However, it is important to verify that the test objectives remain aligned with these changes. Consider whether the smaller sizes still adequately simulate the intended scenarios and whether additional tests might be needed to cover cases with larger data sizes.

  • Ensure that the test objectives are still met with reduced sizes.
  • Consider adding tests for scenarios with larger data sizes if needed.
Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Verify the impact of reduced key and value sizes in RefillMaster.

# Test: Check for any test cases or documentation that mention expected key/value sizes.
rg --type go -A 5 'RefillMaster'

Length of output: 1687


38-39: Consider the implications of reduced data volume.

The reduction in dataVolumeMB from 64 to 1 significantly decreases the volume of data processed during tests. This change might affect the ability to assess performance under load conditions. Ensure that the reduced data volume aligns with the test's goals.

Verification successful

Consider the test's objectives with reduced data volume.

The reduction in dataVolumeMB from 64 to 1 affects only the specific test in tests/integration/rsync_dynamic_reconfig.go. If the test's goal is to evaluate performance under load, this change might not align with that objective. Ensure the test's purpose is clear and adjust dataVolumeMB accordingly if load testing is desired.

Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Verify the implications of reduced data volume in RefillMaster.

# Test: Search for references to dataVolumeMB to understand its role in tests.
rg --type go -A 5 'dataVolumeMB'

Length of output: 1168


Script:

#!/bin/bash
# Search for calls to the RefillMaster function to understand its usage.
rg --type go 'RefillMaster('

Length of output: 107


Script:

#!/bin/bash
# Correct the regex pattern and search for calls to the RefillMaster function.
rg --type go 'RefillMaster\('

Length of output: 234


139-139: Verify the test case effectiveness with reduced data volume.

The test case now uses a significantly smaller data volume. Ensure that the test still effectively evaluates the system's behavior under the intended conditions.

Verification successful

Test case with reduced data volume is appropriate.

The use of a smaller data volume in the RefillMaster function is intentional to prevent long refill times, as indicated by the comment in the code. This ensures the test remains efficient without compromising its effectiveness. No further test cases or documentation were found specifying different data volumes.

  • tests/integration/rsync_dynamic_reconfig.go: Comment suggests avoiding large data volumes to keep refill times short.
Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Verify the effectiveness of the test case with reduced data volume.

# Test: Check if there are other test cases or documentation that specify the intended behavior for different data volumes.
rg --type go -A 5 'RefillMaster(MASTERADDR, 1, ctx)'

Length of output: 54


Script:

#!/bin/bash
# Broaden the search to find all instances of `RefillMaster` usage in the codebase.
rg --type go 'RefillMaster(' -A 5

Length of output: 112


Script:

#!/bin/bash
# Correct the regex pattern to search for all instances of `RefillMaster` usage in the codebase.
rg --type go 'RefillMaster\(' -A 5

Length of output: 1177

2 revised network kill logic
@cheniujh cheniujh force-pushed the bugfix/compatible_redis_sentinel branch from 2089ccc to 3a62d50 Compare August 9, 2024 08:07
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

Commits

Files that changed from the base of the PR and between 2089ccc and 3a62d50.

Files selected for processing (1)
  • .github/workflows/pika.yml (3 hunks)
Files skipped from review as they are similar to previous changes (1)
  • .github/workflows/pika.yml

src/pika_admin.cc Outdated Show resolved Hide resolved
src/pika_repl_client.cc Outdated Show resolved Hide resolved
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

Commits

Files that changed from the base of the PR and between 3a62d50 and 1df3351.

Files selected for processing (1)
  • src/pika_admin.cc (5 hunks)
Additional comments not posted (3)
src/pika_admin.cc (3)

687-693: Ensure correct handling of client kill types.

The DoInitial method now supports a new argument for killing clients by type. Ensure that the input validation and error messages are clear and consistent with other command patterns.


Line range hint 1222-1240: Review slave_repl_offset calculation logic.

The calculation of slave_repl_offset now includes additional logic. Ensure that this calculation is accurate and does not exceed 32-bit integer limits if that is a requirement.


743-752: Verify client kill logic for different types.

The Do method now handles killing clients by type using ClientKillAllNormal and ClientKillPubSub. Ensure these methods are correctly implemented and do not inadvertently affect the current connection.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

Commits

Files that changed from the base of the PR and between 1df3351 and a2d7949.

Files selected for processing (1)
  • src/pika_server.cc (9 hunks)
Files skipped from review as they are similar to previous changes (1)
  • src/pika_server.cc

2 rename some funs and varibles
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

Commits

Files that changed from the base of the PR and between a2d7949 and b45a402.

Files selected for processing (3)
  • src/net/include/net_pubsub.h (1 hunks)
  • src/net/src/net_pubsub.cc (3 hunks)
  • src/pika_server.cc (9 hunks)
Files skipped from review as they are similar to previous changes (2)
  • src/net/include/net_pubsub.h
  • src/net/src/net_pubsub.cc
Additional comments not posted (3)
src/pika_server.cc (3)

859-862: LGTM: Comprehensive client termination.

The addition of NotifyCloseAllConns() ensures that all pub/sub connections are closed, aligning with the goal of comprehensive client termination.


864-865: LGTM: Focused pub/sub connection termination.

The ClientKillPubSub function provides a targeted approach for closing pub/sub connections, improving code clarity.


867-869: LGTM: Explicit naming for normal client termination.

The ClientKillAllNormal function is clearly named to indicate its purpose, enhancing code readability.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

Commits

Files that changed from the base of the PR and between b45a402 and 7ca2b9d.

Files selected for processing (2)
  • src/pika_admin.cc (5 hunks)
  • src/pika_repl_client.cc (1 hunks)
Files skipped from review as they are similar to previous changes (1)
  • src/pika_repl_client.cc
Additional comments not posted (3)
src/pika_admin.cc (3)

685-693: Correct handling of kill command with type argument.

The changes correctly handle the additional "type" argument for the "kill" command, ensuring that only recognized types are processed.


743-752: Correct execution of client kill operations based on type.

The changes correctly implement the logic to execute different client kill operations based on the kill_type_ variable, with appropriate error handling for unknown types.


Line range hint 1222-1240: Accurate calculation of slave_repl_offset.

The changes correctly accumulate binlog offsets to compute the slave_repl_offset, ensuring accurate tracking of replication progress.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

Commits

Files that changed from the base of the PR and between 7ca2b9d and 0a3a49a.

Files selected for processing (3)
  • include/pika_admin.h (2 hunks)
  • src/pika_admin.cc (5 hunks)
  • src/pika_server.cc (9 hunks)
Files skipped from review as they are similar to previous changes (3)
  • include/pika_admin.h
  • src/pika_admin.cc
  • src/pika_server.cc

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

Commits

Files that changed from the base of the PR and between 0a3a49a and 42a0ad0.

Files selected for processing (1)
  • include/pika_repl_bgworker.h (1 hunks)
Files skipped from review as they are similar to previous changes (1)
  • include/pika_repl_bgworker.h

@chejinge chejinge merged commit 3f52ad0 into OpenAtomFoundation:unstable Aug 13, 2024
13 checks passed
chejinge added a commit that referenced this pull request Aug 13, 2024
* 1. make pika support redis sentinel
2. support client kill type pubsub/normal
3. ensure fd is removed in epoll if server wanna close fd

* fix exit process:
1. ensure NetWork Thread(Dispacher) can be stopped in time
2. ensure all queued Async WriteDB task can be done before exit

---------

Co-authored-by: chejinge <[email protected]>
cheniujh added a commit to cheniujh/pika that referenced this pull request Sep 24, 2024
* 1. make pika support redis sentinel
2. support client kill type pubsub/normal
3. ensure fd is removed in epoll if server wanna close fd

* fix exit process:
1. ensure NetWork Thread(Dispacher) can be stopped in time
2. ensure all queued Async WriteDB task can be done before exit

---------

Co-authored-by: chejinge <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3.5.5 4.0.1 ☢️ Bug Something isn't working
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants