-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: PacificA slave data replication Consistency scheme #2975
feat: PacificA slave data replication Consistency scheme #2975
Conversation
Warning There were issues while running some tools. Please review the errors and either fix the tool’s configuration or disable the tool if it’s a critical failure. 🔧 buf (1.47.2)src/pika_inner_message.protoConfig file YAML parsing error: yaml: unmarshal errors: WalkthroughThis pull request introduces enhancements to Pika's consensus and replication mechanisms, focusing on improving database write synchronization and logging. The changes span multiple files, including consensus coordination, replication background worker, client connection, and protocol definition. The modifications introduce new methods for processing leader logs, managing database write offsets, and handling database write synchronization across different components of the system. Changes
Possibly related issues
Possibly related PRs
Suggested labels
Suggested reviewers
Poem
Tip CodeRabbit's docstrings feature is now available as part of our Early Access Program! Simply use the command Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 8
🧹 Nitpick comments (9)
include/pika_consensus.h (3)
127-127
: Rename parameter for clarity and consistencyIn the
ProcessLeaderDB
method, consider renaming the parameterbinlogoffset
tobinlog_offset
to improve readability and adhere to naming conventions.Apply this diff:
-pstd::Status ProcessLeaderDB(const uint64_t binlogoffset); +pstd::Status ProcessLeaderDB(const uint64_t binlog_offset);
128-128
: Correct method name to follow CamelCase conventionRename
GetwriteDBOffset
toGetWriteDBOffset
to maintain consistent CamelCase naming throughout the codebase.Apply this diff:
-void GetwriteDBOffset(LogOffset& end_offset,LogOffset& begin_offset); +void GetWriteDBOffset(LogOffset& end_offset, LogOffset& begin_offset);
213-214
: Use uniform initialization for member variablesConsider initializing
end_db_offset_
andbegin_db_offset_
using uniform initialization for consistency.Apply this diff:
-LogOffset end_db_offset_=LogOffset(); -LogOffset begin_db_offset_=LogOffset(); +LogOffset end_db_offset_{}; +LogOffset begin_db_offset_{};include/pika_rm.h (1)
73-74
: Improve parameter naming and formatting
- In
ConsensusProcessLeaderDB
, rename the parameteroffset
tobinlog_offset
for clarity.- In
ConsensusGetwriteDBOffset
, correct the method name toConsensusGetWriteDBOffset
and add a space after the comma.Apply this diff:
-pstd::Status ConsensusProcessLeaderDB(const uint64_t offset); +pstd::Status ConsensusProcessLeaderDB(const uint64_t binlog_offset); -void ConsensusGetwriteDBOffset(LogOffset& end_offset,LogOffset& begin_offset); +void ConsensusGetWriteDBOffset(LogOffset& end_offset, LogOffset& begin_offset);src/pika_consensus.cc (1)
377-382
: Correct method naming to match convention and add spacesRename
GetwriteDBOffset
toGetWriteDBOffset
and add spaces after commas for consistency and readability.Apply this diff:
-void ConsensusCoordinator::GetwriteDBOffset(LogOffset& end_offset,LogOffset& begin_offset) +void ConsensusCoordinator::GetWriteDBOffset(LogOffset& end_offset, LogOffset& begin_offset)include/pika_repl_bgworker.h (1)
37-37
: Consider adding documentation for the new method.The new
HandleBGWorkerDB
method appears to be part of the consistency scheme implementation, but its purpose and usage are not immediately clear from the declaration.Add a documentation comment explaining:
- The purpose of this method
- The expected format/content of the
void* arg
parameter- Any assumptions or preconditions
src/pika_inner_message.proto (1)
163-167
: Consider adding validation rules for db_write_offset.The response message structure is good, but consider adding validation rules or documentation for acceptable ranges of
db_write_offset
.src/pika_repl_bgworker.cc (1)
113-113
: Initialize is_write_db with a named constant.The boolean flag controls critical write behavior. Consider using a named constant for better maintainability and clarity.
- bool is_write_db=true; + static const bool kDefaultWriteDBEnabled = true; + bool is_write_db = kDefaultWriteDBEnabled;include/pika_define.h (1)
219-223
: Consider enhancing hash function to reduce collision probabilityThe current hash function only uses
b_offset.offset
, ignoringfilenum
and logic offset fields. This could lead to hash collisions when different log entries have the same offset but different file numbers.Consider incorporating more fields in the hash calculation:
struct hash_db_write_info { size_t operator()(const LogOffset& n) const { - return std::hash<uint64_t>()(n.b_offset.offset); + size_t h1 = std::hash<uint32_t>()(n.b_offset.filenum); + size_t h2 = std::hash<uint64_t>()(n.b_offset.offset); + return h1 ^ (h2 << 1); } };
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (10)
include/pika_consensus.h
(3 hunks)include/pika_define.h
(3 hunks)include/pika_repl_bgworker.h
(1 hunks)include/pika_repl_client_conn.h
(1 hunks)include/pika_rm.h
(2 hunks)src/pika_consensus.cc
(1 hunks)src/pika_inner_message.proto
(5 hunks)src/pika_repl_bgworker.cc
(2 hunks)src/pika_repl_client_conn.cc
(1 hunks)src/pika_rm.cc
(1 hunks)
🔇 Additional comments (11)
include/pika_consensus.h (2)
126-126
: Add space after comma for consistent formatting
In the ProcessLeaderLog
method declaration, add a space after the comma for better readability and to maintain consistent coding style.
[refactor_suggestion_nitpick]
Apply this diff:
-pstd::Status ProcessLeaderLog(const std::shared_ptr<Cmd>& cmd_ptr, const BinlogItem& attribute, bool is_write_db=true);
+pstd::Status ProcessLeaderLog(const std::shared_ptr<Cmd>& cmd_ptr, const BinlogItem& attribute, bool is_write_db = true);
211-212
: Fix formatting issues for better readability
- Add spaces after commas and between the type and variable name.
- Correct the indentation and spacing for improved code clarity.
[refactor_suggestion_nitpick]
Apply this diff:
-std::unordered_map<uint64_t, std::shared_ptr<Cmd>> binlog_index;
-std::unordered_map<LogOffset,uint64_t,hash_db_write_info>offset_index;
+std::unordered_map<uint64_t, std::shared_ptr<Cmd>> binlog_index;
+std::unordered_map<LogOffset, uint64_t, hash_db_write_info> offset_index;
src/pika_consensus.cc (1)
388-392
: 🛠️ Refactor suggestion
Check for existence before accessing map elements
Before accessing binlog_index[iter.second]
, confirm that the key exists to prevent potential exceptions.
Apply this diff:
if (iter.first.b_offset.offset > offset || binlog_index.count(iter.second) == 0) {
continue;
}
auto cmd_ptr = binlog_index[iter.second];
Likely invalid or redundant comment.
include/pika_repl_client_conn.h (1)
27-27
: Add new handler method HandleDbWriteResponse
The method HandleDbWriteResponse
is appropriately added to handle database write responses. Ensure that it is implemented consistently with existing handler methods.
src/pika_inner_message.proto (2)
11-11
: LGTM: New message type for database write operations.
The addition of kDbWrite
enum value properly extends the protocol to support the new consistency scheme.
90-98
: Verify field requirements in DbWriteSync request message.
The message structure looks good, but consider:
- Should
succ
field be optional since it indicates operation success/failure? - Consider adding documentation for each field's purpose
✅ Verification successful
The previous command failed because the file type wasn't recognized. Let's search more broadly to understand the usage of DbWriteSync and find all proto files in the codebase.
The succ
field should remain required, and documentation would be beneficial
The succ
field should remain as required because:
- It's used in a database write synchronization context where the success/failure status is crucial
- The field is part of a response message (evidenced by usage in
pika_repl_client_conn.cc
) where the operation result must always be communicated - Making it optional could lead to ambiguous states where the success status is unknown
Suggested improvements:
- Add field documentation to explain:
node
: The node involved in the sync operationdb_name
: Target database nameslot_id
: Slot identifier for the operationack_range_start
andack_range_end
: Binlog offset range being acknowledgedsession_id
: Session identifier for the sync operationsucc
: Operation success status
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Check if any other proto files reference these fields
rg "db_write_sync" --type proto
Length of output: 61
Script:
#!/bin/bash
# Search for references to DbWriteSync in all files
rg "DbWriteSync" -A 3
# Search for proto files in the repository
fd "\.proto$"
Length of output: 1514
include/pika_define.h (2)
187-191
: LGTM: Clean implementation of DbWriteChip struct
The struct follows the established pattern of the codebase and properly implements required constructors.
286-293
: LGTM: Clean extension of WriteTask struct
The modifications properly integrate database write functionality while maintaining backward compatibility.
src/pika_repl_client_conn.cc (1)
85-93
: LGTM: Consistent implementation of DB write message handling
The new case follows the established pattern for message handling and includes proper validation.
src/pika_rm.cc (2)
387-391
: LGTM: Clean implementation of ConsensusGetwriteDBOffset
The method properly delegates to the coordinator while maintaining consistent parameter passing.
396-399
: LGTM: Clean implementation of ConsensusProcessLeaderDB
The method properly delegates to the coordinator while maintaining consistent parameter types.
void PutOffsetIndex(LogOffset win_offset,uint64_t binlog_offset){ | ||
offset_index[win_offset] = binlog_offset; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Pass LogOffset
by const reference to improve performance
The PutOffsetIndex
method accepts LogOffset win_offset
by value. Passing it by const reference avoids unnecessary copies and improves performance.
Apply this diff:
-void PutOffsetIndex(LogOffset win_offset,uint64_t binlog_offset){
+void PutOffsetIndex(const LogOffset& win_offset, uint64_t binlog_offset){
offset_index[win_offset] = binlog_offset;
}
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
void PutOffsetIndex(LogOffset win_offset,uint64_t binlog_offset){ | |
offset_index[win_offset] = binlog_offset; | |
} | |
void PutOffsetIndex(const LogOffset& win_offset,uint64_t binlog_offset){ | |
offset_index[win_offset] = binlog_offset; | |
} |
void PutCoordinatorOffsetIndex(LogOffset win_offset,uint64_t binlog_offset){ | ||
coordinator_.PutOffsetIndex(win_offset,binlog_offset); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Pass LogOffset
by const reference and fix formatting
- Pass
win_offset
by const reference inPutCoordinatorOffsetIndex
to avoid unnecessary copying. - Add spaces after commas for consistency.
Apply this diff:
-void PutCoordinatorOffsetIndex(LogOffset win_offset,uint64_t binlog_offset){
+void PutCoordinatorOffsetIndex(const LogOffset& win_offset, uint64_t binlog_offset){
coordinator_.PutOffsetIndex(win_offset, binlog_offset);
}
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
void PutCoordinatorOffsetIndex(LogOffset win_offset,uint64_t binlog_offset){ | |
coordinator_.PutOffsetIndex(win_offset,binlog_offset); | |
} | |
void PutCoordinatorOffsetIndex(const LogOffset& win_offset, uint64_t binlog_offset){ | |
coordinator_.PutOffsetIndex(win_offset, binlog_offset); | |
} |
Status ConsensusCoordinator::ProcessLeaderLog(const std::shared_ptr<Cmd>& cmd_ptr, const BinlogItem& attribute, bool is_write_db) { | ||
LogOffset last_index = mem_logger_->last_offset(); | ||
if (attribute.logic_id() < last_index.l_offset.index) { | ||
LOG(WARNING) << DBInfo(db_name_).ToString() << "Drop log from leader logic_id " | ||
<< attribute.logic_id() << " cur last index " << last_index.l_offset.index; | ||
return Status::OK(); | ||
} | ||
if(is_write_db){ | ||
auto opt = cmd_ptr->argv()[0]; | ||
if (pstd::StringToLower(opt) != kCmdNameFlushdb) { | ||
// apply binlog in sync way | ||
Status s = InternalAppendLog(cmd_ptr); | ||
// apply db in async way | ||
InternalApplyFollower(cmd_ptr); | ||
} else { | ||
// this is a flushdb-binlog, both apply binlog and apply db are in sync way | ||
// ensure all writeDB task that submitted before has finished before we exec this flushdb | ||
int32_t wait_ms = 250; | ||
while (g_pika_rm->GetUnfinishedAsyncWriteDBTaskCount(db_name_) > 0) { | ||
std::this_thread::sleep_for(std::chrono::milliseconds(wait_ms)); | ||
wait_ms *= 2; | ||
wait_ms = wait_ms < 3000 ? wait_ms : 3000; | ||
} | ||
// apply flushdb-binlog in sync way | ||
Status s = InternalAppendLog(cmd_ptr); | ||
// applyDB in sync way | ||
PikaReplBgWorker::WriteDBInSyncWay(cmd_ptr); | ||
} | ||
return Status::OK(); | ||
} | ||
Status s = InternalAppendLog(cmd_ptr); | ||
binlog_index[attribute.offset()] = cmd_ptr; | ||
return Status::OK(); | ||
|
||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Refactor ProcessLeaderLog
method for better maintainability
The ProcessLeaderLog
method contains deeply nested conditionals and repetitive code segments. Refactor the method to improve readability and maintainability.
Consider extracting the flushdb
handling logic into a separate private method and simplifying the nested if-else
blocks.
Status ConsensusCoordinator::ProcessLeaderDB(const uint64_t offset) { | ||
end_db_offset_=LogOffset(); | ||
begin_db_offset_=LogOffset(); | ||
begin_db_offset_.b_offset.offset=UINT64_MAX; | ||
for (const auto& iter : offset_index) { | ||
if (iter.first.b_offset.offset > offset || binlog_index.count(iter.second) == 0) { | ||
continue; | ||
} | ||
auto cmd_ptr = binlog_index[iter.second]; | ||
auto opt = cmd_ptr->argv()[0]; | ||
if (pstd::StringToLower(opt) != kCmdNameFlushdb) { | ||
InternalApplyFollower(cmd_ptr); | ||
} else { | ||
// this is a flushdb-binlog, both apply binlog and apply db are in sync way | ||
// ensure all writeDB task that submitted before has finished before we exec this flushdb | ||
int32_t wait_ms = 250; | ||
while (g_pika_rm->GetUnfinishedAsyncWriteDBTaskCount(db_name_) > 0) { | ||
std::this_thread::sleep_for(std::chrono::milliseconds(wait_ms)); | ||
wait_ms *= 2; | ||
wait_ms = wait_ms < 3000 ? wait_ms : 3000; | ||
} | ||
// apply flushdb-binlog in sync way | ||
Status s = InternalAppendLog(cmd_ptr); | ||
// applyDB in sync way | ||
PikaReplBgWorker::WriteDBInSyncWay(cmd_ptr); | ||
} | ||
if (iter.first.b_offset.offset > end_db_offset_.b_offset.offset) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ensure thread safety when accessing shared resources
The method ProcessLeaderDB
accesses offset_index
and binlog_index
without any synchronization mechanisms. If these maps are accessed concurrently, it could lead to race conditions.
Add appropriate mutex locks to ensure thread safety when accessing shared data structures.
if(is_write_db){ | ||
auto opt = cmd_ptr->argv()[0]; | ||
if (pstd::StringToLower(opt) != kCmdNameFlushdb) { | ||
// apply binlog in sync way | ||
Status s = InternalAppendLog(cmd_ptr); | ||
// apply db in async way | ||
InternalApplyFollower(cmd_ptr); | ||
} else { | ||
// this is a flushdb-binlog, both apply binlog and apply db are in sync way | ||
// ensure all writeDB task that submitted before has finished before we exec this flushdb | ||
int32_t wait_ms = 250; | ||
while (g_pika_rm->GetUnfinishedAsyncWriteDBTaskCount(db_name_) > 0) { | ||
std::this_thread::sleep_for(std::chrono::milliseconds(wait_ms)); | ||
wait_ms *= 2; | ||
wait_ms = wait_ms < 3000 ? wait_ms : 3000; | ||
} | ||
// apply flushdb-binlog in sync way | ||
Status s = InternalAppendLog(cmd_ptr); | ||
// applyDB in sync way | ||
PikaReplBgWorker::WriteDBInSyncWay(cmd_ptr); | ||
} | ||
return Status::OK(); | ||
} | ||
Status s = InternalAppendLog(cmd_ptr); | ||
binlog_index[attribute.offset()] = cmd_ptr; | ||
return Status::OK(); | ||
|
||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Avoid potential infinite loop in waiting mechanism
The loop waiting for unfinished write DB tasks doubles wait_ms
each time without an upper bound on total wait time. This could lead to prolonged waits or an infinite loop.
Apply this diff to implement a maximum total wait time:
int32_t wait_ms = 250;
+int32_t total_wait_ms = 0;
while (g_pika_rm->GetUnfinishedAsyncWriteDBTaskCount(db_name_) > 0) {
std::this_thread::sleep_for(std::chrono::milliseconds(wait_ms));
+ total_wait_ms += wait_ms;
wait_ms *= 2;
wait_ms = wait_ms < 3000 ? wait_ms : 3000;
+ if (total_wait_ms >= 10000) { // Maximum total wait time of 10 seconds
+ LOG(WARNING) << "Exceeded maximum wait time for unfinished write tasks.";
+ break;
+ }
}
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
if(is_write_db){ | |
auto opt = cmd_ptr->argv()[0]; | |
if (pstd::StringToLower(opt) != kCmdNameFlushdb) { | |
// apply binlog in sync way | |
Status s = InternalAppendLog(cmd_ptr); | |
// apply db in async way | |
InternalApplyFollower(cmd_ptr); | |
} else { | |
// this is a flushdb-binlog, both apply binlog and apply db are in sync way | |
// ensure all writeDB task that submitted before has finished before we exec this flushdb | |
int32_t wait_ms = 250; | |
while (g_pika_rm->GetUnfinishedAsyncWriteDBTaskCount(db_name_) > 0) { | |
std::this_thread::sleep_for(std::chrono::milliseconds(wait_ms)); | |
wait_ms *= 2; | |
wait_ms = wait_ms < 3000 ? wait_ms : 3000; | |
} | |
// apply flushdb-binlog in sync way | |
Status s = InternalAppendLog(cmd_ptr); | |
// applyDB in sync way | |
PikaReplBgWorker::WriteDBInSyncWay(cmd_ptr); | |
} | |
return Status::OK(); | |
} | |
Status s = InternalAppendLog(cmd_ptr); | |
binlog_index[attribute.offset()] = cmd_ptr; | |
return Status::OK(); | |
} | |
if(is_write_db){ | |
auto opt = cmd_ptr->argv()[0]; | |
if (pstd::StringToLower(opt) != kCmdNameFlushdb) { | |
// apply binlog in sync way | |
Status s = InternalAppendLog(cmd_ptr); | |
// apply db in async way | |
InternalApplyFollower(cmd_ptr); | |
} else { | |
// this is a flushdb-binlog, both apply binlog and apply db are in sync way | |
// ensure all writeDB task that submitted before has finished before we exec this flushdb | |
int32_t wait_ms = 250; | |
int32_t total_wait_ms = 0; | |
while (g_pika_rm->GetUnfinishedAsyncWriteDBTaskCount(db_name_) > 0) { | |
std::this_thread::sleep_for(std::chrono::milliseconds(wait_ms)); | |
total_wait_ms += wait_ms; | |
wait_ms *= 2; | |
wait_ms = wait_ms < 3000 ? wait_ms : 3000; | |
if (total_wait_ms >= 10000) { // Maximum total wait time of 10 seconds | |
LOG(WARNING) << "Exceeded maximum wait time for unfinished write tasks."; | |
break; | |
} | |
} | |
// apply flushdb-binlog in sync way | |
Status s = InternalAppendLog(cmd_ptr); | |
// applyDB in sync way | |
PikaReplBgWorker::WriteDBInSyncWay(cmd_ptr); | |
} | |
return Status::OK(); | |
} | |
Status s = InternalAppendLog(cmd_ptr); | |
binlog_index[attribute.offset()] = cmd_ptr; | |
return Status::OK(); | |
} |
@@ -139,6 +139,14 @@ void PikaReplBgWorker::HandleBGWorkerWriteBinlog(void* arg) { | |||
slave_db->SetReplState(ReplState::kTryConnect); | |||
return; | |||
} | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Add error handling for binlog processing failures.
The code continues processing even if binlog item decode fails. Consider adding proper error handling and logging.
+ if (worker->binlog_item_.offset().empty()) {
+ LOG(WARNING) << "Invalid binlog offset after decode";
+ slave_db->SetReplState(ReplState::kTryConnect);
+ return;
+ }
Also applies to: 149-149
if(!is_write_db){ | ||
std::shared_ptr<SyncMasterDB> db = g_pika_rm->GetSyncMasterDBByName(DBInfo(db_name)); | ||
LogOffset cur_logoffset; | ||
ParseBinlogOffset(binlog_res.binlog_offset(),&cur_logoffset); | ||
db->PutCoordinatorOffsetIndex(cur_logoffset,worker->binlog_item_.offset()); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Critical: Potential race condition in coordinator offset update.
The coordinator offset index update block has several issues:
- Redundant DB lookup inside the loop
- No synchronization around offset update
- Inconsistent spacing in the code block
Consider refactoring to:
- if(!is_write_db){
- std::shared_ptr<SyncMasterDB> db = g_pika_rm->GetSyncMasterDBByName(DBInfo(db_name));
- LogOffset cur_logoffset;
- ParseBinlogOffset(binlog_res.binlog_offset(),&cur_logoffset);
- db->PutCoordinatorOffsetIndex(cur_logoffset,worker->binlog_item_.offset());
- }
+ if (!is_write_db) {
+ LogOffset cur_logoffset;
+ ParseBinlogOffset(binlog_res.binlog_offset(), &cur_logoffset);
+ if (db) { // Use existing db pointer from outer scope
+ std::lock_guard<std::mutex> lock(db->GetOffsetMutex());
+ db->PutCoordinatorOffsetIndex(cur_logoffset, worker->binlog_item_.offset());
+ }
+ }
Committable suggestion skipped: line range outside the PR's diff.
void PikaReplClientConn::HandleDbWriteResponse(void* arg) { | ||
std::unique_ptr<ReplClientTaskArg> task_arg(static_cast<ReplClientTaskArg*>(arg)); | ||
std::shared_ptr<net::PbConn> conn = task_arg->conn; | ||
std::shared_ptr<InnerMessage::InnerResponse> response = task_arg->res; | ||
|
||
const InnerMessage::InnerResponse_DbWriteSync& dbwrite_sync_response = response->db_write_sync(); | ||
int32_t session_id = dbwrite_sync_response.session_id(); | ||
const InnerMessage::Slot& db_response = dbwrite_sync_response.slot(); | ||
const std::string& db_name = db_response.db_name(); | ||
const InnerMessage::BinlogOffset& binlog_offset = dbwrite_sync_response.db_write_offset(); | ||
|
||
std::shared_ptr<SyncMasterDB> db = g_pika_rm->GetSyncMasterDBByName(DBInfo(db_name)); | ||
if (!db) { | ||
LOG(WARNING) << db_name << "Not found."; | ||
} | ||
db->ConsensusProcessLeaderDB(binlog_offset.offset()); | ||
LogOffset ack_end; | ||
LogOffset ack_start; | ||
db->ConsensusGetwriteDBOffset(ack_end,ack_start); | ||
g_pika_rm->SendBinlogSyncAckRequest(db_name, ack_start, ack_end); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add proper error handling and validation
The current implementation has several issues:
- No proper error handling when db is null
- Potential null pointer dereference at line 115
- Missing validation of binlog_offset
Consider this safer implementation:
void PikaReplClientConn::HandleDbWriteResponse(void* arg) {
std::unique_ptr<ReplClientTaskArg> task_arg(static_cast<ReplClientTaskArg*>(arg));
std::shared_ptr<net::PbConn> conn = task_arg->conn;
std::shared_ptr<InnerMessage::InnerResponse> response = task_arg->res;
const InnerMessage::InnerResponse_DbWriteSync& dbwrite_sync_response = response->db_write_sync();
int32_t session_id = dbwrite_sync_response.session_id();
const InnerMessage::Slot& db_response = dbwrite_sync_response.slot();
const std::string& db_name = db_response.db_name();
const InnerMessage::BinlogOffset& binlog_offset = dbwrite_sync_response.db_write_offset();
std::shared_ptr<SyncMasterDB> db = g_pika_rm->GetSyncMasterDBByName(DBInfo(db_name));
if (!db) {
LOG(WARNING) << db_name << "Not found.";
+ return;
}
+
+ if (binlog_offset.offset() == 0) {
+ LOG(WARNING) << "Invalid binlog offset received for " << db_name;
+ return;
+ }
db->ConsensusProcessLeaderDB(binlog_offset.offset());
LogOffset ack_end;
LogOffset ack_start;
db->ConsensusGetwriteDBOffset(ack_end,ack_start);
g_pika_rm->SendBinlogSyncAckRequest(db_name, ack_start, ack_end);
}
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
void PikaReplClientConn::HandleDbWriteResponse(void* arg) { | |
std::unique_ptr<ReplClientTaskArg> task_arg(static_cast<ReplClientTaskArg*>(arg)); | |
std::shared_ptr<net::PbConn> conn = task_arg->conn; | |
std::shared_ptr<InnerMessage::InnerResponse> response = task_arg->res; | |
const InnerMessage::InnerResponse_DbWriteSync& dbwrite_sync_response = response->db_write_sync(); | |
int32_t session_id = dbwrite_sync_response.session_id(); | |
const InnerMessage::Slot& db_response = dbwrite_sync_response.slot(); | |
const std::string& db_name = db_response.db_name(); | |
const InnerMessage::BinlogOffset& binlog_offset = dbwrite_sync_response.db_write_offset(); | |
std::shared_ptr<SyncMasterDB> db = g_pika_rm->GetSyncMasterDBByName(DBInfo(db_name)); | |
if (!db) { | |
LOG(WARNING) << db_name << "Not found."; | |
} | |
db->ConsensusProcessLeaderDB(binlog_offset.offset()); | |
LogOffset ack_end; | |
LogOffset ack_start; | |
db->ConsensusGetwriteDBOffset(ack_end,ack_start); | |
g_pika_rm->SendBinlogSyncAckRequest(db_name, ack_start, ack_end); | |
} | |
void PikaReplClientConn::HandleDbWriteResponse(void* arg) { | |
std::unique_ptr<ReplClientTaskArg> task_arg(static_cast<ReplClientTaskArg*>(arg)); | |
std::shared_ptr<net::PbConn> conn = task_arg->conn; | |
std::shared_ptr<InnerMessage::InnerResponse> response = task_arg->res; | |
const InnerMessage::InnerResponse_DbWriteSync& dbwrite_sync_response = response->db_write_sync(); | |
int32_t session_id = dbwrite_sync_response.session_id(); | |
const InnerMessage::Slot& db_response = dbwrite_sync_response.slot(); | |
const std::string& db_name = db_response.db_name(); | |
const InnerMessage::BinlogOffset& binlog_offset = dbwrite_sync_response.db_write_offset(); | |
std::shared_ptr<SyncMasterDB> db = g_pika_rm->GetSyncMasterDBByName(DBInfo(db_name)); | |
if (!db) { | |
LOG(WARNING) << db_name << "Not found."; | |
return; | |
} | |
if (binlog_offset.offset() == 0) { | |
LOG(WARNING) << "Invalid binlog offset received for " << db_name; | |
return; | |
} | |
db->ConsensusProcessLeaderDB(binlog_offset.offset()); | |
LogOffset ack_end; | |
LogOffset ack_start; | |
db->ConsensusGetwriteDBOffset(ack_end,ack_start); | |
g_pika_rm->SendBinlogSyncAckRequest(db_name, ack_start, ack_end); | |
} |
主那一块的逻辑需要重新梳理,梳理完还需要改代码,可能需要一些时间@hengsheng |
The logic of the main part needs to be reorganized. After that, the code needs to be changed. It may take some time @hengsheng |
Better not let the English guy see we actually hid it with Chinese. -some random chinamen |
目前根据主的方面的要求,在收到写binlog请求之后,不要立即写db,等到主实例收到,回复后,再次发送写db请求。
目前我的处理逻辑是,收到binlog请求后,加入到已准备列表,等待下次写db请求后,根据这次请求的offset,把之前这个offset之前所有的binlog写入,也就是说,主收到所有从实例的准备好之后,设置提交状态,给所有从实例回复,告诉他们,可以将准备列表的binlog提交了。
对于主从的强一致性,我在对的从处理中,加了“开关”,目前情况,在收到binlog请求,binlog和db都会写,只有在收到binlog中,解析出有不要写db的请求(这块等主完成后,协商加一下),这次才会不写db,
Summary by CodeRabbit
New Features
Bug Fixes
Documentation
Chores