-
Notifications
You must be signed in to change notification settings - Fork 72
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
build(core-clp): Remove boost::iostream
dependency.
#450
build(core-clp): Remove boost::iostream
dependency.
#450
Conversation
Do you want to try merging main to see if Centos 9 resolves the workflow failure? |
WalkthroughThe pull request modifies several Changes
Suggested reviewers
Warning Rate limit exceeded@LinZhihao-723 has exceeded the limit for the number of commits or files that can be reviewed per hour. Please wait 19 minutes and 28 seconds before requesting another review. ⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. 📜 Recent review detailsConfiguration used: CodeRabbit UI 📒 Files selected for processing (1)
🚧 Files skipped from review as they are similar to previous changes (1)
Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
🧹 Outside diff range and nitpick comments (5)
components/core/src/clp/streaming_archive/reader/Segment.cpp (2)
Line range hint
46-56
: Enhance error handling with specific failure detailsConsider capturing and logging the specific reason for memory mapping failure to aid in debugging.
m_memory_mapped_segment_file.emplace(segment_path); if (false == m_memory_mapped_segment_file.has_value()) { SPDLOG_ERROR( - "streaming_archive::reader:Segment: Unable to memory map the compressed " - "segment with path: {}", - segment_path.c_str() + "streaming_archive::reader:Segment: Unable to memory map the compressed " + "segment with path: {} (errno: {}, message: {})", + segment_path.c_str(), + errno, + std::strerror(errno) ); return ErrorCode_Failure; }
Replace boost::filesystem with std::filesystem in multiple files
The verification shows extensive usage of
boost::filesystem
across the codebase, particularly in:
- File operations (exists, file_size, remove)
- Path manipulations (path, parent_path, weakly_canonical)
- Directory operations (is_directory, create_directory, directory_iterator)
The migration to
std::filesystem
will require changes in:
- Header includes
- Namespace usage
- Some API adjustments (e.g.,
boost::filesystem::filesystem_error
tostd::filesystem::filesystem_error
)🔗 Analysis chain
Line range hint
8-8
: Replace boost::filesystem with std::filesystemSince the PR's objective is to remove boost dependencies, consider replacing
boost::filesystem
withstd::filesystem
(available since C++17).-#include <boost/filesystem.hpp> +#include <filesystem>Also update any
boost::filesystem
namespace usage tostd::filesystem
.🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash # Check for remaining boost::filesystem usage rg "boost::filesystem" --type cppLength of output: 14835
components/core/tests/test-StreamingCompression.cpp (3)
4-4
: Consider replacing boost::filesystem with std::filesystemWhile the PR aims to remove boost dependencies,
boost::filesystem
is still used for file cleanup operations. Consider usingstd::filesystem
(C++17) instead:-#include <boost/filesystem.hpp> +#include <filesystem> // ... rest of the file ... - boost::filesystem::remove(compressed_file_path); + std::filesystem::remove(compressed_file_path);Also applies to: 232-232
Line range hint
19-23
: Consider using std::vector for safer memory managementReplace manual memory management with std::vector to prevent potential memory leaks and improve exception safety:
- char* uncompressed_data = new char[uncompressed_data_size]; - for (size_t i = 0; i < uncompressed_data_size; ++i) { - uncompressed_data[i] = (char)('a' + (i % 26)); - } + std::vector<char> uncompressed_data(uncompressed_data_size); + std::generate(uncompressed_data.begin(), uncompressed_data.end(), + [n = 0]() mutable { return (char)('a' + ((n++) % 26)); });
Line range hint
1-233
: Consider using parameterized tests to reduce code duplicationThe test file contains significant repetition in the decompression verification steps. Consider using Catch2's parameterized testing features to reduce duplication:
SCENARIO("StreamingCompression decompression verification") { std::vector<size_t> chunk_sizes = { uncompressed_data_size / 100, uncompressed_data_size / 50, uncompressed_data_size / 25, // ... other sizes }; for (const auto& chunk_size : chunk_sizes) { GIVEN("A chunk of size " + std::to_string(chunk_size)) { REQUIRE(ErrorCode_Success == decompressor.get_decompressed_stream_region( uncompressed_bytes, decompressed_data, chunk_size)); REQUIRE(memcmp(uncompressed_data, decompressed_data, chunk_size) == 0); memset(decompressed_data, 0, uncompressed_data_size); uncompressed_bytes += chunk_size; } } }
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
📒 Files selected for processing (7)
components/core/src/clp/clg/CMakeLists.txt
(1 hunks)components/core/src/clp/clo/CMakeLists.txt
(1 hunks)components/core/src/clp/clp/CMakeLists.txt
(1 hunks)components/core/src/clp/make_dictionaries_readable/CMakeLists.txt
(1 hunks)components/core/src/clp/streaming_archive/reader/Segment.cpp
(3 hunks)components/core/src/clp/streaming_archive/reader/Segment.hpp
(2 hunks)components/core/tests/test-StreamingCompression.cpp
(2 hunks)
🔇 Additional comments (12)
components/core/src/clp/make_dictionaries_readable/CMakeLists.txt (1)
Line range hint 48-53
: LGTM! Boost::iostreams dependency successfully removed.
The removal of Boost::iostreams while maintaining other necessary Boost components aligns well with the PR objective.
Let's verify that all Boost::iostreams usage has been replaced in the source files:
✅ Verification successful
Boost::iostreams successfully removed, only required Boost dependencies remain
The verification confirms that:
- No traces of Boost::iostreams or mapped_file_source remain in the codebase
- Only necessary Boost components are present:
- boost/filesystem.hpp for path handling
- boost/program_options.hpp for command-line parsing
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Description: Check for any remaining Boost::iostreams usage in source files
# Search for any remaining boost::iostreams includes or usage
echo "Checking for boost::iostreams includes..."
rg -l "boost.*iostreams" components/core/src/clp/make_dictionaries_readable/
# Search for mapped_file_source specifically
echo "Checking for mapped_file_source usage..."
rg -l "mapped_file_source" components/core/src/clp/make_dictionaries_readable/
Length of output: 341
Script:
#!/bin/bash
# Let's try a more comprehensive search for any Boost I/O related code
# Search for any Boost I/O related patterns with context
echo "=== Searching for Boost I/O patterns ==="
rg -i "boost.*(io|stream|mapped)" components/core/src/clp/make_dictionaries_readable/ -C 2
# Check the implementation files for memory mapping
echo -e "\n=== Checking memory mapping implementation ==="
fd -e cpp -e hpp . components/core/src/clp/make_dictionaries_readable/ --exec cat {}
# Look for any remaining Boost includes
echo -e "\n=== Checking all Boost includes ==="
rg "#include.*boost" components/core/src/clp/make_dictionaries_readable/
Length of output: 13969
components/core/src/clp/streaming_archive/reader/Segment.hpp (2)
5-5
: LGTM! Proper header inclusion for the new dependencies.
The new includes are correctly ordered and necessary for the transition from boost::iostreams to the custom memory-mapped file implementation.
Also applies to: 10-10
56-56
: Excellent choice using std::optional for the memory-mapped file.
The transition from boost::iostreams to std::optional is a good modernization that makes the nullable state explicit and provides better ownership semantics.
Let's verify the interface compatibility between the old and new implementations:
✅ Verification successful
Let me gather more information about the ReadOnlyMemoryMappedFile implementation.
Interface compatibility verified successfully
The change from boost::iostreams::mapped_file_source
to std::optional<ReadOnlyMemoryMappedFile>
is fully compatible. The new ReadOnlyMemoryMappedFile
class provides:
- Memory-mapped file functionality through native POSIX
mmap
- A
get_view()
method that returns a view of the mapped memory - Proper RAII management with move/copy operations explicitly deleted
- Proper error handling through
OperationFailed
exception - Special handling for empty files
The std::optional
wrapper appropriately handles the case when the file mapping isn't initialized, which was previously implicit in the boost implementation.
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Description: Verify that ReadOnlyMemoryMappedFile provides equivalent functionality to boost::iostreams::mapped_file_source
# Search for the ReadOnlyMemoryMappedFile interface
echo "Searching for ReadOnlyMemoryMappedFile interface..."
rg -t hpp "class.*ReadOnlyMemoryMappedFile"
# Search for usage patterns of the old boost implementation to ensure all cases are covered
echo "Searching for previous boost::iostreams usage patterns..."
rg -B 2 -A 2 "boost::iostreams::mapped_file_source"
Length of output: 2911
Script:
#!/bin/bash
# Search for ReadOnlyMemoryMappedFile implementation and usage
rg -g "*.{cpp,hpp,h}" "ReadOnlyMemoryMappedFile" -B 3 -A 3
# Search for memory mapped file related methods in Segment class
ast-grep --pattern 'class Segment {
$$$
}'
Length of output: 12015
components/core/src/clp/streaming_archive/reader/Segment.cpp (2)
36-44
: LGTM: Proper error handling and resource cleanup
The error handling for previously opened memory mapped files is well implemented with appropriate warning messages and cleanup.
66-68
: LGTM: Proper resource cleanup
The cleanup sequence is correct, ensuring proper release of resources.
components/core/src/clp/clg/CMakeLists.txt (2)
Line range hint 1-132
: Verify memory mapping implementation transition
The source files list includes ReadOnlyMemoryMappedFile, which replaces Boost's memory mapping functionality. Let's ensure all relevant files have been updated.
#!/bin/bash
# Description: Verify memory mapping implementation transition
# Expected: All files should use the new ReadOnlyMemoryMappedFile class
# Find files that might use memory mapping
rg -l "mapped_file|mmap" components/core/src/clp/clg/
# Check usage of ReadOnlyMemoryMappedFile
ast-grep --pattern 'ReadOnlyMemoryMappedFile'
Line range hint 133-143
: LGTM! Boost::iostreams successfully removed.
The removal of Boost::iostreams while retaining other necessary Boost components aligns with the PR objectives.
Let's verify that no source files still include Boost iostreams headers:
✅ Verification successful
Boost::iostreams dependency removal verified successfully
The verification confirms that:
- No remaining Boost::iostreams includes or usage in the clg directory
- No direct usage of the old mapped_file or mmap functionality
- ReadOnlyMemoryMappedFile is properly referenced in CMakeLists.txt
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Description: Check for any remaining boost/iostreams includes
# Expected: No matches as the dependency has been removed
# Search for boost/iostreams includes
rg -l "boost/iostreams" components/core/src/clp/clg/
# Search for boost::iostreams namespace usage
rg -l "boost::iostreams" components/core/src/clp/clg/
Length of output: 803
Script:
#!/bin/bash
# Let's check for any remaining boost/iostreams includes or usage specifically in the clg directory
rg -l "boost/iostreams|boost::iostreams" components/core/src/clp/clg/
# Also check for any direct usage of mapped_file or mmap in clg directory
rg -l "mapped_file|mmap" components/core/src/clp/clg/
# Let's see how the new ReadOnlyMemoryMappedFile is being used in clg
rg -l "ReadOnlyMemoryMappedFile" components/core/src/clp/clg/
Length of output: 232
components/core/src/clp/clo/CMakeLists.txt (1)
161-161
: LGTM! Successfully removed Boost::iostreams dependency.
The removal of Boost::iostreams from the target libraries aligns with the PR objective and follows the implementation of the custom memory-mapped file solution.
components/core/src/clp/clp/CMakeLists.txt (3)
174-174
: LGTM! Boost::iostreams successfully removed.
The removal of Boost::iostreams while retaining other necessary Boost components aligns well with the PR objectives.
Line range hint 1-183
: Consider updating dependency documentation
Since this represents a significant dependency change, consider updating the project's documentation to reflect the removal of Boost::iostreams and the transition to the custom memory-mapped file implementation.
#!/bin/bash
# Description: Check for documentation files that might need updates
# Look for common documentation files
fd -e md -e rst --exec rg -l "boost.*iostreams|dependencies|build requirements"
Line range hint 174-183
: Verify remaining Boost dependencies
Let's verify that the remaining Boost dependencies (filesystem, program_options) aren't indirectly pulling in iostreams.
✅ Verification successful
No indirect Boost::iostreams dependencies found in clp target
The review of the codebase shows that while Boost::iostreams is used in some components, it's only used for its memory-mapped file functionality (mapped_file
) and not for any streaming or filtering capabilities. The CMake dependencies for the clp
target are safe as they don't indirectly pull in any other Boost::iostreams features.
- The only Boost::iostreams usage is in:
- ZstdDecompressor (clp_s component)
- Segment reader (glt component)
- LogtypeTableManager (glt component)
All these uses are isolated to memory mapping functionality and don't affect the clp
target's dependencies.
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Description: Check for indirect Boost::iostreams dependencies
# Search for any remaining boost::iostreams usage
rg -l "boost.*iostreams|boost/iostreams" --type cpp --type hpp
# Check CMake files for indirect Boost dependencies
fd -e cmake -e txt -x rg -l "Boost::iostreams|boost.*iostreams"
Length of output: 992
Script:
#!/bin/bash
# Let's check the content of the CMake files that mention Boost::iostreams
echo "=== components/core/src/glt/glt/CMakeLists.txt ==="
cat components/core/src/glt/glt/CMakeLists.txt
echo -e "\n=== components/core/src/clp_s/CMakeLists.txt ==="
cat components/core/src/clp_s/CMakeLists.txt
echo -e "\n=== components/core/CMakeLists.txt ==="
cat components/core/CMakeLists.txt
# Also check if there are any direct includes of boost/iostreams in the codebase
echo -e "\n=== Checking for direct includes of boost/iostreams ==="
rg "include.*boost/iostreams" --type cpp --type hpp
Length of output: 33947
Script:
#!/bin/bash
# Let's check for any direct includes of boost/iostreams with proper file extensions
rg "include.*boost/iostreams" -t cpp -t h -t hpp -t cc
# Also check for any usage of boost::iostreams namespace
rg "boost::iostreams::" -t cpp -t h -t hpp -t cc
Length of output: 161
Script:
#!/bin/bash
# Search for boost/iostreams includes and usage without file type filters
rg "include.*boost/iostreams"
# Search for boost::iostreams namespace usage
rg "boost::iostreams::"
# Also check if there are any files that might be using iostreams indirectly
rg -l "iostream|istream|ostream|filtering_stream|filtering_streambuf"
Length of output: 4821
components/core/tests/test-StreamingCompression.cpp (1)
153-156
: LGTM! Clean RAII-style memory mapping implementation
The new implementation using ReadOnlyMemoryMappedFile
follows modern C++ practices:
- RAII-style resource management
- Proper const correctness
- Clean view-based access to mapped memory
boost::iostream
dependency.boost::iostream
dependency.
boost::iostream
dependency.boost::iostream
dependency.
@kirkrodrigues Yeah Centos 9 resolves the workflow failure. Can u check the commit message? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For the PR title, how about:
build(core-clp): Remove `boost::iostream` dependency.
boost::iostream
dependency.boost::iostream
dependency.
Description
PR #445 has implemented our own memory-mapped file. With our implementation, we can now remove
boost::iostream
dependency from clp and unit tests.This PR makes necessary changes to replace
boost
memory-mapped files withclp::ReadOnlyMemoryMappedFile
, and remove theboost::iostream
dependency from the cmake files ofclp
,clg
,clo
,make-dictionaries-readable
, andunitTest
. Sinceboost::iostream
is still used inclp-s
andglt
, it is not yet fully removed from our library dependency.Validation performed
Ensure all unit tests passed.
Summary by CodeRabbit
New Features
ReadOnlyMemoryMappedFile
class for improved memory mapping functionality.Bug Fixes
Segment
class.Tests
ReadOnlyMemoryMappedFile
class, enhancing the encapsulation of memory mapping functionality.Documentation