Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reorganize benchmarks/dbscan #1217

Merged
merged 9 commits into from
Feb 26, 2025

Conversation

aprokop
Copy link
Contributor

@aprokop aprokop commented Feb 25, 2025

The current code mixes different clustering algorithms in a single place. It has many issues:

  • unclear parameters for each algorithm
  • difficulty in adding new algorithms
  • hard to read code

The main motivation for this cleanup is to prepare for adding a distributed dbscan benchmark.

This PR:

  • Splits dbscan benchmark into 3 (dbscan, hdbscan, mst)
  • Refactors code
  • Removes unused data converter

Copy link
Contributor

@dalg24 dalg24 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would prefer if we moved the common instantiation things into a library to speed up compilation.

@aprokop
Copy link
Contributor Author

aprokop commented Feb 25, 2025

I would prefer if we moved the common instantiation things into a library to speed up compilation.

Can you please elaborate?

What I'd really like is to be able to specify in CMakeLists.txt which instantiations to do, and not having to edit the source files switch statements. Then, I would just instantiate it for a couple dimensions by default. When needing more, like a paper, I think it would be OK to allow longer compilation times.

@dalg24
Copy link
Contributor

dalg24 commented Feb 25, 2025

We are compiling the same thing multiple times, one for each exe

@aprokop
Copy link
Contributor Author

aprokop commented Feb 25, 2025

We are compiling the same thing multiple times, one for each exe

Which thing?

Copy link
Contributor

@dalg24 dalg24 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor comments.
Looks good after you fix them.


add_executable(ArborX_Benchmark_DBSCAN.exe dbscan.cpp)
target_include_directories(ArborX_Benchmark_DBSCAN.exe PRIVATE ${CMAKE_CURRENT_SOURCE_DIR})
target_link_libraries(ArborX_Benchmark_DBSCAN.exe ArborX::ArborX Boost::program_options cluster_benchmark_helpers)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
target_link_libraries(ArborX_Benchmark_DBSCAN.exe ArborX::ArborX Boost::program_options cluster_benchmark_helpers)
target_link_libraries(ArborX_Benchmark_DBSCAN.exe PRIVATE ArborX::ArborX Boost::program_options cluster_benchmark_helpers)

Copy link
Contributor Author

@aprokop aprokop Feb 26, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't really matter for the executable. We don't do this anywhere else.

@aprokop
Copy link
Contributor Author

aprokop commented Feb 26, 2025

Couple builds did not start (HIP, SYCL) but they passed before clang-tidy check, so that's fine.

@aprokop aprokop merged commit 0ff7c67 into arborx:master Feb 26, 2025
1 of 2 checks passed
@aprokop aprokop deleted the reorganize_dbscan_benchmark branch February 26, 2025 15:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants