[KAT-1090] Simplify how we track build dependencies #112

arthurp · 2021-03-10T19:57:51Z

Currently build dependencies are tracked in at least the following places:

conda_recipe/meta.yaml
conda_recipe/environment.yml
scripts/setup_ubuntu.sh
scripts/setup_osx.sh
config/conanfile.py
Various CMakeLists.txt
README.md

This is a mess and causes issues like having different versions specified in different dependency lists resulting in people getting different build results (this happened with different versions of the C++ date parser library).

It would be great to have a single source of truth that could be maintained and then the information could distributed to all the appropriate parts of the system, either by accessing the single source, or by regenerating the relevant file from the single source as needed.

This would also allow easier use of pipenv like tools for those who like them since there would be a single list of all the dependencies that could be fed into the tool.

The text was updated successfully, but these errors were encountered:

arthurp · 2021-03-10T20:25:49Z

Thoughts because I must always try to solve issues immediately. The dependency format should have the capabilities:

specify alternative names for a package in different systems (e.g., APT vs Conda).
specify versions in a general format that can be converted to any system (e.g., >1.0)
specify versions in system specific forms as an escape hatch
include comments

The format should be as simple as possible so allow it to be parsed from shell scripts if needed and easily parsed from any language without any other dependencies. This rules out general languages like JSON (complex and not supported by the C++ standard library) or YAML (complex and not supported by the Python standard library). This means that some custom line oriented format is probably simplest. Maybe like this: package [system], package [system], generic_package: version [system], generic_version

# pip is called "pip" in all the packaging systems we care about
pip: >20
arrow-cpp [conda], arrow [conan], apache-arrow [brew]: >=2.0<3.0

aneeshdurg · 2021-03-10T20:28:58Z

Just for context, why do we support multiple dependency systems? Can we avoid this problem by restricting how we gather deps?

arthurp · 2021-03-10T21:03:59Z

We are already building packages for Ubuntu (debs) and Conda. And most people use Conan for build dependencies, but I use Conda (because it's required for the Conda packaging, so I want to test it all the time). The result is that we need dependency lists in the correct formats and names for at least the following as I see it: Conda, Conan, Debian/Ubuntu, RPM/CentOS, pip (for python stuff in non-conda environments). And in fact most environments will MIX these kinds of dependencies. It's gonna be a bit confusing, but you get the idea and I don't think it's at all impossible to get right.

arthurp · 2021-09-09T16:18:08Z

I have decided to use JSON as the data format for the dependency information. This is supported by:

CMake (via string(JSON ...))
Python standard library
A C++ library we already use
Bash via jq (which we haven't used but could)
and it's probably not hard to find a library for any environment we are working in.

The main reason not to go with something "simpler" and custom is that it would force us to write a parser in each language, including one of the worst: CMake.

Writing JSON manually isn't great for humans, but if we format it in a specific way, it shouldn't be too bad.

arthurp mentioned this issue Mar 10, 2021

Add version management script. Consistently use it. #90

Merged

arthurp added the good first issue Good for newcomers label Mar 10, 2021

arthurp changed the title ~~Simplify how we track build dependencies~~ [KAT-1090] Simplify how we track build dependencies Aug 16, 2021

arthurp self-assigned this Sep 9, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[KAT-1090] Simplify how we track build dependencies #112

[KAT-1090] Simplify how we track build dependencies #112

arthurp commented Mar 10, 2021 •

edited

Loading

arthurp commented Mar 10, 2021

aneeshdurg commented Mar 10, 2021

arthurp commented Mar 10, 2021

arthurp commented Sep 9, 2021 •

edited

Loading

[KAT-1090] Simplify how we track build dependencies #112

[KAT-1090] Simplify how we track build dependencies #112

Comments

arthurp commented Mar 10, 2021 • edited Loading

arthurp commented Mar 10, 2021

aneeshdurg commented Mar 10, 2021

arthurp commented Mar 10, 2021

arthurp commented Sep 9, 2021 • edited Loading

arthurp commented Mar 10, 2021 •

edited

Loading

arthurp commented Sep 9, 2021 •

edited

Loading