-
Notifications
You must be signed in to change notification settings - Fork 561
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
pip_parse: requirements_by_platform: allow custom defined platforms #2548
Comments
Right now there is no way to do that, similarly to how it is not possible to have switches in the regular Python tooling. #2449 example shows how people are doing it right now. |
So the example shown does not have multiple requirements files (I want to support multiple platforms, where torch has different ways for installing in each). Using The main strategy is to rely on
# file: WORKSPACE
# the base python deps management, which handles the default cases
pip_parse(
name = "python_deps",
python_interpreter_target = interpreter,
requirements_by_platform = {
"//third_party/python/locks:requirements_lock_linux_x86_64.txt": "linux_x86_64",
},
requirements_lock = "//third_party/python:requirements_lock.txt",
)
# individual requirements are also parsed with their own names for `select` for specific packages
# which are not specifically handled by `requirements_by_platform`
pip_parse(
name = "python_deps_linux_x86_64_cu_12_1",
python_interpreter_target = interpreter,
requirements_lock = "//third_party/python/locks:requirements_lock_linux_x86_64_cu_12_1.txt",
)
pip_parse(
name = "python_deps_linux_x86_64",
python_interpreter_target = interpreter,
requirements_lock = "//third_party/python/locks:requirements_lock_linux_x86_64.txt",
) Now I can create my own requirement macro and use it everywhere to take care of platform # file: third_party/python/defs.bzl
def requirement(pkg_name):
# draft and pseudocode: dont know the exact syntax right now
# converting special characters to underscore and lowercase not handled too
return select({
"cuda_enabled": "@python_deps_linux_cu_12_1_{}//:pkg".format(dep),
"default": "@python_deps_{}//:pkg".format(dep),
}) or I could specifically create aliases for each package which is problematic # file: third_party/python/torch
alias(
name = "torch",
actual = select({
"cuda_enabled": "@python_deps_linux_cu_12_1_{}//:pkg".format(dep),
"default": "@python_deps_{}//:pkg".format(dep),
}),
) Can you give me some opinion on these options? I dont have any idea which option would be better in the long run, or if there would be some other better option Thanks |
Making this work in What is more, we are moving towards using I think the best approach for now is for the user to define the |
Ohk, for now, I've taken the following approach and it's working well enough for my usecase. From what I understand, this is what you are talking about. correct me if I'm wrong. I hope it maps easily to bzlmod. Putting it out here in case someone else needs it
# WORKSPACE
load("@rules_python//python:pip.bzl", "pip_parse")
###################### Main pip parse ####################################################
# this pip-parse is the generic pip-parse whose targets are used as fallbacks
# it relies on the rules to select correct pip package depending on the host platform
##########################################################################################
pip_parse(
name = "python_deps",
python_interpreter_target = interpreter,
requirements_by_platform = {
"//third_party/python/reqs/locks:requirements_lock_linux_x86_64.txt": "linux_x86_64",
"//third_party/python/reqs/locks:requirements_lock_macos_aarch64.txt": "osx_aarch64",
},
requirements_lock = "//third_party/python/reqs/locks:requirements_lock_linux_x86_64_cu_12_1.txt",
)
# Load the starlark macro which will define your dependencies.
load("@python_deps//:requirements.bzl", install_py_deps = "install_deps")
# Call it to define repos for your requirements.
install_py_deps()
######################### Linux Cuda pip #########################################################################
# Cuda dependencies require extra work from our side (which is not covered by stock pip)
# we have rules for specific pip packages which require extra work (example torch)
# these packages would use the platform specific pip targets defined here to install platform specific dependencies
###################################################################################################################
pip_parse(
name = "python_deps_linux_x86_64_cu_12_1",
python_interpreter_target = interpreter,
requirements_lock = "//third_party/python/reqs/locks:requirements_lock_linux_x86_64_cu_12_1.txt",
)
# Load the starlark macro which will define your dependencies.
load("@python_deps_linux_x86_64_cu_12_1//:requirements.bzl", install_py_linux_86_64_cu_12_1_deps = "install_deps")
# Call it to define repos for your requirements.
install_py_linux_86_64_cu_12_1_deps()
# platforms/gpu/BUILD
constraint_setting(
name = "gpu",
default_constraint_value="none",
visibility=["//visibility:public"],
)
constraint_value(
name = "cu_12_1",
constraint_setting = ":gpu",
visibility=["//visibility:public"],
)
constraint_value(
name = "none",
constraint_setting = ":gpu",
visibility=["//visibility:public"],
)
# platforms/BUILD
platform(
name = "linux_x86_64",
constraint_values = [
"@platforms//os:linux",
"@platforms//cpu:x86_64",
"//platforms/gpu:none",
],
visibility=["//visibility:public"],
)
platform(
name = "linux_x86_64_cu_12_1",
constraint_values = [
"@platforms//os:linux",
"@platforms//cpu:x86_64",
"//platforms/gpu:cu_12_1",
],
visibility=["//visibility:public"],
)
platform(
name = "macos_arm_aarch64",
constraint_values = [
"@platforms//os:macos",
"@platforms//cpu:aarch64",
"@platforms//gpu:none",
],
visibility=["//visibility:public"],
)
# third_party/python/lib/torch/BUILD
alias(
name = "torch_linux",
actual = select({
"//platforms/gpu:cu_12_1": "@python_deps_linux_x86_64_cu_12_1_torch//:pkg",
"//conditions:default": requirement("torch"),
}),
)
alias(
name = "torch",
actual = select({
"@platforms//os:linux": ":torch_linux",
"//conditions:default": requirement("torch"),
}),
visibility=["//visibility:public"],
) The team needs to be notified that |
I think it would translate to: special handling of pytorch to somehow map constraints to the backing repos during the BUILD-file generation. IMHO, this would be a generally useful feature. We sort of have this already with pip.override for patching e.g. the library's py files, but we're still lacking a decent way to customize how the BUILD file stuff is handled. Something a bit unique to this case is that it's not just patching a BUILD file -- there is also the separate repo that, ideally, the pypi integration would create/manage. Another take: how do we represent arbitrary constraints with the pypi integration? Hm, maybe stated another way: generalizing requirements_by_platform to support arbitrary constraints? (this idea sounds familiar) Ultimately, what we want is:
A requirements file supports basic stuff, e.g. os, cpu arch, with the environment markers. IIUC, it doesn't support anything custom, though (e.g. cuda version). I suppose one solution is for our requirements parser to support something custom, e.g. env markers like Lacking this info being in requirements.txt, it has to be somewhere else. That translates to something on the |
We do have a I guess in this particular case the main issue is that we need to pull 2 different local versions of pytorch and map that to the same thing. Maybe using something like: pip.parse(
hub_name = "pip",
config_setting = "my_config_setting_1",
requirements_by_platform = ...
)
pip.parse(
hub_name = "pip",
requirements_by_platform,
) This could almost work if we:
If we decided to make this the main implementation how alias(
name = "pytorch",
actual = select({
"my_config_setting_1": select({
"is_python_3.13": "@pytorch_1", # the same select would be needed even if we have the filename based config settings.
}),
"default": select({
"is_python_3.13": "@pytorch_2",
}),
}),
) I wrote a nested select to illustrate the point that we need to do additional selects afterwards. If I am correct, this could even make it possible to further cleanup the config settings that I already cleaned up in #2556 - right now we have The way to test my hypothesis would be to get rid of the All of this is now much easier only because bazelbuild/bazel#19301 got resolved. When I was writing the inital |
I thought about it a little more and realized that what we need for legacy alias(
name = "pkg",
actual = "@pip_repo//:pkg",
) For alias(
name = "pkg",
actual = select({
"//_config:is_cp39_linux_aarch64": "@pip_cp39_linux_aarch64_repo//:pkg",
# ...
})
) For alias(
name = "_pkg_1",
actual = select({
"//_config:<filename_based_config_setting>": "@filename_based_repo//:pkg",
}),
)
alias(
name = "_pkg_2",
actual = select({
"//_config:<filename_based_config_setting>": "@filename_based_repo//:pkg",
}),
)
config_setting_group(
name = "_config_setting_group_1",
match_all = ["@some_pkg:custom_user_setting", "//_config:is_cp39_linux_aarch64"]
)
alias(
name = "pkg",
actual = select({
":_config_setting_group_1": "_pkg_1",
"//conditions:default": "_pkg_2", # or fail if there is no such _pkg_2 configured.
})
) That way we:
I think I first want to refactor our internal code and follow this solution and then the implementation of this feature would be definitely doable. No estimated time of when this would be ready, but thought I'd communicate out the direction I intend to take. If anybody wants to pick it up, feel free. :) |
Allow custom platforms to be added to pip_parse
We are trying to install PyTorch and PyTorch has a slightly different way for installing.
The pip command for installing pytorch+linux+cpu:
While for a specific GPU version:
So we have different requirements files for linux CPU and linux GPU combinations. In the code, I can see that rules_python is using the requirements file depending on the current host that its running, code is here
How do we pass custom platforms? Something like this
Versions
Thanks!
The text was updated successfully, but these errors were encountered: