AEP1 - Provider Catalogs #45

flisboac · 2022-04-21T02:15:30Z

flisboac
Apr 21, 2022

Migrating the Catalogs feature request from #33 to a new Github discussion. All the relevant chat history is here; if I miss anything, or if a rewrite is needed with a "formal" proposal (i.e. more focused), just tell me!

P.S.: AEP stands for Antidote Enhancement Proposal 😎

Reading the documentation again, and some of the source-code, I now have a better understanding of the overall mechanism used by Antidote.

Not only the factory provider, but also the indirect provider, requires you to specify a factory function of sorts. I believe it must work this way because it's not possible to guarantee that the implementation classes are loaded (as in imported from modules and made available to the injector/app) unless the app import them at some point. Those factories guarantee that, and shifts the selection of implementation candidates to the user. It's a really smart decision, and I'm quite impressed.

Such a design forces your injection site (e.g. your app/lib) to indirectly depend on a predefined set of implementations. In a way, you still have the benefits of dependency injection, but the lack of service discovery makes implementing auto-provided library services a bit harder. I'll try to illustrate what I mean.

About multi injection

Back to the logger domain. As an example, suppose I have a "core" library declaring an interface LoggerSink. Then I may want to offer to users different implementations of a log sink, that will be injected by means of some indirect factory. An implementation would look like this:

Code snippet: On library "my_logging_core" (with basic logging framework definitions)

# 
# On library "my_logging_core" (with basic logging framework definitions)
# 

from abc import ABCMeta, abstractmethod
from typing import Callable, List, Optional, Sequence, Set, Tuple, Type, TypeVar, Union
from typing_extensions import TypeAlias

from antidote import world, Service, Provide


class LogSink(Service, metaclass=ABCMeta):
    """Sends logs somewhere."""
    @abstractmethod
    def log(self, message: str) -> None: ...

class Logger(Service, metaclass=ABCMeta):
    """Orchestrates log sinking."""
    def log(self, message: str) -> None: ...

TIndirect = TypeVar('TIndirect', covariant=True)
TIndirectFactory: TypeAlias = Callable[
    ...,
    Union[
        TIndirect,
        List[TIndirect],
        Tuple[TIndirect],
        Set[TIndirect],
    ],
]
TIndirectFilter: TypeAlias = Callable[[TIndirect, TIndirectFactory], bool]

def multi_inject(
    indirect_type: Type[TIndirect],
    *factories: TIndirectFactory,

    # Not necessary at all, but a possibility
    when: Optional[TIndirectFilter] = lambda: True,
    required: Optional[Union[bool, int]] = False,
) -> Sequence[TIndirect]:
    expected_typename = indirect_type.__name__
    def injector():
        selected: List[TIndirect] = []

        for factory in factories:
            candidates = world.get[indirect_type](indirect_type @ factory)
            if not isinstance(candidate, (list, tuple, set)):
                if not isinstance(candidates, indirect_type):
                    typename = type(candidates).__name__
                    raise ValueError(
                        f'Multi-injection candidate must be an instance of "{expected_typename}", got "{typename}".'
                    )
                candidates = [candidates]

            for candidate in candidates:
                if when(candidate, factory):
                    selected.append(candidate)

        #  The assertions could be better, but should give an idea of what I mean
        if required is True:
            assert len(selected) > 0, \
                'Multi-injection of "{expected_typename}" is required, but no suitable candidate was found'
        elif isinstance(required, int):
            assert len(selected) >= required, \
                'Multi-injection of "{expected_typename}" is required,' \
                ' but not enough suitable candidates were found (requires at least {required})'

        return tuple(selected)

    return injector

Code snippet: On library "my_logging_remote" (with sinks for some external logging service/api)

# 
# On library "my_logging_remote" (with sinks for some external logging service/api)
# 

from antidote import implementation, Service
from my_logging_core import LogSink


class CloudwatchLogSink(Service, LogSink):
    ...

class LogstashLogSink(Service, LogSink):
    ...

# Don't want to complicate the snippet any more than it already is. :x
# Please imagine here some sort of configuration injection, and all that.
@implementation(LogSink, permanent=True)
def remote_log_sink_inject() -> Sequence[LogSink]:
    return (CloudwatchLogSink(), LogstashLogSink())

Code snippet: On library "my_logging_os" (with sinks specialized on services in the local OS)

# 
# On library "my_logging_os" (with sinks specialized on the local OS)
# 

from antidote import implementation, Service
from my_logging_core import LogSink

class SyslogLogSink(Service, LogSink):
    ...

# Don't want to complicate the snippet any more than it already is. :x
# Please imagine here some sort of configuration injection, and all that.
@implementation(LogSink, permanent=True)
def os_log_sink_inject() -> Sequence[LogSink]:
    return (SyslogLogSink(),)

Code snippet: On library "my_logging_fd" (with sinks specialized on file descriptors)

# 
# On library "my_logging_fd" (with sinks specialized on file descriptors)
# 

from antidote import implementation, Service
from my_logging_core import LogSink


class ConsoleLogSink(Service, LogSink):
    ...

class RotatingFileLogSink(Service, LogSink):
    ...

@implementation(LogSink, permanent=True)
def fd_log_sink_inject() -> Sequence[LogSink]:
    return (ConsoleLogSink(), RotatingFileLogSink())

Code snippet: On library "my_logging" (the entrypoint for the logging framework)

# 
# On library "my_logging" (the entrypoint for the logging framework)
# 

from typing_extensions import TypeAlias, Annotated
from antidote import implementation, Service, From

from my_logging_core import LogSink, Logger
from my_logging_remote import remote_log_sink_inject
from my_logging_os import os_log_sink_inject
from my_logging_fd import fd_log_sink_inject

class DefaultLogger(Service, Logger):
    """Orchestrates log sinking."""

    def __init__(self, sinks: Sequence[LogSink]):
        self._sinks = sinks
    
    def log(self, message: str) -> None:
        for sink in self.sinks:
            sink.write(message)


all_log_sinks = multi_inject(
    fd_log_sink_inject,
    os_log_sink_inject,
    remote_log_sink_inject,
)

ProvidedAllLogSinks: TypeAlias = Annotated[Sequence[LogSink], From[all_log_sinks]]

@implementation(Logger, permanent=True)
def default_logger(sinks: ProvidedAllLogSinks) -> Logger:
    return DefaultLogger(sinks)

ProvidedLogger: TypeAlias = Annotated[Logger, From[default_logger]]

Code snippet: On a final app (i.e. actually using the "logging framework")

# 
# On a final app
# 

from antidote import inject
from my_logging import ProvidedLogger

@inject
def main(logger: ProvidedLogger):
    # ... do some work
    logger.log('done!')  # would log to many different sinks (or less)


main()

(I haven't tested this code, but I suppose it should work. Please correct me if I'm wrong!)

That is all completely fine for smaller apps, domains or teams. But as those increase in size or scope, or are further broken down, maintaining such a tight grip of dependencies at the library level (in my example, through all_log_sinks, or similar mechanisms) can become increasingly harder. As a library or framework author, the more freedom you can give the user, the better. I agree we should not suggest bad practices, or overwhelm the user with options, but in this case specifically, a better service discovery may allow those authors to loosen coupling even more, and allow more decentralization whilst maintaining a sane baseline (in the form of interfaces, and similar elements).

Concerning this scenario, one could say that the app developer already have the freedom he needs, but that freedom comes at the cost of unnecessary complexity. For example, he could reuse the multi_inject mechanism to select exactly which sinks he wants to use, and then construct the Logger himself:

Code snippet: On a final app with customized log sink factories

from antidote import inject, world
from my_logging import DefaultLogger

from my_logging_core import LogSink, Logger, multi_inject
from my_logging_os import os_log_sink_inject
from my_logging_fd import fd_log_sink_inject

app_log_sinks = multi_inject(
    fd_log_sink_inject,
    os_log_sink_inject,
)

@inject
def main():
    # Note that I cannot reuse the default logger factory, because I want to use a
    # different set of sinks than the defaults provided by the statically specified
    # indirect provider:
    
    # >>> logger = world.get[Logger](Logger @ default_logger)
    # <<< (Not possible!)

    # Because of that, we need to construct a logger by hand, or make a new factory.
    # This requires us to have knowledge of some implementation, and hurts 
    # testability in a way (because dependency is now fixed).
    # The need for using world inside a function that's in principle auto-wired
    # does not look very good (also, the same can be said about `default_logger`).
    logger_sinks = world.get[LogSink](LogSink @ app_log_sinks)
    logger = DefaultLogger(sinks)

    # Now, the rest of the app follows as before.
    
    # ... do some work

    logger.log('done!')  # would log to 4 different sinks (or less)

main()

This use case could be handled better, and by Antidote itself. What if we had a concept in Antidote for a collection of provider elements (Service, Component, Factories, Indirects, etc), much in the same vein as Modules in NestJS or Modules in AngularJS? The idea is for a library author to provide said collection, and it's up to the user to select which of those collections he wants to include in the auto-injection mechanism.

I would call such a collection a catalog (as in service catalog, etc), so that it does not conflict with Python's concept of a module. Also, I think catalog is a better description of the intention of that functionality.

NOTE: The same argument could be applied to single-element (but likewise indirect) injections too.

... And yet another proposal!

A catalog could be declared as a simple Python module with some well-known exported, non-dunded properties. As of now, from their NestJS and Angular concept counterparts, the only property we should care about is exports and imports.

exports is mandatory, and would specify the injectables that will be considered when no injection is provided by the other providers. Injectable type could be deduced by how the element is decorated (e.g. @implements, inheriting Service; possibly with some helper to determine which kind of element it is.

imports would be used for when a library needs to include the injectables of another from inside this catalog mechanism. It would incur some degree of indirection, but nothing that could be e.g. easily followed in any IDE by simple CTRL+click, etc. Also, imports is entirely optional, and would just add the imported catalogs' exports (transitively) into the set of possible injections. (I guess this would be slower than the current mechanism, but not by much, as long as the chain is not too long, and the candidate set is not too big; perhaps some of the work can be optimized by pre-calculating the entire candidate set on Cython?)

This would allow Antidote to keep the design philosophy of "explicit is better that implicit," because it would be clear for the user (from both library and app perspective) from where the injection candidates are coming from.

Expanding on the previous example (supposing we refactor it appropriately):

Code snippet: On module "my_logging_remote.catalog"

from .impl import remote_log_sink_inject, CloudwatchLogSink, LogstashLogSink

exports = (remote_log_sink_inject, CloudwatchLogSink, LogstashLogSink)

Code snippet: On module "my_logging_os.catalog"

from .impl import SyslogLogSink, os_log_sink_inject

exports = (SyslogLogSink, os_log_sink_inject)

Code snippet: On module "my_logging_fd.catalog"

from .impl import ConsoleLogSink, RotatingFileLogSink, fd_log_sink_inject

exports = (ConsoleLogSink, RotatingFileLogSink, fd_log_sink_inject)

Code snippet: On module "my_logging.catalog"

from .impl import DefaultLogger, default_logger

exports = (
    DefaultLogger,
    default_logger,
)

Code snippet: On the final app (default)

It stays the very same way.

from antidote import inject
from my_logging import ProvidedLogger

@inject
def main(logger: ProvidedLogger):
    # ... do some work
    logger.log('done!')  # would log to many different sinks (or less)

main()

Code snippet: On the final app (with custom log sink providing)

from antidote import inject
from my_logging_fd import catalog as fd_catalog
from my_logging_os import catalog as os_catalog
from my_logging import catalog, ProvidedLogger

@inject(catalogs=(catalog, fd_catalog, os_catalog)
def main(logger: ProvidedLogger):
    # ... do some work
    logger.log('done!')  # would log to many different sinks (or less)

main()

Note that this is all mostly about indirect providing.

By this design, perhaps a new CatalogProvider could be implemented, to cover the case when a provider is not immediately offered by the other providers. This would keep compatibility with the current injection mechanism, and offer an alternative for when a greater degree of control of indirection is needed. Libraries can have as many catalogs as needed, or offer some catalog factory to customize provisioning (much in the same way the very common Module.forRoot pattern in Angular works).

Note that this is on top of the qualifiers for which this issue is for. Catalogs would be there to provide entire sets of injectables, and in some way filter them, at the container level (i.e. coarser control). Qualifiers would be used to slice and/or filter injection at the injection point level (i.e. more granular control). Also, contrary to modules in NestJS or Angular, there would be no isolation between catalogs, in the sense of internal vs external injectables (i.e. no "privates"), as I guess it would be quite hard to implement something like that and it would also not bring as much benefit.

One disadvantage of this approach is that I don't see an obvious way to alter world to make it also work with catalogs when injecting:

Could the concept of sub-container be introduced (i.e. like clone(), but using parent setup to look up injections, hence propagating the catalog set downstream)?
Or should world.get() also accept catalogs, and each injection would be distinct from each other in terms of lookup?
- And in that case, what if I have to programmatically inject something? (NestJS solved this with injections of so-called ModuleRefs, which in this proposal would be something more akin to ContainerRefs, and return a customized "world")

Those questions prevented me from trying to implement a new Provider. I needed to check the feasibility of such an idea, hence why I'm walltexting (I'm sorry) :x

Follows the rest of the discussion history regarding this functionality:

From @flisboac

(...) I'd also add the possibility for a dependency to be resolved through the ModuleRef, with a flag to enable searching the dependency outside the catalog (ie. app and world-level) -- which is inspired by the same functionality in NestJS (in NestJS parlance, that is strict: false), and that I did use in more than one occasion.

The advantage of this approach is that:

Dependencies local to the catalog always have priority.
It makes users less dependent on interfacing with world when searching for a dependency dynamically.
It makes it possible to do an "app-level" dependency search, in case a specific set of catalogs are selected by your app's entrypoint.

In my initial proposal, the idea was for world to have "globals" (global-level dependencies), and nothing changes in regards to how it's implemented today. Whereas catalogs would implement "scoped" dependencies, much in the same way how modules in Python work (ie. you need to import dependencies to use them, and imports are local to the importing catalog; with optional caching, perhaps).

Is it necessary to be able to declare catalogs at @inject-level though ? The catalog idea for me seems to be global definition. You declare, within your current container/catalog, on which sub-catalog/container you want to rely on.

In regards to the @inject decorator, the catalog selection would only apply to the entrypoint (ie. your main function). I don't think injection points in constructor/function parameters need this at all.

This would lessen the amount of dependencies the DI mechanism need to register and traverse, and it makes the whole mechanism a bit more scalable. Tests could be smaller and focused on specific catalogs, for example. Or the app may offer multiple entrypoints, each of them with their own catalog selection.

Otherwise, the only thing you would need, at the app-level, is to import the catalog, and then antidote would automagically include it, which is not very explicit and would raise some warnings (e.g. flake8 complaining the import is not used). Also, all entrypoints would see all catalogs as well, which defeats the purpose (as far as I understand it).

But I admit this could be an immense change to the codebase, in which case opting for a static config approach would be the only way. I also don't know how it'd play out with the Providers mechanism (do we extend the Provider API to allow for the catalog implementation? How to integrate existing providers with this supposed new means of resolution?). But again, this is just an idea; if it's not feasible, I will understand, and it's not a problem at all, antidote is already plenty helpful as it is!

From @Finistere

In my initial proposal, the idea was for world to have "globals" (global-level dependencies), and nothing changes in regards to how it's implemented today. Whereas catalogs would implement "scoped" dependencies, much in the same way how modules in Python work (ie. you need to import dependencies to use them, and imports are local to the importing catalog; with optional caching, perhaps).

I'm not entirely sure to understand your vision. For me, off the top of my head, I would do something like this:
# essentially world
class Container:
    catalogs: list[Catalog]

    def get(self, dependency: object) -> object:
        # provider logic
        # ...
        
        for catalog in self.catalogs:
            try:
                return catalog.get(dependency)
            except DependencyNotFoundError:
                pass

        raise DependencyNotFoundError(dependency)
    
# API

catalog = world.new_catalog()
with catalog:  # replaces world within this context, similar to world.test.new()
    # declare dependencies, eventually through their import
    ...

# add to a mother catalog
mother_catalog.add_catalog(catalog)
# or to world directly
world.add_catalog(catalog)
I also don't know how it'd play out with the Providers mechanism (do we extend the Provider API to allow for the catalog implementation? How to integrate existing providers with this supposed new means of resolution?). But again, this is just an idea; if it's not feasible, I will understand, and it's not a problem at all, antidote is already plenty helpful as it is!

It would be orthogonal to the provider mechanism IMO. Providers define how dependencies are constructed and how they should be cached (scope). A Catalog would have its own Providers. I don't know yet if the global Container should be a Catalog itself or not, but they're definitely similar.

with a flag to enable searching the dependency outside the catalog (ie. app and world-level)

I'm unsure how to make this for now. Getting them through world would work, but specifying at @inject time that you actually want to rely on the world is a bit more tricky. I suppose I could add inject.from_world.me and the FromWorld annotation. FromWorld could also work in @inject(dependencies={'x': FromWorld(...)}).

Otherwise I really like the approach of local-first, global if requested!

From @flisboac

I'm not entirely sure to understand your vision. For me, off the top of my head, I would do something like this:
Ok, that's a perfectly fair API.

I was considering the @inject use case, for when you configure a catalog per app entrypoint (ie. per "main" function), as you would want to have as many as needed. Updating my samples from before:

from antidote import inject, world
from my_logging_fd import catalog as fd_catalog
from my_logging_os import catalog as os_catalog
from my_logging import catalog, ProvidedLogger

aws_ecs_catalog = world.new_catalog()
aws_ecs_catalog.add_catalog(catalog, fd_catalog, os_catalog)

aws_lambda_catalog = world.new_catalog()
aws_lambda_catalog.add_catalog(catalog, fd_catalog, os_catalog)  # could be different

@inject.from_(aws_lambda_catalog)  # could also be `@inject(catalog=aws_lambda_catalog )` ?
def aws_lambda_main(logger: ProvidedLogger): 
    """Specialized for running as a AWS Lambda."""
    # ProvidedLogger comes from main_catalog, or transitively from its (catalog) dependencies.
    # Any dependency neither declared in those catalogs nor in world are not even visible here.
    # ...
    # do some work
    logger.log('done!')  # would log to many different sinks (or less)

@inject.from_(aws_ecs_catalog)
def aws_ecs_main(logger: ProvidedLogger):
    """Specialized for running as a container, eg. ECS/Fargate, etc."""
    # same here
    # ...
    # do some work
    logger.log('done!')  # would log to many different sinks (or less)

# I could have as many "main"s as needed.

# Each should see only the dependencies on the catalog it depends on.
# Dependencies of dependencies would use the catalog's resolution mechanisms.
# ie. an injected service would have its dependencies resolved from the
# scope of the main's catalog, not world; no explicit catalog reference should
# be needed to be specified at the injection point. Antidote knows which catalog
# is the "current", and will use it for resolutions.

# Of course, this is just an alternative to the context manager form,
# for when the needed dependencies are known or can be declared as
# parameter type annotations.

# The sequence format I initially proposed, ie. `catalogs=(...)`
# would do all that catalog creation for you automatically, if all
# you would do is call `add_catalog` (which I think will be a common
# use case for catalogs).
# this catalog would exist only for, or at least for, the duration of the
# function's execution.
# Example:

@inject.from_([catalog, fd_catalog, os_catalog])  # could also be `@inject(catalog=aws_lambda_catalog )` ?
def aws_lambda_main(logger: ProvidedLogger): 
    """Specialized for running as a AWS Lambda."""
    # ...
    # do some work
    logger.log('done!')  # would log to many different sinks (or less)

# All of this would allow me to share entrypoints in a library as well, for example.

A Catalog would have its own Providers.

Perfect! That's how I initially imagined them to work. More specifically, my worries were about duplicating efforts when splitting between world and catalogs' provider implementation. I understood initially that there would be distinct implementations for both, but looking at Antidote's code, I understand better what you mean.

What about this, for example, for the indirect provider (src/antidote/_providers/indirect.py)?

@API.private
class IndirectProvider(Provider[Hashable]):

    # ...

    def maybe_transitive_provide(self, dependency: Hashable, container: Container
                      ) -> Optional[DependencyValue]:
        """Searches for dependencies in imported catalogs"""

    # Is this a sufficient implementation?
    def maybe_provide(self, dependency: Hashable, container: Container
                      ) -> Optional[DependencyValue]:
        if not isinstance(dependency, ImplementationDependency):
            return self.maybe_transitive_provide(dependency, container)  # <-- here

        try:
            target = self.__implementations[dependency]
        except KeyError:
            return self.maybe_transitive_provide(dependency, container)  # <-- here

        if target is not None:
            return container.provide(target)
        else:
            # Mypy treats linker as a method
            target = dependency.implementation()
            if dependency.permanent:
                self.__implementations[dependency] = target
            value = container.provide(target)
            return DependencyValue(
                value.unwrapped,
                scope=value.scope if dependency.permanent else None
            )

    # etc...

@API.private
class ImplementationDependency(FinalImmutable):
    __slots__ = ('interface', 'implementation', 'permanent', '__hash')
    interface: type
    implementation: Callable[[], Hashable]
    permanent: bool

    # If None, comes from world; if not None, comes from some catalog.
    # Either that, or `catalog: Catalog`, and world will be a catalog.
    # Should be explicit in case the dependency comes from somewhere else.
    # And, of course, this is only relevant if caching transitive dependencies
    # locally is deemed important. Otherwise, this is not necessary.
    catalog: Optional[Catalog]
   
    __hash: int

    def __init__(self,
                 interface: Hashable,
                 implementation: Callable[[], Hashable],
                 permanent: bool,
                 catalog: Optional[Catalog] = None):
        super().__init__(interface,
                         implementation,
                         permanent,
                         hash((interface, implementation, catalog)))  # <-- here

    # ...

    def __eq__(self, other: object) -> bool:
        return (isinstance(other, ImplementationDependency)
                and self.catalog is other.catalog  # <-- here
                and self.__hash == other.__hash
                and (self.interface is other.interface
                     or self.interface == other.interface)
                and (self.implementation is other.implementation  # type: ignore
                     or self.implementation == other.implementation)  # type: ignore
                )  # noqa

In any case, each catalog has their own provider instances, as you said. The user can add more providers, if needed (or not; I'm not entirely sure if it's a good idea, though).

From @Finistere

I was considering the @Inject use case, for when you configure a catalog per app entrypoint (ie. per "main" function), as you would want to have as many as needed. Updating my samples from before:

I'm not convinced by your example. With aws_lambda_main and aws_ecs_main, you're distinguishing the code path between lambda and ecs, so hiding this specificity behind a catalog & dependency injection doesn't make sense. IMHO either they are the same function and it's the caller that defines the catalog or the catalog selection is inside their body like:
def aws_lambda_main(): 
    """Specialized for running as a AWS Lambda."""
    with aws_lambda_catalog:
        common()

def aws_ecs_main():
    """Specialized for running as a container, eg. ECS/Fargate, etc."""
    with aws_ecs_catalog:
        common()
    
    
@inject
def common(logger: ProvidedLogger = inject.me()):
    logger.log('done!')  # would log to many different sinks (or less)
But I do see the need to change, at runtime, the catalogs used by world. If I understand your need correctly, it's the possibility to override within a certain context the dependencies exposed by world.

From @flisboac

The context manager syntax makes a lot of sense, and I'm pleased by the API you proposed. The @inject support would be a shortcut (an extra, or syntax sugar) to the user. It's not necessary at all, it's just to avoid the need for writing two functions for each entrypoint, or for when programmatic dependency resolution is not going to be used. It would allow for each entrypoint to be injected with the dependencies they need right away, at annotation level. Also, parameters that are not injected could still be passed along, something that may be useful for standalone functions (like the entrypoints).

The same machinery used to implement the context manager idiom could be used to implement the decorator-based one (unless there's some problem with it, which I'm not seeing).

Regarding my example, the implementations of aws_lambda_main and aws_ecs_main are not the same. Sorry for not making that obvious at first. They have different needs, do different tasks, or inject completely different dependencies. Therefore, their code is what is going to differ, and is what matters, not their injections. Them having the same dependencies is only coincidental. I just didn't bother to put different implementations for them at the time. :(

Folows an updated example:

# The lambda parameters will be passed by AWS Lambda's framework.
# event and context are just just JSON objects.
# With a code this simple, it become very easy to sell the use of a DI library,
# proper IoC and testability.
@inject(source=aws_lambda_catalog)  # copying your proposed new factory syntax here
def aws_lambda_main(
  event,
  context,
  *,
  s3: S3Client= inject.me(),
  sns: SnsClient = inject.me(),
  logger: ProvidedLogger = inject.me(),
) -> None:
  # do some work
  # ie. process incoming event, publish sns message, etc
  # ...
  s3.put_object(event, ...)
  sns.notify({ "type": "...", "message": "work done" })
  logger.info("event processing finished successfully)

# Now, suppose we're going to provide an API as a Fargate container.
# aws_ecs_main is a long-running process, and receives no
# parameters. The aws_ecs_main is called from a script, or from some
# executable module (e.g. `python -m my_lib.main.aws_ecs`)
@inject(source=aws_ecs_catalog)
def aws_ecs_main(
  *,
  cli_args_parser: CliArgsParser = inject.me(),
  api_config: ApiConfig = inject.me(),  # Gets some env-var configs, etc. Is a Constants class
  config_provider: ConfigProvider = inject.me(),
  api_service: ApiService = inject.me(),
) -> None:
  # Also consider that different entrypoints may parse the CLI
  # through different CLI parsers, or may parse the CLI arguments
  # themselves.
  cli_args = cli_args_parser.parse_args()

  config = config_provider.merge_configs(cli_args, api_config)
  api_service.serve(config)

# They both do different things. There may be some overlap in dependencies
# or implementation on them, but that's not what I was arguing for.

As a library author, I'll provide catalogs that can be used by both end-users (lambda and ECS). Catalog selection is primarily the responsibility of the end user (or, of a library provider, transitively).

Another advantage of @inject-decorated functions that they can be called as-is, for simpler use-cases. Now, how this is going to play out when they're called in the context of some other catalog is something I'm not sure myself how to deal with, and may be the reason for some opposition to this idea. When I first wrote the proposal, I was thinking about catalogs overriding whatever was in their context, much in the same way it is done today with world, but without a clone. For example:

# In normal code:

# A new catalog, with some overrides
specialized_aws_ecs_catalog = world.new_catalog()
specialized_aws_ecs_catalog.singleton(...)
# ...

def specialized_aws_ecs():
  with specialized_aws_ecs_catalog:
    # I don't need to specify aws_ecs_catalog here, because
    # the entrypoint would have imported it already.
    # The context manager override would be an exception,
    # not a rule.
    aws_ecs()

#
# Or, in tests:

def test_aws_lambda_main():
  # This context manager would take precedence when providing dependencies.
  with world.test.new_catalog() as test_catalog:
    test_catalog.override.singleton(S3Client, MockedS3Client())  # Or simply `test_catalog.singleton(...)`?
    # ...
    # Would use the MockedS3Client first, and respect
    # dependencies coming from the original aws_lambda_catalog
    # if not present in test_catalog
    aws_lambda_main(test_event, test_ctx)

But your suggested use is perhaps clearer in intent, and better from an Antidote-implementation point of view. Rewriting the previous example, I would have the following:

def aws_lambda_main(
  event,
  context,
) -> None:
  with aws_lambda_catalog:
    aws_lambda_main_impl(event, context)

@inject
def aws_lambda_main_impl(
  event,
  context,
  *,
  s3: S3Client= inject.me(),
  sns: SnsClient = inject.me(),
  logger: ProvidedLogger = inject.me(),
) -> None:
  # do some work
  # ie. process incoming event, publish sns message, etc
  # ...
  s3.put_object(event, ...)
  sns.notify({ "type": "...", "message": "work done" })
  logger.info("event processing finished successfully)

(...)

As a side note...

My specific needs, as of now, are for relatively short executables, parameterized by a combination of CLI arguments, environment variables and configuration values coming from all sorts of external places/services. Each project (a repository; one or more for each team and product) will have an assortment of job implementations of varied types, written as Python functions, executable modules or normal Python scripts. There will be a lot of projects, and a lot of jobs for each project (ie. it's numerous in both aspects). Each job type parses, fetches and merges parameters (CLI, env-vars, etc) in different ways, but some common properties among them are somewhat guaranteed. Each job will execute in a specific environment, or will interface with a specific systems, so a considerable number of services (service classes, etc) will be offered in the form of an internal framework, to standardize those services and some aspects of job implementations.

Catalogs would allow me to extend a baseline with the necessary implementations for each job type (not each job!), without enforcing the entire framework upon all of them (ie. not putting everything in world). Each job would pick the catalogs they need, so that the amount of scanned dependencies could be reduced, and dependencies could be more easily traceable. It's still up to the job implementation to orchestrate how its injected services are used, or implement details not covered entirely by a service (ie. things specific to that job).

I can see most of the smaller or simpler jobs using the short (@inject) form, because some use cases can be abstracted away quite significantly. This is what we were focusing on (ie. simplifying and reducing the amount of code we need to write). But the bigger ones may be complex enough to warrant more code. In both cases, the use of a DI library would do wonders.

A catalog could hold other catalogs in which it'll search for dependencies and use its providers as last resort.

Well, that's the inverse of what I was thinking of, in terms of dependency resolution order.

My catalog proposal was based on my experiences with NestJS modules. I also used it as a base model. Please take a look at this diagram, from their documentation:

My initial idea was for dependencies to be looked up catalog-first. If strict: True, only the catalog and global-level (in our case, world) is looked up. If dependency is not available locally, and strict: False, it would be looked up at application-level (which in our case, would be the application entrypoint's catalog) or at global level (in our case, world), but only if strict: false`. Overrides could then be implemented in terms of replacing the entrypoint's catalog with another, wich in turn imports the initial catalog. World could also be a last-level catalog, added by the resolution algorithm automatically.

With providers last, dependencies would always come from world first, if catalogs would import world in some fashion. Perhaps that's why you favor world dependencies to be explicit, but I don't think it's a good idea to force specific catalogs, not even world, because it would become unwieldly rather fast. Provider-last also means that the context-manager will be the only way for dependency overrides to happen -- which I can deal with, but it's a bit limiting.

In general, apart from factories, I don't think injection points (e.g. a type-annotated parameter or property) should bother with where the dependency comes from, at all. The only point at which it is relevant to specify a catalog, in my opinion, is on your application's entrypoint, or on catalog dependencies. The initial catalog selection is obviously important, but in any other point, neither library nor application authors should bother with where an injected parameter comes from.

Up to now, for factories, that degree of specificity was necessary so that importing the injected (implementation) class could be guaranteed. Now, the catalog is doing pretty much the same (guaranteeing that all the classes it offers are loaded), but in a well defined "DI package" that's explicitly selected by the user. That's why I don't think we need to be this specific.

In my understanding, a world dependency is there just so that it is available to all catalogs, regardless of whether they import each other or not. Such a feature should be used judiciously, though, e.g. a framework based on Antidote can provide some DI-managed registry service, but library authors using said framework should focus on providing their dependencies through catalogs.

Even when disconsidering world, if dependencies come from imported catalogs first, a bit of the benefits of having a catalog "scope" is lost, as local dependencies will be ignored in favor of external ones. For long chain of imports, the more front-facing catalogs lose a bit of relevance as dependency providers themselves, and would be mostly relegated to be simple catalog importers.

Also, by establishing catalogs and its inter-dependencies, we may have multiple injection candidates. The idea was to use catalogs not only to group those dependencies, but to also provide specialized ones. This is especially true for indirect providing (ie. when what's being requested is some ABC/interface). If imported catalogs have the preference, I will need to tell explicitly from which catalog each dependency comes from, either with context managers or some other mechanism, when said specializations are to be preferred. It's a level of specificity that I didn't want to enforce users with. It also makes it harder for library authors to provide said specializations, for the same reason. Qualifiers can alleviate this problem, but because of the resolution order, dependencies nearer the entrypoint catalog will have the least preference by default, which can be unintuitive.

From @flisboac

A catalog could hold other catalogs in which it'll search for dependencies and use its providers as last resort.

Regarding this again, I spent some time thinking about this resolution order specifically, and on second thought, it may make more sense for provider-last to be the default.

It guarantees a more deterministic resolution, because it will always go for the first provided dependency in the catalog hierarchy, much in the same way as class/module loaders are implemented in most languages. Once the dependency is found, that very same dependency is used again on later resolutions, something that won't be guaranteed if providers are checked first. Now, depending on the size of the catalog hierarchy, I'm not sure if it'll be easy for users to find where the dependency comes from, but from a "dependency loader" point of view, this makes more sense.

So, now I think your approach is the right way to go. But how will overriding work in this case (e.g. during tests)?

From @Finistere

Regarding this again, I spent some time thinking about this resolution order specifically, and on second thought, it may make more sense for provider-last to be the default.

To be honest, I don't have any strong opinions on this matter. In my previous comments, I only tried to have an understanding of what you would like and think about how it would impact the current code & API. I wrote it this way because it seemed natural with the idea of having a catalog overriding its parent dependencies. The providers would contain the dependencies of a specific catalog after all as they do for the Container currently. So having catalogs first means children override the parent catalog.

(...)

Intuitively what's important for me is for the API to be as clear as possible. It should be straightforward to understand which catalog is used. It'd be possible to do runtime inspection with world.debug(), but one shouldn't have to use this. It might be better to not provide context managers syntax at all. And instead only add catalog keyword to @inject, @factory, @service, etc... Anyone could then wrap those decorators with their own if you're using specific catalogs at several places. And adding the possibility to register a catalog to world: allowing libraries to expose specific dependencies and application entry points to add globally the catalogs they need.

In tests, it's fine to have context managers like world.test.clone() because they're usually short-lived or with a clearly defined context (function, class, module, global). But for application code, outside of the entry point, it doesn't sound like a good idea at first sight. And I want to avoid anything that could encourage poor code design. As said though, those are only thoughts nothing more yet. Coming back to you on this soon. :)

From @Finistere

But how will overriding work in this case (e.g. during tests)?

For sure, one needs to be able to override dependencies of world or any specific catalog. This shouldn't be hard. Currently, Antidote completely replaces the global container in tests with one that supports overrides. And like catalogs are before providers in my previous example, overrides are executed before any provider. A similar logic would be applied. Catalogs would probably have a "test"-mode in which they would support overrides.

From @flisboac

To me, intuitively child catalogs would be first but parent catalogs would be after providers

Just to be sure we're on the same page regarding the terminology... The only mechanism by which you could be able to establish some kind of catalog hierarchy is via importing, right? In this case, the dependent catalog (child) imports its depending catalog (parent). So, in this case, the dependent catalog (child) would provide first, and only then would it provide from parents.

(You could also invert that logic, but well... Which will it be?)

For sure, one needs to be able to override dependencies of world or any specific catalog. This shouldn't be hard.

That's reassuring. I was a bit unsure as to how this could be implemented.

And instead only add catalog keyword to @Inject, @factory, @service, etc...

Well, I'd add it only to @inject, because that could denote an application entrypoint, for the reasons I exposed in my previous comment.

@factory and @service are just injectables (dependencies), and they will be part of a catalog, not require one. Only catalogs import catalogs. Consider a catalog like you would a Python module, and it'll make sense. Defining the catalog in an @inject is only valid because the decorated element is outside the dependency injection mechanism (i.e. is not an injectable; it only receives injections).

Just as an example, from NestJS's documentation, this is how you create a module:

import { Module } from '@nestjs/common';
import { CatsController } from './cats.controller';
import { CatsService } from './cats.service';

@Module({
  controllers: [CatsController],
  providers: [CatsService],
})
export class CatsModule {}

Note the decoration there. We could even follow the same idea, and type-validate the new module via some Protocol type (instead of e.g. forcing CatsModule to inherit a Catalog class).

In NestJS, providers can be classes, as long as it is decorated with @Injectable.

Now, your application must run in terms of a "root module". Any module can be a root module. For example:

import { Module } from '@nestjs/common';
import { CatsModule } from './cats/cats.module';

@Module({
  imports: [CatsModule],
})
export class AppModule {}

Note the imports there. In this case, AppModule depends on and imports CatsModule. That's what I'm suggesting we follow as well. (You can even export imported modules, which is super useful to create modules that aggregates functionalities from multiple modules!)

To execute the app, you need to create an "entrypoint": it's either a script, or a function, etc, which instantiates a NestApplication from an initial module:

import { NestFactory } from '@nestjs/core';
import { AppModule } from './app.module';

async function bootstrap() {
  const app = await NestFactory.create(AppModule);
  await app.listen(3000);
}
bootstrap();

The app.listen is only relevant if you have controllers in your app, which are REST endpoints NestJS automatically exposes via an internally managed express web server. In this case, CatsController would be served locally, at localhost:3000.

For Antidote, the approach should instead be more similar to standalone apps, because Antidote is not a web framework. That means some component of your app should be delegated as the entrypoint; it would then fetch and execute those managed services. For example:

import { NestFactory } from '@nestjs/core';
import { AppModule } from './app.module';
import { SomeTask } from './some/nested/module/some-task.service'

async function bootstrap() {
  const app = await NestFactory.create(AppModule);
  const someTask= app.get(SomeTask);
  someTask.run();
}
bootstrap();

bootstrap is the entrypoint here. SomeTask is an injectable class from some module. It's either provided or imported by the "root" (app) module. (It may be hard to find from where SomeTask is being provided, but I think that's for the better, because in most cases this won't be relevant. You should depend on the interface, not the implementation (e.g. in Python terms, SomeTask could be just an ABC). I think just following the trail of module imports (or, in Antidote's case, world.debug) is a good compromise for allowing IoC and improving discoverability.)

What I suggested for @inject was some automation of this "entrypoint" logic. The entrypoint would not be injectable, but it could receive injections. Instead of doing gets, you could just decorate some function's parameters with the types you want to inject, and @inject would do the rest, provided you parameterize it with the catalog you want as a source/root.

Finistere · 2022-04-21T12:38:08Z

Finistere
Apr 21, 2022
Maintainer

Thanks for creating the AEP. :D

Here's my understanding of the Catalog idea :

A catalog is a collection of dependencies, eventually separating public and private ones.

On injection, @inject, the catalog to use, from hereon named application catalog, is:

the one specified explicitly as an argument, if any.

@inject(catalog=X)
def f():  # X catalog is used for any dependencies of f()
    ...

the one defined by an injected parent function in the stack:

@inject(catalog=X)
def f():  # X catalog is used for any dependencies of f()
    g()

def g():
    h()

@inject
def h():  # catalog is not specified explicitly, as such we're using the one specified by f().
    ...

There is only one application catalog, as such neither g() nor h() can redefine the application catalog. An injected function declaring a catalog would be an application entrypoint, hence why re-declaring the application catalog would have little sense at first sight.
If no application catalog is specified, the global one, world, is used.

Dependencies declared inside a catalog make also use of @inject. Those should use the catalog's dependencies. If specified explicitly, @inject could escape the catalog and search in the application catalog.

@interface(catalog=catalog)
class Dummy:
    pass

@service(catalog=catalog)
class StrictService:
    # Dummy is only searched for inside `catalog`.
    def __init__(self, dummy: Dummy = inject.me()) -> None:
        pass

@service(catalog=catalog)
class EscapeService:
    # Dummy searched from the application catalog.
    def __init__(self, dummy: Dummy = inject.me(strict=False)) -> None:
        pass

When using strict=False, the current catalog is not prioritized, we're passing through the application catalog and using its resolution order.
A catalog can be "imported" inside another catalog. The former is a child catalog and the latter the parent catalog. Dependencies declared by the parent catalog take priority over the children one. This allows a parent catalog to override any logic defined in a children catalog when applicable.
The application catalog would actually be the only possible child of the world catalog. As such, global dependencies are always available and override any catalog.
In tests, it must be possible to override dependencies from any specific catalog. So something like:
```
with catalog.test.clone():
    ...
```

Rough implementation :

Currently, Antidote defines a global Container which acts as world. Container would become a Catalog and world the root catalog. Each Catalog would have its own Providers and child Catalogs. Provider would be used before child Catalog during dependency resolution. Catalog would only cache dependencies which their own Provider provided. It shouldn't impact performance as the Cython version should be able to aggressively cache dependencies, remembering which Catalog/Provider was the source.

Uncertainties :

How should dependencies be declared to be part of a catalog? I see two possibilities:
1. Specifying explicitly the catalog: @service(catalog=catalog)
2. Relying a context manager:
```
catalog = ...
with catalog:
    @service
    class Dummy:  # declared inside catalog
        pass
```
At first sight, the first is more explicit.

Let me know if I missed/misunderstood something @flisboac!

0 replies

flisboac · 2022-07-31T22:34:53Z

flisboac
Jul 31, 2022
Author

I must apologize in advance for not commenting earlier, but yes, that's pretty much what I had in mind.

I only want to add one thing, which may very well be a nitpick, or not be justifiable as a functionality. Please do tell me what you think (because I'm on the fence about it, in a way).

What about declaring a provider programmatically, ie.:

# Just an example
from antidote import create_catalog, injectable, Catalog

class SomeService:
  ...

catalog = create_catalog()
catalog.provide(injectable.of(SomeService))

# Or even:
def build_catalog(config_file: str, *, enable_some_service: bool = False) -> Catalog:
  catalog = create_catalog()
  # ...
  if enable_some_service:
    catalog.provide(injectable.of(SomeService))
  return catalog

I suggest this because:

It would be less strict, as SomeService would not be bound by a specific catalog, and therefore could be provided in multiple catalogs if needed. Which instance of it is injected would depend solely on the catalog import hierarchy the user builds up.
The build_catalog idea is based on NestJS's dynamic imports. It's a way to customize catalog generation. (parameterized catalogs is a good name for it, although the catalog itself is not parameterized, but in the context of the technique, it kinda is).

Specifically for (2), I only suggest it because in the past I found it very useful to re-provide dependencies with some small twists, like specifying a different configuration location, or other assorted parameters for catalog/module creation. This was especially handy when you had no option of changing some upstream library (e.g. the class has no injectable configuration, but you must inject that very same class, whilst trying not to break compatibility).

2 replies

flisboac Jul 31, 2022
Author

I have to say, though, that the enable_some_service thing could be implemented in terms of qualifiers, in some cases.

flisboac Jul 31, 2022
Author

Btw, @Finistere, should I put this post in the PR instead?

Finistere · 2022-08-01T08:42:24Z

Finistere
Aug 1, 2022
Maintainer

Btw, @Finistere, should I put this post in the PR instead?

No, it's fine, not everything needs to be in the PR. The V2 has a lot of things in it. ;)

What about declaring a provider programmatically, ie.:

I'm not really in favor of it. I see two issues with it:

Method wiring: @injectable doesn't create a new class, it modifies the existing one by wrapping the methods if there are any injections to be done. So applying multiple times @injectable on a class for different Catalog is not really possible as it is, unless you prevent any wiring with wiring=None for all of them. The __init__() function would be wired for the first @injectable and that's it otherwise.
It hides the definition of the class. Now when you look for SomeService to understand how it behaves you won't know directly. Is it a singleton? What's the wiring? Is it using a factory_method or just plain __init__() ?

Specifically for (2), I only suggest it because in the past I found it very useful to re-provide dependencies with some small twists, like specifying a different configuration location, or other assorted parameters for catalog/module creation. This was especially handy when you had no option of changing some upstream library (e.g. the class has no injectable configuration, but you must inject that very same class, whilst trying not to break compatibility).

To me, it seems like the dependency you had to change should have been defined with the equivalent of @interface / @overridable (the latter also defines the wrapped class as the default implementation). If it wasn't, I suppose there probably was a design issue somewhere. Being able to define the same class in different ways through @injectable isn't what I would naturally expect, it's pretty much @interface goal to me.

1 reply

flisboac Aug 1, 2022
Author

Totally understandable. In retrospect it was an unreasonable request. There're a lot of ways one could achieve the same objectives with Antidote.

Finistere · 2022-08-01T14:52:41Z

Finistere
Aug 1, 2022
Maintainer

Well, now I realize that you cannot override dependencies even in V2. Every mechanism for dependency registration checks for duplicates in the other providers and child catalogs. Outside of @interface/@overridable which is designed for it, IMHO it seems to just hurt maintainability to allow it. And adding an implementation is always done in the catalog in which the interface is registered, so it's not really overriding anything on the catalog level.

Without overrides, specifying @inject(catalog=app_catalog) isn't really useful except for weird cases and for people extending Antidote with their own provider logic. That's probably how it going to be released for now as I don't see any good APIs for now. Overriding constant defined with const could be one use-case though.

1 reply

flisboac Aug 1, 2022
Author

And adding an implementation is always done in the catalog in which the interface is registered

I thought we had the option to provide an implementation outside the catalog too. That would be useful in some scenarios. But it's fine, perhaps that can come in the future, if you think it's worth it. (Or maybe I can introduce a PR, once I become more familiarized with the V2 codebase).

Without overrides, specifying @Inject(catalog=app_catalog) isn't really useful

I agree. But we can always improve later on. The API is plenty useful as it is now.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

AEP1 - Provider Catalogs #45

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 4 comments 4 replies

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{title}}

{{title}}

Select a reply

AEP1 - Provider Catalogs #45

flisboac Apr 21, 2022

About multi injection

... And yet another proposal!

Replies: 4 comments · 4 replies

Finistere Apr 21, 2022 Maintainer

flisboac Jul 31, 2022 Author

flisboac Jul 31, 2022 Author

flisboac Jul 31, 2022 Author

Finistere Aug 1, 2022 Maintainer

flisboac Aug 1, 2022 Author

Finistere Aug 1, 2022 Maintainer

flisboac Aug 1, 2022 Author

flisboac
Apr 21, 2022

Replies: 4 comments 4 replies

Finistere
Apr 21, 2022
Maintainer

flisboac
Jul 31, 2022
Author

flisboac Jul 31, 2022
Author

flisboac Jul 31, 2022
Author

Finistere
Aug 1, 2022
Maintainer

flisboac Aug 1, 2022
Author

Finistere
Aug 1, 2022
Maintainer

flisboac Aug 1, 2022
Author