Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rework tapir integration according to the new API #82

Open
Tracked by #662
catostrophe opened this issue Nov 8, 2021 · 8 comments
Open
Tracked by #662

Rework tapir integration according to the new API #82

catostrophe opened this issue Nov 8, 2021 · 8 comments
Assignees

Comments

@catostrophe
Copy link
Member

With the release of 0.19.0-M14 the ServerEndpoint has changed significantly.

Now it encapsulates a separate piece of security logic in form of A => F[Either[E, U]] which breaks the current approach to trace injection.

@catostrophe catostrophe self-assigned this Nov 8, 2021
@catostrophe catostrophe linked a pull request Nov 8, 2021 that will close this issue
@fiadliel
Copy link
Contributor

I'm not that familiar with the code, but I did a draft PR here: #89
As a draft, it doesn't yet attempt to engage with how to trace authenticated endpoints (you probably have better ideas than me on how that should be done), only so-called public endpoints.

@catostrophe catostrophe linked a pull request Nov 17, 2021 that will close this issue
@catostrophe
Copy link
Member Author

@fiadliel let's continue the discussion here

@fiadliel
Copy link
Contributor

I've been wondering what one actually wants here from tapir (it'd be a shame if something was implemented that wasn't what you want).

One approach I'm thinking of is that the part that calls both security and functional logic is sttp.tapir.server.interpreter.ServerInterpreter. If we had the ability to chain calls to ServerRequest => List[ServerEndpoint[R, F]] => List[ServerEndpoint[R, F]] (changing F on the way), would that be enough to implement? That would, of course, turn the top level of tracing into service-wide tracing instead of per-endpoint. But I don't think it's going to be feasible to do it per-endpoint in future, unless I (or the raw request) is added to the security signature.

This would mean that per-endpoint syntax would no longer be involved with extracting the span, just with creating a child span and decorating it with per-endpoint information.

One potential positive benefit is that tapir interceptors could become part of the trace.

Also if this change to ServerInterpreter can't be done generically, but per interpreter instance, then it gets in danger of duplicating behavior already done by other integrations, and it might make sense to leverage those integrations to lift a service into Trace[F] instead.

Thoughts?

@catostrophe
Copy link
Member Author

catostrophe commented Nov 30, 2021

The main idea of secured endpoints is that tapir doesn't parse inputs (the I parameter) until securityLogic is performed (needs only A for evaluation).

I'm working on a solution for secured endpoints and can't decide on how to act and what to demand from different kinds of inputs.

  • Alternative 1:

    1. require A to contain certain Headers a valid context Ctx can be extracted and built from.
    2. make the root span and context before securityLogic is performed, describe securityLogic in the traced effect G[_], and pass Ctx to it via a Provide[F, G, Ctx] instance.
    3. Embed Ctx into U, pass it to the main serverLogic and then make a child span from it and use it to run the contextual effect G[_] in which serverLogic is described. <-- this won't work! Do the same separately for serverLogic

    This requires the parsing of potentially insecure inputs before authorization, namely those needed for making a proper Ctx instance. We'll need to make all of them a part of the A type.

  • Alternative 2:

    1. Somehow skip tracing in securityLogic. Don't know how to implement and totally dislike this option.

@catostrophe
Copy link
Member Author

cc @fiadliel @janstenpickle
Need your help.

@catostrophe
Copy link
Member Author

N.B. the new server endpoint model is:

securityLogic: A => F[E | U]
serverLogic: U => I => F[E | O]

@fiadliel
Copy link
Contributor

There's a little nuance to "doesn't parse inputs (the I parameter) until securityLogic is performed", which is that inputs except the body are parsed, in https://github.com/softwaremill/tapir/blob/master/core/src/main/scala/sttp/tapir/server/interpreter/ServerInterpreter.scala#L59 - but it looks like a hairy part of the API to interact with. Is that a Vector[Any]?

I have a general issue with any solution where a single request ends up creating two root spans. There should be a 1:1 match between requests and spans (not considering child spans).

There seem to be a few options for creating that root span:

  1. create the span when we have no I value, just Request. Can create a span, but it has minimal endpoint context. However, it might be possible to add attributes to the span later on, when more data is available.
  2. create the span when we have no I value, just a vector of EndpointInput (excluding body). More structured than Request, but I don't know how useable this information is for tracing. But we might at least know which endpoint is being invoked, better than nothing. Also needs some way of injecting a Trace[F].span and then calling auth/service logic, tapir needs an API change for this hook.
  3. convince Adam that it's okay for I to be parsed before security logic is run.

The first option is not much different to using a server-specific integration (e.g. trace4cats + http4s), it uses zero value-add of what tapir provides. The second option might need experimentation, but it looks like the code could need a lot of refactoring to make it useable. Third option sounds like the hardest one.


It doesn't look like there are any easy and nice solutions within the tapir code.

Right now, personally, I'm using a combination of trace4cats-http4s, and a per-endpoint child span which can be annotated with endpoint name and input/output parameters. It works, but it's a bit noisy if you'd prefer a single span per request.

@fiadliel
Copy link
Contributor

fiadliel commented Dec 1, 2021

Some more thoughts:

tapir selects the endpoint that's used by whichever is the first to return a success. So firstly, we can't define the span name based on the endpoint (unless we can make the span name mutable). But that has always been an issue.

tryServerEndpoint returns F[RequestResult[B]], if it were to return F[(RequestResult[B], ServerEndpoint, securityInputs, regularInputs)] then everything can be backfilled afterwards, except for the span name.

Need a hook to create a span wrapping callInterceptors. Need callInterceptors and everything from there on to also return the selected endpoint, and the inputs. And a hook where the attributes can be derived from endpoint and inputs, before the span closes. Lots of hooks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants