Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OpenTelemetry support #921

Open
deki opened this issue May 3, 2021 · 14 comments
Open

OpenTelemetry support #921

deki opened this issue May 3, 2021 · 14 comments

Comments

@deki
Copy link

deki commented May 3, 2021

OpenTelemetry is an observability framework for cloud-native software and comes with a collection of tools, APIs, and SDKs. You can use it to instrument, generate, collect, and export telemetry data (metrics, logs, and traces) for analysis in order to understand your software's performance and behavior.

The OpenTelemetry PHP library is in development and located here: https://github.com/open-telemetry/opentelemetry-php
This example shows how it can be used: https://github.com/open-telemetry/opentelemetry-php/blob/main/examples/AlwaysOnOTLPExample.php

While this is working with bref without any issues, it currently requires a lot of extra work to have the traces from the OTLPExporter in AWS X-Ray. AWS simplifies this for e.g. Java and Python by packaging OpenTelemetry together with an out-of-the-box configuration for AWS Lambda and AWS X-Ray in an easy to setup layer (see https://aws-otel.github.io/docs/getting-started/lambda).

Is it possible to combine this extension layer https://github.com/open-telemetry/opentelemetry-lambda/tree/main/collector with bref? Or how would you advise users to plug it together?

@mnapoli
Copy link
Member

mnapoli commented May 3, 2021

Hi, I'm not familiar with this tool. How would you integrate that with Bref? More specifically:

  • what would users have to do to use it?
  • what would need to be changed in Bref to provide that to users?

@deki
Copy link
Author

deki commented May 5, 2021

Thanks @mnapoli for the quick response.

  • what would users have to do to use it?
    Add the opentelemetry-php lib to composer. Configure an endpoint where the trace data should be send to. This can be done using the OTEL_EXPORTER_OTLP_ENDPOINT environment variable.
    Follow https://github.com/open-telemetry/opentelemetry-lambda/blob/main/collector/README.md to publish the OpenTelemetry Collector Lambda layer to the AWS account. Add it to the existing serverless.yml to the existing layers section.

  • what would need to be changed in Bref to provide that to users?
    Probably documentation only.

I will continue experimenting and let you know my findings.

@deki
Copy link
Author

deki commented May 31, 2021

So the following needs to be done in addition to what I've already mentioned before:

Setting OTEL_EXPORTER_OTLP_ENDPOINT is not required, however to change the Collector config to a custom collector.yaml OPENTELEMETRY_COLLECTOR_CONFIG_FILE can be used.

@sc0ttdav3y
Copy link

Hi @deki, I've come across your posts in various repos about getting XRay + Bref working, and I'm wondering whether you had any success getting it working, and if so, how?

So far I've found a trail of broken AWS blog posts, examples and other docs — the most useful are:

I've tried this:

  • Built and deployed the custom opentelemetry collector as per your advice, and added it to my serverless function's layers section via its ARN
  • pulled in the composer deps for open-telemetry at version 0.0.17 (latest at time of writing)
  • wrangled all my transports, exporters, id generators, propagators, tracers, spans, etc gist

I got to the point where it's almost working.

I'm seeing x-ray logs in my lambda logs, indicating PHP is sending data to the collector:

2023-01-08T11:43:42 {"level":"info","ts":1673178222.7031276,"msg":"TracesExporter","kind":"exporter","data_type":"traces","name":"logging","#spans":1}
2023-01-08T11:43:42 {"level":"info","ts":1673178222.7031856,"msg":"ResourceSpans #0\nResource SchemaURL: \nResource attributes:\n     -> faas.name: Str(XXXXX)\n     -> faas.version: Str($LATEST)\n     -> cloud.region: Str(ap-southeast-2)\n     -> cloud.provider: Str(aws)\nScopeSpans #0\nScopeSpans SchemaURL: \nInstrumentationScope io.opentelemetry.contrib.php \nSpan #0\n    Trace ID       : 63baac6e38c7d4ad3a7b755834919a73\n    Parent ID      : 84bfa37214deb991\n    ID             : d810f21b345baa63\n    Name           : XXXX::YYYY::ZZZZ\n    Kind           : Internal\n    Start time     : 2023-01-08 11:43:42.607256547 +0000 UTC\n    End time       : 2023-01-08 11:43:42.695558894 +0000 UTC\n    Status code    : Unset\n    Status message : \nAttributes:","kind":"exporter","data_type":"traces","name":"logging"}

I know this is sending to the collector because I initially got an error when I got the collector URL wrong.

And I see logs from the collector from time to time when it shuts down, so I know it's running. And I know the x-ray ID is correct, as the ID in the open-telemetry logs matches that on the Lambda invocation.

But with all this good news, in AWS itself the Lambda function trace itself does not include this data.

I feel I'm really close, but the docs are woefully out of date, and PHP isn't exactly a first-class citizen in AWS's eyes. @deki, if you have any insights on this or could share what worked for you, then I'd love to get hear it.

I know this isn't really the right place for this question, but I feel observability is the big issue with Bref right now, and if I can get this solved, I'd be happy to share it back to the project.

Thanks, Scott

@deki
Copy link
Author

deki commented Jan 10, 2023

Hi @sc0ttdav3y,
thanks for raising this. It looks like some things got outdated with the move to 1.0.0 beta. I've informed my colleagues (also about your other issue open-telemetry/opentelemetry-php#906) and they will respond soon.

For me it's been a while since I tried it but what I recall is that I had a similar issue with traces not showing up in Xray until I set a proper IdGenerator (X-Ray trace Ids have a different format). See: https://github.com/aws-observability/aws-otel-php/blob/74042ac1992a9d9b6cdf94fecbbfaa90070d107a/SampleApp/src/Controller/AwsSdkInstrumentationController.php#L10

@sc0ttdav3y
Copy link

Thanks @deki — I am using the IdGenerator provided by AWS, and am trying to propagate the ID from inbound headers — all using their official code. I suspect something with IDs. I'll continue plugging away, and report back here if I get any success.

@sc0ttdav3y
Copy link

I got this working in the end. It was quite a journey. Thanks to everyone for their help.

Some key insights:

  • Use the "Go" collector Lambda layer, which is supported by AWS and works with PHP. I couldn't get the custom compiled one to work, and AWS looked at me funny when I asked for help.
  • Bref's context contains a getTraceId() method that holds the X-Ray trace for propagating between services.
  • OpenTelemetry is quite young, things are changing, docs are all over the shop right now, and this wasn't easy to get going — pin your composer.json, as minor versions matter.

I ended up using the "Go" lambda layer along with bref's grpc layer (required for open-telemetry):

layers:
  - ${bref:layer.php-74}
  - ${bref-extra:redis-php-74}
  - ${bref-extra:grpc-php-74}
  - arn:aws:lambda:${aws:region}:901920570463:layer:aws-otel-collector-amd64-ver-0-68-0:1

I then developed a wrapper class to act as a facade to OpenTelemetry:
https://gist.github.com/sc0ttdav3y/0d320d08a726dd0ed204a47bd8ebb78b

Here more technical background info on my journey to get this going:
aws-observability/aws-otel-php#4

I think there's an opportunity in Bref to compile and distribute a custom layer combining the go collector, the grpc extension and a PHP class to help facade all this.

@GrahamCampbell
Copy link
Contributor

brefphp/extra-php-extensions#501

@johnrobertcobbold
Copy link

Hello @GrahamCampbell @sc0ttdav3y, does this bref php extension turn this into a "out-of-the-box configuration for AWS Lambda and AWS X-Ray"?

We are interested in using AWS X-Ray with our Bref deployments. Unfortunately, there seems to be no ressources on how to exploit this extension.

@sc0ttdav3y
Copy link

@johnrobertcobbold I was wondering the same thing :-) Haven’t had a chance yet to look into this work.

I’m still happily running using the results of my work above, but unfortunately OpenTelemetry changed their PHP APIs significantly once I worked out all the magic incantations to get it working, so I’m stuck on an older version of their library right now. That’s what living on the bleeding edge is like, I guess.

I have plans to check this out when I get a chance, but I think it will require some modernisation of my app and that won’t be for a while yet.

@mnapoli
Copy link
Member

mnapoli commented Jan 23, 2024

Hey everyone! I just released an advanced Sentry integration, especially focused on performance tracing: https://bref.sh/sentry

I'm now looking at a similar package for X-Ray. OpenTelemetry with X-Ray is a mess, so I don't think focusing on OT itself would provide a huge benefit (rather looking at a direct integration with X-Ray). But I thought I'd mention that here, maybe you have an opinion on this.

@sc0ttdav3y
Copy link

In my use-case I use OpenTelemetry's PHP library to manually start and end spans around important parts of our code, and I also record events and metrics such as DB queries, as well as exceptions and errors. This all then flows through OpenTelemetry into X-Ray which aligns to my call stack, plus it shows calls to other AWS services, and allows AWS to drive CloudWatch graphs and alarms from all this data.

We can't use other vendors like NewRelic or DataDog for compliance reasons.

The important part for me is to be able to manually instrument spans, events, metrics and exceptions, and to cascade calls to other AWS services in a way that passes the traceId to build a complete picture.

I agree that OpenTelemetry is a bit of a mess, and for us it's just a necessary pipeline into X-Ray. My focus is on PHP -> X-Ray -> CloudWatch, so if your solution replaces OpenTelemetry with something else, then so long as I can do the things I mention above then great 👍

@mnapoli
Copy link
Member

mnapoli commented Jan 24, 2024

Very interesting, thanks! Are you currently doing that on Lambda (running the OT -> X-Ray extension)? Or on servers/ECS/etc.?

@sc0ttdav3y
Copy link

I run on both Lambda and ECS, deployed via serverless framework.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants