Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Set scrape timeout based on Prometheus header #60

Open
wants to merge 4 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 3 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -32,6 +32,7 @@
import io.rsocket.util.DefaultPayload;
import org.springframework.stereotype.Controller;
import org.springframework.web.bind.annotation.GetMapping;
import org.springframework.web.bind.annotation.RequestHeader;
import org.springframework.web.bind.annotation.RestController;

import io.rsocket.util.EmptyPayload;
Expand All @@ -48,6 +49,7 @@
import java.nio.channels.ClosedChannelException;
import java.security.*;
import java.security.spec.PKCS8EncodedKeySpec;
import java.time.Duration;
import java.util.Map;
import java.util.concurrent.ConcurrentHashMap;
import java.util.stream.Collectors;
Expand Down Expand Up @@ -157,15 +159,20 @@ public Mono<String> proxyMetrics() {
}

@GetMapping(value = "/metrics/connected", produces = "text/plain")
public Mono<String> prometheus() {
public Mono<String> prometheus(@RequestHeader(value = "X-Prometheus-Scrape-Timeout-Seconds", required = false) String timeoutHeader) {
Duration timeout = determineTimeout(timeoutHeader);

return Flux
.fromIterable(scrapableApps.entrySet())
.flatMap(socketAndState -> {
ConnectionState connectionState = socketAndState.getValue();
RSocket rsocket = socketAndState.getKey();
Timer.Sample sample = Timer.start();
return rsocket
.requestResponse(connectionState.createKeyPayload())
Mono<Payload> request = rsocket.requestResponse(connectionState.createKeyPayload());
if (timeout != null) {
request = request.timeout(timeout);
}
return request
.map(payload -> connectionState.receiveScrapePayload(payload, sample))
.onErrorResume(throwable -> {
scrapableApps.remove(rsocket);
Expand All @@ -184,6 +191,25 @@ public Mono<String> prometheus() {
.collect(Collectors.joining("\n"));
}

private Duration determineTimeout(String timeoutHeader) {
if (timeoutHeader == null) {
return null;
}

try {
Duration timeout = Duration
.ofMillis((long) (Double.parseDouble(timeoutHeader) * 1_000))
.minus(properties.getTimeoutOffset());

if (timeout.isNegative() || timeout.isZero()) {
return null;
}
return timeout;
} catch (NumberFormatException e) {
return null;
}
}

class ConnectionState {
private final KeyPair keyPair;

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -17,12 +17,16 @@

import org.springframework.boot.context.properties.ConfigurationProperties;

import java.time.Duration;

/**
* @author Christian Tzolov
*/
@ConfigurationProperties("micrometer.prometheus-proxy")
public class PrometheusControllerProperties {

private static final Duration DEFAULT_TIMEOUT_OFFSET = Duration.ZERO;

/**
* Proxy accept TCP port.
*/
Expand All @@ -33,6 +37,11 @@ public class PrometheusControllerProperties {
*/
private int websocketPort = 8081;

/**
* Scrape timeout offset.
*/
private Duration timeoutOffset = DEFAULT_TIMEOUT_OFFSET;

public int getTcpPort() {
return tcpPort;
}
Expand All @@ -48,4 +57,16 @@ public int getWebsocketPort() {
public void setWebsocketPort(int websocketPort) {
this.websocketPort = websocketPort;
}

public Duration getTimeoutOffset() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this needed? Should not we set it to whatever value the client wants?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the timeout was set to the exact value of the Prometheus scrape timeout (given by the header), any one app instance that is too slow to respond (thus timing out), would also cause us to hit the Prometheus scrape timeout (since they're the same in this hypothetical). Thus, as a result of a single instance timing out, no metrics are successfully scraped from any of the other connected instances (because Prometheus hit its scrape timeout and closed the connection).

So, a small offset is needed to make the individual instance scrape timeout slightly less than the Prometheus scrape timeout, allowing time for the metrics to be scraped from all connected instances, concatenated and returned via the network before Prometheus itself times out.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this is needed for the reason you describe, shouldn't we set the default to a small but nonzero value?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It depends. Defaulting to a zero offset maintains backwards compatibility with the old behavior but doesn't give the benefit of the offset from a timeout perspective.

On the other hand, defaulting to a nonzero offset gives the timeout benefit described above but technically is a breaking change. For example, consider a Prometheus setup with a 10s scrape timeout and assume we chose an offset of 1s. In this case, the RSocket timeout is 9s. Let's say the user running this Prometheus setup previously had RSocket scrape duration of about 9.5 seconds. Prior to this change, they'd be running their setup without issue since there's no explicit RSocket timeout and 9.5 seconds is lower than the Prometheus 10s timeout. However, with the introduction of this change, suddenly this user's scrapes are timing out on the RSocket side from the 9 second timeout and thus aren't getting the metrics they were getting previously.

I don't think this is necessarily a likely scenario and the odds of this happening decrease as the default offset also decreases, but it's definitely a consideration.

Thoughts?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I really appreciate you thinking about backward compatibility and the potential negative effects of this. It's important for us to take these things seriously and avoid as much pain for users as possible. That said, I think if we're changing this behavior in a minor release (as opposed to a patch release), we make note of it in the release notes, and it is configurable so people negatively affected by it can effectively undo the behavior change, I think it is worthwhile moving things in a better direction. That's how I feel anyway. @jonatan-ivanov What do you think?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I looked into this, it makes sense, Prometheus sends the scrape_timeout value in X-Prometheus-Scrape-Timeout-Seconds and I did not find a way to account for latency on the Prometheus server side. I find this unfortunate since:

  1. If you have multiple Prometheus servers in different network locations with different latencies (let's pretend it is significant enough), the scraped app needs to account for the higher network latency. This feels somewhat wrong because the scraped app should not be aware of this.
  2. Or if you move a Prometheus server to one network location to another (let's pretend the latency it is significant enough), you need to change the offset for every scraped app.

But if the offset would be on Prometheus side the issue is somewhat similar:

  1. Scraped apps can be in multiple network locations so Prometheus should account for highest network latency, though configuring this per job could solve this.
  2. Or if you move an app that is scraped, you need to change all Prometheus instances, though you usually have less Prometheus instances than apps.

So I guess this is what we have, it makes sense to me.

I think I agree with @shakuzen about having a small but non-zero default value in a milestone release. To me the interesting part is what should that value be.
I should be small enough so the negative effect is minimal and big enough so that it can account the network delay which includes transfer cost which depends on the response size. My first guess is around 100-500ms.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you think @jdlafaye about having a small default offset?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's reasonable to set a small default offset as it will provide an immediate benefit in a number of setups. I'd probably lean towards 500ms as that meshes well with the default Prometheus scrape timeout of 10s and also happens to be the same as what Blackbox Exporter uses.

Let me know your thoughts and I can make that change.

return timeoutOffset;
}

public void setTimeoutOffset(Duration timeoutOffset) {
if (timeoutOffset == null || timeoutOffset.isNegative()) {
this.timeoutOffset = DEFAULT_TIMEOUT_OFFSET;
} else {
this.timeoutOffset = timeoutOffset;
}
}
}