-
Notifications
You must be signed in to change notification settings - Fork 49
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Some APIs result into 500 error #665
Comments
hi @MadhuNEC :-)
The variable name is actually
Can you post more details about the error you're getting? Any trace of that in the logs? What's the backend you're using? Crate DB or Timescale? What's the load on the DB? We've had similar issues in the past which boiled down to not allocating enough resources to the database backend. So when QL tried to run a query, the DB would just refuse to run it b/c it was overloaded. It could be you're experiencing something similar, but I can't be sure. Like I said, we'd need to know more about your test environment, errors you get in the logs, DB load, etc. |
Yes, I am using WORKERS variable. Sorry, mistaken to write in the comment, updated the same. I am using CrateDB backend. For DB Load, there are 10 entities created and for each entity approx 5000 data is present. For example, there are 10 entities like tid1, tid2, tid3 etc and each entity (tid1,tid2, ..) contains approx 5000 values of pressure. |
cool, just wanted to double-check w/ you to rule out possible config issues.
Then yes, I agree w/ you QL could be the bottleneck. At a 20 req/sec throughput rate and 40 workers, it looks like each worker should be busy for up to 2 secs. That means, the producer (jmeter) is faster than the consumer (QL) and eventually QL will be in a situation where all 40 workers are busy but new requests are still coming in. Keep in mind workers do the work sequentially (pun intended :-), so 40 workers means at most 40 concurrent queries. In this scenario Gunicorn will have no worker process to assign incoming requests to and so will just return a 500. On the other hand, it could also well be that each request takes up to 2 secs on average not b/c QL is slow but rather Crate DB can't keep up w/ the query rate. I've seen this in the past and the solution was to give Crate DB enough RAM to perform decently---Crate is a fine piece of software but you can't expect it to match your workload if you don't give it enough resources, have a look at the manual for the details. Then you could also up the number of QL workers. On a side note, we never really worked on query optimisation, but we did identify some potential performance hot spots. For a complete analysis you can read In particular, the issues/performance section could be applicable to your scenario. But again, keep in mind that model is just an abstract model, we've never validated it w/ real measurements. Speaking of which, one way to get to the bottom of this would be to use QL's built-in telemetry to figure out how the QL load varies as a function of the input requests and how much of the processing time is spent waiting for Crate DB to return query result sets. Hope this helps! |
Hi @c0c0n3 , I have checked the number of busy workers during the script execution. I have checked that only 4-5 workers are busy at any point of time.
I have executed the QuantumLeap APIs from jmeter. Below is the details for script execution: |
Hi @MadhuNEC :-) So I've finally found the time to look at this issue. What I did, I followed the steps in - https://github.com/orchestracities/ngsi-timeseries-api/wiki/Gauging-Performance to do some load testing. Then as explained in the wiki article I used Pandas to do some basic data analysis. It turns out Gunicorn actually distributed the work quite evenly among my 10 workers---I don't have enough horsepower to test w/ 40. So basically I got pretty much the same results as in the wiki article.
How did you do that? Looking at the code in your
and do statistical analysis with Pandas as explained in the article and let us know if you get different results? Thanks alot!! |
Hi @c0c0n3 can you please suggest a way to do this? |
Hi @NEC-Vishal, did you run these commands before calling $ cd /path/to/ngsi-timeseries-api
$ source setup_dev_env.sh
$ pipenv install --dev
$ cd src/tests/benchmark |
We are trying to execute QuantumLeap APIs via jmeter script. But some of the APIs result into 500 error. Below are the configurations for API execution:
API: GET http://host/v2/entities/tid1/attrs/pressure?type=typ&limit=100&offset=1534&fromDate=2022-06-02T00:00:00Z&toDate=2022-07-04T23:59:59Z
Throughput: 20req/s.
WORKERS: 40
QuantumLeap Version: 0.8.1
We have tried increasing the value of WORKERS variable, it decreases the probability of 500 error. But the 500 error is still occurring.
Is increasing the WORKERS value be correct solution for this issue? If yes, how could I calculate the correct value for WORKERS value?
The text was updated successfully, but these errors were encountered: