You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In a multi-tenant cloud use case, multiple iModels are served through a single node process. Typically, nodes are not designed for multi-threaded workflows. By default, concurrent queries use four worker threads and one monitor thread. The monitor thread checks the quota and can interrupt a query if it exceeds the assigned quota, running every second. Worker threads operate whenever there are queries in the queue, using a thread-safe callback to post results back to the node process.
Node.js is essentially a single-threaded library that uses an event loop. To send or receive data, tasks must be queued. When the event loop processes asynchronous tasks, it picks one from the queue and performs a callback, transforming data from the native layer on another thread into Node.js-compatible data. These asynchronous tasks are queued in the microtask queue, and every time the event loop runs, Node.js processes all tasks in the microtask queue.
In a multi-tenant setup, let's say we have ( n ) iModels running on a single node process. Each iModel will have a minimum of 5 threads, resulting in ( n times 5 ) threads. However, with only 4 CPUs available, the Kubernetes pod will divide the CPU time among all the threads, causing them to take longer to execute.
The primary goal of concurrent queries is to maximize CPU time per thread.
If there aren't enough full CPUs available per thread, all queries will run slower and may time out more frequently.
Node.js will have to process a much longer microtask queue under load, making it less responsive. In the worst case, it will be stuck processing all microtasks from ( n times 4 ) threads, in addition to queuing new queries.
This setup fails mainly because iTwin.js and concurrent queries were designed for a single backend and single iModel, not for a multi-tenant environment.
All synchronous calls from JavaScript to native code block the thread. More connections and higher load on the element API will make Node.js unresponsive.
When this monolithic server backend fails, all users of those iModels are affected. New backends may be spawned with less contention initially, but they will also become unusable under load over time.
We've overcomplicated the architecture by trying to create a server that manages multiple users inside Kubernetes, which is designed to do the same thing.
The memory required by the number of threads is significant. Each connection takes up 32MB of SQLite page cache. If ( n = 10 ), then at maximum load, 1GB is used solely for SQLite connections for each thread, not including other caches.
Overall, we should move away from multi-tenant setups when possible and investigate other ways. For example instead, we should scale up and down using Kubernetes, where the backend performs a single task, such as running a query, and then terminates. Even though concurrent queries emulate a server, it would be best if they only had two threads: one to compute the query and another to interrupt it for quota purposes. We should leave the scaling to Kubernetes.
Anyway while we have this architecture we can improve it and here is how
What need to change
A single pool of threads should be used by the entire node process, as having more threads than CPUs is not beneficial.
All connections should queue tasks to the same pool of threads, optimizing CPU usage.
If there are a significant number of pending tasks, we should return an error to Kubernetes to scale up, indicating that the current process is busy and has too many tasks in the queue.
The text was updated successfully, but these errors were encountered:
In a multi-tenant cloud use case, multiple iModels are served through a single node process. Typically, nodes are not designed for multi-threaded workflows. By default, concurrent queries use four worker threads and one monitor thread. The monitor thread checks the quota and can interrupt a query if it exceeds the assigned quota, running every second. Worker threads operate whenever there are queries in the queue, using a thread-safe callback to post results back to the node process.
Node.js is essentially a single-threaded library that uses an event loop. To send or receive data, tasks must be queued. When the event loop processes asynchronous tasks, it picks one from the queue and performs a callback, transforming data from the native layer on another thread into Node.js-compatible data. These asynchronous tasks are queued in the microtask queue, and every time the event loop runs, Node.js processes all tasks in the microtask queue.
In a multi-tenant setup, let's say we have ( n ) iModels running on a single node process. Each iModel will have a minimum of 5 threads, resulting in ( n times 5 ) threads. However, with only 4 CPUs available, the Kubernetes pod will divide the CPU time among all the threads, causing them to take longer to execute.
Overall, we should move away from multi-tenant setups when possible and investigate other ways. For example instead, we should scale up and down using Kubernetes, where the backend performs a single task, such as running a query, and then terminates. Even though concurrent queries emulate a server, it would be best if they only had two threads: one to compute the query and another to interrupt it for quota purposes. We should leave the scaling to Kubernetes.
Anyway while we have this architecture we can improve it and here is how
What need to change
The text was updated successfully, but these errors were encountered: