Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Libpitt/enumerate thread #60

Draft
wants to merge 3 commits into
base: dev-integrate
Choose a base branch
from
Draft

Conversation

libpitt
Copy link
Collaborator

@libpitt libpitt commented Jun 28, 2024

Change log:

  1. Use enumerate alive threads and thread naming to check for running jobs

@maxsibilla Does this look okay to you? My commit:
main...libpitt/enumerate-thread

cc @yuanzhou @shirey

yuanzhou and others added 2 commits June 27, 2024 15:51
src/app.py Outdated
logger.info('Started live reindex all')
else:
logger.info('Could not start live reindex all. Thread already running.')
return 'Request of live reindex all documents not accepted', 409
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@libpitt I would change the return to add that the method is already running.

"Request of live reindex all documents not accepted. Reindex all process already running" - something like that

@yuanzhou
Copy link
Member

@libpitt @maxsibilla have you tested in a docker environment? The search-api runs as a multi-processing and multi-threading mode in the docker deployment.

@maxsibilla
Copy link
Contributor

@libpitt @maxsibilla have you tested in a docker environment? The search-api runs as a multi-processing and multi-threading mode in the docker deployment.

Just locally. I will test that now.

@maxsibilla
Copy link
Contributor

@yuanzhou That seems to be working in a docker environment

@yuanzhou
Copy link
Member

I'm converting this PR to DRAFT for now based on the conversation from Slack. If later we are sure we won't need this feature, this PR can be closed by then.

This implementation doesn't work well. I was still able to start the reindex-all process multiple times when the thread didn't exist in one of the workers. In practice, we may have more workers and each worker may have more than 1 threads.

A former developer added this PUT /reindex-all at the early phase of search-api. As we have a lot more data now, it no longer serves the desired purpose using a REST API call to handle the reindex that may take a few hours to complete. The lack of dedicated logging is another major reason we should deprecate it.

We will be working on New reindex-all script feature with dedicated logging in the upcoming sprint.

This blog explained Python (we use 3.9 currently) doesn't have true multithreading, yet. The proper support is coming to Python 3.13.

spawned uWSGI master process (pid: 35)
spawned uWSGI worker 1 (pid: 81, cores: 1)
spawned uWSGI worker 2 (pid: 83, cores: 1)
spawned uWSGI worker 3 (pid: 85, cores: 1)
spawned uWSGI worker 4 (pid: 87, cores: 1)

'PUT /reindex-all' triggered at 1720494538.1338146
Executing reindex_all() in thread id: 46912539168896, native_id: 85
Parent process id: 35
Worker process id: 85
executing _is_job_thread_running(). Thread active_count: 1
Thread name: uWSGIWorker3Core0, thread id: 46912539168896, native_id: 35
Started live reindex all

'PUT /reindex-all' triggered at 1720494565.7901337
Executing reindex_all() in thread id: 46912539168896, native_id: 87
Parent process id: 35
Worker process id: 87
executing _is_job_thread_running(). Thread active_count: 1
Thread name: uWSGIWorker4Core0, thread id: 46912539168896, native_id: 35
Started live reindex all

'PUT /reindex-all' triggered at 1720494590.0925567
Executing reindex_all() in thread id: 46912539168896, native_id: 85
Parent process id: 35
Worker process id: 85
executing _is_job_thread_running(). Thread active_count: 2
Thread name: uWSGIWorker3Core0, thread id: 46912539168896, native_id: 35
Thread name: reindex_all, thread id: 46914001823488, native_id: 100
Could not start live reindex all. Thread id: 46912539168896, native_id: 85 already running.

'PUT /reindex-all' triggered at 1720494619.6040359
Executing reindex_all() in thread id: 46912539168896, native_id: 81
Parent process id: 35
Worker process id: 81
executing _is_job_thread_running(). Thread active_count: 1
Thread name: uWSGIWorker1Core0, thread id: 46912539168896, native_id: 35
Started live reindex all

'PUT /reindex-all' triggered at 1720494658.07922
Executing reindex_all() in thread id: 46912539168896, native_id: 87
Parent process id: 35
Worker process id: 87
executing _is_job_thread_running(). Thread active_count: 2
Thread name: uWSGIWorker4Core0, thread id: 46912539168896, native_id: 35
Thread name: reindex_all, thread id: 46914001823488, native_id: 112
Could not start live reindex all. Thread id: 46912539168896, native_id: 87 already running.

'PUT /reindex-all' triggered at 1720494658.6580398
Executing reindex_all() in thread id: 46912539168896, native_id: 83
Parent process id: 35
Worker process id: 83
executing _is_job_thread_running(). Thread active_count: 1
Thread name: uWSGIWorker2Core0, thread id: 46912539168896, native_id: 35
Started live reindex all

@yuanzhou yuanzhou marked this pull request as draft July 12, 2024 15:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants