-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Why do file inventory jobs take so long for some users #1274
Comments
In the QA MediaFlux instance my permissions are very different from Matt's permissions:
You can also get permissions via the
|
We now have a user complaint about this: RT #56623: Issue with manifest
|
@matthewjchandler do you think you can provide some details about the issue, for example the user that had the issue and/or the project that they are trying to download? I tried to view the ticket in RT (guessing this is the right link https://cses.princeton.edu/tickets/Ticket/Display.html?id=56623) but I get "No permission to view ticket" |
@hectorcorrea I don't have the project ID, but the user is mnaydan, who I believe is a data manager on the project in question. |
The job and project in question in Production:
Project |
I re-ran the job in production at 9:46 AM and as of 10:13 AM it has exported 1 million files and it's still going.
|
My file inventory finished in about an hour which is not great but it's better than not finishing. It was strange that the file that it produced only had 2,115,517 files but the TigerData app shows 2,596,619. To make things more interesting, the MediaFlux desktop app show also 2,115,517 files. I guess the way we are getting the count that we show in the app is picking something else. Not sure what that is. |
Aterm also shows 2115517 when we query for a count (the
|
File count issue now tracked here: #1327 |
With the logging information introduced in PR #1328 it's obvious the slowness is in the iterator. Fetching each page of data takes less than one second when running under my user (
Whereas for Matt's user fetching each page of data takes almost 10 seconds:
|
Updating my user It was taking 10 seconds for Matt rather than 15 seconds but the experiment shows that the user permissions are having a huge impact on the iterator performance. |
Unfortunately, running the
The log file shows:
|
Running the File Inventory from the application with the reduced permissions still takes 10-15 per iteration so this it is not a case where the server might have been too busy at one time but not the other. Could the issue be related to how we make the requests from Ruby or how we manage the authentication/session in those requests? |
Re-adding the admin role to my account gets the the job to run fast:
I removed the |
With Matt as an administrator he gets the same performance as my user does, each batch of records it's fetched in less than a second.
His current permissions (plus the
|
Issue about FileInventoryJob crashing: #1331 |
Deployed version When querying MediaFlux directly my account fetches a smaller number of files (2,115,517) than Chuck's (2,116,016) or the original user but close enough to benchmark as a non-admin. As of 5 PM the File Inventory job has exported half the file inventory and should end around 6:30 PM. Fetching pages is slowing down, it used to be 0.5 seconds per page and now it's hovering around the 1 second per page but hopefully does not slow down much more. Memory utilization is about 6% which is good. Update: The job finished in 62 minutes and my account is not an administrator in production. |
Deployed a new version of the code to Staging and QA that lets us run a FileInventory for a specific user without using CAS authentication and noticed that the same problem persist. Good news the problem is not CAS, bad news I still don't know what the issue is. Here is what I did:
When running with My user with
|
While fixing issue #1263 we noticed that the file inventory job takes a very long time when Matt (
user_id 186
) runs it in QA. The file inventory forproject_id 44
(which has 189,000 files) takes 65 minutes when Matt runs it, but as shown below it only takes 5 minutes when Hectoruser_id 192
runs it.Below is what the
user_request
table shows in QA.We think this could be related to the fact that Hector is an admin on MediaFlux QA but Matt isn't.
The text was updated successfully, but these errors were encountered: