Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extremely slow sync #16

Open
desbma opened this issue Dec 14, 2024 · 12 comments
Open

Extremely slow sync #16

desbma opened this issue Dec 14, 2024 · 12 comments
Labels
bug Something isn't working

Comments

@desbma
Copy link

desbma commented Dec 14, 2024

Hi,

I'm trying this project to replace a previous isync/mbsync based setup, to backup GMail accounts.

The first run experience is flawless with the wizard, however the performance of the following sync operation is abyssal compared to mbsync.

For a GMail account with 7.5 GB of content (according to Google), this is my experience so far:

  1. Shows "listing envelopes" for about 30min, without creating any files
  2. Sync starts, using 100% CPU on one core for a while
  3. Sync continues without using much CPU, but still extremely slow. After more than 12 hours or runtime, it has downloaded about 5GB, and global progress shows 29%.

My system has a 32 cores CPU, and a 1Gb/s fiber network link. I'm using a version compiled from master branch, with a configuration identical to the suggested one for Gmail.left.backend.root-dir is set to a tmpfs mount. I tried lowering right.backend.clients-pool-size to 2, but that did not help.

Any suggestion to improve performance is welcome.

Thanks

@soywod
Copy link
Member

soywod commented Dec 19, 2024

First of all, thank you trying Neverest.

I'm sorry about your bad experience. I just tried again to sync my Gmail account (~4Gb) using master (71ec0c738a69abf2dddc7ba846ebe1a763aedf48), with a similar setup of yours but I do not share the same experience. So sth must be wrong in your configuration. My guesses are the following:

  • Did you excluded the [Gmail]/All Mail from the sync? It should definitely not be included in the sync, because it's not really a mailbox, more like an alias. It would lead to an account doubled in size. I just realized that it is not mentionned (anymore?!) in the README, I will fix that.

    folder.filters.exclude = ["[Gmail]/All Mail"]
    
    # or if you want to try with the INBOX only:
    folder.filters.include = ["INBOX"]
  • Did you try to increase the size of the clients pool? I used 8 during my tests.

    right.backend.clients-pool-size = 8

soywod added a commit that referenced this issue Dec 19, 2024
@desbma
Copy link
Author

desbma commented Dec 19, 2024

Did you excluded the [Gmail]/All Mail from the sync? It should definitely not be included in the sync, because it's not really a mailbox, more like an alias. It would lead to an account doubled in size. I just realized that it is not mentionned (anymore?!) in the README, I will fix that.

Thanks, I'm trying again with the filter, and will report back if that improves performance or not.

Did you try to increase the size of the clients pool? I used 8 during my tests.

My understanding from #9 (comment) was that the default was the number of CPUs (32 in my case), so I tried the default and lowering it to 4 in case the server was applying some kind of rate limit, but that did not have any perceptible effect on performance.

@soywod
Copy link
Member

soywod commented Dec 19, 2024

My understanding from #9 (comment) was that the default was the number of CPUs (32 in my case), so I tried the default and lowering it to 4 in case the server was applying some kind of rate limit, but that did not have any perceptible effect on performance.

Algorithm was improved since then, it does not rely on the number of CPUs anymore. This is now managed by the runtime. The option actually controls the number of clients in the pool, it needs to be adjusted depending on your performances. I tested with 8, and since we have a similar setup I believe it should work for you as well. Let me know.

@WhyNotHugo
Copy link

WhyNotHugo commented Dec 19, 2024 via email

@desbma
Copy link
Author

desbma commented Dec 19, 2024

It's currently running for about 2 hours, and has created about 1,2 GB of email data. Folder progressions show between 4 and 16%.
So I'm not sure if it's faster than before, but I would still consider abnormally slow.

@soywod
Copy link
Member

soywod commented Dec 19, 2024

Indeed sth is wrong, it should not take that long. Did you try with only 1 folder, for example the INBOX? How long takes an initial sync with mbsync on your same account?

@desbma
Copy link
Author

desbma commented Dec 19, 2024

Did you try with only 1 folder, for example the INBOX?

I just tried, and it takes about 5-10 seconds, but I only have 30 mails in it.

How long takes an initial sync with mbsync on your same account?

I did a backup from scratch a long time ago with mbsync, and then I ran a periodic script to only update the maildir. I remember the initial sync was long, but "more than an hour long", not "several days and still running long". The weekly update took 2 or 3 minutes to complete, whereas neverest takes a lot more than that in "Listing envelopes" state before actually writing any emails.
I recently lost the full mbsync backup (including the config file) hence I am looking for alternatives before re-doing the same setup.

@soywod
Copy link
Member

soywod commented Dec 19, 2024

By any chance, do you have Gmail tags or aliases? If so then try to pick only the real mailboxes your are interested in, without them. Otherwise I don't see anything else that could prevent Neverest to go faster for you, especially that we have a similar setup and that I do not reproduce this slowness. How do look like your mailboxes? Do they contain heavy messages (huge attachments)? Or mostly text?

@desbma
Copy link
Author

desbma commented Dec 19, 2024

By any chance, do you have Gmail tags or aliases?

Not sure what you mean by aliases. GMail has no concept of folders, only tags, and yes I have many of them, on one or two levels, but they don't overlap so they should map 1-1 with IMAP folders.
I have noticed than when I sync a single folder with not many emails, or large ones, it tends to be fast (network bandwidth at ~1MB/s), but for a single folder with many thousands small emails, it stays in the 0-100 KB/s range).
Also the "all messages" IMAP folder is localized, and is named "[Gmail]/Tous les messages" for me it seems. But that is not the cause of slowness because it can be reproduced with a single IMAP folder.

@desbma
Copy link
Author

desbma commented Dec 19, 2024

Email size seems to be the most decisive factor, I just synced about 6k emails with PNG images at ~600 KB/s, whereas its 1/10 of the network speed for small emails.

@soywod
Copy link
Member

soywod commented Dec 20, 2024

I do not get what is going wrong. The initial sync is literally just about downloading messages, nothing fancy. I will investigate further as soon as I approach the v1.

@soywod soywod added the bug Something isn't working label Dec 20, 2024
@soywod soywod added this to Pimalaya Dec 20, 2024
@soywod soywod moved this to Todo in Pimalaya Dec 20, 2024
@desbma
Copy link
Author

desbma commented Dec 20, 2024

For what it is worth, I tried again with a fresh mbsync config and I get 1-10 MB/s download speeds, much faster than neverest.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
Status: Todo
Development

No branches or pull requests

3 participants