Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

general performance issue #11

Open
convolvatron opened this issue Dec 8, 2017 · 6 comments
Open

general performance issue #11

convolvatron opened this issue Dec 8, 2017 · 6 comments

Comments

@convolvatron
Copy link
Collaborator

optimizations we are consider are

  1. pipeline segmented transfers
  2. connection bonding
  3. iovecs to reduce copying
  4. caching filehandles

note that (2) might obviate (1). (1) may in fact be necessary for correctness if the server evaluates out of order

@convolvatron
Copy link
Collaborator Author

convolvatron commented Dec 8, 2017

  1. inserts seem to involve a lot of small writes, probably btree pointers, lsns, etc. it would probably help to batch these between sync(). that may also lower the chances of corrupting the database on fault (quasi-atomic updates)

to address another comment in the source, this probably means

  • defining a synch mode
  • changing file write/create to take a default mode
  • allow a per file write to choose a different mode
  • exposing a flush

@convolvatron
Copy link
Collaborator Author

convolvatron commented Dec 8, 2017

  1. up the transfer chunking size

@jssmith
Copy link
Owner

jssmith commented Dec 8, 2017

Ok, good learnings + thoughts. Comments:

  1. inserts seem to involve a lot of small writes...

Just to confirm I that I understand this: you expect we can get efficiencies from batching at the NFS protocol layer. I believe we already have the Nagle algorithm in TCP layer (though now that I think about it I'm not sure whether we want that - maybe we just do all of the batching ourselves).

  1. up the transfer chunking size

Ah, yes, we should probably experiment with this. We definitely need to renumber these because this seems like low-hanging fruit.

I'll want to make sure that we have all of these changes flagged so that we can measure and compare performance under different approaches.

@convolvatron
Copy link
Collaborator Author

right. just like nagle. nagle though is probably going to be a little blind. in particular it flushes with a timer and not with an explicit sync.and i think(?) the size threshold is lower than the tcp mss. sure wrp flags. dynamic api is preferable to compile time.

@convolvatron
Copy link
Collaborator Author

oh right, nagle in this case wont help and in fact might be hurting because we block waiting for the remote response on each write

  1. turn on NO_DELAY

@convolvatron
Copy link
Collaborator Author

note that pipelining segmented transfers might not get us anything, since i'm pretty sure sqlite will max out at a page size. if thats the <= segment size then it kind of doesn't matter (for this application)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants