Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Nextcloud on appserver.io #14131

Closed
tomasz-grobelny opened this issue Feb 9, 2019 · 26 comments
Closed

Nextcloud on appserver.io #14131

tomasz-grobelny opened this issue Feb 9, 2019 · 26 comments
Labels
0. Needs triage Pending check for reproducibility or if it fits our roadmap enhancement technical debt

Comments

@tomasz-grobelny
Copy link
Contributor

tomasz-grobelny commented Feb 9, 2019

Is your feature request related to a problem? Please describe.
Nextcloud is slow. Consider for example uploading a large number of small files. Each WebDAV operation, no matter how simple, takes at least 1 second and up to 40-50 seconds even for relatively small files (<1MB) on reasonable good link (100Mbit/s).

Describe the solution you'd like
This is because PHP has to load all nextcloud apps, files, etc. on every request doing the same work all over again on each HTTP request. This is highly inefficient. Applications should be initialized once into memory and then only used on each request. This should be possible using appserver.io and servlets.

Describe alternatives you've considered
Another solution might be based on ReactPHP - idea is the same: do not load/execute the same things over and over again on each request.

Additional context
Questions are the following:

  1. Was it ever considered to use appserver.io (or anything of this kind if exists) for Nextcloud?
  2. What are your thoughts on the idea? Would requiring appserver.io be acceptable? Could it be optional (still work with classic servers, but also support appserver.io)? Would it be possible to have only some part working with appserver.io (eg. WebDAV) while others still work as now?
  3. What would be the best area to start? Two ideas come to my mind: preview generation and WebDAV.
@tomasz-grobelny tomasz-grobelny added enhancement 0. Needs triage Pending check for reproducibility or if it fits our roadmap labels Feb 9, 2019
@tomasz-grobelny
Copy link
Contributor Author

Anyone any thoughts?

@nickvergessen
Copy link
Member

  • Was it ever considered to use appserver.io (or anything of this kind if exists) for Nextcloud?

No

  • What are your thoughts on the idea? Would requiring appserver.io be acceptable? Could it be optional (still work with classic servers, but also support appserver.io)? Would it be possible to have only some part working with appserver.io (eg. WebDAV) while others still work as now?

Highly unlikely. Also we try to make Nextcloud work in as many enviroments as possible. So requiring people to install something else is not possible. Also all the apps would require a major rewrite, to be able to handle the new situation. So I don't think it is an option really.

  • What would be the best area to start? Two ideas come to my mind: preview generation and WebDAV.

Instead we should look at the apps and see what they are loading and why and help them prevent that.

Other opinions? @rullzer @MorrisJobke

@nickvergessen nickvergessen added technical debt and removed 0. Needs triage Pending check for reproducibility or if it fits our roadmap labels Feb 22, 2019
@tomasz-grobelny
Copy link
Contributor Author

  • What are your thoughts on the idea? Would requiring appserver.io be acceptable? Could it be optional (still work with classic servers, but also support appserver.io)? Would it be possible to have only some part working with appserver.io (eg. WebDAV) while others still work as now?

Highly unlikely. Also we try to make Nextcloud work in as many enviroments as possible. So requiring people to install something else is not possible. Also all the apps would require a major rewrite, to be able to handle the new situation. So I don't think it is an option really.

I did some preliminary tests with ReactPHP and I am almost able to get it going for generating previews without any major rewrite. I still have some session related issues (cannot use cookie methods, cannot use $_SESSION), issues with getting response to the client (it is generated on server side though). I should be able to deliver already generated preview in under 100ms, whereas it takes around 2 seconds with normal mechanism (over 10x speed improvement). The only additional requirement is ReactPHP installed through composer: no appserver, no changes to web server config.

Would that be acceptable if I get fully working?

  • What would be the best area to start? Two ideas come to my mind: preview generation and WebDAV.

Instead we should look at the apps and see what they are loading and why and help them prevent that.

This will not help. Just including base.php needs several hundred ms, so I don't see how getting down to 100ms would be possible.

@MorrisJobke
Copy link
Member

This will not help. Just including base.php needs several hundred ms, so I don't see how getting down to 100ms would be possible.

o.O This is not true in many environments. We know that it is big, but on typical systems all requests are served within 50 to 250 ms. And the preview part is not that much limited by base.php but by the actual image manipulation and there we can't do that much or do we?

@tomasz-grobelny
Copy link
Contributor Author

This will not help. Just including base.php needs several hundred ms, so I don't see how getting down to 100ms would be possible.

o.O This is not true in many environments.

Sure, but in some it is. I am a little bit selfish here focusing on my env, but I guess I am not alone with nextcloud performance issues on low-power consumption hardware.

We know that it is big, but on typical systems all requests are served within 50 to 250 ms. And the preview part is not that much limited by base.php

We have two parts here: generation and serving. For the generation part we can use pregeneration, but that still does not solve the problem if serving already existing thumbnails for a typical view takes tens of seconds. Anyway, the same goes for WebDAV requests: each one takes at least 2 seconds on my hardware and most time is consumed by some basic setup.

Also please note that I would not like to focus that much on absolute numbers as these depend on hardware used as on the relative speedup. Which is IMO huge.

but by the actual image manipulation and there we can't do that much or do we?

We can certainly do better. But probably not in PHP. Please have a look here: https://blog.uploadcare.com/the-fastest-production-ready-image-resize-out-there-part-0-7c974d520ad9. That's why I previously (nextcloud/gallery#490) proposed to have some REST service for image manipulation.

@MorrisJobke
Copy link
Member

We have two parts here: generation and serving. For the generation part we can use pregeneration, but that still does not solve the problem if serving already existing thumbnails for a typical view takes tens of seconds. Anyway, the same goes for WebDAV requests: each one takes at least 2 seconds on my hardware and most time is consumed by some basic setup.

Okay - then some profiling would be really nice. Either xdebug or https://blackfire.io should give some valuable insights why it takes so long.

Also please note that I would not like to focus that much on absolute numbers as these depend on hardware used as on the relative speedup. Which is IMO huge.

Indeed - thus profiler often give the option to give relative time some methods calls take - the big ones are the ones we should focus on.

We can certainly do better. But probably not in PHP. Please have a look here: https://blog.uploadcare.com/the-fastest-production-ready-image-resize-out-there-part-0-7c974d520ad9. That's why I previously (nextcloud/gallery#490) proposed to have some REST service for image manipulation.

I remind some ideas also here in the server for previews of anything to be outsourced to another service (optinionally of course).

@rullzer
Copy link
Member

rullzer commented Feb 26, 2019

@tomasz-grobelny while I'm always open for improvements please do keep in mind the following two things

  1. It should still all run out of the box as is
  2. The default preview generation is so slow because we us gd. We do not use imagmagik by default for security reasons
  3. Note that also in our main setup there are a lot of security measurements. For example with preview serving we need to authenticate the current user, check if they even have access to the previews etc tec.

Also. I'm still curious why serving the preview takes upwards of 2 seconds for you. I tried this recently on a raspberry pi and serving of the preview was a lot faster than that (under a second).

@tomasz-grobelny
Copy link
Contributor Author

We have two parts here: generation and serving. For the generation part we can use pregeneration, but that still does not solve the problem if serving already existing thumbnails for a typical view takes tens of seconds. Anyway, the same goes for WebDAV requests: each one takes at least 2 seconds on my hardware and most time is consumed by some basic setup.

Okay - then some profiling would be really nice. Either xdebug or https://blackfire.io should give some valuable insights why it takes so long.

Once I did some profiling using xdebug. My conclusion back then was that we don't do anything that takes majority of time - it is just gazillion of tiny calls that create objects through DI, setup some event handlers and the like.

Also please note that I would not like to focus that much on absolute numbers as these depend on hardware used as on the relative speedup. Which is IMO huge.

Indeed - thus profiler often give the option to give relative time some methods calls take - the big ones are the ones we should focus on.

I tried that approach - unfortunately when I got into some huge-looking functiona it turned out to be a lot of tiny calls and each of them not less important than others.

We can certainly do better. But probably not in PHP. Please have a look here: https://blog.uploadcare.com/the-fastest-production-ready-image-resize-out-there-part-0-7c974d520ad9. That's why I previously (nextcloud/gallery#490) proposed to have some REST service for image manipulation.

I remind some ideas also here in the server for previews of anything to be outsourced to another service (optinionally of course).

Could you provide some more info on that? Do you mean something already existing and external, or something written specifically for Nextcloud?

@tomasz-grobelny
Copy link
Contributor Author

@tomasz-grobelny while I'm always open for improvements please do keep in mind the following two things

1. It should still all run out of the box as is

Should be doable.

2. The default preview generation is so slow because we us gd. We do not use imagmagik by default for security reasons

If gd is even slower than IM then we have even more room for improvement :-)

3. Note that also in our main setup there are a lot of security measurements. For example with preview serving we need to authenticate the current user, check if they even have access to the previews etc tec.

Yes, agreed. I am working on that.

Also. I'm still curious why serving the preview takes upwards of 2 seconds for you. I tried this recently on a raspberry pi and serving of the preview was a lot faster than that (under a second).

One second is still a lot for what is essentially authenticating and serving a static file of several KB. Unfortunately I cannot compare performance of my hardware (pcengines apu2) vs Raspberry Pi.

@tomasz-grobelny
Copy link
Contributor Author

I had a closer look at the middleware/controller/dependency injection code and I would like to implement the following changes:

  1. Neither controller nor middleware classes should take session/request/response/userId objects/values as constructor dependencies.
  2. Middlewares should take HttpContext as parameter of individual method calls (beforeController/etc.).
  3. Controllers should have HttpContext available as property.
  4. HttpContext should give Controller/Middleware access to request/response/session objects.

All in all these changes should not affect how Nextcloud works now (in terms of code complexity or performance), but they should make it possible to later convert to request/response model where DI object graph is built once and reused on subsequent requests.

Please note that similar model of accessing request is used in other frameworks (eg. ASP.NET Core). The ultimate motivation for those changes is performance: I have a PoC that serves previews over 10 times faster (on my hardware, for very small previews) that uses ReactPHP.

@MorrisJobke, @rullzer, @nickvergessen - would you be ok with this approach?

@rullzer
Copy link
Member

rullzer commented Mar 5, 2019

@tomasz-grobelny we can't just change the public interface. This will break all apps.

Can you explain to me what this solves. We already build 1 big DI container. I do not see how not passing the requests to controllers makes the code 10 times faster. And for example the request object have to be instantiated anyways because they are required to do the actual routing.

@tomasz-grobelny
Copy link
Contributor Author

tomasz-grobelny commented Mar 5, 2019

@tomasz-grobelny we can't just change the public interface. This will break all apps.

Can you explain to me what this solves. We already build 1 big DI container. I do not see how not passing the requests to controllers makes the code 10 times faster. And for example the request object have to be instantiated anyways because they are required to do the actual routing.

As I understand this would break public interface of middlewares only - do we have any apps that use this public interface given that middleware registration is hardcoded in DIContainer.php?

Please note that existing controllers in external apps should still work if they specify IRequest as their dependency. They can be migrated, but they are not forced to do so at this step.

As for performance: at this step there is no immediate performance benefit (as mentioned in my previous comment). This is just a step towards being able to have an application server that can service http requests with much smaller footprint. Currently for every single request we register all classes, build whole object graph and only then execute the actual code. We have to do it because the object graph depends on the IRequest (and eg. userId). So currently for every request we do not only instantiate the Request object, we also register and instantiate a whole zoo of other objects.

The ultimate goal is to make the object graph independent of the Request object and therefore reusable across requests. This means that class registration and object graph construction does not need to be done for every request and so every request can be served faster. For trivial requests (like serving already existing preview) the setup time is significant compared to the actual work that needs to be done.

UPDATE: I see that new middlewares can indeed be registered. Maybe we could then provide some compatibility layer to be removed one version later?

@rullzer
Copy link
Member

rullzer commented Mar 5, 2019

We have a deprecation period of 3 years for public interfaces.

I still don't get what you are trying to do. Each request (especially for the previews) needs the request obect, The user id, the session etc. Because we need to do authentication etc.

But having said al that feel free to shoot in a proof of conept because I feel I'm missing something and maybe having code to look at helps.

@tomasz-grobelny
Copy link
Contributor Author

I'll have a look again how much the Middleware interface needs to be changed.

For every request we need to instantiate Request, Response, Session, userId and really not much more (maybe some filesystem view). Currently we instantiate huge amounts of objects, register a lot of events which is simply not needed on a per-request basis.

I'll post here when I have a demoable PoC.

@tomasz-grobelny
Copy link
Contributor Author

tomasz-grobelny commented Mar 6, 2019

Ok, so here it goes:
tomasz-grobelny@962ad56

It contains a lot of hacks (eg. getting access to session), but the main points are:

  1. To test it you need to run "php server.php" from console. Each request will print a line.
  2. You can observe that previews for image files are created using server.php (through proxy.php). For me this is much faster than without the changes.
  3. The Middleware public abstract class contains only minor changes which should not affect compatibility.
  4. Currently only SameSiteCookieMiddleware has been converted not to cache Request object. As you can see the conversion is relatively simple (s/$this->request/$this->context->request/).
  5. The current version is single threaded, but could be extended for multiple worker processes using PHP-PM.

Does this clarify things?

@nickvergessen
Copy link
Member

The problem is that at least this line:
tomasz-grobelny@962ad56#diff-954b90756615239eaac02e54722e1f1aL102

breaks 95% of the existing apps

@tomasz-grobelny
Copy link
Contributor Author

tomasz-grobelny commented Mar 7, 2019

Why would that be given the same line is added here:
tomasz-grobelny@962ad56#diff-0a830b0354b26e84f08f705b5356a05bR190
?
The only point of this change is that we can re-register userId in server.php for every request.

As you can see the PreviewController has not been changed at all - and it depends on 'userId'. This still works because the controller is instantiated on every request - currently this is hard-coded but we could write any logic to determine whether given controller should be instantiated or not (from 'always instantiate' policy, through 'detect not-compatible constructor' policy to 'never instantiate' policy).

Anyway, this should not affect current request per process execution mode.

@tomasz-grobelny
Copy link
Contributor Author

@nickvergessen , do you think this approach would introduce incompatibility? Or maybe you have any idea which checks I should perform to make sure things don't break.

@tomasz-grobelny
Copy link
Contributor Author

@rullzer , @MorrisJobke , @nickvergessen - could you please provide any input on the approach proposed? Would you be ok with the changes in DI container registration (as discussed above) and Middleware abstract class (of course provided I remove the obvious hacks like hard-coding controller name) so that the actual code for running Nextclud on ReactPHP can be added externally?

@tomasz-grobelny
Copy link
Contributor Author

@rullzer , @nickvergessen , @MorrisJobke, @blizzz, @schiessle - please provide feedback in the referenced pull request. Please also refer to nextcloud/gallery#458 - this is just one more voice why performance should be considered with more priority than it currently is.

@stale

This comment has been minimized.

@stale stale bot added the stale Ticket or PR with no recent activity label Jun 5, 2019
@tomasz-grobelny

This comment has been minimized.

@stale stale bot removed the stale Ticket or PR with no recent activity label Jun 5, 2019
@kesselb

This comment has been minimized.

@skjnldsv skjnldsv added the 0. Needs triage Pending check for reproducibility or if it fits our roadmap label Jun 12, 2019
@dgtlmoon
Copy link

dgtlmoon commented Nov 17, 2019

Most of that delay is actually the bootstrap sequence as you correctly say, I dug around with a profiler and I can see that surprisingly the way it handles the scaling of images (I was feeding it 8Mb images) was very fast

That delay in the bootstrap sequence is even in play when you request an image that already had the preview built (that is to say, when you request a preview via core/preview/ URL) you will also encounter the slowness

more here #13709 (comment)

I would add that part of the tests for this project should include a guarantee that fetching existing preview images should not take more than 20ms (as is easily achievable in drupal/wordpress/etc)

@szaimen
Copy link
Contributor

szaimen commented May 27, 2021

There were a lot of speed improvements in the last releases and I would argue that with correct configuration and good hardware, Nextcloud can already be pretty fast.
So I am going to close this since it doesn't seem to be necessary.

@szaimen szaimen closed this as completed May 27, 2021
@pulsejet
Copy link
Member

Hi all. I apologise for commenting on an ancient thread, but I just need some clarification from the devs (more context, waiting on some reply):

Highly unlikely. Also we try to make Nextcloud work in as many enviroments as possible. So requiring people to install something else is not possible. Also all the apps would require a major rewrite, to be able to handle the new situation. So I don't think it is an option really.

Is this still true / a hard requirement?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
0. Needs triage Pending check for reproducibility or if it fits our roadmap enhancement technical debt
Projects
None yet
Development

No branches or pull requests

9 participants