Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Finding the maximum performance #646

Merged
merged 26 commits into from
Jan 9, 2025
Merged

Finding the maximum performance #646

merged 26 commits into from
Jan 9, 2025

Conversation

hatoo
Copy link
Owner

@hatoo hatoo commented Jan 4, 2025

#617 (comment)

Only for a fixed number of requests run.

❯ TOKIO_WORKER_THREADS=16 oha -n 4000000 -c 1000 --no-tui http://localhost:3000
Summary:
  Success rate: 100.00%
  Total:        6.2924 secs
  Slowest:      0.0923 secs
  Fastest:      0.0000 secs
  Average:      0.0016 secs
  Requests/sec: 635684.8490

  Total data:   49.59 MiB
  Size/request: 13 B
  Size/sec:     7.88 MiB
❯ TOKIO_WORKER_THREADS=16 cargo run --release -- -n 4000000 -c 1000 --no-tui http://localhost:3000
Summary:
  Success rate: 100.00%
  Total:        5.7571 secs
  Slowest:      0.1873 secs
  Fastest:      0.0000 secs
  Average:      0.0014 secs
  Requests/sec: 694790.7464

  Total data:   49.59 MiB
  Size/request: 13 B
  Size/sec:     8.61 MiB
❯ wrk -t 16 -c 1000  http://localhost:3000
Running 10s test @ http://localhost:3000
  16 threads and 1000 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency     1.23ms    1.19ms  21.26ms   90.28%
    Req/Sec    56.31k     9.45k  160.94k    78.17%
  9026938 requests in 10.05s, 1.09GB read
Requests/sec: 898204.31
Transfer/sec:    111.36MB

note: Set TOKIO_WORKER_THREADS to the number of actual physical cpus (no hyper-threading) actually helps performance.
It's another issue.

@hatoo hatoo force-pushed the local-future branch 2 times, most recently from eb082f7 to 2604c2e Compare January 5, 2025 09:36
@hatoo hatoo changed the title Utilize LocalRuntime Optimizing by using thread local futures Jan 5, 2025
@hatoo
Copy link
Owner Author

hatoo commented Jan 6, 2025

Not sending to channel per requests greatly improve performance

note: --profile pgo doesn't mean it uses PGO, just profile for PGO build (enabling lto etc.)

❯ cargo run --profile pgo -- -n 6000000 -c 1000 --no-tui http://localhost:3000
Summary:
  Success rate: 100.00%
  Total:        6.9893 secs
  Slowest:      0.2570 secs
  Fastest:      0.0000 secs
  Average:      0.0011 secs
  Requests/sec: 858459.4633

  Total data:   74.39 MiB
  Size/request: 13 B
  Size/sec:     10.64 MiB
❯ wrk -t16 -c 1000 http://localhost:3000
Running 10s test @ http://localhost:3000
  16 threads and 1000 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency     1.19ms    1.15ms  19.68ms   90.30%
    Req/Sec    58.90k     8.17k  177.76k    78.10%
  9435690 requests in 10.06s, 1.14GB read
Requests/sec: 938274.13
Transfer/sec:    116.33MB

@hatoo hatoo changed the title Optimizing by using thread local futures Finding the maximum performance Jan 6, 2025
@hatoo hatoo marked this pull request as ready for review January 9, 2025 07:39
@hatoo hatoo merged commit 55849fe into master Jan 9, 2025
11 checks passed
@hatoo hatoo mentioned this pull request Jan 11, 2025
@hatoo hatoo deleted the local-future branch January 16, 2025 10:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant