Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

stream stringify #709

Closed
wants to merge 11 commits into from
Closed

stream stringify #709

wants to merge 11 commits into from

Conversation

cesco69
Copy link
Contributor

@cesco69 cesco69 commented Apr 12, 2024

I don't know if this idea has ever been discussed or if it can actually be a performance improvement.
The idea is to implement a stringify version that write a Readable to consume in the server response during the stringify process
e.g:

const fastJson = require('fast-json-stringify')
const Stream = require('stream');

const schema = {
  title: 'Example Schema',
  type: 'object',
  properties: {
    foo: {
      type: 'string'
    }
}
const stringify = fastJson(schema, { enableStream: true })

app.get('/', function (req, res) {
  const data = { foo: 'bar' }
  res.type('json');
  const s = new Stream.PassThrough()
  s.pipe(req.res as any); // pipe the stream on the server response
  stringify(data, s);
})

Maybe serving the stream into the web-server (fastify) response during the stringify process, is a time-parallelization and maybe has some advantage.

The benchmark only shows that no slowdowns have been introduced in the process. There is no benchmark on the stream!

> [email protected] bench
> node ./benchmark/bench.js

short string............................................. x 23,264,383 ops/sec ±0.54% (191 runs sampled)
unsafe short string...................................... x 1,047,238,211 ops/sec ±0.38% (191 runs sampled)
short string with double quote........................... x 13,510,414 ops/sec ±0.43% (191 runs sampled)
long string without double quotes........................ x 15,287 ops/sec ±0.35% (191 runs sampled)
unsafe long string without double quotes................. x 1,011,215,935 ops/sec ±0.31% (191 runs sampled)
long string.............................................. x 15,822 ops/sec ±0.23% (192 runs sampled)
unsafe long string....................................... x 1,028,558,169 ops/sec ±0.32% (192 runs sampled)
number................................................... x 1,035,869,184 ops/sec ±0.37% (191 runs sampled)
integer.................................................. x 215,910,712 ops/sec ±0.32% (193 runs sampled)
formatted date-time...................................... x 1,673,295 ops/sec ±0.26% (192 runs sampled)
formatted date........................................... x 1,105,447 ops/sec ±0.56% (189 runs sampled)
formatted time........................................... x 1,114,771 ops/sec ±0.31% (191 runs sampled)
short array of numbers................................... x 75,956 ops/sec ±0.42% (190 runs sampled)
short array of integers.................................. x 64,778 ops/sec ±0.66% (190 runs sampled)
short array of short strings............................. x 20,069 ops/sec ±0.52% (189 runs sampled)
short array of long strings.............................. x 18,395 ops/sec ±1.84% (183 runs sampled)
short array of objects with properties of different types x 9,508 ops/sec ±0.45% (191 runs sampled)
object with number property.............................. x 1,052,341,337 ops/sec ±0.50% (189 runs sampled)
object with integer property............................. x 222,885,201 ops/sec ±0.38% (192 runs sampled)
object with short string property........................ x 23,348,971 ops/sec ±0.63% (190 runs sampled)
object with long string property......................... x 16,086 ops/sec ±0.49% (191 runs sampled)
object with properties of different types................ x 1,904,268 ops/sec ±0.92% (183 runs sampled)
simple object............................................ x 10,046,660 ops/sec ±0.60% (190 runs sampled)
simple object with required fields....................... x 9,920,616 ops/sec ±0.43% (188 runs sampled)
object with const string property........................ x 1,064,703,008 ops/sec ±0.46% (191 runs sampled)
object with const number property........................ x 1,061,670,005 ops/sec ±0.46% (190 runs sampled)
object with const bool property.......................... x 1,052,898,941 ops/sec ±0.45% (190 runs sampled)
object with const object property........................ x 1,054,126,269 ops/sec ±0.48% (188 runs sampled)
object with const null property.......................... x 1,044,532,223 ops/sec ±0.40% (190 runs sampled)

Checking out "poc-stream"
Execute "npm run bench"

> [email protected] bench
> node ./benchmark/bench.js

short string............................................. x 22,786,051 ops/sec ±0.59% (189 runs sampled)
unsafe short string...................................... x 218,964,319 ops/sec ±0.41% (190 runs sampled)
short string with double quote........................... x 13,495,670 ops/sec ±0.59% (191 runs sampled)
long string without double quotes........................ x 15,525 ops/sec ±0.45% (189 runs sampled)
unsafe long string without double quotes................. x 219,528,955 ops/sec ±0.41% (189 runs sampled)
long string.............................................. x 15,479 ops/sec ±0.30% (190 runs sampled)
unsafe long string....................................... x 211,905,094 ops/sec ±0.22% (193 runs sampled)
number................................................... x 211,208,490 ops/sec ±0.16% (192 runs sampled)
integer.................................................. x 210,452,179 ops/sec ±0.13% (194 runs sampled)
formatted date-time...................................... x 1,539,365 ops/sec ±0.19% (194 runs sampled)
formatted date........................................... x 1,045,581 ops/sec ±0.21% (194 runs sampled)
formatted time........................................... x 1,035,530 ops/sec ±0.27% (191 runs sampled)
short array of numbers................................... x 70,876 ops/sec ±0.33% (188 runs sampled)
short array of integers.................................. x 63,599 ops/sec ±0.32% (192 runs sampled)
short array of short strings............................. x 19,384 ops/sec ±0.52% (191 runs sampled)
short array of long strings.............................. x 19,502 ops/sec ±0.27% (192 runs sampled)
short array of objects with properties of different types x 9,419 ops/sec ±0.24% (192 runs sampled)
object with number property.............................. x 155,215,517 ops/sec ±0.29% (193 runs sampled)
object with integer property............................. x 213,214,858 ops/sec ±0.19% (192 runs sampled)
object with short string property........................ x 21,588,018 ops/sec ±0.58% (189 runs sampled)
object with long string property......................... x 15,445 ops/sec ±0.21% (194 runs sampled)
object with properties of different types................ x 1,826,196 ops/sec ±0.63% (187 runs sampled)
simple object............................................ x 9,316,552 ops/sec ±0.28% (193 runs sampled)
simple object with required fields....................... x 9,260,906 ops/sec ±0.44% (190 runs sampled)
object with const string property........................ x 210,324,452 ops/sec ±0.12% (194 runs sampled)
object with const number property........................ x 209,395,498 ops/sec ±0.17% (194 runs sampled)
object with const bool property.......................... x 210,243,675 ops/sec ±0.14% (194 runs sampled)
object with const object property........................ x 209,518,132 ops/sec ±0.16% (194 runs sampled)
object with const null property.......................... x 210,374,093 ops/sec ±0.11% (194 runs sampled)

short string...............................................+2.1%
unsafe short string......................................+378.27%
short string with double quote............................+0.11%
long string without double quotes.........................-1.53%
unsafe long string without double quotes.................+360.63%
long string...............................................+2.22%
unsafe long string.......................................+385.39%
number...................................................+390.45%
integer...................................................+2.59%
formatted date-time........................................+8.7%
formatted date............................................+5.73%
formatted time............................................+7.65%
short array of numbers....................................+7.17%
short array of integers...................................+1.85%
short array of short strings..............................+3.53%
short array of long strings...............................-5.68%
short array of objects with properties of different types.+0.94%
object with number property..............................+577.99%
object with integer property..............................+4.54%
object with short string property.........................+8.16%
object with long string property..........................+4.15%
object with properties of different types.................+4.28%
simple object.............................................+7.84%
simple object with required fields........................+7.12%
object with const string property........................+406.22%
object with const number property........................+407.02%
object with const bool property..........................+400.8%
object with const object property........................+403.12%
object with const null property..........................+396.51%

Signed-off-by: francesco <[email protected]>
@cesco69 cesco69 marked this pull request as draft April 12, 2024 12:55
cesco69 added 4 commits April 12, 2024 15:13
Signed-off-by: francesco <[email protected]>
Signed-off-by: francesco <[email protected]>
Signed-off-by: francesco <[email protected]>
Signed-off-by: francesco <[email protected]>
@mcollina
Copy link
Member

I don't this would yield any significant performance improvement.

cesco69 added 2 commits April 12, 2024 16:56
Signed-off-by: francesco <[email protected]>
@cesco69 cesco69 marked this pull request as ready for review April 12, 2024 15:04
@cesco69 cesco69 changed the title stream stringify (DRAFT) stream stringify Apr 12, 2024
@cesco69
Copy link
Contributor Author

cesco69 commented Apr 12, 2024

I don't this would yield any significant performance improvement.

@mcollina this will be complicated to test

@cesco69 cesco69 marked this pull request as draft April 12, 2024 15:23
Signed-off-by: francesco <[email protected]>
@ivan-tymoshenko
Copy link
Member

I think this might give you the perf boost only if you have a really large array in the response. So I would try to make a chunk at least per object and not per object key/value. Making chunks small will give you a big overhead.

@ivan-tymoshenko
Copy link
Member

Simple math. If time required to send one tcp package from server to client is more than time for serializing the whole response than it doesn't make a lot of sense. If I understand the idea correctly.

cesco69 added 3 commits April 15, 2024 12:04
Signed-off-by: francesco <[email protected]>
Signed-off-by: francesco <[email protected]>
Signed-off-by: francesco <[email protected]>
@cesco69 cesco69 marked this pull request as ready for review April 15, 2024 10:42
Copy link
Member

@mcollina mcollina left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will be slower than before, because it generates all data synchronously before beginning to write it. Note you'll still generate all the content synchronously and queue it in the stream. Let's say this queuing cost 1ns. If you enqueue 1 chunk, it would take 1 ns. If you enqueue 1000 chunks, it would cost 1ms. Given that you already have 100% of the data to be sent, it's not worth the complexity.

Streams are great if we could start sending some of the data while waiting for some other I/O to happen. This reduces loading time etc. But not in this case.

@cesco69
Copy link
Contributor Author

cesco69 commented Apr 15, 2024

@mcollina

Streams are great if we could start sending some of the data while waiting for some other I/O to happen. This reduces loading time etc. But not in this case.

Why not?

 const s = new Stream.PassThrough()
 s.pipe(req.res as any); // pipe the stream on the server response
 stringify(data, s);

stringify write on stream each chunk of json and stream.pipe(res); send each to client

@mcollina
Copy link
Member

It doesn't. You are generating all the chunks synchronously and enqueuing them. After all of that is completed, the event loop will pick up and do all the work to send them through (I'm simplifying).

@cesco69
Copy link
Contributor Author

cesco69 commented Apr 15, 2024

It doesn't. You are generating all the chunks synchronously and enqueuing them. After all of that is completed, the event loop will pick up and do all the work to send them through (I'm simplifying).

I have tried , it's true ! I think I need to study how streams work

:(

@mcollina mcollina closed this Apr 15, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants