Enforcing memory limit on update operations #379

paterczm · 2017-07-27T11:48:46Z

Needs lightblue-platform/lightblue-core#805 merged first.

paterczm · 2017-07-27T11:53:58Z

mongo/src/main/java/com/redhat/lightblue/mongo/crud/IterateAndUpdate.java

@@ -186,6 +200,16 @@ public void update(CRUDOperationContext ctx,
                measure.begin("ctx.addDocument");
                DocTranslator.TranslatedDoc translatedDoc=translator.toJson(document);
                DocCtx doc=new DocCtx(translatedDoc.doc,translatedDoc.rmd);
+                if (memoryMonitor != null) {
+                    // if memory threshold is exceeded, this will throw an Error
+                    memoryMonitor.apply(doc);


Write and read limits are set separately. We can enforce a lower limit on finds than on updates, we can even disable the limit for updates completely.

Inconsistent state as a result of a failed update of a range of documents is not a new problem, it's just that this logic can make it happen more often.

It's not even new to mongo, technically, since the only atomicity guarantee is with single documents; updating a range of documents may fail mongo-side mid operation as well.

On a separate note, I think there are other places in this algorithm the document is copied: at least when update starts, we copy the document again, and possibly also when the update is projected it may copy it as well (though I didn't read through that code well enough to see if it was just a pass-through-view upon the original or if a proper copy, though there is definitely non-zero memory overhead in either case). Should we monitor those too or is this approximation enough, given we have a separate threshold for writes?

The main variable is projections I guess, so I guess it ultimately depends if the projection makes another copy. If it does not, then the approximation is maybe fine: registering another copy is linearly proportional so it makes little difference other than how closely the threshold matches actual bytes used in memory. That said, if we tune that based on heap configuration, it may be less surprising if this matches more closely. I suppose we could just multiply the size calculated by * ~2 instead of adding the same doc to the monitor twice (i.e. use (doc) -> JsonUtils.size(doc.getRoot() * 1.5 /* nice comment here explaining this is because we read the doc in memory as original, then copy it to do the update */ )).

I used 1.5 in that example because I realized, when copying, it is not exactly * 2: primitives are reused since they are immutable (see ValueNode#deepCopy). Ultimately, not sure what is best without looking at real numbers. We know it is >1 * size but I don't know how much greater. Also, I think technically this may also impact hook calculation (double counting "copied-but-not-really" primitives). Maybe it's fine to be off by some, as long as it's documented as such when tuning thresholds. It's already an approximation to start, anyway.

On a separate note, I think there are other places in this algorithm the document is copied: at least when update starts, we copy the document again, and possibly also when the update is projected it may copy it as well (though I didn't read through that code well enough to see if it was just a pass-through-view upon the original or if a proper copy, though there is definitely non-zero memory overhead in either case). Should we monitor those too or is this approximation enough, given we have a separate threshold for writes?

Coming up with DocCtx size is tricky. There is originalDocument, outputDocument (projected) and updatedDocument, all populated and re-populated at different stages of update operation processing. I included the copies created when updates starts (which you pointed out). Projection - as far as I can see - is only creating json "containers" (ArrayNode/ObjectNode) and keeps references to actual fields and values. To be honest, after days of looking at this I'm still unsure on how many document copies we store in memory at peak. It's not just the update that creates copies, it's also HookManager and I really don't know why it's not just using the copies in DocCtx'es.

At this point, the estimate sums up root, originalDocument and updatedDocument from DocCtx and pre and post copies created during hook queuing. What is not covered by the estimate is that updatedDocument is modified (by the update operation) and outputDocument created by projector.

alechenninger

This looks good, though I think our result size calculation may be off by a factor of ~1.5-2.5 give or take, see comment.

Ultimately, it's not a huge deal if the number is off by a constant multiplier because we can change the threshold appropriately. There isn't exactly a proper "unit" that we can use, so I already see the number as sort of "arbitrary byte-like-thing that correlates closely with actual bytes used" :-).

alechenninger · 2017-07-27T16:00:47Z

mongo/src/main/java/com/redhat/lightblue/mongo/crud/IterateAndUpdate.java

@@ -186,6 +200,16 @@ public void update(CRUDOperationContext ctx,
                measure.begin("ctx.addDocument");
                DocTranslator.TranslatedDoc translatedDoc=translator.toJson(document);
                DocCtx doc=new DocCtx(translatedDoc.doc,translatedDoc.rmd);
+                if (memoryMonitor != null) {
+                    // if memory threshold is exceeded, this will throw an Error
+                    memoryMonitor.apply(doc);


It's not even new to mongo, technically, since the only atomicity guarantee is with single documents; updating a range of documents may fail mongo-side mid operation as well.

On a separate note, I think there are other places in this algorithm the document is copied: at least when update starts, we copy the document again, and possibly also when the update is projected it may copy it as well (though I didn't read through that code well enough to see if it was just a pass-through-view upon the original or if a proper copy, though there is definitely non-zero memory overhead in either case). Should we monitor those too or is this approximation enough, given we have a separate threshold for writes?

The main variable is projections I guess, so I guess it ultimately depends if the projection makes another copy. If it does not, then the approximation is maybe fine: registering another copy is linearly proportional so it makes little difference other than how closely the threshold matches actual bytes used in memory. That said, if we tune that based on heap configuration, it may be less surprising if this matches more closely. I suppose we could just multiply the size calculated by * ~2 instead of adding the same doc to the monitor twice (i.e. use (doc) -> JsonUtils.size(doc.getRoot() * 1.5 /* nice comment here explaining this is because we read the doc in memory as original, then copy it to do the update */ )).

I used 1.5 in that example because I realized, when copying, it is not exactly * 2: primitives are reused since they are immutable (see ValueNode#deepCopy). Ultimately, not sure what is best without looking at real numbers. We know it is >1 * size but I don't know how much greater. Also, I think technically this may also impact hook calculation (double counting "copied-but-not-really" primitives). Maybe it's fine to be off by some, as long as it's documented as such when tuning thresholds. It's already an approximation to start, anyway.

alechenninger · 2017-07-27T16:04:02Z

mongo/src/main/java/com/redhat/lightblue/mongo/crud/IterateAndUpdate.java

+                    // no hooks will fire for updated batches
+                    // counts sent to client will be set to zero
+                    // TODO: I perceive this as a problem with updates and hooks impl in general
+                    // we need to run hooks per batch (see https://github.com/lightblue-platform/lightblue-mongo/issues/378)


Agree hook design needs work, this is one among other problems

alechenninger · 2017-07-27T16:20:13Z

mongo/src/test/java/com/redhat/lightblue/mongo/crud/MongoCRUDControllerMemoryLimitsTest.java

+                projection("{'field':'*'}"));
+
+        // this is wrong - one batch was updated successfully
+        // see IterateAndUpdate.java:205 for more info


This is ugly, but I think given our options it makes sense: "updated" probably should mean that hooks fired, too; that is the whole transaction was successful for those N documents. In this case, that's not true, so probably makes sense to consider them not "updated" despite the visible state change.

This is ugly, but I think given our options it makes sense

Agree. It's misleading, not necessarily wrong.

…ays fires when needed

paterczm requested review from alechenninger and bserdar July 27, 2017 11:48

paterczm added the ready for review label Jul 27, 2017

paterczm commented Jul 27, 2017

View reviewed changes

alechenninger approved these changes Jul 27, 2017

View reviewed changes

Marek Paterczyk added 4 commits July 28, 2017 11:03

Enforcing memory limit on update operations

d6f44bf

Memory limit incorporates results list and copies created by hooks

406bc30

Moving maxResultSetSize monitor above warn one to ensure that max alw…

76c4c84

…ays fires when needed

Merged changes from downstream, test tweaks

9e37dd3

paterczm force-pushed the cap2 branch from ec41269 to 9e37dd3 Compare July 28, 2017 11:31

paterczm merged commit 493b6e6 into master Jul 28, 2017

paterczm deleted the cap2 branch July 28, 2017 11:37

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Enforcing memory limit on update operations #379

Enforcing memory limit on update operations #379

paterczm commented Jul 27, 2017

paterczm Jul 27, 2017

alechenninger Jul 27, 2017

paterczm Jul 28, 2017

alechenninger left a comment

alechenninger Jul 27, 2017

alechenninger Jul 27, 2017

alechenninger Jul 27, 2017

paterczm Jul 28, 2017

Enforcing memory limit on update operations #379

Enforcing memory limit on update operations #379

Conversation

paterczm commented Jul 27, 2017

paterczm Jul 27, 2017

Choose a reason for hiding this comment

alechenninger Jul 27, 2017

Choose a reason for hiding this comment

paterczm Jul 28, 2017

Choose a reason for hiding this comment

alechenninger left a comment

Choose a reason for hiding this comment

alechenninger Jul 27, 2017

Choose a reason for hiding this comment

alechenninger Jul 27, 2017

Choose a reason for hiding this comment

alechenninger Jul 27, 2017

Choose a reason for hiding this comment

paterczm Jul 28, 2017

Choose a reason for hiding this comment