Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix partitioned session window bug during checkpoint restore #143

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

arunkm
Copy link
Collaborator

@arunkm arunkm commented Oct 29, 2020

Fix partitioned session window bug, where restoring from checkpoint triggers exception

}
else if (!this.orderedKeysDictionary.ContainsKey(partitionKey))
{
this.orderedKeysDictionary.Add(partitionKey, new LinkedList<TKey>());
}
Copy link
Contributor

@peterfreiling peterfreiling Oct 29, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add comments to explain why this is necessary - e.g., orderedKeysDictionary is not included in checkpoint, so needs to be restored, and every entry in the other collections also needs to be present in orderedKeysDictionary upon restore as that assumption is made throughout this class, etc. #Closed

@arunkm arunkm force-pushed the users/arunkm/fix-PartitionedSessionWindow-bug branch from 4f85e22 to 4d86cf2 Compare October 29, 2020 18:53
namespace SimpleTesting
{
/* This test case verifies fix for
* Bug 944724: Exception in Trill SessionWindow when Partition (substreams) is used and restored from checkpoint
Copy link
Contributor

@peterfreiling peterfreiling Oct 29, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

944724 [](start = 11, length = 6)

944724 [](start = 11, length = 6)

Since this is GitHub repo please do not reference internal bugs, just describe them if needed #Closed

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

reworded

…iggers an exception.

Cause of the Issue:
    The PartitionedSessionWindowPipe keeps multiple dictionary states.
    One of the dictionary does not have a [DataMember] attribute, Because the value type is a LinkedList which does not support serialization.
    On checkpoint and then Restore, this dictionary is re-created using other data members in UpdatePointers callback.
    During this, the scenario of empty LinkedList value is missed and not restored.
    When next data event appears for the partition, the partitionKey is indexed on the dictionary resulting in KeyNotFoundException

Regression
     No, this existed for a long time. The customer hit the bug now because of workaround suggested (for another bug) to use 'timestamp by .. over' clause.

ReproSteps:
    Added as a Testcase.

Fix:
    Added code to restore key with empty LinkedList.
    Verified that this was the exact state just before checkpoint.
    Added a testcase that triggers this bug and with the fix the testcase passes.
@arunkm arunkm force-pushed the users/arunkm/fix-PartitionedSessionWindow-bug branch from 4d86cf2 to 58496b0 Compare October 29, 2020 19:06
Copy link
Contributor

@peterfreiling peterfreiling left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:shipit:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants