-
Notifications
You must be signed in to change notification settings - Fork 21
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add annotations #654
Add annotations #654
Conversation
42b8603
to
a7be057
Compare
The biggest problem I am aware of is dealing with null-intervals There is a lot of cases which require different behavior from null-intervals:
|
Thanks for the first draft @zurk 👍 Some feedback I can give from UIMA experience:
|
0e55837
to
2239da4
Compare
First feedback addressed, the code becomes more verbose but I and Hugo agreed that it is better to be more explicit then short. |
I was busy with #666 so I do all my plans after a review. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the improvements @zurk 👍
From the discussion we had, the last big point left to implement seems to be the UAST to annotations transformation (that will allow to ditch many code smells from current codebase).
This starts to look ready 👍. It should be better to have parent attribute in UASTAnnotation and remove the UASTParentAnnotation though, it will make the code much simpler! |
I agree. However, when I start implementing it I see that it is not so easy. The problem is that we add the UAST and parents annotation in different places. It looks like we can move parents annotation code to uast annotation, but we have a I hope once v3 python client will be introduced we will have parent feature included into UAST itself and remove |
14ee11d
to
ca0638b
Compare
I think we can consider this PR as done and ready to merge, so please review carefully and let's merge if it is fine. I am also doing a regression test for this part one more time to be sure that I have exactly the same VNodes in the end (zurk@612887e). Here is the commit to run the test (I run |
This option seems perfectly fine to me. I feel we should do it to avoid complication of the rest of the codebase. |
Ok, I think I know the answer. My plan is:
Logic was not broken, code improved and everyone seems to be happy. :) |
yes it seems like great solution: then the code will be easy to write everywhere else 👍 |
Signed-off-by: Konstantin Slavnov <[email protected]>
Signed-off-by: Konstantin Slavnov <[email protected]>
Done. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As soon as there is a lot of conflicts I am going to close this one and open new PR. |
Note that all work in progress. However, I created this PR to discuss the proposed solution, find all possible drawbacks and fix them on the early stage. Please note, that the code is raw in minor details. Let's discuss high-level questions about implemented architecture/interface/API decisions.
Let me remind you the main idea of annotations:
Separate data (code) and all it annotations (like UAST nodes, Virtual Token,
y
, file path, etc) handling.This PR contains:
lookout-sdk-ml
when I create ready-to-merge PR. For now, I do not want to split the code because it is harder to review.to_vnode()
function.FeatureExtractor._parse_vnodes()
,FeatureExtractor._parse_file()
,VirtualNode.from_node()
functions are rewritten using Annotations to give a feeling of how it will look like. To make everything works I use
to_vnode()
Backward capability function.I also checked that I get exactly the same vnodes that we have in current master.
Related PR: #521
cc @src-d/machine-learning and specially @m09 PTAL
Any comments and feedback appreciated.