Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor feature vector generation #394

Merged
merged 38 commits into from
May 11, 2024
Merged

Conversation

hdefazio
Copy link
Contributor

@hdefazio hdefazio commented Apr 15, 2024

  • time_from_name() now requires the topic of the data it is being provided because the medical and cooking dataset images have different filename structures
    - this also required task specific versions of time_from_name() in their respective load_data files
  • Added the ability to draw keypoints if they are present in visualize_kwcoco_by_label()
  • Wrote hacky function activity_label_fixes() to map the BBN ground truth acitivty labels to our activity labels

MAIN POINT OF MR:

  • Refactored compute_feats() in train_actiivty_classifier.py and obj_det2d_set_to_feature() and obj_det2d_set_to_feature_by_method() in utils.py so the joint distances would be added if the respective flags were turned on in utils.py and not always added in train_activity_classifier.py
    - this also resulted in refactoring the feature vector order to remove redundancies in the data and loops
  • Added plot_feature_vec() to plot the various object/hand/joint points based on the feature vector and the kwcoco hand bounding boxes as a debug and visualization tool
  • Added feature_version_to_options() so we can access the corresponding feature vector flags in new spots - and thus updated obj_det2d_set_to_feature() to use this function

@hdefazio hdefazio marked this pull request as draft April 15, 2024 18:08
@hdefazio hdefazio changed the title Fix pytest failures Refactor feature vector generation May 1, 2024
Extract the float timestamp from the filename.

:param fname: Filename of an image in the format
frame_<frame number>_<seconds>_<nanoseconds>.<extension>
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there somewhere else that it would be useful to document this filename requirement?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another idea: wherever such a filename requirement is assumed, we could add an "assert" checking for the right number of underscores

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

^From Hannah:

the id used to be parsed like this:

RE_FILENAME_TIME = re.compile(
    r"frame_(?P<frame>\d+)(?P<ts>\d+(?:|.)\d+).(?P<ext>\w+)"
)

So then you'd get a slightly-opaque error in this code:

    fname = os.path.basename(fname)
    match = RE_FILENAME_TIME.match(fname)
    time = match.group("ts")

Rough proposal: edit the function here and always use it to acquire any arguments from the assumed filename above. Add some assertions (correct number of underscores, etc.) which try to enforce the above naming convention. If those assertions fail, the failure message should contain the fname convention that's documented in-line above.

@Purg , please tell me if you have better ideas or suggestions for the check I'm proposing here!

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Late to the party. I would probably recommend updating the regex to have more specific captures for the expected pattern instead of capturing a variable pattern and then performing secondary parsing. Unless you're implying with the regex that the _<nanoseconds> portion is optional and sometimes not there. If the latter is the case, I would still recommend being explicit and having two regexs, one with and one without nanoseconds. Try matching against nanoseconds, if fail try parse against only seconds, if fail then raise an exception.

RE_FILENAME_TIME_SEC = re.compile(
    r"frame_(?P<frame>\d+)_(?P<ts_sec>\d+)\.(?P<ext>\w+)"
)
RE_FILENAME_TIME_NANO = re.compile(
    r"frame_(?P<frame>\d+)_(?P<ts_sec>\d+)_(?P<ts_nano>\d+)\.(?P<ext>\w+)"
)

@@ -29,6 +29,14 @@ def sanitize_str(str_: str):
"""
return str_.lower().strip(" .")

def time_from_name(fname, topic="cooking"):
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add some assertion that the fname matches the RE_FILENAME_TIME parsing that we expect - if it doesn't, print RE_FILENAME_TIME

Comment on lines +33 to +36
if topic == "medical":
from angel_system.data.medical.load_bbn_data import time_from_name as tfn
elif topic == "cooking":
from angel_system.data.cooking.load_kitware_data import time_from_name as tfn
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this the right way to handle multiple versions of this function? This was attempting to allow other functions to just import from angel_system.data.common.load_data import time_from_name without needing to recreate this if statement each time

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is OK, but it is generally preferred to not perform imports at runtime if you can avoid it (unexpected runtime exception hazard). An alternative to this is to create a global dict (or equivalent structure) that has these keys as its keys, and its values being the function reference, e.g.

...
import angel_system.data.medical.load_bbn_data
import angel_system.data.cooking.load_kitware_data
...


TOPIC_TIME_FN_MAP = {
    "medical": angel_system.data.medical.load_bbn_data.time_from_name,
    "cooking": angel_system.data.cooking.load_kitware_data.time_from_name,
}


def time_from_name(fname, topic="cooking"):
    topic = topic.lower()
    if topic not in TOPIC_TIME_FN_MAP:
        raise ValueError(f"Unknown topic {topic}: Unknown filename time function.")
    return TOPIC_TIME_FN_MAP[topic](fname)

label = "mark-time"
label_id = 8

def activity_label_fixes(task, activity_label, target):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is hacky - the way BBN gave us activity GT, we parse it like this in angel_system/data/medical/load_bbn_data.py:

data = line.split("\t")

            # Find frame filenames
            start_frame = int(data[0])
            end_frame = int(data[1])

# Determine activity
            activity_str = data[2].strip().split(" ")
            hand = activity_str[0]
            activity = activity_str[1]
            target = activity_str[2] if len(activity_str) > 2 else None

Example output:

212     1128    hands      apply_pressure_to        casualty_wound

(Paraphrased)
"start frame"
"end frame"
"which hand is doing the activity" ("both" = "hands")
"activity" (generic, could apply to R18 or M2)
"target"

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we add some text like this to the docstring for this function?

Copy link
Member

@Purg Purg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I am able to grasp what is going on at the larger picture view, but the smaller pieces that I could understand look good to me. See relatively minor comments.

I'll assume various leading/trailing whitespace things will be handled via a code formatting pass with black.

f"{json.dumps([o['name'] for o in dset.categories().objs])}"
)
act_id_to_str = {dset.cats[i]["id"]: dset.cats[i]["name"] for i in dset.cats}
print(f"Object label mapping:\n\t", obj_label_to_ind)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nitpick: This is not an f-string anymore so the leading f can be removed.


:param feat_version:
Version of the feature conversion approach.
:param top_k_objects: Number top confidence objects to use per label, defaults to 1
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
:param top_k_objects: Number top confidence objects to use per label, defaults to 1
:param top_k_objects: Number of top confidence objects to use per label, defaults to 1

object_inds = list(
set(list(label_to_ind.values())) - set(hands_inds) - set(non_object_inds)
)
zero_joint_offset = [0 for i in range(22)]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we able to get this 22 size from anywhere via introspection, or is the only not-difficult way to get this in here via hard-coding?

X.append(feature_vec.ravel())

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

trailing whitespace.

#########################
default_dist = (0, 0) # (1280 * 2, 720 * 2)
default_center_dist = (0, 0) # (1280, 720)
default_bbox = [0, 0, 0, 0] # [0, 0, 1280, 720]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ideally, non-mutable types should be used as defaults (and for function argument defaults that I see later), e.g. tuple. The use of list type variables here means they are subject to modification during runtime which can lead to hard-to-debug logical errors.

if use_center_dist:
image_center = kwimage.Boxes([default_bbox], "xywh").center
image_center = kwimage.Boxes([0, 0, 1280, 720], "xywh").center # Hard coded image size
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would be a little more comfortable with default values/tuples/etc. being defaulted kwargs to the function (actually modifiable by future/advanced code), but if this works for now, it works. Either way, I would highly recommend documenting hard-coded values and assumptions in the function's doc-string.

@@ -49,7 +49,7 @@ def load_kwcoco(dset):
return dset


def add_activity_gt_to_kwcoco(task, dset):
def add_activity_gt_to_kwcoco(topic, task, dset, activity_config_fn):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I latently understand that "topic" here refers to things like "Medical", "Cooking", etc. It would be good to define jargon like this here so that future readers know what should be going here.

Comment on lines +33 to +36
if topic == "medical":
from angel_system.data.medical.load_bbn_data import time_from_name as tfn
elif topic == "cooking":
from angel_system.data.cooking.load_kitware_data import time_from_name as tfn
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is OK, but it is generally preferred to not perform imports at runtime if you can avoid it (unexpected runtime exception hazard). An alternative to this is to create a global dict (or equivalent structure) that has these keys as its keys, and its values being the function reference, e.g.

...
import angel_system.data.medical.load_bbn_data
import angel_system.data.cooking.load_kitware_data
...


TOPIC_TIME_FN_MAP = {
    "medical": angel_system.data.medical.load_bbn_data.time_from_name,
    "cooking": angel_system.data.cooking.load_kitware_data.time_from_name,
}


def time_from_name(fname, topic="cooking"):
    topic = topic.lower()
    if topic not in TOPIC_TIME_FN_MAP:
        raise ValueError(f"Unknown topic {topic}: Unknown filename time function.")
    return TOPIC_TIME_FN_MAP[topic](fname)

Comment on lines 132 to 135
if topic == "medical":
from angel_system.data.medical.load_bbn_data import time_from_name
elif topic == "cooking":
from angel_system.data.cooking.load_kitware_data import time_from_name
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The above time_from_name could be used here instead of duplicating its functionality.

Extract the float timestamp from the filename.

:param fname: Filename of an image in the format
frame_<frame number>_<seconds>_<nanoseconds>.<extension>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment doesn't match the regex defined above: The regex has a - (dash) after the frame number instead of an underscore.

@hdefazio hdefazio marked this pull request as ready for review May 11, 2024 01:16
@hdefazio hdefazio merged commit cbc2cfd into PTG-Kitware:master May 11, 2024
1 of 3 checks passed
@hdefazio hdefazio deleted the fix/pytest branch May 11, 2024 01:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants