Classified fields in Perceval docs should correspond to the fields actually removed #611

valeriocos · 2020-02-06T11:37:50Z

When executing the github backend with the option --filter-classified, the list of all classified fields is reported in each JSON document (pointer). Thus, it isn't possible to derive which fields were removed from a given document, if the latter didn't contain one of the classified fields. It would be useful to adapt the code to include in the classified_fields_filtered only the fields that have been removed.

The text was updated successfully, but these errors were encountered:

sduenas · 2020-02-06T11:55:36Z

The idea of these fields is all or nothing. You don't decide which fields you want to remove and which you don't want.

Is there anything I'm missing?

valeriocos · 2020-02-06T12:17:03Z

Based on the classified fields declared at https://github.com/chaoss/grimoirelab-perceval/blob/master/perceval/backends/core/github.py#L102, for each document only the classified fields actually removed from it should appear in the classified_fields_filtered attribute.

Having a look at the code, I'm basically saying that in case of a KeyError, that classified field shouldn't be added to the classified_fields_filtered attribute. The reason is that with the document obtained, it isn't possible to know if that classified field was removed or didn't exist.

sduenas · 2020-02-06T20:09:23Z

Is that really necessary? What would be the difference? In the end, data is not going to be there which is what we really want with that option.

valeriocos · 2020-02-08T11:34:03Z

We can live with it, so feel free to close this issue.

The point is that the classified fileds at https://github.com/chaoss/grimoirelab-perceval/blob/master/perceval/backends/core/github.py#L102 include a mix of attributes present in issues and pull requests. It would be better to have classified fields per category, then we can decide whether to include in the classified_fields_filtered of each document the fields actually removed.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Classified fields in Perceval docs should correspond to the fields actually removed #611

Classified fields in Perceval docs should correspond to the fields actually removed #611

valeriocos commented Feb 6, 2020

sduenas commented Feb 6, 2020

valeriocos commented Feb 6, 2020

sduenas commented Feb 6, 2020

valeriocos commented Feb 8, 2020

Classified fields in Perceval docs should correspond to the fields actually removed #611

Classified fields in Perceval docs should correspond to the fields actually removed #611

Comments

valeriocos commented Feb 6, 2020

sduenas commented Feb 6, 2020

valeriocos commented Feb 6, 2020

sduenas commented Feb 6, 2020

valeriocos commented Feb 8, 2020