Adding sourcedata filename to a column in the scans.tsv file #905

adam2392 · 2021-10-20T19:49:12Z

Problem

I've been working with conversion of sourcedata files over to BIDS for the sake of i) speeding up my analysis work streams and ii) speeding up sharing of datasets. However, many times you'll have new datasets coming in, or maybe you want to determine if the file you uploaded with some filename (e.g. subject001_eeg_001.edf) was converted or not.

Moreover, many of my collaborators (i.e. clinicians) only remember their original file naming scheme, not the organized BIDS files. Unfortunately, then there's a lot of back and forth about which file is which unless there is a backwards trace of which BIDS file corresponds to which source file. There is no easy way to check this right now.

Suggestion

My proposal is to add a SHOULD requirement in the scans.tsv that suggests that users add a column to the file for original_filename, which adds the filename of the source file. This way, one can backtrack what was converted easily. To be honest, I think it should be a MUST, unless there is some sort of PHI embedded in the source filename?

The text was updated successfully, but these errors were encountered:

effigies · 2021-10-20T19:55:37Z

Definitely can't do a must, and for PHI reasons (and the fact that scans.tsv is optional) I think should is too strong. This does seem okay to put in as may. An alternative could be promoting the derivative Sources metadata to raw files as well.

adam2392 · 2021-10-20T20:13:51Z

You mean https://bids-specification.readthedocs.io/en/stable/05-derivatives/02-common-data-types.html here?

I suppose that serves the same purpose as adding original_filename in scans.tsv. However, where would these go? Would they go in the sidecar JSON?

I'm okay w/ either option as long as it's specified in BIDS, then we can support it in mne-bids.

tsalo · 2021-10-20T20:16:15Z

+1 to using Sources in raw datasets. It fits in with other applications to derivatives rules to raw datasets, like #440.

effigies · 2021-10-20T20:17:41Z

You mean https://bids-specification.readthedocs.io/en/stable/05-derivatives/02-common-data-types.html here?

Yes.

I suppose that serves the same purpose as adding original_filename in scans.tsv. However, where would these go? Would they go in the sidecar JSON?

Yes.

Another approach to this could just be a table in sourcedata/ or code/ with source/destination columns. It could serve as a log or an input to a tool that performs the conversions.

sappelhoff · 2021-10-20T20:25:26Z

Also +1 to make Sources available for Raw.

adam2392 · 2021-10-21T00:56:31Z

Should this just go in the sidecar json part for each MEG, EEG and iEEG?

Remi-Gau · 2021-10-21T05:49:42Z

Also +1 to make Sources available for Raw.

Same for me

guiomar · 2021-10-21T23:22:17Z

I would be careful of including original filenames inside the BIDS dataset, since many times they could contain sensitive data (eg. surnames, real dates, diseases, etc). Since this is not imprescindible information to understand the dataset itself, but lab management logistics. I would incline more towards some log outside (eg in /sourcedata), that one can easily delete before the dataset is shared. Having a field inside a json or tsv might be more difficult to delete.

Remi-Gau · 2021-10-22T06:46:09Z

yup we raised that concern in the PR: #906 (comment)

Though technically nothing in BIDS prevents from naming a file: sub-JohnDoe_T1w.nii, but I see your point.

sappelhoff · 2021-11-29T09:15:19Z

closed, because we discussed in #906 that implementing this on the tooling side of things would suffice ... see mne-tools/mne-bids#890

adam2392 mentioned this issue Oct 21, 2021

[ENH] Add Sources link to all raw datatypes sidecar JSON #906

Closed

sappelhoff closed this as completed Nov 29, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Adding sourcedata filename to a column in the scans.tsv file #905

Adding sourcedata filename to a column in the scans.tsv file #905

adam2392 commented Oct 20, 2021

effigies commented Oct 20, 2021 •

edited

Loading

adam2392 commented Oct 20, 2021

tsalo commented Oct 20, 2021

effigies commented Oct 20, 2021

sappelhoff commented Oct 20, 2021

adam2392 commented Oct 21, 2021

Remi-Gau commented Oct 21, 2021

guiomar commented Oct 21, 2021

Remi-Gau commented Oct 22, 2021

sappelhoff commented Nov 29, 2021

Adding sourcedata filename to a column in the scans.tsv file #905

Adding sourcedata filename to a column in the scans.tsv file #905

Comments

adam2392 commented Oct 20, 2021

Problem

Suggestion

effigies commented Oct 20, 2021 • edited Loading

adam2392 commented Oct 20, 2021

tsalo commented Oct 20, 2021

effigies commented Oct 20, 2021

sappelhoff commented Oct 20, 2021

adam2392 commented Oct 21, 2021

Remi-Gau commented Oct 21, 2021

guiomar commented Oct 21, 2021

Remi-Gau commented Oct 22, 2021

sappelhoff commented Nov 29, 2021

effigies commented Oct 20, 2021 •

edited

Loading