Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Missing arcs when plotting BEDPE file format #73

Open
eafyounian opened this issue Jul 15, 2022 · 1 comment
Open

Missing arcs when plotting BEDPE file format #73

eafyounian opened this issue Jul 15, 2022 · 1 comment
Assignees
Labels
bug Something isn't working

Comments

@eafyounian
Copy link
Contributor

Hi! When plotting arcs for BEDPE file format, if the first coordinate comes after the second coordinate (i.e. start_x > start_y), the arc is not drawn. I have fixed this issue by swapping the first coordinate with the second coordinate (see figure where the first track after the gene track shows some features, the second track is how it looks after my fix, and the third track is before the fix where some arcs are missing).
image
Is there some other more elegant fixes?

By the way, I am using coolbox version 0.3.3. I quickly checked the source of the latest version (i.e. 0.3.8) but I did not immediately find any indication that this is addressed in the latest version (but of course I may have missed it 🙂).

Thank you in advance for your consideration and response! 🙂

@Nanguage Nanguage added the bug Something isn't working label Jul 15, 2022
@Nanguage
Copy link
Collaborator

Nanguage commented Jul 15, 2022

I have reproduced this behavior, the Arc track fetched the non-upper-triangular record(start_x > start_y) but it's not drawn.

image

After I checked, this bug is caused by the following code:

diameter = (end - start)

When drawing, the diameter of arc is determined by pos2 - pos1 and becomes negative when plot the non-upper-triangular record(start_x > start_y).

A more elegant solution is to add a judgment to fetch_plot_data that converts all records to the upper-triangular form(start_x < start_y). fetch_plot_data is this function:

def fetch_plot_data(self, gr: GenomeRange, **kwargs) -> pd.DataFrame:
df = self.fetch_data(gr, **kwargs)
if len(df) == 0:
return df
style = self.properties['style']
if style == self.STYLE_ARCS:
pos_at = self.properties['pos']
if pos_at == 'start':
pos1, pos2 = df['start1'], df['start2']
elif pos_at == 'end':
pos1, pos2 = df['end1'], df['end2']
else:
pos1, pos2 = (df['end1'] + df['start1']) / 2, (df['end2'] + df['start2']) / 2
return pd.DataFrame({'pos1': pos1, 'pos2': pos2, 'score': df['score']})
elif style == self.STYLE_HICPEAKS:
gr2 = kwargs.get('gr2')
if gr2 and gr2 != gr:
mask = (df['start2'] >= gr2.start) & (df['end2'] <= gr2.end)
df = df[mask]
return df
else:
raise ValueError("The supported style for bedpe data are ['arcs', 'hicpeaks']")

@Nanguage Nanguage self-assigned this Jul 15, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants