You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Setting the "references" via the GFA RS:Z: tag in the header or using vg gbwt --set-reference for a given panSN sample name works fine. However, for triobinned assemblies (particularly in agriculture), one haplotype might be taken as the reference while the other is not. Setting the header with RS:Z:sample#hap results in HAPLOTYPE-sense, while trying vg gbwt --set-reference "sample#hap" fails due to a prohibited character ("#").
Perhaps the easiest solution is to promote the entire sample to reference and just ignore that the non-reference haplotype is REFERENCE-sense, but do you have any other ideas? I guess handling a half-reference half-haplotype sample might complicate a lot of the internal workings for diploid sampling, but parsing the haplotype field of panSN for setting reference paths seems doable.
This is using v1.63.1.
Best,
Alex
The text was updated successfully, but these errors were encountered:
GBZ could in principle specify references at the level of individual paths, but there are no interfaces for passing that information to the graph. Not on the command line, not in the API, and not in the file formats. Some other implementations determine the sense by parsing the path name, and renaming the paths is the only way to switch between haplotype and reference senses. I'm not exactly sure about the rules. sample#sequence is clearly a reference and sample#haplotype#sequence#something is clearly a haplotype, but I'm not sure how PanSN names will be interpreted.
Hi,
Setting the "references" via the GFA
RS:Z:
tag in the header or usingvg gbwt --set-reference
for a given panSN sample name works fine. However, for triobinned assemblies (particularly in agriculture), one haplotype might be taken as the reference while the other is not. Setting the header withRS:Z:sample#hap
results in HAPLOTYPE-sense, while tryingvg gbwt --set-reference "sample#hap"
fails due to a prohibited character ("#").Perhaps the easiest solution is to promote the entire sample to reference and just ignore that the non-reference haplotype is REFERENCE-sense, but do you have any other ideas? I guess handling a half-reference half-haplotype sample might complicate a lot of the internal workings for diploid sampling, but parsing the haplotype field of panSN for setting reference paths seems doable.
This is using v1.63.1.
Best,
Alex
The text was updated successfully, but these errors were encountered: