Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding to SVG output #3086

Open
hyanwong opened this issue Feb 12, 2025 · 9 comments
Open

Adding to SVG output #3086

hyanwong opened this issue Feb 12, 2025 · 9 comments

Comments

@hyanwong
Copy link
Member

hyanwong commented Feb 12, 2025

There are a number of use-cases for adding material to tskit tree drawings. In particular I'm thinking about legends, but this would also be useful for annotations for tutorials etc. I am unsure if adding the extra code to the tskit drawing module would be worth the maintenance burden, but it seems pretty nice to be able to do something like the following in a notebook:

tree.draw_svg().embed(my_legend_text_in_SVG_format)

This would also allow multiple SVGs trees to be shown side-by-side, which can be helpful

tree1.draw_svg().embed(tree2.draw_svg(root_svg_attributes={'x': 200}))

Note that this isn't done by changing the SVG drawing routine (which is hard, and already fairly overloaded with functionality), but can be implemented by adding a method to the tskit.drawing.SVGstring class that adds arbitrary text after the <svg> tag. The code below does so relatively efficiently using Python's built-in html parser:

from html.parser import HTMLParser

class SVGString(str):
    "A string containing an SVG representation"
    ...


    class SVGParser(HTMLParser):
        def __init__(self):
            super().__init__()
            self.svg_tag = None
        
        def handle_starttag(self, tag, attrs):
            if tag == 'svg':
                self.svg_tag = self.get_starttag_text()
                raise StopParsing
    
    class StopParsing(Exception):
        pass
    
    def embed(self, svg_to_embed):
        """
        Embed additional svg markup at the start of the svg string (immediately after the initial
        <svg ...> tag). This could include standard tags, or even a nested <svg> component.
        """
        parser = SVGParser()
        start = 0
        while start < len(self):
            end = self.find('>', start)  # feed up to potential end-of-tag into parser
            if end == -1:
                break
            try:
                parser.feed(self[start:(end + 1)])
                start = end + 1
            except StopParsing:
                # Assume we don't need anything before the first <svg ...> tag
                return SVGString(parser.svg_tag + to_insert + self[end + 1:])
        raise ValueError("No starting <svg> tag found in SVGstring")

Is this too much to add into the tskit code base? If so, we can simply close this issue.

@hyanwong
Copy link
Member Author

hyanwong commented Feb 12, 2025

Here's a nice example of adding a legend and outputting immediately to the notebook window, which would be much more difficult to achieve if the user had to deconstruct the SVG by hand and reassemble with a legend:

Image
import msprime
import stdpopsim
species = stdpopsim.get_species("HomSap")
model = species.get_demographic_model("AncientEurasia_9K19")
subpop_size = 50
samples = {pop.name: subpop_size for pop in model.populations if pop.default_sampling_time is not None}
contig = species.get_contig(length=1e4)
engine = stdpopsim.get_engine("msprime")
ts = engine.simulate(model, contig, samples, seed=1)

colours = ['#CC6677', '#332288', '#DDCC77', '#117733', '#88CCEE', '#882255', '#44AA99', '#999933', '#AA4499']
styles = [f".node.p{p.id} > .sym {{fill: {colours[p.id]}}}" for p in model.populations]
params = dict(size=(1000, 600), time_scale="log_time", node_labels={}, style=" ".join(styles))

svg = ts.first().draw_svg(**params, title=f"AncientEurasia_9K19, n = {subpop_size} x {len(samples)} populations")

legend = "".join([
    f'<g transform="translate(40, {20 + 15*p.id})" class="node p{p.id}"><rect width="6" height="6" class="sym" /><text x="10" y="7">{p.name} (id={p.id})</text></g>'
    for p in model.populations
])

svg.embed(legend)

Once we have functionality like this, it would then be reasonably easy to provide a helper function that e.g. generated a legend for different population colours, as in my example above.

@jeromekelleher
Copy link
Member

If someone were to do this themselves, parsing the SVG as XML or whatever and updating using upstream libraries, what would it look like?

@hyanwong
Copy link
Member Author

hyanwong commented Feb 13, 2025

Good question. I guess it might involve either knowing the structure of the SVG, or doing what I have previously done and embedding both SVGs in another SVG. The problem with this is that it doesn't automatically work in a notebook. So e.g.:

from IPython.display import SVG

svg = tree.draw_svg(**params)

legend = "".join([  # as before
    f'<g transform="translate(40, {20 + 15*p.id})" class="node p{p.id}"><rect width="6" height="6" class="sym" /><text x="10" y="7">{p.name} (id={p.id})</text></g>'
    for p in model.populations
])

## New stuff (untested)

# Somehow extract the height and width from the current SVG
h, w = some_function_to_get_h_w(svg)
# make an SVG tag: I always have to look up the syntax for this
svg_start = f'<svg baseProfile="full" height="{h}" version="1.1" width="{w}" xmlns="http://www.w3.org/2000/svg">'

display(SVG(svg_start + svg + legend + "</svg>"))

@jeromekelleher
Copy link
Member

Right, but using a proper html/svg/xml library what does it look like? My issue here is that you need to know a lot about SVG anyway to usefully tweak these things, so why don't we figure out how to get people using more powerful upstream tools for this rather than rolling our own?

@hyanwong
Copy link
Member Author

hyanwong commented Feb 13, 2025

Well, I was hoping that this might open the way to allow this sort of functionality for people who don't know about SVG. In particular, you can embed one SVG within another, so

my_legend = some_library.generate_svg_legend(**params)
tree.draw_svg().embed(my_legend)

Otherwise, as you say, you would need to use a specific SVG library (I've never really used any, so the below is a rough guess). This certainly would require you to know how SVG works.

svg = tree.draw_svg()
obj = some_svg_lib.parse(svg)
my_legend = some_library.generate_svg_legend(**params)
obj.add_to_root(my_legend)  # or maybe you would embed both the `svg` and `my_legend` components inside another svg object?
display(SVG(obj.as_string))

@hyanwong
Copy link
Member Author

I mean, this isn't super important, but it would mean that (a) making nice docs is easier (especially as we wouldn't need document how to import a specific external SVG library, which might become obsolete etc) and (b) I suspect that it would make it quite a lot easier to produce readable one-off plots in notebooks, e.g. like the sc2ts notebooks we are trying to make public.

It's a tiny bit like saying "should we have a title parameter when drawing an SVG tree, or simply say that people can add their own title by editing the SVG. The latter is possible, but inconvenient enough that people don't actually bother to do so.

@jeromekelleher
Copy link
Member

Sure, sounds good, just checking.

Would probably be easier to use the minidom parser than to be faddling around with event driven stuff, though.

@hyanwong
Copy link
Member Author

hyanwong commented Feb 13, 2025

Would probably be easier to use the minidom parser than to be faddling around with event driven stuff, though.

Ah neat, I didn't know about that. But the html.parser approach means you don't need to parse the entire SVG first (you just want to grab the very first <svg> tag anyway). For really big drawings, it seems a bad idea to parse everything just to grab what is usually the first ~40 characters of the string. I can't immediately see how to do that with the minidom parser (maybe there is a way, though) I see you can use parser = XMLPullParser(['start']) to do this using the xml parser, like this (which is much neater)

from xml.etree.ElementTree import XMLPullParser

class SVGString(str):
    ...
    def embed(self, svg_to_embed):
        """
        Embed additional svg markup at the start of the svg string (immediately after the initial
        <svg ...> tag). This could include standard tags, or even a nested <svg> component.
        """
        parser = XMLPullParser(['start'])
        start = 0
        while start < len(self):
            end = self.find('>', start)  # feed up to potential end-of-tag into parser
            if end == -1:
                break
            parser.feed(self[start:(end + 1)])
            start = end + 1
            parser.flush()
            for event, elem in parser.read_events():
                if elem.tag == "svg" or elem.tag.endswith("}svg"):
                    return tskit.drawing.SVGString(self[:start] + svg_to_embed + self[start:])
        raise ValueError("No starting <svg> tag found in SVGstring")

I'll also check with Ben if he thinks the idea of adding a method to the tskit SVGstring class is the best approach here. Another possibility would be to add (yet another) parameter to the draw_svg command, and add it right at the end of the draw_svg() methods in trees.py:

class Tree:
    ...
    def draw_svg(
        ...
        output = draw.drawing.tostring()
        if extra_svg is not None
            first_tag = get_first_tag(output)
            output =  first_tag  + extra_svg + output[len(first_tag):]
        ...

@jeromekelleher
Copy link
Member

I don't see anything wrong with adding another parameter here to draw_svg like preamble or something. ("extra" is very vague)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants