[Feature Request] TQDM Integration #190

iwr-redmond · 2024-12-07T03:43:32Z

Description

Long running processes such as file downloads and generations are common in ML applications. Often they are logged to the console and not the UI, which is not helpful in circumstances where the end user is in app mode or accessing a web app remotely.

Rio provides a ProgressBar but not a connection between this UI component and potential workloads.

I suggest integrating Rio with TQDM.

Alternatives

No response

Additional Context

Leveraging the Plot component might be easier, but risks leaving ProgressBar warming up leftover snow.
Graphical progress meters are under-represented in existing frameworks (although they are used), with CLI counters being much more typical - notably, Gradio's tqdm support is comparatively recent

Related Issues/Pull Requests

No response

mad-moo · 2024-12-07T07:11:03Z

What would the API for this look like for the user? Can you post a minimal example for what you have in mind?

And yes, definitely integrate the progressbar rather than drawing one in a plot

iwr-redmond · 2024-12-07T09:22:54Z

A few options, in pseudocode because I'm more hat than cattle today.

Rely on the user to get most of it right:

downloader, output = async MyTqdmAsyncDownloader("https://huggingface.co/repository/large_file.safetensors")
rio.ProgressBar(progress=output)

Assume the user might need some guidance:

class MonitoredProgress(ProgressBar):
     key = "preloader"
     async_function = MyDownloadFunction("https://huggingface.co/repository/large_file.safetensors")
     progress_type = percentage # or range, or steps, etc. - predefined for tqdm use scenarios

Provide some helper functions - might be going over the top a bit:

async def my_function_steps(rio.Queue) = (
    rio.Helpers.Download(https://huggingface.co/repository/large_file.zip, local_file)
    my_hash_check_function(local_file)
    unzip_to_someplace() 
) # rio.Queue = basic wrapper around tqdm.async

class MonitoredOverall(rio.ProgressBar):
     key = "provisioning_steps_counter"
     type = QueueMonitor(my_function_steps)

class MonitoredSteps(rio.ProgressBar):
    key = "provisioning_steps_progress"
    type = StepMonitor(my_function_steps)

Aran-Fey · 2024-12-07T11:43:58Z

I don't really understand your examples, I'm afraid. They're a bit too pseudo for me, and they look quite different from how rio currently works.

This is how ProgressBars are currently used:

class ProgressDemo(rio.Component):
    progress: float = 0
    
    async def start_working(self):
        for p in range(10):
            await asyncio.sleep(1)
            self.progress = (p + 1) / 10
            await self.force_refresh()
    
    def build(self):
        return rio.Column(
            rio.Button('start working', on_press=self.start_working),
            rio.ProgressBar(self.progress),
        )

How would you write this code if tqdm integration existed? What do you want to see changed?

iwr-redmond · 2024-12-07T16:08:11Z

Please add that example to the documentation page. It's helpful. Also, my code is at best good for government work (c.f. warming leftover snow).

What I think is needed is two things:

Easier ways to transfer progress information from the backend to the frontend. Looking at the landscape of ML apps and research demo web interfaces, this is mostly not done well and often just plain ignored. If Rio is to have a well-functioning desktop component, the CLI with its basic feedback mechanisms will not be present once the app enters into production.
Basic protection against a non-async function taking over the application and causing it to freeze during a long process. This may exist already, but if not it would be good to take the opportunity to create a helper function that invites users to take advantage of useful features (easy progress bar) in order to reduce this risk and encourage best practices.

To answer your question then.

First the original function that would have existed anyway. I'll pick on OmniGen's scheduler class:

 def __call__(self, z, func, model_kwargs, use_kv_cache: bool=True, offload_kv_cache: bool=True):
        num_tokens_for_img = z.size(-1)*z.size(-2) // 4
        if isinstance(model_kwargs['input_ids'], list):
            cache = [OmniGenCache(num_tokens_for_img, offload_kv_cache) for _ in range(len(model_kwargs['input_ids']))] if use_kv_cache else None
        else:
            cache = OmniGenCache(num_tokens_for_img, offload_kv_cache) if use_kv_cache else None
        results = {}
        for i in tqdm(range(self.num_steps)):
            timesteps = torch.zeros(size=(len(z), )).to(z.device) + self.sigma[i]
            pred, cache = func(z, timesteps, past_key_values=cache, **model_kwargs)
            sigma_next = self.sigma[i+1]
            sigma = self.sigma[i]
            z = z + (sigma_next - sigma) * pred
            if i == 0 and use_kv_cache:
                num_tokens_for_img = z.size(-1)*z.size(-2) // 4
                if isinstance(cache, list):
                    model_kwargs['input_ids'] = [None] * len(cache)
                else:
                    model_kwargs['input_ids'] = None

                model_kwargs['position_ids'] = self.crop_position_ids_for_cache(model_kwargs['position_ids'], num_tokens_for_img)
                model_kwargs['attention_mask'] = self.crop_attention_mask_for_cache(model_kwargs['attention_mask'], num_tokens_for_img)

        del cache
        torch.cuda.empty_cache()  
        gc.collect()
        return z

This has not used tqdm.async, and asyncio itself isn't used anywhere in the code (nor in Rypo's excellent quantization and speed PRs). The authors have rightly focused on making their core code work, and the UI is a secondary if not tertiary affair. Because tqdm is called exactly once as a convenience, everything is outsourced to that package, which provides both risks (what if there are future changes that don't work well?) and opportunities (rio can take the process over instead).

To help with these matters, I suggest a rio wrapper function to make tqdm more accessible and reproducible. It would include a bunch of sane defaults, like using tqdm.async, and probably some exploration of the logging helper function. Critically, it would also provide a few basic options as to what type of counting is going to occur:

Percent
Range
Steps

These would allow the correct tqdm formatting to be applied and then transferred to the frontend. Gradio's version of this sort of integration function is here.

The end user, then, taking advantage of this integration, would then switch out L162:

for i in rio.ProgressMonitor(range(self.num_steps), key="omni_inference", counter="steps"):

Or (if a percentage is wanted in the UI):

for i in rio.ProgressMonitor(range(self.num_steps), key="omni_inference", counter="percent"):

Then finally on the frontend, something like:

from omnigen import scheduler
class ProgressDemo(rio.Component):
    
    async def start_working = OmniGenScheduler(my_ui_options_not_coded_here)
    
    def build(self):
        return rio.Column(
            rio.Button('start working', on_press=self.start_working),
            rio.ProgressBar(monitor_key="omni_inference"), # attach the particular progress bar to the particular rio.ProgressMonitor key established above
        )

I presumed above that rio.ProgressBar would remain as-is, and therefore invented subclasses (the fictional "rio.MonitorXYZ" for that purpose). An alternate approach might be to leave the original code unmodified and initiate the wrapper within the ProgressDemo component along the lines of:

from omnigen import scheduler
class ProgressDemo(rio.Component):
    
    async def start_working = rio.ProgressMonitor(OmniGenScheduler(my_ui_options_not_coded_here),
           key = "omni_inference",
           counter = "percent",
    )
    
    def build(self):
        return rio.Column(
            rio.Button('start working', on_press=self.start_working),
            rio.ProgressBar(monitor_key="omni_inference"),
        )

On the 'monitor_key' property, bear in mind that an app could have multiple async processes needing to be monitored, such as an external download of a large file and local inference, or a series of inferences/downloads and a counter of the total progress.

As an aside, I accidentally suggested a tqdm.async wrapper for external file downloads as I was thinking through the pseudo example. This would also be a worthwhile helper if there is going to be a progress monitor function of some kind, as it lends itself nicely to common uses cases (user downloads the software, which then downloads a bunch of large checkpoints).

Aran-Fey · 2024-12-07T18:40:32Z

Thanks for the detailed response. I'm still struggling to wrap my head around this (particularly since I'm unfamiliar with both tqdm and gradio), but it's becoming clearer.

Considering that I forgot to guard against the user pressing the "start working" button twice (which would lead to two async functions fighting over the same progress bar), it's fair to say that we should try to make ProgressBars more intuitive to use.

Not sure how tqdm would be integrated with this, but here are some ideas I had:

 class ExampleTask(rio.Task):
     async def run(self):
         # Simulate some work
         for p in range(10):
             await asyncio.sleep(1)
             self.progress = (p + 1) / 10


 class ProgressDemo(rio.Component):
     current_task: ExampleTask | None = None
     
     async def start_task(self):
         self.current_task = ExampleTask()
     
     def build(self):
         return rio.Column(
             rio.Button(
                 'start task',
                 on_press=self.start_task,
                 is_sensitive=self.current_task is None,
             ),
             rio.ProgressBar(self.current_task),
         )

 class ProgressDemo(rio.Component):
     @rio.task
     async def do_stuff(self):
         # Simulate some work
         for p in range(10):
             await asyncio.sleep(1)
             yield (p + 1) / 10
     
     def build(self):
         return rio.Column(
             rio.Button(
                 'start task',
                 on_press=self.do_stuff.start,
                 is_sensitive=not self.do_stuff.is_running,
             ),
             rio.ProgressBar(self.do_stuff),
         )

iwr-redmond · 2024-12-07T19:04:39Z

I think (1) is closer to my layman's understanding of how Rio is structured. A rio.Task fundamental component to manage whatever comes up (sending progress to the UI, avoiding collisions, etc.) also seems highly valuable.

The Gradio reference is primarily to demonstrate that CLI -> UI transfer without matplotlib is plausible. The internals of tqdm always make me feel about as sharp as a mashed potato, so you're in good company there. If I may suggest, try experimenting with the wget sample because it shows two different ways of formatting the wget function: the sample TqdmUpTo class for a progress bar, which actually is not needed for Rio, and wrapattr, which shows only some numbers during file transfer (too little but a useful comparison). I suggested specific text deliverables (percent, range, steps) because tqdm.async can be configured to provide these during function execution, and that significantly narrows down the amount of customization/integration work that needs to be done. More counting/progress reporting methods can always be added later.

iwr-redmond · 2024-12-17T02:20:03Z

There is a useful sample in issue 660, which converts the TQDM stats to strings that can be parsed.

That would provide something like:

stats = str(nested) # from the sample code

update = stats.split("%")[0] # assuming the standard tqdm display format
self.progress = update # send the new count to the frontend

mad-moo · 2024-12-17T06:11:40Z

I don't think the implementation is the problem. After looking at Aran Fey's code, the integration just doesn't help a whole lot. It saves just a handful of lines of code, but in return makes everything more complex.

I still like the idea, but IMHO this only makes sense if we can come up with an easier way for users to actually access it

iwr-redmond · 2024-12-18T08:52:34Z

this only makes sense if we can come up with an easier way for users to actually access it

Hallelujah to that!

iwr-redmond · 2024-12-27T02:09:28Z

Wolfgang Fahl's NiceGUI wrapper has adds TQDM to NiceGUI in a simpler manner than above. Example usage here.

iwr-redmond added the enhancement Improves on existing functionality - NOT a new feature label Dec 7, 2024

iwr-redmond mentioned this issue Feb 2, 2025

Forms Generation #211

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature Request] TQDM Integration #190

[Feature Request] TQDM Integration #190

iwr-redmond commented Dec 7, 2024 •

edited

Loading

mad-moo commented Dec 7, 2024

iwr-redmond commented Dec 7, 2024 •

edited

Loading

Aran-Fey commented Dec 7, 2024

iwr-redmond commented Dec 7, 2024 •

edited

Loading

Aran-Fey commented Dec 7, 2024

iwr-redmond commented Dec 7, 2024

iwr-redmond commented Dec 17, 2024

mad-moo commented Dec 17, 2024

iwr-redmond commented Dec 18, 2024

iwr-redmond commented Dec 27, 2024

[Feature Request] TQDM Integration #190

[Feature Request] TQDM Integration #190

Comments

iwr-redmond commented Dec 7, 2024 • edited Loading

Description

Suggested Solution

Alternatives

Additional Context

Related Issues/Pull Requests

mad-moo commented Dec 7, 2024

iwr-redmond commented Dec 7, 2024 • edited Loading

Aran-Fey commented Dec 7, 2024

iwr-redmond commented Dec 7, 2024 • edited Loading

Aran-Fey commented Dec 7, 2024

iwr-redmond commented Dec 7, 2024

iwr-redmond commented Dec 17, 2024

mad-moo commented Dec 17, 2024

iwr-redmond commented Dec 18, 2024

iwr-redmond commented Dec 27, 2024

iwr-redmond commented Dec 7, 2024 •

edited

Loading

iwr-redmond commented Dec 7, 2024 •

edited

Loading

iwr-redmond commented Dec 7, 2024 •

edited

Loading