Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature request] Allow the use of logging instead of print #3729

Closed
christophertubbs opened this issue May 10, 2024 · 4 comments
Closed

[Feature request] Allow the use of logging instead of print #3729

christophertubbs opened this issue May 10, 2024 · 4 comments
Labels
feature request feature requests for making TTS better. wontfix This will not be worked on but feel free to help.

Comments

@christophertubbs
Copy link

🚀 Feature Description

The print function is in several places, most noticeably (to me) is in utils.synthesizer.Synthesizer.tts, with lines like:

        print(f" > Processing time: {process_time}")
        print(f" > Real-time factor: {process_time / audio_time}")

This is great when messing around, but it'd be nice to have the option to use different types of loggers (or even just the root). For instance, if I have a distributed application, I can have this writing to something that would send the messages through a pubsub setup so that another application may read and interpret the output in real time.

Solution

utils.synthesizer.Synthesizer's signature can be changed to look like:

    def __init__(
        self,
        tts_checkpoint: str = "",
        tts_config_path: str = "",
        tts_speakers_file: str = "",
        tts_languages_file: str = "",
        vocoder_checkpoint: str = "",
        vocoder_config: str = "",
        encoder_checkpoint: str = "",
        encoder_config: str = "",
        vc_checkpoint: str = "",
        vc_config: str = "",
        model_dir: str = "",
        voice_dir: str = None,
        use_cuda: bool = False,
        logger: logging.Logger = None
    ) -> None:

and the tts function can look like:

    if self.__logger:
        self.__logger.info(f" > Processing time: {process_time}")
        self.__logger.info(f" > Real-time factor: {process_time / audio_time}")
    else:
        print(f" > Processing time: {process_time}")
        print(f" > Real-time factor: {process_time / audio_time}")

A Protocol for the logger might work better than just the hint of logging.Logger - it'd allow programmers to put in some wackier functionality, such as writing non-loggers that just so happen to have a similar signature.

Alternative Solutions

An alternative solution would be to pass the writing function to tts itself, something like:

    def tts(
        self,
        text: str = "",
        speaker_name: str = "",
        language_name: str = "",
        speaker_wav=None,
        style_wav=None,
        style_text=None,
        reference_wav=None,
        reference_speaker_name=None,
        split_sentences: bool = True,
        logging_function: typing.Callable[[str], typing.Any] = None,
        **kwargs,
    ) -> List[int]:

    ...
   
    if logging_function:
        logging_function(f" > Processing time: {process_time}")
        logging_function(f" > Real-time factor: {process_time / audio_time}")
    else:
        print(f" > Processing time: {process_time}")
        print(f" > Real-time factor: {process_time / audio_time}")

This will enable code like:

def output_sound(text: str, output_path: pathlib.Path, connection: Redis):
    from TTS.api import TTS
    speech_model = TTS(DEFAULT_MODEL).to("cpu")
    speech_model.tts_to_file(text=text, speaker="p244", file_path=str(output_path), logging_function: connection.publish)

Additional context

I don't believe that utils.synthesizer.Synthesizer.tts is the only location of the standard print function. A consistent solution should be applied there.

The parameter for the logging functionality will need to be passed through objects and functions that lead to the current print statements. For instance, TTS.api.TTS.tts_to_file would require a logging_function parameter if it were to the function to self.synthesizer.tts within the tts function.

The general vibe of the solutions I've provided will make sure that pre-existing code behaves no different, making the new functionality purely opt-in.

I haven't written anything using a progress bar like the one that this uses, so I can't speak up for that aside from the fact that it might need to be excluded.

@christophertubbs christophertubbs added the feature request feature requests for making TTS better. label May 10, 2024
@eginhard
Copy link
Contributor

In our fork (pip install coqui-tts) all prints have been switched to Python logging. Feel free to try it out and let us know if it works for you. (also a duplicate of #1691)

@christophertubbs
Copy link
Author

Thanks for your work there! I wasn't aware that TTS was essentially dead here when I posted. Is there anything I need to know when migrating over to your version?

@eginhard
Copy link
Contributor

No, there aren't any major changes and you can use it in the same way.

Copy link

stale bot commented Jun 19, 2024

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. You might also look our discussion channels.

@stale stale bot added the wontfix This will not be worked on but feel free to help. label Jun 19, 2024
@stale stale bot closed this as completed Jul 18, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature request feature requests for making TTS better. wontfix This will not be worked on but feel free to help.
Projects
None yet
Development

No branches or pull requests

2 participants