Compute-efficient Real-time Voice Cloning

By

Matthew Raffel and Micah Janzen

A voice cloning machine learning (ML) model receives a speech and text input and creates a new speech output reading the text input in the voice of the speaker. Our project aims to both speed up processing and reduce the computational resources necessary to run a voice cloning ML model, which can then be uploaded to a low-end system. The project uses a pre-existing machine learning toolkit repository to speed up the productivity of machine learning engineers. By implementing a modified ML model with updated sub-models into the existing model, we gain access to an improved training and evaluation environment that is more accessible to a broader audience. Once the updated model is complete, it can be implemented on a low-end system for user interaction. The user peripherals consist of a miniature button keyboard with an attachable display and a microphone for user inputs, and a volume adjusted amplifier for the cloned voice output.

❮ ❯

3 Lifts

Artifacts

Name	Description
Executive Summary	Project summary and motivation for design. Includes project timeline.	Download
Project Document	A document containing documentation for the entire Compute-efficient Real-time Voice Cloning project design process.	Download
Project Summary Video	A summary video for the Compute-efficient Real-time Voice Cloning project.	Link
GitHub Repository	GitHub Repository for our code.	Link

Project Showcase

Project
Showcase

Compute-efficient Real-time Voice Cloning

By

Matthew Raffel and Micah Janzen

Artifacts