So for context, I have an accent and AI can't understand me very well. This is a challenge with Zoom meetings where Otter.ai can't understand half of what I'm saying, or even something as simple as voice commands for my TV. I want to, without reinventing the wheel, take an existing model similar to what Otter uses, and then give it extra training with my labeled voice data (like I speak into my microphone and give it the "answer" on what I said).
How would you go about doing this? I'm thinking I might start looking on hugging face, but the options are just endless and I don't even know where to begin. Help please?
(This could go on my portfolio if successful, or even put it out into production for others to train the model on their own voices/accents)
EDIT: I opted to try using Azure's built-in model and retrain it with my voice data instead of building something from scratch.