The Problem Statement
The client wanted a reliable audio tool that could be used to store, organize, and label audio samples obtained from various projects. A system, to prepare datasets to feed into the Speech-to-text models, add meta-data like Flags, Tag and Word Mapping in the audio, etc. Based on previous experiences with KritiKal Solutions, the client again engaged with us for this project.
Screenshots
The Solution
KritiKal Solutions leveraged its software expertise to devise an audio tool that enables the developers to tag, manage audio files and facilitates them with training and regression purposes.
Frontend is designed using DART programming language with navigation functionalities to edit any specific part of the waveform. To define backend functionality, Golang is used. The tool fetches sound data and converts it into a waveform for visualization purposes. Using speech-to-text algorithms words are mapped to the audio waveform, which in turn mapped to the audio itself. These mappings are used for further training and regression activities.
The Audio analytics tool also acts as a central repository to store and manage audios. The broad idea of creating this tool was to preserve the actual information provided in the audio, add meta-data useful for training/regression, segregating data into meaningful projects without tampering with data itself.
Benefits Delivered
The Audio tool designed by KritiKal Solutions is highly beneficial for extracting meaningful information from raw audio data automatically. This helped the client to efficiently compare/validate the spoken words with predefined expected words and serve them with the most relevant information. The audio tool also acts as a central repository for data.
Technology Used
- DART language
- Golang
- Google App Engine
- Google Cloud Tasks