CCExtractor Development

CCAligner: Allow passing raw text transcript

CCAligner uses subtitles to do a "guided search" in the input audio file for speech. But sometimes we have raw text transcript (plain text containing spoken text without any timing information and formatting). CCAligner should be able to handle that at least for -transcribe parameter.

Allow passing text transcript directly instead of subtitles. The grammar files should be generated with the help of this file. Of course this task requires at least some understanding of working of CCAligner, so it's recommended to go through the working of program before attempting this.

For the task, add a new parameter -txt. User should pass raw text file to CCAligner. When this mode is chosen, do not allow normal word level synchronisation, but only allow complete timed transcription ( -transcribe parameter).

Task tags

  • refactoring
  • parameter
  • ccaligner
  • c++

Students who completed this task

Hemang Rajvanshy

Task type

  • code Code
close

2017