CCExtractor Development

Any language: Create a program that creates a histogram of subtitle frames per minute from srt

This is simpler than it sounds :-)

One user mentioned that he had written a small tool for himself, with this description:

make a histogram in one-minute increments of the number of lines in a subtitle. The main purpose was to easily see if there were long gaps that might indicate missing captions. It is programmed in Excel VBA, so it isn't widely usable. The functionality may exist in subtitle software already but I haven't found it.

As you can see, he wrote it in Excel VBA (obviously he's not a professional programmer) so it can't really be used for example from scripts, or in any system in which Excel is not available.

But it is a good idea, so let's implement it as a command line tool that be be run in any system and from scripts.

You can use any language you want, but Python seems ideal because there's plenty of libraries for .srt and everything else, and it's available in all systems now. But if you are more comfortable with PHP, C++, etc, that's fine too.

The output must be a simple table with 3 columns: The minute, the count of subtitle frames, and a graphical representation of the count (using a + for each occurrence).

For example:

Minute 1    3    +++
Minute 2    1    +
Minute 3 
Minute 4    5    +++++

There also must be a header with the name of the input file, and a footer with the total amount of frames.

Remember that because it has to work with scripts the name of the input file must be a parameter. Don't ask for it. The output would be the terminal.

Task tags

  • python
  • php
  • math
  • srt
  • histogram

Students who completed this task

Parth Pratim, Aadi Bajpai, saltyJeff, Al Mao, Harry Yu

Task type

  • code Code