FS (2018-01-25): Here is how the new flow of data from China works.
A. Recording station
News shows are recorded on sands, a Raspberry Pi with a Joker-TV tuner and an 8TB hard drive in a server room at Hunan Normal University. The large hard drive means that it has room to record and store several months of video -- at the current rate of 20GB a day, around a year.
Jacek provides the broadcast schedule on vila in Poland; it's transferred to cartago, and sands picks it up there once a day with xmltv-download. The schedule script then reprograms the crontabs automatically between 7am and 8am every day.
We currently record local, regional, and national news -- typically 5-10 shows a day -- with the script channel, currently channel_joker_2018-01-11.sh. It also extracts the particular stream we want from the multi-stream file that Joker records.
Finally it calls the script check-cc-joker, which adds header information and copies the file to xingfu.
B. File processing
The virtual machine xingfu, provided to us for exclusive use and with root access, attempts to extract captions, currently to no effect.
It then rescales the h264 transport stream to simplify a complex mix of display aspect ratio and sample aspect ratio; the resulting files play fine in any player.
The audio is transcoded to aac. This is all accomplished by the script mpg2mp4-bulk, which is started by cron a few times each day.
For these operations, we use a custom-built ffmpeg: