As we discussed earlier data direct from the EUMETCast service is compressed. Before it can be read into NWCASF-GEO a decompression process need to be performed.
We will now write a script and include it in our crontab so that new files entering our compressed_Sat_data directory are regularly decompressed and moved into the Sat_data directory for NWCSAF-GEO to process them.
First lets navigate to our scripts directory and create a new script called decompress.sh
cd $SAFNWC/scripts
vim decompress.sh
Now add the following lines to create the decompression script. This script assumes that you have downloaded and successfully built the xRITDecompress executable in your safnwc directory as discussed during the preparation of data section.
#!/bin/bash
src=$SAFNWC/import/compressed_Sat_data
dest=$SAFNWC/import/Sat_data
EPI_status_file=$SAFNWC/scripts/EPI_status_file.txt
PRO_status_file=$SAFNWC/scripts/PRO_status_file.txt
cd ${src}
if [[ -n "$(find ${src}/. -type f -name "*PRO*")" && -n "$(find ${src}/. -type f -name "*EPI*")" ]];
then
EPI_list=$(find ${src}/. -type f -name "*EPI*" -exec basename {} \; | awk -F'-' '{print $7}')
PRO_list=$(find ${src}/. -type f -name "*PRO*" -exec basename {} \; | awk -F'-' '{print $7}')
printf "%s\n" $EPI_list > ${EPI_status_file}
printf "%s\n" $PRO_list > ${PRO_status_file}
decomp_list=$( comm -1 -2 ${EPI_status_file} ${PRO_status_file} )
for tim in ${decomp_list};
do
find ${src}/. -type f -name "*${tim}*" -exec $SAFNWC/PublicDecompWT-master/xRITDecompress/xRITDecompress {} \;
mv *${tim}*-__ ${dest}
rm *${tim}*
done
rm ${EPI_status_file}
rm ${PRO_status_file}
fi
This script looks in the compressed_Sat_data directory that we created earlier and decompresses them using xRITDecompress. However, this script is slightly more advanced and will help to avoid some problems in the future.
As a first step, source and destination directories are defined and two different status files (one for epilogue files and one for prologue files) are defined (but not created). Then the script changes directory to the source directory.
The script then checks to see whether there are any prologue and epilogue files in the source directory, if not then no action is taken. If files of this type are present, then a list of the time slots associated with these files is generated. These lists are then output to the two status files so that they can be compared using the comm command. This command produces a list of the time stamps that are present in both lists. This is important because we would not want to decompress files that were still in the process of being moved (or downloaded in the case of an operational setup). The times that have both prologue and epilogue files are then decompressed and moved to the destination directory. The original compressed files are then deleted.
Remember to make this script executable before trying to run it.
chmod +x decompress.sh
To automate this process we have to add a line to our crontab. Open the crontab using
crontab -e
and add the following line, making sure that the path to the script is correct (we need to give a full path here as cron does not use the same environment variables as those specified for you as a user.
* * * * * /bin/bash -c "/full/path/to/scripts/decompress.sh"
As you might now be able to tell this crontab entry will execute the above decompression script every minute. Now that you have added this script to your crontab it will be run every minute from now until you remove this command from your crontab. Don't worry about the fact that this script is already being run despite us not being ready to run NWCSAF-GEO, this script only does anything when there are the appropriate files in the source directory, as such at the moment it is called and then quits almost straight away.