Text is a very big part of most files that digital technology users create. For example, these files could be: Word or PDF documents, emails, cellphone texts (SMS format) or web pages. Therefore being able to compress text for storage or transmission is extremely important. Fortunately files containing mainly text can be significantly compressed.
Like image compression there are many algorithms or methods that have been devised to do this. There is one important point to note about text compression and that is it needs to use a lossless method. This means the method must not discard any data when it compresses the data. If this was so, the data when it is uncompressed would be incomplete. To quote an article in Wikipedia to do with data compression:
Points to note from this quote:
The images below show compression settings for the 7z program. The program was able to take the XHTML mark-up that made up the Wikipedia data compression page and compress it from 116KB, when it was a text file, to a 19KB 7z zip file. That is a compression ratio of 16.11 (116/19=16.11) or the zipped version is 6.11% (1/16.11 x 100 = 6.11) of the size of the original. See below.
Note that 7z uses the LZMA compression method by default, but others can be used. Files can be saved in various formats, such as the popular zip one, and compression levels can be specified. Lastly, an important point: zip programs can be used to 'zip up' any data type. For example, the XHTML of the Wikipedia page mentioned above copied and saved as a Word document measures 221KB in size (due to the extra formatting information added by Word); compressed it measures 31.2KB. The images below show some of the compression option settings that the 7z program has. Note that at bottom right the compressed file can be password protected using the AES 256 encryption format. This is virtually impossible to break.
Click on an image to see a bigger version.
Data Compression >