#CS 2336 Project: Text File Compression
This project is submitted by Eric Dilmore, Ian Brown, and Mahder Negash as the final project for CS2336 section 001. This file can be found in a typeset form at this location: https://github.com/CS2336-Group/2336-project/blob/master/README.md
Our group implemented three text file compression algorithms and made a GUI wrapper for them. Arithmetic Encoding, Huffman Coding, and Run Length Encoding were all used in this project. In addition, the Burrows-Wheeler Transform (while not explicitly a compression algorithm) was implemented behind the scenes to assist in Run Length Encoding. Text files will be compressed to binary files with the extension .emi.
Type in the name of the file you want to process, either to encode or to decode. The filename can be either relative to the current directory or an absolute path. Use a plain text file for compression or a compressed file for decompression. Select the desired encoding algorithm from the dropdown menu and then click "Encode/Decode". The program will next ask you whether you want to encode or decode the file. Once you select, it will process the file and display a summary of the process.
Note that the Burrows-Wheeler Transform takes quite a long time, making Run Length Encoding take a long time. Each encoding and decoding takes place in its own thread, so the rest of the program shouldn't appear to lag. You will know that the processing is complete when the summary appears. Do not assume that there is an infinite loop if the code does not finish in a short time for long files.
Note also that the compression algorithms will filter out any characters that are not valid UTF-8 characters. Unfortunately, the Java Character class can only handle valid UTF characters. If File Filtering is specified, the program will filter out all non-alphanumeric characters and replaces generic whitespace with single spaces. If File Filtering is not specified, then it will only filter out invalid characters.
If you attempt to decode a file where an unencrypted file with the same base name exists, you will be prompted whether you want to overwrite or write to a modified filename. The program will then number your unencrypted file when it is output.
If you have ant installed, just run:
ant
in the root directory. The program should compile and run.
If you would like to simply build the program without running it, run:
ant build
If you would like to simple run the program without rebuilding it, run: ant run
Build the project with:
dir /s /B src/*.java > sources.txt
javac -d target @sources.txt
Then, to run, change directories to the target directory. Then run:
java compression.CompressionGUI
Build the project with:
find -name "src/*.java" >sources.txt
javac -d target @sources.txt
Then, to run, change directories to the target directory. Then run:
java compression.CompressionGUI
to run it.
- System Architect
- Arithmetic Coding
- GUI
- Master of Regular Expressions
- Burrows Wheeler Transform
- Run Length Encoding
- Panicked Commit Messages
- Being Awesome
- Huffman Coding