Huffman codes in java
Learn Latest Tutorials
Preparation
Trending Technologies
B.Tech / MCA
Javatpoint Services
JavaTpoint offers too many high quality services. Mail us on h[email protected], to get more information about given services.
- Website Designing
- Website Development
- Java Development
- PHP Development
- WordPress
- Graphic Designing
- Logo
- Digital Marketing
- On Page and Off Page SEO
- PPC
- Content Development
- Corporate Training
- Classroom and Online Training
- Data Entry
Training For College Campus
JavaTpoint offers college campus training on Core Java, Advance Java, .Net, Android, Hadoop, PHP, Web Technology and Python. Please mail your requirement at [email protected].
Duration: 1 week to 2 week
Like/Subscribe us for latest updates or newsletter
Huffman.java
Below is the syntax highlighted version of Huffman.java from §5.5 Data Compression.
/****************************************************************************** * Compilation: javac Huffman.java * Execution: java Huffman - < input.txt (compress) * Execution: java Huffman + < input.txt (expand) * Dependencies: BinaryIn.java BinaryOut.java * Data files: https://algs4.cs.princeton.edu/55compression/abra.txt * https://algs4.cs.princeton.edu/55compression/tinytinyTale.txt * https://algs4.cs.princeton.edu/55compression/medTale.txt * https://algs4.cs.princeton.edu/55compression/tale.txt * * Compress or expand a binary input stream using the Huffman algorithm. * * % java Huffman - < abra.txt | java BinaryDump 60 * 010100000100101000100010010000110100001101010100101010000100 * 000000000000000000000000000110001111100101101000111110010100 * 120 bits * * % java Huffman - < abra.txt | java Huffman + * ABRACADABRA! * ******************************************************************************/ /** * The @code Huffman> class provides static methods for compressing * and expanding a binary input using Huffman codes over the 8-bit extended * ASCII alphabet. * * For additional documentation, * see href="https://algs4.cs.princeton.edu/55compression">Section 5.5 of * Algorithms, 4th Edition by Robert Sedgewick and Kevin Wayne. * * @author Robert Sedgewick * @author Kevin Wayne */ public class Huffman // alphabet size of extended ASCII private static final int R = 256; // Do not instantiate. private Huffman() > // Huffman trie node private static class Node implements ComparableNode> private final char ch; private final int freq; private final Node left, right; Node(char ch, int freq, Node left, Node right) this.ch = ch; this.freq = freq; this.left = left; this.right = right; > // is the node a leaf node? private boolean isLeaf() assert ((left == null) && (right == null)) || ((left != null) && (right != null)); return (left == null) && (right == null); > // compare, based on frequency public int compareTo(Node that) return this.freq - that.freq; > > /** * Reads a sequence of 8-bit bytes from standard input; compresses them * using Huffman codes with an 8-bit alphabet; and writes the results * to standard output. */ public static void compress() // read the input String s = BinaryStdIn.readString(); char[] input = s.toCharArray(); // tabulate frequency counts int[] freq = new int[R]; for (int i = 0; i input.length; i++) freq[input[i]]++; // build Huffman trie Node root = buildTrie(freq); // build code table String[] st = new String[R]; buildCode(st, root, ""); // print trie for decoder writeTrie(root); // print number of bytes in original uncompressed message BinaryStdOut.write(input.length); // use Huffman code to encode input for (int i = 0; i input.length; i++) String code = st[input[i]]; for (int j = 0; j code.length(); j++) if (code.charAt(j) == '0') BinaryStdOut.write(false); > else if (code.charAt(j) == '1') BinaryStdOut.write(true); > else throw new IllegalStateException("Illegal state"); > > // close output stream BinaryStdOut.close(); > // build the Huffman trie given frequencies private static Node buildTrie(int[] freq) // initialize priority queue with singleton trees MinPQ pq = new MinPQNode>(); for (char c = 0; c R; c++) if (freq[c] > 0) pq.insert(new Node(c, freq[c], null, null)); // merge two smallest trees while (pq.size() > 1) Node left = pq.delMin(); Node right = pq.delMin(); Node parent = new Node('\0', left.freq + right.freq, left, right); pq.insert(parent); > return pq.delMin(); > // write bitstring-encoded trie to standard output private static void writeTrie(Node x) if (x.isLeaf()) BinaryStdOut.write(true); BinaryStdOut.write(x.ch, 8); return; > BinaryStdOut.write(false); writeTrie(x.left); writeTrie(x.right); > // make a lookup table from symbols and their encodings private static void buildCode(String[] st, Node x, String s) if (!x.isLeaf()) buildCode(st, x.left, s + '0'); buildCode(st, x.right, s + '1'); > else st[x.ch] = s; > > /** * Reads a sequence of bits that represents a Huffman-compressed message from * standard input; expands them; and writes the results to standard output. */ public static void expand() // read in Huffman trie from input stream Node root = readTrie(); // number of bytes to write int length = BinaryStdIn.readInt(); // decode using the Huffman trie for (int i = 0; i length; i++) Node x = root; while (!x.isLeaf()) boolean bit = BinaryStdIn.readBoolean(); if (bit) x = x.right; else x = x.left; > BinaryStdOut.write(x.ch, 8); > BinaryStdOut.close(); > private static Node readTrie() boolean isLeaf = BinaryStdIn.readBoolean(); if (isLeaf) return new Node(BinaryStdIn.readChar(), -1, null, null); > else return new Node('\0', -1, readTrie(), readTrie()); > > /** * Sample client that calls @code compress()> if the command-line * argument is "-" an @code expand()> if it is "+". * * @param args the command-line arguments */ public static void main(String[] args) if (args[0].equals("-")) compress(); else if (args[0].equals("+")) expand(); else throw new IllegalArgumentException("Illegal command line argument"); > >
Huffman Code in Java
The Huffman coding is a data compression algorithm that creates a binary tree of nodes. The node can be either internal nodes or leaf nodes.
This tutorial describes and demonstrates the Huffman code with Java in detail.
Demonstrate the Use of Huffman Coding Algorithm in Java
The idea of the Huffman coding algorithm is to assign variable-length codes to input characters based on the frequencies of corresponding characters.
These codes are called the Prefix codes since the code given to each character is unique, which helps Huffman coding with decoding without any ambiguity.
We can build a Huffman tree using a priority queue in Java, where the node with the highest priority has the lowest frequency. We will follow the steps given below.
- First, create a leaf node for each character of the given text and add the nodes to the priority queue.
- If there is more than one node in a queue, remove two nodes with the lowest frequency and highest priority from that queue.
- Now, create a new node with two children nodes that were removed before, the frequency of the new node will be equal to the sum of both nodes’ frequencies. And then add that node to the priority queue.
- Finally, the remaining node will be the root node, and the tree will be completed.
Let’s see an example in Java to convert a text into Huffman coding.
The main class Huffman.java :
import java.util.Comparator; import java.util.HashMap; import java.util.Map; import java.util.PriorityQueue; class Huffman // Huffman Tree Traversing and storing the Huffman Codes in a dictionary. public static void encode_huffman(Huffman_Node root_node, String str, MapCharacter, String> huffman_Code) if (root_node == null) return; > // if the root node is a leaf node if (is_Leaf(root_node)) huffman_Code.put(root_node.charac, str.length() > 0 ? str : "1"); > encode_huffman(root_node.left, str + '0', huffman_Code); encode_huffman(root_node.right, str + '1', huffman_Code); > // Huffman Tree Traversing and decoding the encoded string public static int decode_huffman(Huffman_Node root_node, int index, StringBuilder sb) if (root_node == null) return index; > // if the root node is a leaf node if (is_Leaf(root_node)) System.out.print(root_node.charac); return index; > index++; root_node = (sb.charAt(index) == '0') ? root_node.left : root_node.right; index = decode_huffman(root_node, index, sb); return index; > // This function checks if Huffman Tree contains only one single node public static boolean is_Leaf(Huffman_Node root_node) return root_node.left == null && root_node.right == null; > // Main Huffman tree build function public static void Main_Build_HuffmanTree(String text) // Base case: empty string if (text == null || text.length() == 0) return; > // Calculate the frequency of each character and store it in a map of dict MapCharacter, Integer> frequency = new HashMap<>(); for (char c: text.toCharArray()) frequency.put(c, frequency.getOrDefault(c, 0) + 1); > // priority queue to store nodes of the Huffman tree // the highest priority item has the lowest frequency PriorityQueueHuffman_Node> prio_queue; prio_queue = new PriorityQueue<>(Comparator.comparingInt(l -> l.frequency)); // leaf node for each character, adding it to the priority queue. for (var entry: frequency.entrySet()) prio_queue.add(new Huffman_Node(entry.getKey(), entry.getValue())); > //repeat the process till there is more than one node in the queue while (prio_queue.size() != 1) // Then remove the two nodes with the highest priority and lowest frequency Huffman_Node left = prio_queue.poll(); Huffman_Node right = prio_queue.poll(); // Now create a new internal node with two children nodes, and the frequency will be the some of both nodes; add the new node to the priority queue. int sum = left.frequency + right.frequency; prio_queue.add(new Huffman_Node(null, sum, left, right)); > Huffman_Node root_node = prio_queue.peek(); // Huffman tree Traversing and storing the Huffman codes in a dict or map MapCharacter, String> huffmanCode = new HashMap<>(); encode_huffman(root_node, "", huffmanCode); // Display the Huffman codes System.out.println("The Huffman Codes for the given text are: " + huffmanCode); System.out.println("The original text is: " + text); // display the encoded string StringBuilder sb = new StringBuilder(); for (char c: text.toCharArray()) sb.append(huffmanCode.get(c)); > System.out.println("The encoded text is: " + sb); System.out.print("The decoded text is: "); if (is_Leaf(root_node)) // For input like a, aa, aaa, etc. while (root_node.frequency-- > 0) System.out.print(root_node.charac); > > else // Huffman Tree traversing with decoding the encoded string int index = -1; while (index sb.length() - 1) index = decode_huffman(root_node, index, sb); > > > // Call the Huffman code public static void main(String[] args) String text = "This is delftstack"; Main_Build_HuffmanTree(text); > >
The node class Huffman_Node.java :
import java.util.Comparator; import java.util.HashMap; import java.util.Map; import java.util.PriorityQueue; // A Tree node class Huffman_Node Character charac; Integer frequency; Huffman_Node left = null, right = null; Huffman_Node(Character charac, Integer frequency) this.charac = charac; this.frequency = frequency; > public Huffman_Node(Character charac, Integer frequency, Huffman_Node left, Huffman_Node right) this.charac = charac; this.frequency = frequency; this.left = left; this.right = right; > >
The first class is the main class that performs the operations of the Huffman coding algorithm, and the second class is for creating the nodes. The code will generate Huffman codes for a given text, the encoded text, and decode it.
The Huffman Codes for the given text are: The original text is: Hello This is delftstack.com The encoded text is: 011010000000001100010100111001011110010101111001010111011000000110111011001101111100101011010011111010110001110 The decoded text is: Hello This is delftstack.com
As we can see, the given text contains 25 characters which are 25×8 = 200 bits, and the encoded string is only 111 bits, almost 45% data compression. This data compression is the primary purpose of Huffman coding.
Sheeraz is a Doctorate fellow in Computer Science at Northwestern Polytechnical University, Xian, China. He has 7 years of Software Development experience in AI, Web, Database, and Desktop technologies. He writes tutorials in Java, PHP, Python, GoLang, R, etc., to help beginners learn the field of Computer Science.