使用哈夫曼演算法對檔案進行解壓和壓縮
前言:筆者是一名在讀大二的學生,用哈夫曼演算法對檔案進行壓縮和解壓是老師布置的實驗課作業,碼這個程式花了很多時間,但是我也在其中學到了很多東西,寫這種類似于工程類和寫演算法的程式感覺有很大的不一樣,
目錄
- 使用哈夫曼演算法對檔案進行解壓和壓縮
- part1 原理
- part2 壓縮代碼實作
- 1.統計檔案字符出現次數
- 2.建立哈夫曼樹
- 3.對出現的字符編碼
- 4.按照對檔案編碼
- 5.把字串里的01串轉換
- 6.存入檔案前的準備作業
- 7.寫入檔案
- part3.解壓代碼實作
- 1.讀檔案
- 2.還原哈夫曼樹
- 3.還原檔案
- part4.全部代碼
- part5.寫在最后
part1 原理
對檔案內出現的字符進行統計,以出現的次數為關鍵字,建立哈夫曼樹,實作對字符的編碼,將字符轉換成一個01序列,因為一個字符占用一個位元組,8個bit, 而一個位元組可以存八位的01編碼,所以用位編碼替換檔案內的字符,既完成了檔案的壓縮,為了能夠解碼,我們要把哈夫曼樹和01序列一起存進檔案中,
👆👆哈夫曼樹
解碼時,我們將存進壓縮檔案里的哈夫曼樹還原,通過檔案里的01序列對哈夫曼樹進行查找,即可還原出原檔案,
part2 壓縮代碼實作
1.統計檔案字符出現次數
👇統計文本中字符出現頻率
int Count(string op, string path1, string path2, int tong[]) {
// TODO:統計檔案1字符出現頻率
// path1是源檔案,path2是目標檔案
ifstream instr(path1, ios::in | ios::binary);
unsigned int bytebuff = 0;
char ch;
while (instr.get(ch)) {
bytebuff =(int)(unsigned char)ch;
tong[bytebuff]++;
} //統計weight完成!
int LeafNumber = 0;
for (int i = 0; i <= 256; i++) {
if (tong[i] != 0)
LeafNumber++;
}
instr.close();
cerr << "Count Completed" << endl;
return LeafNumber;
}
👆使用get()一次從檔案輸入流里讀取一個位元組,相當于一個char型別,使用桶計數法統計字符出現次數,這里要用unsigned char,因為有些字符的編碼是大于127的,符號位為1,不適用unsigned讀出來會是負數,
2.建立哈夫曼樹
struct HuffmanNode {
int info; //存
int index;
int weight;
int parent; //存
int left;
int right;
char side; //存
string BinaryCode;
friend bool operator>(HuffmanNode f1, HuffmanNode f2) {
return f1.weight > f2.weight;
}
};
👆哈夫曼樹的節點資訊
👇個人習慣喜歡把節點名和結構體指標換名字
typedef HuffmanNode Node;
typedef HuffmanNode *Tree;
創建哈夫曼樹:
int CreatHuffmanTree(int tong[], int LeafNumber, Node HuffmanTree[]) {
// TODO:創建哈夫曼樹
int k = 0;
priority_queue<Node, vector<Node>, greater<Node>> pq;
// Tree HuffmanTree = new Node[2 * LeafNumber - 1];
// 0--->LeafNumber - 1 是葉節點
// leafNumber--->2*LeafNumber - 2 是根節點
for (int i = 0; i <= 256; i++) {
if (tong[i]) {
HuffmanTree[k].info = i;
HuffmanTree[k].index = k;
HuffmanTree[k].left = HuffmanTree[k].right =
HuffmanTree[k].parent = -1;
HuffmanTree[k].weight = tong[i];
pq.push(HuffmanTree[k]);
k++;
}
}
int j = LeafNumber;
//通過優先佇列構建哈夫曼樹
while (pq.size() > 1) {
Node t1 = pq.top();
pq.pop();
Node t2 = pq.top();
pq.pop();
HuffmanTree[t1.index].parent = j;
HuffmanTree[t2.index].parent = j;
HuffmanTree[j].index = j;
HuffmanTree[j].parent = -1;
// cout<<HuffmanTree[j].index;
HuffmanTree[j].left = t1.index;
// cout<<HuffmanTree[j].left;
HuffmanTree[j].right = t2.index;
// cout<<HuffmanTree[j].right;
HuffmanTree[j].weight = t1.weight + t2.weight;
HuffmanTree[j].info = -127;
pq.push(HuffmanTree[j]);
j++;
}
j--;
Node HuffmanTreeHead = pq.top();
HuffmanTreeHead.parent = -127;
HuffmanTree[j] = HuffmanTreeHead;
HuffmanTree[j].info = -127;
pq.pop();
cerr << "Creat Huffman Tree Completed" << endl;
return 0;
}
這里使用了stl容器優先佇列priority_queue,優先佇列,其底層是用堆來實作的,隊首一定是當前佇列中優先級最高的那一個,
因為哈夫曼樹的節點不只有一個資訊, 所以要使用優先佇列對出現次數weight排序,要在結構體里多載運算子
friend bool operator>(HuffmanNode f1, HuffmanNode f2) {
return f1.weight > f2.weight;
}
3.對出現的字符編碼
int GetCodeNode(Node HuffmanTree[], int LeafNumber) {
for (int i = 0; i < 2 * LeafNumber - 2; i++) {
if (HuffmanTree[i].info == -127)
continue;
int IndexForSearching = i;
HuffmanTree[i].BinaryCode = "";
int j = 0;
while (HuffmanTree[IndexForSearching].parent != -127) {
j = HuffmanTree[IndexForSearching].parent;
if (HuffmanTree[j].left == IndexForSearching)
HuffmanTree[i].BinaryCode += '0';
if (HuffmanTree[j].right == IndexForSearching)
HuffmanTree[i].BinaryCode += '1';
IndexForSearching = j;
}
reverse(HuffmanTree[i].BinaryCode.begin(),
HuffmanTree[i].BinaryCode.end());
}
cerr << "Get Node Code Completed" << endl;
return 0;
}
👆目的是為了在下一步對檔案進行編碼的時候比較簡便,但是這樣會比較慢
4.按照對檔案編碼
string Encode(string path1, Node HuffmanTree[], int LeafNumber) {
ifstream instr(path1, ios::in | ios::binary);
char ch;
unsigned int bytebuff = 0;
int bitmask = 0x80;
string HuffmanPath = "";
while (instr.get(ch)) {
bytebuff = (int)(unsigned char)ch;
int value = bytebuff;
for (int i = 0; i < LeafNumber; i++) {
if (HuffmanTree[i].info == value) {
HuffmanPath += HuffmanTree[i].BinaryCode;
break;
}
}
}
cerr << "Encode Completed" << endl;
return HuffmanPath;
}
👆按照檔案中的順序,把檔案的中的字符轉換為01序列保存到字串中,
5.把字串里的01串轉換
字串里的0和1是以字符來存盤的,一個位元組存一個0或1,通過位運算,把位元組里的八個位都存0和1,這樣一個位元組就可以存八個0和1
string SwitchStringToBinary(string HuffmanPath, int &Sign) {
// TODO:將字串里的01序列修改為bit
// 最后一個位元組要處理多余的0-->把0放后面
string BinaryPath = "";
int bytebuff = 0;
int shiftcount = 0;
for (int i = 0; i < HuffmanPath.size(); i++) {
bytebuff += (HuffmanPath[i] == '1' ? 1 : 0);
bytebuff <<= 1;
shiftcount++;
if (shiftcount == 8) {
bytebuff >>= 1;
BinaryPath += (char)bytebuff;
bytebuff = 0;
shiftcount = 0;
if (i == HuffmanPath.size() - 1)
break;
if (i + 8 > HuffmanPath.size()) {
i++;
while (i <= HuffmanPath.size() - 1) {
bytebuff += (HuffmanPath[i] == '1' ? 1 : 0);
bytebuff <<= 1;
shiftcount++;
i++;
}
bytebuff <<= 7 - shiftcount;
BinaryPath += (char)bytebuff;
}
}
}
Sign = 8 - shiftcount;
cerr << "Switch String To Binary Completed" << endl;
return BinaryPath;
}
👆這里寫的時候要注意,因為我們把字串的時候是八個八個的去讀,如果字串的長度 % 8 != 0,那么字串的末尾是湊不齊八位的,要特殊處理,這里我把有效的01串后面全填滿0,補齊8位,然后用一個Sign來標記末尾多添加了幾個0,把Sign一并存入到檔案當中,這樣解碼的時候就可以把最后多余的0給處理掉了,
6.存入檔案前的準備作業
這里我遇到的問題比較多,也是程式bug出現的主要原因,我們要盡可能的存入少的資訊,占用少的空間,把我們的哈夫曼樹給存入檔案,要存哪些資訊以便解碼的時候能夠完整的還原出來,在嘗試了許多種演算法以后(文末會介紹),寫出了無數多個bug,我使用了比較笨比的方法~:
存節點的 parent(父節點/母節點),side(子節點是父節點的左子節點還是右子節點),info(葉節點對應的ASCII碼)
👇獲取節點的side
int GetSide(Node HuffmanTree[], int LeafNumber) {
for (int i = 0; i < 2 * LeafNumber - 2; i++) {
if (i == HuffmanTree[HuffmanTree[i].parent].left)
HuffmanTree[i].side = 'l';
if (i == HuffmanTree[HuffmanTree[i].parent].right)
HuffmanTree[i].side = 'r';
}
return 0;
}
7.寫入檔案
int WriteToFile(string path2, Node HuffmanTree[], int LeafNumber,
string BinaryPath, int Sign) {
// path2是要寫的目標檔案
//打開二進制檔案輸出流
// ShowTable(HuffmanTree, LeafNumber);
LeafNumber -= 1;
ofstream outstr(path2, ios::binary);
outstr.write(reinterpret_cast<char *>(&Sign), sizeof(char));
outstr.write(reinterpret_cast<char *>(&LeafNumber), sizeof(char));
LeafNumber += 1;
for (int i = 0; i < LeafNumber; i++) {
//獲取要寫的內容的地址,轉換為char*
HuffmanTree[i].parent -= LeafNumber;
outstr.write(reinterpret_cast<char *>(&HuffmanTree[i].info),
sizeof(char));
outstr.write(reinterpret_cast<char *>(&HuffmanTree[i].parent),
sizeof(char));
outstr.write(reinterpret_cast<char *>(&HuffmanTree[i].side),
sizeof(char));
}
for (int i = LeafNumber; i < 2 * LeafNumber - 1; i++) {
HuffmanTree[i].parent -= LeafNumber;
outstr.write(reinterpret_cast<char *>(&HuffmanTree[i].parent),
sizeof(char));
outstr.write(reinterpret_cast<char *>(&HuffmanTree[i].side),
sizeof(char));
}
for (int i = 0; i < BinaryPath.size(); i++) {
outstr.put(BinaryPath[i]);
}
outstr.close();
cerr << "Write To File Completed" << endl;
return 0;
}
把要寫的樹資訊和編碼一并寫入檔案當中,我在編碼頭寫入了我哈夫曼樹的葉節點個數LeafNumber,還有上文中提到的Sign,接著就是哈夫曼樹和檔案的編碼,
part3.解壓代碼實作
寫完壓縮以后,解壓基本上是一馬平川~,但是解壓會遇到很多問題,要去壓縮的代碼里面去改,
1.讀檔案
Tree ReadFile(string path1, unsigned int &LeafNumber, string &SearchPath,
int &Sign) {
// Sign是要刪去末尾的幾個0
ifstream instr(path1, ios::in | ios::binary);
SearchPath = "";
char ch;
instr.get(ch);
Sign = ch;
instr.get(ch);
LeafNumber = (int)(unsigned char)ch;
LeafNumber += 1;
Tree HuffmanTree = new Node[2 * LeafNumber - 1];
unsigned int num;
for (int i = 0; i < LeafNumber; i++) {
instr.get(ch);
num = (int)(unsigned char)ch;
HuffmanTree[i].info = num;
instr.get(ch);
num = (int)(unsigned char)ch;
HuffmanTree[i].parent = num + LeafNumber;
instr.get(ch);
HuffmanTree[i].side = ch;
//初始化其他資訊
HuffmanTree[i].BinaryCode = "";
HuffmanTree[i].index = i;
HuffmanTree[i].left = -1;
HuffmanTree[i].right = -1;
HuffmanTree[i].weight = -1;
}
for (int i = LeafNumber; i < 2 * LeafNumber - 1; i++) {
instr.get(ch);
num = (int)(unsigned char)ch;
HuffmanTree[i].parent = num + LeafNumber;
instr.get(ch);
HuffmanTree[i].side = ch;
//初始化其他資訊
HuffmanTree[i].info = -10000;
HuffmanTree[i].BinaryCode = "";
HuffmanTree[i].index = i;
HuffmanTree[i].left = -1;
HuffmanTree[i].right = -1;
HuffmanTree[i].weight = -1;
}
int bitmask = 0x80;
while (instr.get(ch)) {
num = (int)(unsigned char)ch;
while (bitmask != 0) {
if ((bitmask & num) != 0) {
SearchPath += '1';
}
if ((bitmask & num) == 0) {
SearchPath += '0';
}
bitmask >>= 1;
}
bitmask = 0x80;
}
SearchPath.erase(SearchPath.end() - Sign, SearchPath.end());
instr.close();
cerr << "Read File Completed" << endl;
return HuffmanTree;
}
👆把整個壓縮檔案都讀完,開辟空間保存樹節點空間,把檔案的編碼部分保存到string方便使用,記得使用Sign把編碼末尾多余的0去掉,
2.還原哈夫曼樹
把剛剛存入線性表的節點建立父子(母子/父女/母女)關系
int BuiltHuffmanTree(Node HuffmanTree[], int LeafNumber) {
for (int i = 0; i < 2 * LeafNumber - 2; i++) {
if (HuffmanTree[i].side == 'l') {
HuffmanTree[HuffmanTree[i].parent].left = i;
}
if (HuffmanTree[i].side == 'r') {
HuffmanTree[HuffmanTree[i].parent].right = i;
}
}
cerr << "Built Huffman Tree Completed" << endl;
return 0;
}
3.還原檔案
根據字串中的01序列,在哈夫曼樹中查找,把找到的字符寫入檔案中就好啦!
int RestoreFile(string SearchPath, Node HuffmanTree[], string path2,
int LeafNumber) {
ofstream outstr(path2, ios::out | ios::binary);
int head = 2 * LeafNumber - 2, now = head;
for (int i = 0; i < SearchPath.size(); i++) {
if (SearchPath[i] == '0') {
now = HuffmanTree[now].left;
}
if (SearchPath[i] == '1') {
now = HuffmanTree[now].right;
}
if (HuffmanTree[now].left == -1 && HuffmanTree[now].right == -1) {
char res = HuffmanTree[now].info;
outstr.put(res);
now = head;
}
}
outstr.close();
cerr << "Restore File Completed" << endl;
return 0;
}
part4.全部代碼
// writen by spln
// spln@foxmail.com
#include <bits/stdc++.h>
using namespace std;
struct HuffmanNode {
int info; //存
int index;
int weight;
int parent; //存
int left;
int right;
char side; //存
string BinaryCode;
friend bool operator>(HuffmanNode f1, HuffmanNode f2) {
return f1.weight > f2.weight;
}
};
typedef HuffmanNode Node;
typedef HuffmanNode *Tree;
int ShowHelp() {
cerr << "輸入錯誤,請按要求進行輸入:" << endl;
cerr << "-z/-x 檔案名1 檔案名2" << endl;
return 0;
}
class Compression {
public:
int CreatHuffmanTree(int tong[], int LeafNumber, Node HuffmanTree[]) {
// TODO:創建哈夫曼樹
int k = 0;
priority_queue<Node, vector<Node>, greater<Node>> pq;
// Tree HuffmanTree = new Node[2 * LeafNumber - 1];
// 0--->LeafNumber - 1 是葉節點
// laefNumber--->2*LeafNumber - 2 是根節點
for (int i = 0; i <= 256; i++) {
if (tong[i]) {
HuffmanTree[k].info = i;
HuffmanTree[k].index = k;
HuffmanTree[k].left = HuffmanTree[k].right =
HuffmanTree[k].parent = -1;
HuffmanTree[k].weight = tong[i];
pq.push(HuffmanTree[k]);
k++;
}
}
int j = LeafNumber;
//通過優先佇列構建哈夫曼樹
while (pq.size() > 1) {
Node t1 = pq.top();
pq.pop();
Node t2 = pq.top();
pq.pop();
HuffmanTree[t1.index].parent = j;
HuffmanTree[t2.index].parent = j;
HuffmanTree[j].index = j;
HuffmanTree[j].parent = -1;
// cout<<HuffmanTree[j].index;
HuffmanTree[j].left = t1.index;
// cout<<HuffmanTree[j].left;
HuffmanTree[j].right = t2.index;
// cout<<HuffmanTree[j].right;
HuffmanTree[j].weight = t1.weight + t2.weight;
HuffmanTree[j].info = -127;
pq.push(HuffmanTree[j]);
j++;
}
j--;
Node HuffmanTreeHead = pq.top();
HuffmanTreeHead.parent = -127;
HuffmanTree[j] = HuffmanTreeHead;
HuffmanTree[j].info = -127;
pq.pop();
cerr << "Creat Huffman Tree Completed" << endl;
return 0;
}
int GetCodeNode(Node HuffmanTree[], int LeafNumber) {
for (int i = 0; i < 2 * LeafNumber - 2; i++) {
if (HuffmanTree[i].info == -127)
continue;
int IndexForSearching = i;
HuffmanTree[i].BinaryCode = "";
int j = 0;
while (HuffmanTree[IndexForSearching].parent != -127) {
j = HuffmanTree[IndexForSearching].parent;
if (HuffmanTree[j].left == IndexForSearching)
HuffmanTree[i].BinaryCode += '0';
if (HuffmanTree[j].right == IndexForSearching)
HuffmanTree[i].BinaryCode += '1';
IndexForSearching = j;
}
reverse(HuffmanTree[i].BinaryCode.begin(),
HuffmanTree[i].BinaryCode.end());
}
cerr << "Get Node Code Completed" << endl;
return 0;
}
string Encode(string path1, Node HuffmanTree[], int LeafNumber) {
ifstream instr(path1, ios::in | ios::binary);
char ch;
unsigned int bytebuff = 0;
int bitmask = 0x80;
string HuffmanPath = "";
while (instr.get(ch)) {
bytebuff = (int)(unsigned char)ch;
int value = bytebuff;
for (int i = 0; i < LeafNumber; i++) {
if (HuffmanTree[i].info == value) {
HuffmanPath += HuffmanTree[i].BinaryCode;
break;
}
}
}
cerr << "Encode Completed" << endl;
return HuffmanPath;
}
int ShowTable(Node HuffmanTree[], int LeafNumber) {
for (int i = 0; i < 2 * LeafNumber - 1; i++) {
cout << "i:" << i << endl;
cout << HuffmanTree[i].index << "<-index" << endl;
cout << HuffmanTree[i].info << "<-info" << endl;
cout << HuffmanTree[i].side << "<-side" << endl;
cout << HuffmanTree[i].left << "<-left" << endl;
cout << HuffmanTree[i].right << "<-right" << endl;
cout << HuffmanTree[i].parent << "<-parent" << endl;
cout << HuffmanTree[i].weight << "<-weight" << endl;
cout << HuffmanTree[i].BinaryCode << "<-code" << endl;
}
return 0;
}
int Count(string op, string path1, string path2, int tong[]) {
// TODO:統計檔案1字符出現頻率
// path1是源檔案,path2是目標檔案
ifstream instr(path1, ios::in | ios::binary);
unsigned int bytebuff = 0;
char ch;
while (instr.get(ch)) {
bytebuff =(int)(unsigned char)ch;
tong[bytebuff]++;
} //統計weight完成!
int LeafNumber = 0;
for (int i = 0; i <= 256; i++) {
if (tong[i] != 0)
LeafNumber++;
}
instr.close();
cerr << "Count Completed" << endl;
return LeafNumber;
}
//把哈夫曼樹存到FinalOutputString
string SwitchStringToBinary(string HuffmanPath, int &Sign) {
// TODO:將字串里的01序列修改為bit
// 最后一個位元組要處理多余的0-->把0放后面
string BinaryPath = "";
int bytebuff = 0;
int shiftcount = 0;
for (int i = 0; i < HuffmanPath.size(); i++) {
bytebuff += (HuffmanPath[i] == '1' ? 1 : 0);
bytebuff <<= 1;
shiftcount++;
if (shiftcount == 8) {
bytebuff >>= 1;
BinaryPath += (char)bytebuff;
bytebuff = 0;
shiftcount = 0;
if (i == HuffmanPath.size() - 1)
break;
if (i + 8 > HuffmanPath.size()) {
i++;
while (i <= HuffmanPath.size() - 1) {
bytebuff += (HuffmanPath[i] == '1' ? 1 : 0);
bytebuff <<= 1;
shiftcount++;
i++;
}
bytebuff <<= 7 - shiftcount;
BinaryPath += (char)bytebuff;
}
}
}
Sign = 8 - shiftcount;
cerr << "Switch String To Binary Completed" << endl;
return BinaryPath;
}
int GetSide(Node HuffmanTree[], int LeafNumber) {
for (int i = 0; i < 2 * LeafNumber - 2; i++) {
if (i == HuffmanTree[HuffmanTree[i].parent].left)
HuffmanTree[i].side = 'l';
if (i == HuffmanTree[HuffmanTree[i].parent].right)
HuffmanTree[i].side = 'r';
}
return 0;
}
int WriteToFile(string path2, Node HuffmanTree[], int LeafNumber,
string BinaryPath, int Sign) {
// path2是要寫的目標檔案
//打開二進制檔案輸出流
// ShowTable(HuffmanTree, LeafNumber);
LeafNumber -= 1;
ofstream outstr(path2, ios::binary);
outstr.write(reinterpret_cast<char *>(&Sign), sizeof(char));
outstr.write(reinterpret_cast<char *>(&LeafNumber), sizeof(char));
LeafNumber += 1;
for (int i = 0; i < LeafNumber; i++) {
//獲取要寫的內容的地址,轉換為char*
HuffmanTree[i].parent -= LeafNumber;
outstr.write(reinterpret_cast<char *>(&HuffmanTree[i].info),
sizeof(char));
outstr.write(reinterpret_cast<char *>(&HuffmanTree[i].parent),
sizeof(char));
outstr.write(reinterpret_cast<char *>(&HuffmanTree[i].side),
sizeof(char));
}
for (int i = LeafNumber; i < 2 * LeafNumber - 1; i++) {
HuffmanTree[i].parent -= LeafNumber;
outstr.write(reinterpret_cast<char *>(&HuffmanTree[i].parent),
sizeof(char));
outstr.write(reinterpret_cast<char *>(&HuffmanTree[i].side),
sizeof(char));
}
for (int i = 0; i < BinaryPath.size(); i++) {
outstr.put(BinaryPath[i]);
}
outstr.close();
cerr << "Write To File Completed" << endl;
return 0;
}
};
class Decompression {
public:
int ShowTable(Node HuffmanTree[], int LeafNumber) {
for (int i = 0; i < 2 * LeafNumber - 1; i++) {
cout << "i:" << i << endl;
cout << HuffmanTree[i].index << "<-index" << endl;
cout << HuffmanTree[i].info << "<-info" << endl;
cout << HuffmanTree[i].side << "<-side" << endl;
cout << HuffmanTree[i].left << "<-left" << endl;
cout << HuffmanTree[i].right << "<-right" << endl;
cout << HuffmanTree[i].parent << "<-parent" << endl;
}
return 0;
}
Tree ReadFile(string path1, unsigned int &LeafNumber, string &SearchPath,
int &Sign) {
// Sign是要刪去末尾的幾個0
ifstream instr(path1, ios::in | ios::binary);
SearchPath = "";
char ch;
instr.get(ch);
Sign = ch;
instr.get(ch);
LeafNumber = (int)(unsigned char)ch;
LeafNumber += 1;
Tree HuffmanTree = new Node[2 * LeafNumber - 1];
unsigned int num;
for (int i = 0; i < LeafNumber; i++) {
instr.get(ch);
num = (int)(unsigned char)ch;
HuffmanTree[i].info = num;
instr.get(ch);
num = (int)(unsigned char)ch;
HuffmanTree[i].parent = num + LeafNumber;
instr.get(ch);
HuffmanTree[i].side = ch;
//初始化其他資訊
HuffmanTree[i].BinaryCode = "";
HuffmanTree[i].index = i;
HuffmanTree[i].left = -1;
HuffmanTree[i].right = -1;
HuffmanTree[i].weight = -1;
}
for (int i = LeafNumber; i < 2 * LeafNumber - 1; i++) {
instr.get(ch);
num = (int)(unsigned char)ch;
HuffmanTree[i].parent = num + LeafNumber;
instr.get(ch);
HuffmanTree[i].side = ch;
//初始化其他資訊
HuffmanTree[i].info = -10000;
HuffmanTree[i].BinaryCode = "";
HuffmanTree[i].index = i;
HuffmanTree[i].left = -1;
HuffmanTree[i].right = -1;
HuffmanTree[i].weight = -1;
}
int bitmask = 0x80;
while (instr.get(ch)) {
num = (int)(unsigned char)ch;
while (bitmask != 0) {
if ((bitmask & num) != 0) {
SearchPath += '1';
}
if ((bitmask & num) == 0) {
SearchPath += '0';
}
bitmask >>= 1;
}
bitmask = 0x80;
}
SearchPath.erase(SearchPath.end() - Sign, SearchPath.end());
instr.close();
cerr << "Read File Completed" << endl;
return HuffmanTree;
}
int BuiltHuffmanTree(Node HuffmanTree[], int LeafNumber) {
for (int i = 0; i < 2 * LeafNumber - 2; i++) {
if (HuffmanTree[i].side == 'l') {
HuffmanTree[HuffmanTree[i].parent].left = i;
}
if (HuffmanTree[i].side == 'r') {
HuffmanTree[HuffmanTree[i].parent].right = i;
}
}
cerr << "Built Huffman Tree Completed" << endl;
return 0;
}
int RestoreFile(string SearchPath, Node HuffmanTree[], string path2,
int LeafNumber) {
ofstream outstr(path2, ios::out | ios::binary);
int head = 2 * LeafNumber - 2, now = head;
for (int i = 0; i < SearchPath.size(); i++) {
if (SearchPath[i] == '0') {
now = HuffmanTree[now].left;
}
if (SearchPath[i] == '1') {
now = HuffmanTree[now].right;
}
if (HuffmanTree[now].left == -1 && HuffmanTree[now].right == -1) {
char res = HuffmanTree[now].info;
outstr.put(res);
now = head;
}
}
outstr.close();
cerr << "Restore File Completed" << endl;
return 0;
}
};
//決議命令列:
int main(int argc, char *argv[]) {
// //定義類
Compression Compress;
Decompression Decompress;
if (argc != 4)
ShowHelp();
else if (stricmp(argv[1], "-z") == 0)
cerr << "Zip " << argv[2] << " to " << argv[3] << " ..." << endl;
else if (stricmp(argv[1], "-x") == 0)
cerr << "Extract " << argv[2] << " to " << argv[3] << " ..." << endl;
else {
ShowHelp();
return 0;
}
//把路徑賦值給字串
const string op = argv[1];
const string path1 = argv[2];
const string path2 = argv[3];
ifstream instr(path1, ios::in | ios::binary);
if (!instr) {
cerr << "Open File failed" << endl;
return 0;
}
//路徑分配
if (op == "-z") {
// cerr << "Ziping..." << endl;
int tong[257] = {0};
int LeafNumber =
Compress.Count(op, path1, path2, tong); //統計字符出出現頻率
Tree HuffmanTree = new Node[2 * LeafNumber - 1]; //初始化樹
Compress.CreatHuffmanTree(tong, LeafNumber, HuffmanTree); //建樹
Compress.GetSide(HuffmanTree, LeafNumber);
Compress.GetCodeNode(HuffmanTree, LeafNumber); //獲得葉節點的編碼
// Compress.ShowTable(HuffmanTree, LeafNumber);
string HuffmanPath =
Compress.Encode(path1, HuffmanTree, LeafNumber); //檔案進行編碼
int Sign;
string BinaryPath =
Compress.SwitchStringToBinary(HuffmanPath, Sign); //獲得二進制串
Compress.WriteToFile(path2, HuffmanTree, LeafNumber, BinaryPath,
Sign); //把哈夫曼樹寫入檔案
// Compress.ShowTable(HuffmanTree, LeafNumber);
cerr << "Compression Completed" << endl;
return 0;
}
else if (op == "-x") {
unsigned int LeafNumber;
int Sign;
string SearchPath;
Tree HuffmanTree =
Decompress.ReadFile(path1, LeafNumber, SearchPath, Sign);
Decompress.BuiltHuffmanTree(HuffmanTree, LeafNumber);
Decompress.RestoreFile(SearchPath, HuffmanTree, path2, LeafNumber);
// Decompress.ShowTable(HuffmanTree, LeafNumber);
cerr << "Decompress Completed" << endl;
} else
return 0;
return 0;
}
part5.寫在最后
其實我寫完這些代碼以后我感覺我使用的一些演算法并不是很好,但是如果要改的話基本上就是重構代碼了,工程量巨大,
給大家介紹一個大手子同學使用的演算法:不把哈夫曼樹寫入檔案,而是把一個類似于python里字典的東西存進檔案:存入每個字符對應的編碼長度和編碼,這樣在解壓的時候可以構建一個map,就可以把檔案還原,
老師的演算法是:直接開辟長度為512的線性表,每個節點對應的下標就是對應的ASCII碼,在檔案中只存入父節點,經過處理,以下標大小來區分同一個父節點的左右子節點,
歡迎技術交流!!!
如果有更好的演算法歡迎討論!
最后,放張圖大家笑一笑

轉載請註明出處,本文鏈接:https://www.uj5u.com/ruanti/239091.html
標籤:其他
