主頁 > 軟體設計 > 使用哈夫曼演算法對檔案進行壓縮和解壓

使用哈夫曼演算法對檔案進行壓縮和解壓

2020-12-23 11:02:56 軟體設計

使用哈夫曼演算法對檔案進行解壓和壓縮

前言:筆者是一名在讀大二的學生,用哈夫曼演算法對檔案進行壓縮和解壓是老師布置的實驗課作業,碼這個程式花了很多時間,但是我也在其中學到了很多東西,寫這種類似于工程類和寫演算法的程式感覺有很大的不一樣,

目錄

  • 使用哈夫曼演算法對檔案進行解壓和壓縮
    • part1 原理
    • part2 壓縮代碼實作
      • 1.統計檔案字符出現次數
      • 2.建立哈夫曼樹
      • 3.對出現的字符編碼
      • 4.按照對檔案編碼
      • 5.把字串里的01串轉換
      • 6.存入檔案前的準備作業
      • 7.寫入檔案
    • part3.解壓代碼實作
      • 1.讀檔案
      • 2.還原哈夫曼樹
      • 3.還原檔案
    • part4.全部代碼
    • part5.寫在最后

part1 原理

對檔案內出現的字符進行統計,以出現的次數為關鍵字,建立哈夫曼樹,實作對字符的編碼,將字符轉換成一個01序列,因為一個字符占用一個位元組,8個bit, 而一個位元組可以存八位的01編碼,所以用位編碼替換檔案內的字符,既完成了檔案的壓縮,為了能夠解碼,我們要把哈夫曼樹和01序列一起存進檔案中,在這里插入圖片描述
👆👆哈夫曼樹
解碼時,我們將存進壓縮檔案里的哈夫曼樹還原,通過檔案里的01序列對哈夫曼樹進行查找,即可還原出原檔案,

part2 壓縮代碼實作

1.統計檔案字符出現次數

👇統計文本中字符出現頻率

int Count(string op, string path1, string path2, int tong[]) {
        // TODO:統計檔案1字符出現頻率
        // path1是源檔案,path2是目標檔案
        ifstream instr(path1, ios::in | ios::binary);
        unsigned int bytebuff = 0;
        char ch;
        while (instr.get(ch)) {
            bytebuff =(int)(unsigned char)ch;
            tong[bytebuff]++;
        } //統計weight完成!
        int LeafNumber = 0;
        for (int i = 0; i <= 256; i++) {
            if (tong[i] != 0)
                LeafNumber++;
        }
        instr.close();
        cerr << "Count Completed" << endl;
        return LeafNumber;
    }

👆使用get()一次從檔案輸入流里讀取一個位元組,相當于一個char型別,使用桶計數法統計字符出現次數,這里要用unsigned char,因為有些字符的編碼是大于127的,符號位為1,不適用unsigned讀出來會是負數,

2.建立哈夫曼樹

struct HuffmanNode {
    int info; //存
    int index;
    int weight;
    int parent; //存
    int left;
    int right;
    char side; //存
    string BinaryCode;
    friend bool operator>(HuffmanNode f1, HuffmanNode f2) {
        return f1.weight > f2.weight;
    }
};

👆哈夫曼樹的節點資訊
👇個人習慣喜歡把節點名和結構體指標換名字

typedef HuffmanNode Node;
typedef HuffmanNode *Tree;

創建哈夫曼樹:

int CreatHuffmanTree(int tong[], int LeafNumber, Node HuffmanTree[]) {
        // TODO:創建哈夫曼樹
        int k = 0;
        priority_queue<Node, vector<Node>, greater<Node>> pq;
        // Tree HuffmanTree = new Node[2 * LeafNumber - 1];
        // 0--->LeafNumber - 1            是葉節點
        // leafNumber--->2*LeafNumber - 2 是根節點
        for (int i = 0; i <= 256; i++) {
            if (tong[i]) {
                HuffmanTree[k].info = i;
                HuffmanTree[k].index = k;
                HuffmanTree[k].left = HuffmanTree[k].right =
                    HuffmanTree[k].parent = -1;
                HuffmanTree[k].weight = tong[i];
                pq.push(HuffmanTree[k]);
                k++;
            }
        }
        int j = LeafNumber;
        //通過優先佇列構建哈夫曼樹
        while (pq.size() > 1) {
            Node t1 = pq.top();
            pq.pop();
            Node t2 = pq.top();
            pq.pop();
            HuffmanTree[t1.index].parent = j;
            HuffmanTree[t2.index].parent = j;
            HuffmanTree[j].index = j;
            HuffmanTree[j].parent = -1;
            // cout<<HuffmanTree[j].index;
            HuffmanTree[j].left = t1.index;
            // cout<<HuffmanTree[j].left;
            HuffmanTree[j].right = t2.index;
            // cout<<HuffmanTree[j].right;
            HuffmanTree[j].weight = t1.weight + t2.weight;
            HuffmanTree[j].info = -127;
            pq.push(HuffmanTree[j]);
            j++;
        }
        j--;
        Node HuffmanTreeHead = pq.top();
        HuffmanTreeHead.parent = -127;
        HuffmanTree[j] = HuffmanTreeHead;
        HuffmanTree[j].info = -127;
        pq.pop();
        cerr << "Creat Huffman Tree Completed" << endl;
        return 0;
    }

這里使用了stl容器優先佇列priority_queue,優先佇列,其底層是用堆來實作的,隊首一定是當前佇列中優先級最高的那一個,
因為哈夫曼樹的節點不只有一個資訊, 所以要使用優先佇列對出現次數weight排序,要在結構體里多載運算子

friend bool operator>(HuffmanNode f1, HuffmanNode f2) {
        return f1.weight > f2.weight;
    }

3.對出現的字符編碼

int GetCodeNode(Node HuffmanTree[], int LeafNumber) {
        for (int i = 0; i < 2 * LeafNumber - 2; i++) {
            if (HuffmanTree[i].info == -127)
                continue;
            int IndexForSearching = i;
            HuffmanTree[i].BinaryCode = "";
            int j = 0;
            while (HuffmanTree[IndexForSearching].parent != -127) {
                j = HuffmanTree[IndexForSearching].parent;
                if (HuffmanTree[j].left == IndexForSearching)
                    HuffmanTree[i].BinaryCode += '0';
                if (HuffmanTree[j].right == IndexForSearching)
                    HuffmanTree[i].BinaryCode += '1';
                IndexForSearching = j;
            }
            reverse(HuffmanTree[i].BinaryCode.begin(),
                    HuffmanTree[i].BinaryCode.end());
        }
        cerr << "Get Node Code Completed" << endl;
        return 0;
    }

👆目的是為了在下一步對檔案進行編碼的時候比較簡便,但是這樣會比較慢

4.按照對檔案編碼

string Encode(string path1, Node HuffmanTree[], int LeafNumber) {
        ifstream instr(path1, ios::in | ios::binary);
        char ch;
        unsigned int bytebuff = 0;
        int bitmask = 0x80;
        string HuffmanPath = "";
        while (instr.get(ch)) {
            bytebuff = (int)(unsigned char)ch;
            int value = bytebuff;
            for (int i = 0; i < LeafNumber; i++) {
                if (HuffmanTree[i].info == value) {
                    HuffmanPath += HuffmanTree[i].BinaryCode;
                    break;
                }
            }
        }
        cerr << "Encode Completed" << endl;
        return HuffmanPath;
    }

👆按照檔案中的順序,把檔案的中的字符轉換為01序列保存到字串中,

5.把字串里的01串轉換

字串里的0和1是以字符來存盤的,一個位元組存一個0或1,通過位運算,把位元組里的八個位都存0和1,這樣一個位元組就可以存八個0和1

string SwitchStringToBinary(string HuffmanPath, int &Sign) {
        // TODO:將字串里的01序列修改為bit
        // 最后一個位元組要處理多余的0-->把0放后面
        string BinaryPath = "";
        int bytebuff = 0;
        int shiftcount = 0;
        for (int i = 0; i < HuffmanPath.size(); i++) {
            bytebuff += (HuffmanPath[i] == '1' ? 1 : 0);
            bytebuff <<= 1;
            shiftcount++;
            if (shiftcount == 8) {
                bytebuff >>= 1;
                BinaryPath += (char)bytebuff;
                bytebuff = 0;
                shiftcount = 0;
                if (i == HuffmanPath.size() - 1)
                    break;
                if (i + 8 > HuffmanPath.size()) {
                    i++;
                    while (i <= HuffmanPath.size() - 1) {
                        bytebuff += (HuffmanPath[i] == '1' ? 1 : 0);
                        bytebuff <<= 1;
                        shiftcount++;
                        i++;
                    }
                    bytebuff <<= 7 - shiftcount;
                    BinaryPath += (char)bytebuff;
                }
            }
        }
        Sign = 8 - shiftcount;
        cerr << "Switch String To Binary Completed" << endl;
        return BinaryPath;
    }

👆這里寫的時候要注意,因為我們把字串的時候是八個八個的去讀,如果字串的長度 % 8 != 0,那么字串的末尾是湊不齊八位的,要特殊處理,這里我把有效的01串后面全填滿0,補齊8位,然后用一個Sign來標記末尾多添加了幾個0,把Sign一并存入到檔案當中,這樣解碼的時候就可以把最后多余的0給處理掉了,

6.存入檔案前的準備作業

這里我遇到的問題比較多,也是程式bug出現的主要原因,我們要盡可能的存入少的資訊,占用少的空間,把我們的哈夫曼樹給存入檔案,要存哪些資訊以便解碼的時候能夠完整的還原出來,在嘗試了許多種演算法以后(文末會介紹),寫出了無數多個bug,我使用了比較笨比的方法~:
存節點的 parent(父節點/母節點),side(子節點是父節點的左子節點還是右子節點),info(葉節點對應的ASCII碼)
👇獲取節點的side

int GetSide(Node HuffmanTree[], int LeafNumber) {
        for (int i = 0; i < 2 * LeafNumber - 2; i++) {
            if (i == HuffmanTree[HuffmanTree[i].parent].left)
                HuffmanTree[i].side = 'l';
            if (i == HuffmanTree[HuffmanTree[i].parent].right)
                HuffmanTree[i].side = 'r';
        }
        return 0;
    }

7.寫入檔案

int WriteToFile(string path2, Node HuffmanTree[], int LeafNumber,
                    string BinaryPath, int Sign) {
        // path2是要寫的目標檔案
        //打開二進制檔案輸出流
        // ShowTable(HuffmanTree, LeafNumber);
        LeafNumber -= 1;
        ofstream outstr(path2, ios::binary);
        outstr.write(reinterpret_cast<char *>(&Sign), sizeof(char));
        outstr.write(reinterpret_cast<char *>(&LeafNumber), sizeof(char));
        LeafNumber += 1;
        for (int i = 0; i < LeafNumber; i++) {
            //獲取要寫的內容的地址,轉換為char*
            HuffmanTree[i].parent -= LeafNumber;
            outstr.write(reinterpret_cast<char *>(&HuffmanTree[i].info),
                         sizeof(char));
            outstr.write(reinterpret_cast<char *>(&HuffmanTree[i].parent),
                         sizeof(char));
            outstr.write(reinterpret_cast<char *>(&HuffmanTree[i].side),
                         sizeof(char));
        }
        for (int i = LeafNumber; i < 2 * LeafNumber - 1; i++) {
            HuffmanTree[i].parent -= LeafNumber;
            outstr.write(reinterpret_cast<char *>(&HuffmanTree[i].parent),
                         sizeof(char));
            outstr.write(reinterpret_cast<char *>(&HuffmanTree[i].side),
                         sizeof(char));
        }
        for (int i = 0; i < BinaryPath.size(); i++) {
            outstr.put(BinaryPath[i]);
        }
        outstr.close();
        cerr << "Write To File Completed" << endl;
        return 0;
    }

把要寫的樹資訊和編碼一并寫入檔案當中,我在編碼頭寫入了我哈夫曼樹的葉節點個數LeafNumber,還有上文中提到的Sign,接著就是哈夫曼樹和檔案的編碼,

part3.解壓代碼實作

寫完壓縮以后,解壓基本上是一馬平川~,但是解壓會遇到很多問題,要去壓縮的代碼里面去改,

1.讀檔案

Tree ReadFile(string path1, unsigned int &LeafNumber, string &SearchPath,
                  int &Sign) {
        // Sign是要刪去末尾的幾個0
        ifstream instr(path1, ios::in | ios::binary);
        SearchPath = "";
        char ch;
        instr.get(ch);
        Sign = ch;
        instr.get(ch);
        LeafNumber = (int)(unsigned char)ch;
        LeafNumber += 1;
        Tree HuffmanTree = new Node[2 * LeafNumber - 1];
        unsigned int num;
        for (int i = 0; i < LeafNumber; i++) {
            instr.get(ch);
            num = (int)(unsigned char)ch;
            HuffmanTree[i].info = num;
            instr.get(ch);
            num = (int)(unsigned char)ch;
            HuffmanTree[i].parent = num + LeafNumber;
            instr.get(ch);
            HuffmanTree[i].side = ch;
            //初始化其他資訊
            HuffmanTree[i].BinaryCode = "";
            HuffmanTree[i].index = i;
            HuffmanTree[i].left = -1;
            HuffmanTree[i].right = -1;
            HuffmanTree[i].weight = -1;
        }
        for (int i = LeafNumber; i < 2 * LeafNumber - 1; i++) {
            instr.get(ch);
            num = (int)(unsigned char)ch;
            HuffmanTree[i].parent = num + LeafNumber;
            instr.get(ch);
            HuffmanTree[i].side = ch;
            //初始化其他資訊
            HuffmanTree[i].info = -10000;
            HuffmanTree[i].BinaryCode = "";
            HuffmanTree[i].index = i;
            HuffmanTree[i].left = -1;
            HuffmanTree[i].right = -1;
            HuffmanTree[i].weight = -1;
        }
        int bitmask = 0x80;
        while (instr.get(ch)) {
            num = (int)(unsigned char)ch;
            while (bitmask != 0) {
                if ((bitmask & num) != 0) {
                    SearchPath += '1';
                }
                if ((bitmask & num) == 0) {
                    SearchPath += '0';
                }
                bitmask >>= 1;
            }
            bitmask = 0x80;
        }
        SearchPath.erase(SearchPath.end() - Sign, SearchPath.end());
        instr.close();
        cerr << "Read File Completed" << endl;
        return HuffmanTree;
    }

👆把整個壓縮檔案都讀完,開辟空間保存樹節點空間,把檔案的編碼部分保存到string方便使用,記得使用Sign把編碼末尾多余的0去掉,

2.還原哈夫曼樹

把剛剛存入線性表的節點建立父子(母子/父女/母女)關系

int BuiltHuffmanTree(Node HuffmanTree[], int LeafNumber) {
        for (int i = 0; i < 2 * LeafNumber - 2; i++) {
            if (HuffmanTree[i].side == 'l') {
                HuffmanTree[HuffmanTree[i].parent].left = i;
            }
            if (HuffmanTree[i].side == 'r') {
                HuffmanTree[HuffmanTree[i].parent].right = i;
            }
        }
        cerr << "Built Huffman Tree Completed" << endl;
        return 0;
    }

3.還原檔案

根據字串中的01序列,在哈夫曼樹中查找,把找到的字符寫入檔案中就好啦!

int RestoreFile(string SearchPath, Node HuffmanTree[], string path2,
                    int LeafNumber) {
        ofstream outstr(path2, ios::out | ios::binary);
        int head = 2 * LeafNumber - 2, now = head;
        for (int i = 0; i < SearchPath.size(); i++) {
            if (SearchPath[i] == '0') {
                now = HuffmanTree[now].left;
            }
            if (SearchPath[i] == '1') {
                now = HuffmanTree[now].right;
            }
            if (HuffmanTree[now].left == -1 && HuffmanTree[now].right == -1) {
                char res = HuffmanTree[now].info;
                outstr.put(res);
                now = head;
            }
        }
        outstr.close();
        cerr << "Restore File Completed" << endl;
        return 0;
    }

part4.全部代碼

// writen by spln
// spln@foxmail.com
#include <bits/stdc++.h>
using namespace std;
struct HuffmanNode {
    int info; //存
    int index;
    int weight;
    int parent; //存
    int left;
    int right;
    char side; //存
    string BinaryCode;
    friend bool operator>(HuffmanNode f1, HuffmanNode f2) {
        return f1.weight > f2.weight;
    }
};
typedef HuffmanNode Node;
typedef HuffmanNode *Tree;
int ShowHelp() {
    cerr << "輸入錯誤,請按要求進行輸入:" << endl;
    cerr << "-z/-x 檔案名1 檔案名2" << endl;
    return 0;
}
class Compression {
  public:
    int CreatHuffmanTree(int tong[], int LeafNumber, Node HuffmanTree[]) {
        // TODO:創建哈夫曼樹
        int k = 0;
        priority_queue<Node, vector<Node>, greater<Node>> pq;
        // Tree HuffmanTree = new Node[2 * LeafNumber - 1];
        // 0--->LeafNumber - 1            是葉節點
        // laefNumber--->2*LeafNumber - 2 是根節點
        for (int i = 0; i <= 256; i++) {
            if (tong[i]) {
                HuffmanTree[k].info = i;
                HuffmanTree[k].index = k;
                HuffmanTree[k].left = HuffmanTree[k].right =
                    HuffmanTree[k].parent = -1;
                HuffmanTree[k].weight = tong[i];
                pq.push(HuffmanTree[k]);
                k++;
            }
        }
        int j = LeafNumber;
        //通過優先佇列構建哈夫曼樹
        while (pq.size() > 1) {
            Node t1 = pq.top();
            pq.pop();
            Node t2 = pq.top();
            pq.pop();
            HuffmanTree[t1.index].parent = j;
            HuffmanTree[t2.index].parent = j;
            HuffmanTree[j].index = j;
            HuffmanTree[j].parent = -1;
            // cout<<HuffmanTree[j].index;
            HuffmanTree[j].left = t1.index;
            // cout<<HuffmanTree[j].left;
            HuffmanTree[j].right = t2.index;
            // cout<<HuffmanTree[j].right;
            HuffmanTree[j].weight = t1.weight + t2.weight;
            HuffmanTree[j].info = -127;
            pq.push(HuffmanTree[j]);
            j++;
        }
        j--;
        Node HuffmanTreeHead = pq.top();
        HuffmanTreeHead.parent = -127;
        HuffmanTree[j] = HuffmanTreeHead;
        HuffmanTree[j].info = -127;
        pq.pop();
        cerr << "Creat Huffman Tree Completed" << endl;
        return 0;
    }
    int GetCodeNode(Node HuffmanTree[], int LeafNumber) {
        for (int i = 0; i < 2 * LeafNumber - 2; i++) {
            if (HuffmanTree[i].info == -127)
                continue;
            int IndexForSearching = i;
            HuffmanTree[i].BinaryCode = "";
            int j = 0;
            while (HuffmanTree[IndexForSearching].parent != -127) {
                j = HuffmanTree[IndexForSearching].parent;
                if (HuffmanTree[j].left == IndexForSearching)
                    HuffmanTree[i].BinaryCode += '0';
                if (HuffmanTree[j].right == IndexForSearching)
                    HuffmanTree[i].BinaryCode += '1';
                IndexForSearching = j;
            }
            reverse(HuffmanTree[i].BinaryCode.begin(),
                    HuffmanTree[i].BinaryCode.end());
        }
        cerr << "Get Node Code Completed" << endl;
        return 0;
    }
    string Encode(string path1, Node HuffmanTree[], int LeafNumber) {
        ifstream instr(path1, ios::in | ios::binary);
        char ch;
        unsigned int bytebuff = 0;
        int bitmask = 0x80;
        string HuffmanPath = "";
        while (instr.get(ch)) {
            bytebuff = (int)(unsigned char)ch;
            int value = bytebuff;
            for (int i = 0; i < LeafNumber; i++) {
                if (HuffmanTree[i].info == value) {
                    HuffmanPath += HuffmanTree[i].BinaryCode;
                    break;
                }
            }
        }
        cerr << "Encode Completed" << endl;
        return HuffmanPath;
    }
    int ShowTable(Node HuffmanTree[], int LeafNumber) {
        for (int i = 0; i < 2 * LeafNumber - 1; i++) {
            cout << "i:" << i << endl;
            cout << HuffmanTree[i].index << "<-index" << endl;
            cout << HuffmanTree[i].info << "<-info" << endl;
            cout << HuffmanTree[i].side << "<-side" << endl;
            cout << HuffmanTree[i].left << "<-left" << endl;
            cout << HuffmanTree[i].right << "<-right" << endl;
            cout << HuffmanTree[i].parent << "<-parent" << endl;
            cout << HuffmanTree[i].weight << "<-weight" << endl;
            cout << HuffmanTree[i].BinaryCode << "<-code" << endl;
        }
        return 0;
    }
    int Count(string op, string path1, string path2, int tong[]) {
        // TODO:統計檔案1字符出現頻率
        // path1是源檔案,path2是目標檔案
        ifstream instr(path1, ios::in | ios::binary);
        unsigned int bytebuff = 0;
        char ch;
        while (instr.get(ch)) {
            bytebuff =(int)(unsigned char)ch;
            tong[bytebuff]++;
        } //統計weight完成!
        int LeafNumber = 0;
        for (int i = 0; i <= 256; i++) {
            if (tong[i] != 0)
                LeafNumber++;
        }
        instr.close();
        cerr << "Count Completed" << endl;
        return LeafNumber;
    }
    //把哈夫曼樹存到FinalOutputString
    string SwitchStringToBinary(string HuffmanPath, int &Sign) {
        // TODO:將字串里的01序列修改為bit
        // 最后一個位元組要處理多余的0-->把0放后面
        string BinaryPath = "";
        int bytebuff = 0;
        int shiftcount = 0;
        for (int i = 0; i < HuffmanPath.size(); i++) {
            bytebuff += (HuffmanPath[i] == '1' ? 1 : 0);
            bytebuff <<= 1;
            shiftcount++;
            if (shiftcount == 8) {
                bytebuff >>= 1;
                BinaryPath += (char)bytebuff;
                bytebuff = 0;
                shiftcount = 0;
                if (i == HuffmanPath.size() - 1)
                    break;
                if (i + 8 > HuffmanPath.size()) {
                    i++;
                    while (i <= HuffmanPath.size() - 1) {
                        bytebuff += (HuffmanPath[i] == '1' ? 1 : 0);
                        bytebuff <<= 1;
                        shiftcount++;
                        i++;
                    }
                    bytebuff <<= 7 - shiftcount;
                    BinaryPath += (char)bytebuff;
                }
            }
        }
        Sign = 8 - shiftcount;
        cerr << "Switch String To Binary Completed" << endl;
        return BinaryPath;
    }
    int GetSide(Node HuffmanTree[], int LeafNumber) {
        for (int i = 0; i < 2 * LeafNumber - 2; i++) {
            if (i == HuffmanTree[HuffmanTree[i].parent].left)
                HuffmanTree[i].side = 'l';
            if (i == HuffmanTree[HuffmanTree[i].parent].right)
                HuffmanTree[i].side = 'r';
        }
        return 0;
    }
    int WriteToFile(string path2, Node HuffmanTree[], int LeafNumber,
                    string BinaryPath, int Sign) {
        // path2是要寫的目標檔案
        //打開二進制檔案輸出流
        // ShowTable(HuffmanTree, LeafNumber);
        LeafNumber -= 1;
        ofstream outstr(path2, ios::binary);
        outstr.write(reinterpret_cast<char *>(&Sign), sizeof(char));
        outstr.write(reinterpret_cast<char *>(&LeafNumber), sizeof(char));
        LeafNumber += 1;
        for (int i = 0; i < LeafNumber; i++) {
            //獲取要寫的內容的地址,轉換為char*
            HuffmanTree[i].parent -= LeafNumber;
            outstr.write(reinterpret_cast<char *>(&HuffmanTree[i].info),
                         sizeof(char));
            outstr.write(reinterpret_cast<char *>(&HuffmanTree[i].parent),
                         sizeof(char));
            outstr.write(reinterpret_cast<char *>(&HuffmanTree[i].side),
                         sizeof(char));
        }
        for (int i = LeafNumber; i < 2 * LeafNumber - 1; i++) {
            HuffmanTree[i].parent -= LeafNumber;
            outstr.write(reinterpret_cast<char *>(&HuffmanTree[i].parent),
                         sizeof(char));
            outstr.write(reinterpret_cast<char *>(&HuffmanTree[i].side),
                         sizeof(char));
        }
        for (int i = 0; i < BinaryPath.size(); i++) {
            outstr.put(BinaryPath[i]);
        }
        outstr.close();
        cerr << "Write To File Completed" << endl;
        return 0;
    }
};
class Decompression {
  public:
    int ShowTable(Node HuffmanTree[], int LeafNumber) {
        for (int i = 0; i < 2 * LeafNumber - 1; i++) {
            cout << "i:" << i << endl;
            cout << HuffmanTree[i].index << "<-index" << endl;
            cout << HuffmanTree[i].info << "<-info" << endl;
            cout << HuffmanTree[i].side << "<-side" << endl;
            cout << HuffmanTree[i].left << "<-left" << endl;
            cout << HuffmanTree[i].right << "<-right" << endl;
            cout << HuffmanTree[i].parent << "<-parent" << endl;
        }
        return 0;
    }
    Tree ReadFile(string path1, unsigned int &LeafNumber, string &SearchPath,
                  int &Sign) {
        // Sign是要刪去末尾的幾個0
        ifstream instr(path1, ios::in | ios::binary);
        SearchPath = "";
        char ch;
        instr.get(ch);
        Sign = ch;
        instr.get(ch);
        LeafNumber = (int)(unsigned char)ch;
        LeafNumber += 1;
        Tree HuffmanTree = new Node[2 * LeafNumber - 1];
        unsigned int num;
        for (int i = 0; i < LeafNumber; i++) {
            instr.get(ch);
            num = (int)(unsigned char)ch;
            HuffmanTree[i].info = num;
            instr.get(ch);
            num = (int)(unsigned char)ch;
            HuffmanTree[i].parent = num + LeafNumber;
            instr.get(ch);
            HuffmanTree[i].side = ch;
            //初始化其他資訊
            HuffmanTree[i].BinaryCode = "";
            HuffmanTree[i].index = i;
            HuffmanTree[i].left = -1;
            HuffmanTree[i].right = -1;
            HuffmanTree[i].weight = -1;
        }
        for (int i = LeafNumber; i < 2 * LeafNumber - 1; i++) {
            instr.get(ch);
            num = (int)(unsigned char)ch;
            HuffmanTree[i].parent = num + LeafNumber;
            instr.get(ch);
            HuffmanTree[i].side = ch;
            //初始化其他資訊
            HuffmanTree[i].info = -10000;
            HuffmanTree[i].BinaryCode = "";
            HuffmanTree[i].index = i;
            HuffmanTree[i].left = -1;
            HuffmanTree[i].right = -1;
            HuffmanTree[i].weight = -1;
        }
        int bitmask = 0x80;
        while (instr.get(ch)) {
            num = (int)(unsigned char)ch;
            while (bitmask != 0) {
                if ((bitmask & num) != 0) {
                    SearchPath += '1';
                }
                if ((bitmask & num) == 0) {
                    SearchPath += '0';
                }
                bitmask >>= 1;
            }
            bitmask = 0x80;
        }
        SearchPath.erase(SearchPath.end() - Sign, SearchPath.end());
        instr.close();
        cerr << "Read File Completed" << endl;
        return HuffmanTree;
    }
    int BuiltHuffmanTree(Node HuffmanTree[], int LeafNumber) {
        for (int i = 0; i < 2 * LeafNumber - 2; i++) {
            if (HuffmanTree[i].side == 'l') {
                HuffmanTree[HuffmanTree[i].parent].left = i;
            }
            if (HuffmanTree[i].side == 'r') {
                HuffmanTree[HuffmanTree[i].parent].right = i;
            }
        }
        cerr << "Built Huffman Tree Completed" << endl;
        return 0;
    }
    int RestoreFile(string SearchPath, Node HuffmanTree[], string path2,
                    int LeafNumber) {
        ofstream outstr(path2, ios::out | ios::binary);
        int head = 2 * LeafNumber - 2, now = head;
        for (int i = 0; i < SearchPath.size(); i++) {
            if (SearchPath[i] == '0') {
                now = HuffmanTree[now].left;
            }
            if (SearchPath[i] == '1') {
                now = HuffmanTree[now].right;
            }
            if (HuffmanTree[now].left == -1 && HuffmanTree[now].right == -1) {
                char res = HuffmanTree[now].info;
                outstr.put(res);
                now = head;
            }
        }
        outstr.close();
        cerr << "Restore File Completed" << endl;
        return 0;
    }
};
//決議命令列:
int main(int argc, char *argv[]) {
    // //定義類
    Compression Compress;
    Decompression Decompress;
    if (argc != 4)
        ShowHelp();
    else if (stricmp(argv[1], "-z") == 0)
        cerr << "Zip " << argv[2] << " to " << argv[3] << " ..." << endl;
    else if (stricmp(argv[1], "-x") == 0)
        cerr << "Extract " << argv[2] << " to " << argv[3] << " ..." << endl;
    else {
        ShowHelp();
        return 0;
    }
    //把路徑賦值給字串
    const string op = argv[1];
    const string path1 = argv[2];
    const string path2 = argv[3];
    ifstream instr(path1, ios::in | ios::binary);
    if (!instr) {
        cerr << "Open File failed" << endl;
        return 0;
    }
    //路徑分配
    if (op == "-z") {
        // cerr << "Ziping..." << endl;
        int tong[257] = {0};
        int LeafNumber =
            Compress.Count(op, path1, path2, tong); //統計字符出出現頻率
        Tree HuffmanTree = new Node[2 * LeafNumber - 1];          //初始化樹
        Compress.CreatHuffmanTree(tong, LeafNumber, HuffmanTree); //建樹
        Compress.GetSide(HuffmanTree, LeafNumber);
        Compress.GetCodeNode(HuffmanTree, LeafNumber); //獲得葉節點的編碼
        // Compress.ShowTable(HuffmanTree, LeafNumber);
        string HuffmanPath =
            Compress.Encode(path1, HuffmanTree, LeafNumber); //檔案進行編碼
        int Sign;
        string BinaryPath =
            Compress.SwitchStringToBinary(HuffmanPath, Sign); //獲得二進制串
        Compress.WriteToFile(path2, HuffmanTree, LeafNumber, BinaryPath,
                             Sign); //把哈夫曼樹寫入檔案
        // Compress.ShowTable(HuffmanTree, LeafNumber);
        cerr << "Compression Completed" << endl;
        return 0;
    } 
    else if (op == "-x") {
        unsigned int LeafNumber;
        int Sign;
        string SearchPath;
        Tree HuffmanTree =
            Decompress.ReadFile(path1, LeafNumber, SearchPath, Sign);
        Decompress.BuiltHuffmanTree(HuffmanTree, LeafNumber);
        Decompress.RestoreFile(SearchPath, HuffmanTree, path2, LeafNumber);
        // Decompress.ShowTable(HuffmanTree, LeafNumber);
        cerr << "Decompress Completed" << endl;
    } else
        return 0;
    return 0;
}

part5.寫在最后

其實我寫完這些代碼以后我感覺我使用的一些演算法并不是很好,但是如果要改的話基本上就是重構代碼了,工程量巨大,
給大家介紹一個大手子同學使用的演算法:不把哈夫曼樹寫入檔案,而是把一個類似于python里字典的東西存進檔案:存入每個字符對應的編碼長度和編碼,這樣在解壓的時候可以構建一個map,就可以把檔案還原,
老師的演算法是:直接開辟長度為512的線性表,每個節點對應的下標就是對應的ASCII碼,在檔案中只存入父節點,經過處理,以下標大小來區分同一個父節點的左右子節點,

歡迎技術交流!!!
如果有更好的演算法歡迎討論!

最后,放張圖大家笑一笑
在這里插入圖片描述

轉載請註明出處,本文鏈接:https://www.uj5u.com/ruanti/239091.html

標籤:其他

上一篇:南信2018-2020年822真題答案(隨手筆記)

下一篇:編譯原理-正則文法與正則運算式的相互轉化

標籤雲
其他(157675) Python(38076) JavaScript(25376) Java(17977) C(15215) 區塊鏈(8255) C#(7972) AI(7469) 爪哇(7425) MySQL(7132) html(6777) 基礎類(6313) sql(6102) 熊猫(6058) PHP(5869) 数组(5741) R(5409) Linux(5327) 反应(5209) 腳本語言(PerlPython)(5129) 非技術區(4971) Android(4554) 数据框(4311) css(4259) 节点.js(4032) C語言(3288) json(3245) 列表(3129) 扑(3119) C++語言(3117) 安卓(2998) 打字稿(2995) VBA(2789) Java相關(2746) 疑難問題(2699) 细绳(2522) 單片機工控(2479) iOS(2429) ASP.NET(2402) MongoDB(2323) 麻木的(2285) 正则表达式(2254) 字典(2211) 循环(2198) 迅速(2185) 擅长(2169) 镖(2155) 功能(1967) .NET技术(1958) Web開發(1951) python-3.x(1918) HtmlCss(1915) 弹簧靴(1913) C++(1909) xml(1889) PostgreSQL(1872) .NETCore(1853) 谷歌表格(1846) Unity3D(1843) for循环(1842)

熱門瀏覽
  • 面試突擊第一季,第二季,第三季

    第一季必考 https://www.bilibili.com/video/BV1FE411y79Y?from=search&seid=15921726601957489746 第二季分布式 https://www.bilibili.com/video/BV13f4y127ee/?spm_id_fro ......

    uj5u.com 2020-09-10 05:35:24 more
  • 第三單元作業總結

    1.前言 這應該是本學期最后一次寫作業總結了吧。總體來說,對作業的節奏也差不多掌握了,作業做起來的效率也更高了。雖然和之前的作業一樣,作業中都要用到新的知識,但是相比之前,更加懂得了如何利用工具以及資料。雖然之間卡過殼,但總體而言,這幾次作業還算完成的比較好。 2.作業程序總結 相比前兩個單元,此單 ......

    uj5u.com 2020-09-10 05:35:41 more
  • 北航OO(2020)第四單元博客作業暨課程總結博客

    北航OO(2020)第四單元博客作業暨課程總結博客 本單元作業的架構設計 在本單元中,由于UML圖具有比較清晰的樹形結構,因此我對其中需要進行查詢操作的元素進行了包裝,在樹的父節點中存盤所有孩子的參考。考慮到性能問題,我采用了快取機制,一次查詢后盡可能快取已經遍歷過的資訊,以減少遍歷次數。 本單元我 ......

    uj5u.com 2020-09-10 05:35:48 more
  • BUAA_OO_第四單元

    一、UML決議器設計 ? 先看下題目:第四單元實作一個基于JDK 8帶有效性檢查的UML(Unified Modeling Language)類圖,順序圖,狀態圖分析器 MyUmlInteraction,實際上我們要建立一個有向圖模型,UML中的物件(元素)可能與同級元素連接,也可與低級元素相連形成 ......

    uj5u.com 2020-09-10 05:35:54 more
  • 6.1邏輯運算子

    邏輯運算子 1. && 短路與 運算式1 && 運算式2 01.運算式1為true并且運算式2也為true 整體回傳為true 02.運算式1為false,將不會執行運算式2 整體回傳為false 03.只要有一個運算式為false 整體回傳為false 2. || 短路或 運算式1 || 運算式2 ......

    uj5u.com 2020-09-10 05:35:56 more
  • BUAAOO 第四單元 & 課程總結

    1. 第四單元:StarUml檔案決議 本單元采用了圖模型決議UML。 UML檔案可以抽象為圖、子圖、邊的邏輯結構。 在實作中,圖的節點包括類、介面、屬性,子圖包括狀態圖、順序圖等。 采用了三次遍歷UML元素的方法建圖,第一遍遍歷建點,第二、三次遍歷設定屬性、連邊,實作圖物件的初始化。這里借鑒了一些 ......

    uj5u.com 2020-09-10 05:36:06 more
  • 談談我對C# 多型的理解

    面向物件三要素:封裝、繼承、多型。 封裝和繼承,這兩個比較好理解,但要理解多型的話,可就稍微有點難度了。今天,我們就來講講多型的理解。 我們應該經常會看到面試題目:請談談對多型的理解。 其實呢,多型非常簡單,就一句話:呼叫同一種方法產生了不同的結果。 具體實作方式有三種。 一、多載 多載很簡單。 p ......

    uj5u.com 2020-09-10 05:36:09 more
  • Python 資料驅動工具:DDT

    背景 python 的unittest 沒有自帶資料驅動功能。 所以如果使用unittest,同時又想使用資料驅動,那么就可以使用DDT來完成。 DDT是 “Data-Driven Tests”的縮寫。 資料:http://ddt.readthedocs.io/en/latest/ 使用方法 dd. ......

    uj5u.com 2020-09-10 05:36:13 more
  • Python里面的xlrd模塊詳解

    那我就一下面積個問題對xlrd模塊進行學習一下: 1.什么是xlrd模塊? 2.為什么使用xlrd模塊? 3.怎樣使用xlrd模塊? 1.什么是xlrd模塊? ?python操作excel主要用到xlrd和xlwt這兩個庫,即xlrd是讀excel,xlwt是寫excel的庫。 今天就先來說一下xl ......

    uj5u.com 2020-09-10 05:36:28 more
  • 當我們創建HashMap時,底層到底做了什么?

    jdk1.7中的底層實作程序(底層基于陣列+鏈表) 在我們new HashMap()時,底層創建了默認長度為16的一維陣列Entry[ ] table。當我們呼叫map.put(key1,value1)方法向HashMap里添加資料的時候: 首先,呼叫key1所在類的hashCode()計算key1 ......

    uj5u.com 2020-09-10 05:36:38 more
最新发布
  • 【中介者設計模式詳解】C/Java/JS/Go/Python/TS不同語言實作

    * 中介者模式是一種行為型設計模式,它可以用來減少類之間的直接依賴關系,
    * 將物件之間的通信封裝到一個中介者物件中,從而使得各個物件之間的關系更加松散。
    * 在中介者模式中,物件之間不再直接相互互動,而是通過中介者來中轉訊息。 ......

    uj5u.com 2023-04-20 08:20:47 more
  • 露天煤礦現場調研和交流案例分享

    他們集團的資訊化公司及研究院在一個礦區正在做智能礦山的統一平臺的 試點,專案投資大概1億,包括了礦山的各方面的內容,顯示得我們這次交流有點多余。他們2年前開始做智能礦山的規劃,有很多煤礦行業專家的加持,他們的描述是非常完美,但是去年底應該上線的平臺,現在還沒有看到影子。他們確實有很多場景需求,但是被... ......

    uj5u.com 2023-04-20 08:20:25 more
  • 《社區人員管理》實戰案例設計&個人案例分享

    設計是一個讓人夢想成真程序,開始編碼、測驗、除錯之前進行需求分析和架構設計,才能保證關鍵方面都做正確 ......

    uj5u.com 2023-04-20 08:20:17 more
  • 軟體架構生態化-多角色交付的探索實踐

    作為一個技術架構師,不僅僅要緊跟行業技術趨勢,還要結合研發團隊現狀及痛點,探索新的交付方案。在日常中,你是否遇到如下問題 “ 業務需求排期長研發是瓶頸;非研發角色感受不到研發技改提效的變化;引入ISV 團隊又擔心質量和安全,培訓周期長“等等,基于此我們探索了一種新的技術體系及交付方案來解決如上問題。 ......

    uj5u.com 2023-04-20 08:20:10 more
  • 【中介者設計模式詳解】C/Java/JS/Go/Python/TS不同語言實作

    * 中介者模式是一種行為型設計模式,它可以用來減少類之間的直接依賴關系,
    * 將物件之間的通信封裝到一個中介者物件中,從而使得各個物件之間的關系更加松散。
    * 在中介者模式中,物件之間不再直接相互互動,而是通過中介者來中轉訊息。 ......

    uj5u.com 2023-04-20 08:19:44 more
  • 露天煤礦現場調研和交流案例分享

    他們集團的資訊化公司及研究院在一個礦區正在做智能礦山的統一平臺的 試點,專案投資大概1億,包括了礦山的各方面的內容,顯示得我們這次交流有點多余。他們2年前開始做智能礦山的規劃,有很多煤礦行業專家的加持,他們的描述是非常完美,但是去年底應該上線的平臺,現在還沒有看到影子。他們確實有很多場景需求,但是被... ......

    uj5u.com 2023-04-20 08:19:07 more
  • 《社區人員管理》實戰案例設計&個人案例分享

    設計是一個讓人夢想成真程序,開始編碼、測驗、除錯之前進行需求分析和架構設計,才能保證關鍵方面都做正確 ......

    uj5u.com 2023-04-20 08:18:57 more
  • 軟體架構生態化-多角色交付的探索實踐

    作為一個技術架構師,不僅僅要緊跟行業技術趨勢,還要結合研發團隊現狀及痛點,探索新的交付方案。在日常中,你是否遇到如下問題 “ 業務需求排期長研發是瓶頸;非研發角色感受不到研發技改提效的變化;引入ISV 團隊又擔心質量和安全,培訓周期長“等等,基于此我們探索了一種新的技術體系及交付方案來解決如上問題。 ......

    uj5u.com 2023-04-20 08:18:49 more
  • 05單件模式

    #經典的單件模式 public class Singleton { private static Singleton uniqueInstance; //一個靜態變數持有Singleton類的唯一實體。 // 其他有用的實體變數寫在這里 //構造器宣告為私有,只有Singleton可以實體化這個類! ......

    uj5u.com 2023-04-19 08:42:51 more
  • 【架構與設計】常見微服務分層架構的區別和落地實踐

    軟體工程的方方面面都遵循一個最基本的道理:沒有銀彈,架構分層模型更是如此,每一種都有各自優缺點,所以請根據不同的業務場景,并遵循簡單、可演進這兩個重要的架構原則選擇合適的架構分層模型即可。 ......

    uj5u.com 2023-04-19 08:42:41 more