關于“Unsupervised Deep Embedding for Clustering Analysis”的優化問題
作者:凱魯嘎吉 - 博客園 http://www.cnblogs.com/kailugaji/
Deep Embedding Clustering (DEC)和Improved Ceep Emdedding Clustering (IDEC)被相繼提出,但關于引數的優化問題,作者并未詳細給出,于是乎自己推導了一遍,但是發現和這兩篇文章的推導結果不一致,不知道問題出在哪?下面,相當于給出一道數學題,來求解目標函式關于某個引數(以聚類中心為例)的偏導問題,
問題描述
已知
\[L=\sum\limits_{i}^{N}{\sum\limits_{j}^{c}{{{p}_{ij}}\log \frac{{{p}_{ij}}}{{{q}_{ij}}}}}\]
\[{{q}_{ij}}=\frac{{{(1+{{\left\| {{z}_{i}}-{{\mu }_{j}} \right\|}^{2}})}^{-1}}}{\sum\nolimits_{j}{{{(1+{{\left\| {{z}_{i}}-{{\mu }_{j}} \right\|}^{2}})}^{-1}}}}\]
\[{{p}_{ij}}=\frac{q_{ij}^{2}/\sum\nolimits_{j}{{{q}_{ij}}}}{\sum\nolimits_{j}{(q_{ij}^{2}/\sum\nolimits_{j}{{{q}_{ij}}})}}\]
固定${p}_{ij}$, 求
\[\frac{\partial L}{\partial {{\mu }_{j}}}\]
問題求解
根據鏈式法則
\[\frac{\partial L}{\partial {{\mu }_{j}}}=\frac{\partial L}{\partial {{q}_{ij}}}\frac{\partial {{q}_{ij}}}{\partial {{\mu }_{j}}}\]
\[\frac{\partial L}{\partial {{q}_{ij}}}=\frac{\partial \left( {{p}_{ij}}\log \frac{{{p}_{ij}}}{{{q}_{ij}}} \right)}{\partial {{q}_{ij}}}=\frac{\partial \left( {{p}_{ij}}\log {{p}_{ij}}-{{p}_{ij}}\log {{q}_{ij}} \right)}{\partial {{q}_{ij}}}=-\frac{{{p}_{ij}}}{{{q}_{ij}}}\]
\[\frac{{\partial {q_{ij}}}}{{\partial {\mu _j}}} = \sum\limits_i^N {\frac{{\partial \frac{{{{(1 + {{\left\| {{z_i} - {\mu _j}} \right\|}^2})}^{ - 1}}}}{{\sum\nolimits_j {{{(1 + {{\left\| {{z_i} - {\mu _j}} \right\|}^2})}^{ - 1}}} }}}}{{\partial {\mu _j}}}} = \sum\limits_i^N {\left( {\frac{{\partial {{(1 + {{\left\| {{z_i} - {\mu _j}} \right\|}^2})}^{ - 1}}}}{{\partial {\mu _j}}}\frac{1}{{\sum\nolimits_j {{{(1 + {{\left\| {{z_i} - {\mu _j}} \right\|}^2})}^{ - 1}}} }} + {{(1 + {{\left\| {{z_i} - {\mu _j}} \right\|}^2})}^{ - 1}}\frac{{\partial \frac{1}{{\sum\nolimits_j {{{(1 + {{\left\| {{z_i} - {\mu _j}} \right\|}^2})}^{ - 1}}} }}}}{{\partial {\mu _j}}}} \right)} \]
其中
\[\frac{{\partial {{(1 + {{\left\| {{z_i} - {\mu _j}} \right\|}^2})}^{ - 1}}}}{{\partial {\mu _j}}} = - {(1 + {\left\| {{z_i} - {\mu _j}} \right\|^2})^{ - 2}} \cdot \left( { - 2({z_i} - {\mu _j})} \right) = 2({z_i} - {\mu _j}) \cdot {(1 + {\left\| {{z_i} - {\mu _j}} \right\|^2})^{ - 2}}\]
\[\frac{{\partial \frac{1}{{\sum\nolimits_j {{{(1 + {{\left\| {{z_i} - {\mu _j}} \right\|}^2})}^{ - 1}}} }}}}{{\partial {\mu _j}}} = - \frac{{2({z_i} - {\mu _j}) \cdot {{(1 + {{\left\| {{z_i} - {\mu _j}} \right\|}^2})}^{ - 2}}}}{{{{\left( {\sum\nolimits_j {{{(1 + {{\left\| {{z_i} - {\mu _j}} \right\|}^2})}^{ - 1}}} } \right)}^2}}}\]
所以
\[\frac{{\partial {q_{ij}}}}{{\partial {\mu _j}}} = \sum\limits_i^N {(\frac{{2 \cdot ({z_i} - {\mu _j}) \cdot {{(1 + {{\left\| {{z_i} - {\mu _j}} \right\|}^2})}^{ - 2}}}}{{\sum\nolimits_j {{{(1 + {{\left\| {{z_i} - {\mu _j}} \right\|}^2})}^{ - 1}}} }} - \frac{{2 \cdot ({z_i} - {\mu _j}) \cdot {{(1 + {{\left\| {{z_i} - {\mu _j}} \right\|}^2})}^{ - 2}} \cdot {{(1 + {{\left\| {{z_i} - {\mu _j}} \right\|}^2})}^{ - 1}}}}{{{{\left( {\sum\nolimits_j {{{(1 + {{\left\| {{z_i} - {\mu _j}} \right\|}^2})}^{ - 1}}} } \right)}^2}}})} = \sum\limits_i^N {\left( {2 \cdot ({z_i} - {\mu _j}) \cdot {{(1 + {{\left\| {{z_i} - {\mu _j}} \right\|}^2})}^{ - 1}} \cdot {q_{ij}} - 2 \cdot ({z_i} - {\mu _j}) \cdot {{(1 + {{\left\| {{z_i} - {\mu _j}} \right\|}^2})}^{ - 1}} \cdot q_{ij}^2} \right)} {\rm{ = }}\sum\limits_i^N {\left( {2 \cdot ({z_i} - {\mu _j}) \cdot {{(1 + {{\left\| {{z_i} - {\mu _j}} \right\|}^2})}^{ - 1}} \cdot {q_{ij}} \cdot (1 - {q_{ij}})} \right)} \]
求導結果
\[\frac{{\partial L}}{{\partial {\mu _j}}} = \frac{{\partial L}}{{\partial {q_{ij}}}}\frac{{\partial {q_{ij}}}}{{\partial {\mu _j}}} = \sum\limits_i^N {\left( { - \frac{{{p_{ij}}}}{{{q_{ij}}}} \cdot 2 \cdot ({z_i} - {\mu _j}) \cdot {{(1 + {{\left\| {{z_i} - {\mu _j}} \right\|}^2})}^{ - 1}} \cdot {q_{ij}} \cdot (1 - {q_{ij}})} \right)} = \sum\limits_i^N {\left( {2 \cdot ({z_i} - {\mu _j}) \cdot {{(1 + {{\left\| {{z_i} - {\mu _j}} \right\|}^2})}^{ - 1}} \cdot {p_{ij}} \cdot ({q_{ij}} - 1)} \right)} \]
原文結果

不知道問題出在哪?求廣大網友指正~
參考文獻
[1] Deep Clustering Algorithms - 凱魯嘎吉 博客園
[2] Xie J, Girshick R, Farhadi A. Unsupervised deep embedding for clustering analysis[C]//International conference on machine learning. 2016: 478-487.
[3] Guo X, Gao L, Liu X, et al. Improved deep embedded clustering with local structure preservation[C]//IJCAI. 2017: 1753-1759.
轉載請註明出處,本文鏈接:https://www.uj5u.com/qita/251519.html
標籤:其他
