Zheng Gao's Wonderland: December 2018

Thursday, December 27, 2018

tensorflow 炼丹trick

为什么loss 或者学习的参数在summary 中会出现nan？
可能是由于梯度爆炸引起的。解决方法：1. gradient clip 2. batch nomalization 3, 降低学习率 4. 加入正则。
参考文献：
https://blog.csdn.net/qq_33485434/article/details/80733251
https://blog.csdn.net/qq_25737169/article/details/78847691
https://cloud.tencent.com/developer/article/1057071
https://blog.csdn.net/yinxingtianxia/article/details/78121037
https://www.jianshu.com/p/cc42a9a45a71
https://www.zhihu.com/question/49346370

如果你想查看ckpt中的网络结构和参数该怎么弄呢？这里提供2种方法。一是用tensorflow官方源码中自带的inspect_checkpoint.py 给个例子如下： python /usr/local/lib/python2.7/dist-packages/tensorflow/python/tools/inspect_checkpoint.py --file_name=model.ckpt-158940 --tensor_name=unit_1_1/conv1/Weights 如果你只用了file_name这个参数，那么看到的就是整体网络结构。如果你2个参数都用了，那看到的就是具体那一层的值

Tuesday, December 25, 2018

LSTM and attention

LSTM:
references:
http://blog.gdf.name/lstm-with-tensorflow/
https://blog.csdn.net/Jason160918/article/details/78295423
https://blog.csdn.net/xuanyuansen/article/details/61913886
https://www.jianshu.com/p/b6130685d855

formula:

Attention formula: