【原创】Erlang 之 erl_crash.dump 文件分析

      前一篇博文中描述了 erl_crash.dump 文件的生成,本文主要讲解如何针对 erl_crash.dump 文件进行分析。

-=-=-=- 我是88界奥斯卡颁奖礼的分隔线 -=-=-=-

      首先看一下坚强兄的博文《 [Erlang 0057] Erlang 排错利器: Erlang Crash Dump Viewer 》中的内容:html

1. 基于 crashdump_viewer 的 web 页面进行 erl_crash.dump 分析;git

crashdump_viewer:start().

补充:github

The Crashdump Viewer is an HTML based tool for browsing Erlang crashdumps. Crashdump Viewer runs under the WebTool application.

2. 基于 recon 的 erl_crashdump_analyzer.sh 分析脚本 进行 erl_crash.dump 分析;web


-=-=-=- 我是88界奥斯卡颁奖礼的分隔线 -=-=-=-bash


文章中涉及两种工具,下面分别说明;app

【crashdump_viewer】












erl_crashdump_analyzer.sh

      该脚本是 recon 工具套装中的一个,详情能够参考 github 上的说明
      下面针对 erl_crashdump_analyzer.sh 脚本进行注释说明。tcp

[root@YOYO Erlang]# vi erl_crashdump_analyzer.sh 

#!/usr/bin/env bash
DUMP=$1

echo -e "analyzing $DUMP, generated on: " `head -2 $DUMP | tail -1` "\n"   -- echo -e 表示接受 \ 转义;

### SLOGAN ###
grep Slogan: $DUMP -m 1         -- 仅过滤出一条(即第一条)含有 "Slogan:" 的信息

### MEMORY ###                  -- 内存使用状况统计
echo -e "\nMemory:\n==="
M=`grep -m 1 'processes' $DUMP | sed "s/processes: //"`    -- 获取 "processes: " 后的数值(字节)
let "m=$M/(1024*1024)"
echo "  processes: $m Mb"
M=`grep -m 1 'processes_used' $DUMP | sed "s/processes_used: //"`
let "m=$M/(1024*1024)"
echo "  processes_used: $m Mb"
M=`grep -m 1 'system' $DUMP | sed "s/system: //"`
let "m=$M/(1024*1024)"
echo "  system: $m Mb"
M=`grep -m 1 'atom' $DUMP | sed "s/atom: //"`
let "m=$M/(1024*1024)"
echo "  atom: $m Mb"
M=`grep -m 1 'atom_used' $DUMP | sed "s/atom_used: //"`
let "m=$M/(1024*1024)"
echo "  atom_used: $m Mb"
M=`grep -m 1 'binary' $DUMP | sed "s/binary: //"`
let "m=$M/(1024*1024)"
echo "  binary: $m Mb"
M=`grep -m 1 'code' $DUMP | sed "s/code: //"`
let "m=$M/(1024*1024)"
echo "  code: $m Mb"
M=`grep -m 1 'ets' $DUMP | sed "s/ets: //"`
let "m=$M/(1024*1024)"
echo "  ets: $m Mb"
M=`grep -m 1 'total' $DUMP | sed "s/total: //"`
let "m=$M/(1024*1024)"
echo -e "  ---\n  total: $m Mb"

### PROCESS MESSAGE QUEUES LENGTHS ###
echo -e "\nDifferent message queue lengths (5 largest different):\n==="
grep 'Message queue len' $DUMP | sed 's/Message queue length: //g' | sort -n -r | uniq -c | head -5   -- 排序全部进程的消息队列长度,取消息数最多的 5 个显示出来

### ERROR LOGGER QUEUE LENGTH ###
echo -e "\nError logger queue length:\n==="
grep -C 10 'Name: error_logger' $DUMP -m 1| grep 'Message queue length' | sed 's/Message queue length: //g'  -- 匹配第一条 "Name: error_logger" 行,同时输出其上下各 10 行数据,从这些数据中匹配 'Message queue length: ' 获取其对应值

### PORT/FILE DESCRIPTOR INFO ###
echo -e "\nFile descriptors open:\n==="
echo -e "  UDP: "   `grep 'Port controls linked-in driver:' $DUMP | grep 'udp_inet' | wc -l`  -- udp_inet 对应 UDP 端口使用
echo -e "  TCP: "   `grep 'Port controls linked-in driver:' $DUMP | grep 'tcp_inet' | wc -l`  -- tcp_inet 对应 UDP 端口使用
echo -e "  Files: " `grep 'Port controls linked-in driver:' $DUMP | grep -vi 'udp_inet' | grep -vi 'tcp_inet' | wc -l`  -- 去除 udp_inet 和 tcp_inet 对应的行,剩下的对应 FILE 相关使用(efile 或 tty_sl -c -e 等)
echo -e "  ---\n  Total: " `grep 'Port controls linked-in driver:' $DUMP | wc -l`

### NUMBER OF PROCESSES ###
echo -e "\nNumber of processes:\n==="
grep '=proc:' $DUMP | wc -l     -- 统计进程数量(以 "=proc:" 开头的部分对应进程信息开始)

### PROC HEAPS+STACK ###
echo -e "\nProcesses Heap+Stack memory sizes (words) used in the VM (5 largest different):\n==="
grep 'Stack+heap' $DUMP | sed "s/Stack+heap: //g" | sort -n -r | uniq -c | head -5              -- 获取 "Stack+heap: " 字段后的数值,排序后取前 5 

### PROC OLDHEAP ###
echo -e "\nProcesses OldHeap memory sizes (words) used in the VM (5 largest different):\n==="
grep 'OldHeap' $DUMP | sed "s/OldHeap: //g" | sort -n -r | uniq -c | head -5

### PROC STATES ###
echo -e "\nProcess States when crashing (sum): \n==="
grep 'State: ' $DUMP | sed "s/State: //g" | sort | uniq -c               -- 获取发生 crash 时的所有进程状态(例如 Waiting 或 Running)

分析举例工具

[root@YOYO Erlang]# ./erl_crashdump_analyzer.sh erl_crash.dump 
analyzing erl_crash.dump, generated on:  Thu Oct 8 09:51:53 2015 

Slogan: init terminating in do_boot ()

Memory:
===
  processes: 13 Mb
  processes_used: 13 Mb
  system: 24 Mb
  atom: 0 Mb
  atom_used: 0 Mb
  binary: 0 Mb
  code: 18 Mb
  ets: 0 Mb
  ---
  total: 38 Mb

Different message queue lengths (5 largest different):
===
     43 0

Error logger queue length:
===
0

File descriptors open:
===
  UDP:  0
  TCP:  2
  Files:  4
  ---
  Total:  6

Number of processes:
===
43

Processes Heap+Stack memory sizes (words) used in the VM (5 largest different):
===
      4 6772
      3 4185
      2 2586
      3 1598
      5 987

Processes OldHeap memory sizes (words) used in the VM (5 largest different):
===
      1 46422
      1 17731
      2 10958
      3 4185
      1 2586

Process States when crashing (sum): 
===
      1 Running
     42 Waiting
[root@YOYO Erlang]#

从上面的输出中,能够看到atom

  • 内存的分配量和使用量
  • 消息队列信息
  • 文件描述符使用信息
  • 堆+栈内存使用量


补充:
      在 lib/observer-2.0/priv/bin 路径下还有一个 cdv 脚本封装了针对 crashdump_viewer 的调用
spa