4.7. 使用 WeatherReport 对 CouchDB 3 进行故障排除¶
4.7.1. 概述¶
WeatherReport 是一个 OTP 应用程序和工具集,用于诊断可能影响 CouchDB 版本 3 节点或集群(不支持版本 4 或更高版本)的常见问题。可以通过 weatherreport
命令行 escript 访问它。
以下是一个使用 weatherreport
的基本示例,紧随其后的是命令的输出
$ weatherreport --etc /path/to/etc
[warning] Cluster member [email protected] is not connected to this node. Please check whether it is down.
4.7.2. 用法¶
在大多数情况下,您只需运行 weatherreport
命令即可,如上所示。但是,有时您可能希望了解一些额外的细节,或者只运行特定的检查。为此,可以使用命令行选项。执行 weatherreport --help
以了解有关这些选项的更多信息
$ weatherreport --help
Usage: weatherreport [-c <path>] [-d <level>] [-e] [-h] [-l] [check_name ...]
-c, --etc Path to the CouchDB configuration directory
-d, --level Minimum message severity level (default: notice)
-l, --list Describe available diagnostic tasks
-e, --expert Perform more detailed diagnostics
-h, --help Display help/usage
check_name A specific check to run
要了解将要运行哪些检查,请使用 –list 选项
$ weatherreport --list
Available diagnostic checks:
custodian Shard safety/liveness checks
disk Data directory permissions and atime
internal_replication Check the number of pending internal replication jobs
ioq Check the total number of active IOQ requests
mem3_sync Check there is a registered mem3_sync process
membership Cluster membership validity
memory_use Measure memory usage
message_queues Check for processes with large mailboxes
node_stats Check useful erlang statistics for diagnostics
nodes_connected Cluster node liveness
process_calls Check for large numbers of processes with the same current/initial call
process_memory Check for processes with high memory usage
safe_to_rebuild Check whether the node can safely be taken out of service
search Check the local search node is responsive
tcp_queues Measure the length of tcp queues in the kernel
如果您希望了解 WeatherReport 正在执行的所有详细信息,可以在更详细的日志级别运行检查,使用 --level
选项
$ weatherreport --etc /path/to/etc --level debug
[debug] Not connected to the local cluster node, trying to connect. alive:false connect_failed:undefined
[debug] Starting distributed Erlang.
[debug] Connected to local cluster node '[email protected]'.
[debug] Local RPC: mem3:nodes([]) [5000]
[debug] Local RPC: os:getpid([]) [5000]
[debug] Running shell command: ps -o pmem,rss -p 73905
[debug] Shell command output:
%MEM RSS
0.3 25116
[debug] Local RPC: erlang:nodes([]) [5000]
[debug] Local RPC: mem3:nodes([]) [5000]
[warning] Cluster member [email protected] is not connected to this node. Please check whether it is down.
[info] Process is using 0.3% of available RAM, totalling 25116 KB of real memory.
大多数情况下您会想要使用默认值,但任何 syslog 严重性名称都可以(从最详细到最不详细):debug, info, notice, warning, error, critical, alert, emergency
。
最后,如果您只想运行单个诊断或特定诊断列表,可以传递其名称
$ weatherreport --etc /path/to/etc nodes_connected
[warning] Cluster member [email protected] is not connected to this node. Please check whether it is down.