Important Tools
Important linux commands
Having knowledge of following commands will help find issues faster. Elaborating each command in detail is out of scope, please look for man pages or online for more information and examples around the same.
- For logs parsing -: grep, sed, awk, cut, tail, head
- For network checks -: nc, netstat, traceroute/6, mtr, ping/6, route, tcpdump, ss, ip
- For DNS -: dig, host, nslookup
- For tracing system call -: strace
- For parallel executions over ssh -: gnu parallel, xargs + ssh.
- For http/s checks -: curl, wget
- For list of open files -: lsof
- For modifying attributes of the system kernel -: sysctl
In case of distributed systems, some good third party tools can help to execute commands/instructions on many hosts at once, like:
- SSH based tools
- ClusterSSH: Cluster ssh can help you run a command in parallel on many hosts at once.
- Ansible: It allows you to write ansible playbooks which you can run on hundreds/thousands of hosts at the same time.
- Agent Based tools
Log analysis tools
These can help in writing SQL type queries for parsing, analysing logs and provide an easy UI interface to create dashboards which can render various types of charts based on defined queries.
- ELK: Elasticsearch, Logstash and Kibana, provide package of tools and services to allow, parse logs, index logs and analyse logs easily and quickly. Once logs/data is parsed/filtered through logstash and indexed in elasticsearch, one can create dynamic dashboards in Kibana in a matter of minutes. Such provides easy analysis and correlation on application errors/exceptions/warnings.
- Azure kusto: Azure kusto is a cloud based service similar to Elasticsearch and Kibana, it allows easy indexing of heavy logs, provides SQL type interface for writing queries, and an interface to create dynamic dashboards.