add tt command scenario. #847 (#1381)

pull/1367/head
Hollow Man 5 years ago committed by GitHub
parent 47c2da4924
commit 19ed2103bb
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23

@ -1,6 +1,8 @@
tt
==
[`tt` online tutorial](https://alibaba.github.io/arthas/arthas-tutorials?language=en&id=command-tt)
Check the `parameters`, `return values` and `exceptions` of the methods at different times.
`watch` is a powerful command but due to its feasibility and complexity, it's quite hard to locate the issue effectively.

@ -1,6 +1,8 @@
tt
===
[`tt`在线教程](https://alibaba.github.io/arthas/arthas-tutorials?language=cn&id=command-tt)
> 方法执行数据的时空隧道,记录下指定方法每次调用的入参和返回信息,并能对这些不同的时间下调用进行观测
`watch` 虽然很方便和灵活,但需要提前想清楚观察表达式的拼写,这对排查问题而言要求太高,因为很多时候我们并不清楚问题出自于何方,只能靠蛛丝马迹进行猜测。

@ -0,0 +1,16 @@
在新的`Terminal 2`里,下载`arthas-boot.jar`,再用`java -jar`命令启动:
`wget https://alibaba.github.io/arthas/arthas-boot.jar
java -jar arthas-boot.jar`{{execute T2}}
`arthas-boot`是`Arthas`的启动程序它启动后会列出所有的Java进程用户可以选择需要诊断的目标进程。
选择第一个进程,输入 `1`{{execute T2}} ,再`Enter/回车`
Attach成功之后会打印Arthas LOGO。输入 `help`{{execute T2}} 可以获取到更多的帮助信息。
![Arthas Boot](/arthas/scenarios/common-resources/assets/arthas-boot.png)

@ -0,0 +1,10 @@
下载`arthas-demo.jar`,再用`java -jar`命令启动:
`wget https://alibaba.github.io/arthas/arthas-demo.jar
java -jar arthas-demo.jar`{{execute T1}}
`arthas-demo`是一个很简单的程序,它随机生成整数,再执行因式分解,把结果打印出来。如果生成的随机数是负数,则会打印提示信息。

@ -0,0 +1,12 @@
在“tt”中我们演示了了Arthas的tt命令。如果有更多的技巧或者使用疑问欢迎在Issue里提出。
* Issues: https://github.com/alibaba/arthas/issues
* 文档: https://alibaba.github.io/arthas
如果您在使用Arthas请让我们知道您的使用对我们非常重要[查看](https://github.com/alibaba/arthas/issues/111)
欢迎关注公众号获取Arthas项目的信息源码分析案例实践。
![Arthas公众号](/arthas/scenarios/common-resources/assets/qrcode_gongzhonghao.jpg)

@ -0,0 +1,35 @@
{
"title": "Arthas tt命令",
"description": "Arthas tt命令",
"difficulty": "精通者",
"time": "10-20 分钟",
"details": {
"steps": [
{
"title": "启动arthas demo",
"text": "arthas-demo.md"
},
{
"title": "启动arthas-boot",
"text": "arthas-boot.md"
},
{
"title": "tt命令",
"text": "tt.md"
}
],
"intro": {
"text": "intro.md"
},
"finish": {
"text": "finish.md"
}
},
"environment": {
"uilayout": "terminal"
},
"backend": {
"imageid": "java",
"environmentsprotocol": "http"
}
}

@ -0,0 +1,23 @@
![Arthas](https://alibaba.github.io/arthas/_images/arthas.png)
`Arthas` 是Alibaba开源的Java诊断工具深受开发者喜爱。在线排查问题无需重启动态跟踪Java代码实时监控JVM状态。
`Arthas` 支持JDK 6+支持Linux/Mac/Windows采用命令行交互模式同时提供丰富的 `Tab` 自动补全功能,进一步方便进行问题的定位和诊断。
当你遇到以下类似问题而束手无策时Arthas可以帮助你解决
- 这个类从哪个 jar 包加载的?为什么会报各种类相关的 Exception
- 我改的代码为什么没有执行到?难道是我没 commit分支搞错了
- 遇到问题无法在线上 debug难道只能通过加日志再重新发布吗
- 线上遇到某个用户的数据处理有问题,但线上同样无法 debug线下无法重现
- 是否有一个全局视角来查看系统的运行状况?
- 有什么办法可以监控到JVM的实时运行状态
- 怎么快速定位应用的热点,生成火焰图?
本教程会以一个简单的应用为例演示tt命令。
* Github: https://github.com/alibaba/arthas
* 文档: https://alibaba.github.io/arthas/

@ -0,0 +1,189 @@
> 方法执行数据的时空隧道,记录下指定方法每次调用的入参和返回信息,并能对这些不同的时间下调用进行观测
`watch` 虽然很方便和灵活,但需要提前想清楚观察表达式的拼写,这对排查问题而言要求太高,因为很多时候我们并不清楚问题出自于何方,只能靠蛛丝马迹进行猜测。
这个时候如果能记录下当时方法调用的所有入参和返回值、抛出的异常会对整个问题的思考与判断非常有帮助。
于是乎TimeTunnel 命令就诞生了。
### 使用参考
#### 记录调用
对于一个最基本的使用来说,就是记录下当前方法的每次调用环境现场。
`tt -t demo.MathGame primeFactors`{{execute T2}}
```bash
$ tt -t demo.MathGame primeFactors
Press Ctrl+C to abort.
Affect(class-cnt:1 , method-cnt:1) cost in 66 ms.
INDEX TIMESTAMP COST(ms) IS-RET IS-EXP OBJECT CLASS METHOD
-------------------------------------------------------------------------------------------------------------------------------------
1000 2018-12-04 11:15:38 1.096236 false true 0x4b67cf4d MathGame primeFactors
1001 2018-12-04 11:15:39 0.191848 false true 0x4b67cf4d MathGame primeFactors
1002 2018-12-04 11:15:40 0.069523 false true 0x4b67cf4d MathGame primeFactors
1003 2018-12-04 11:15:41 0.186073 false true 0x4b67cf4d MathGame primeFactors
1004 2018-12-04 11:15:42 17.76437 true false 0x4b67cf4d MathGame primeFactors
```
- 命令参数解析
- `-t`
tt 命令有很多个主参数,`-t` 就是其中之一。这个参数的表明希望记录下类 `*Test``print` 方法的每次执行情况。
- `-n 3`
当你执行一个调用量不高的方法时可能你还能有足够的时间用 `CTRL+C` 中断 tt 命令记录的过程,但如果遇到调用量非常大的方法,瞬间就能将你的 JVM 内存撑爆。
此时你可以通过 `-n` 参数指定你需要记录的次数,当达到记录次数时 Arthas 会主动中断tt命令的记录过程避免人工操作无法停止的情况。
- 表格字段说明
|表格字段|字段解释|
|---|---|
|INDEX|时间片段记录编号每一个编号代表着一次调用后续tt还有很多命令都是基于此编号指定记录操作非常重要。|
|TIMESTAMP|方法执行的本机时间,记录了这个时间片段所发生的本机时间|
|COST(ms)|方法执行的耗时|
|IS-RET|方法是否以正常返回的形式结束|
|IS-EXP|方法是否以抛异常的形式结束|
|OBJECT|执行对象的`hashCode()`注意曾经有人误认为是对象在JVM中的内存地址但很遗憾他不是。但他能帮助你简单的标记当前执行方法的类实体|
|CLASS|执行的类名|
|METHOD|执行的方法名|
- 条件表达式
不知道大家是否有在使用过程中遇到以下困惑
* Arthas 似乎很难区分出重载的方法
* 我只需要观察特定参数,但是 tt 却全部都给我记录了下来
条件表达式也是用 `OGNL` 来编写,核心的判断对象依然是 `Advice` 对象。除了 `tt` 命令之外,`watch`、`trace`、`stack` 命令也都支持条件表达式。
- 解决方法重载
`tt -t *Test print params.length==1`
通过制定参数个数的形式解决不同的方法签名,如果参数个数一样,你还可以这样写
`tt -t *Test print 'params[1] instanceof Integer'`
- 解决指定参数
`tt -t *Test print params[0].mobile=="13989838402"`
- 构成条件表达式的 `Advice` 对象
前边看到了很多条件表达式中,都使用了 `params[0]`,有关这个变量的介绍,请参考[表达式核心变量](https://alibaba.github.io/arthas/advice-class.html)
#### 检索调用记录
当你用 `tt` 记录了一大片的时间片段之后,你希望能从中筛选出自己需要的时间片段,这个时候你就需要对现有记录进行检索。
假设我们有这些记录
`tt -l`{{execute T2}}
```bash
$ tt -l
INDEX TIMESTAMP COST(ms) IS-RET IS-EXP OBJECT CLASS METHOD
-------------------------------------------------------------------------------------------------------------------------------------
1000 2018-12-04 11:15:38 1.096236 false true 0x4b67cf4d MathGame primeFactors
1001 2018-12-04 11:15:39 0.191848 false true 0x4b67cf4d MathGame primeFactors
1002 2018-12-04 11:15:40 0.069523 false true 0x4b67cf4d MathGame primeFactors
1003 2018-12-04 11:15:41 0.186073 false true 0x4b67cf4d MathGame primeFactors
1004 2018-12-04 11:15:42 17.76437 true false 0x4b67cf4d MathGame primeFactors
9
1005 2018-12-04 11:15:43 0.4776 false true 0x4b67cf4d MathGame primeFactors
Affect(row-cnt:6) cost in 4 ms.
```
我需要筛选出 `primeFactors` 方法的调用信息
`tt -s 'method.name=="primeFactors"'`{{execute T2}}
```bash
$ tt -s 'method.name=="primeFactors"'
INDEX TIMESTAMP COST(ms) IS-RET IS-EXP OBJECT CLASS METHOD
-------------------------------------------------------------------------------------------------------------------------------------
1000 2018-12-04 11:15:38 1.096236 false true 0x4b67cf4d MathGame primeFactors
1001 2018-12-04 11:15:39 0.191848 false true 0x4b67cf4d MathGame primeFactors
1002 2018-12-04 11:15:40 0.069523 false true 0x4b67cf4d MathGame primeFactors
1003 2018-12-04 11:15:41 0.186073 false true 0x4b67cf4d MathGame primeFactors
1004 2018-12-04 11:15:42 17.76437 true false 0x4b67cf4d MathGame primeFactors
9
1005 2018-12-04 11:15:43 0.4776 false true 0x4b67cf4d MathGame primeFactors
Affect(row-cnt:6) cost in 607 ms.
```
你需要一个 `-s` 参数。<span style="color:red;">同样的,搜索表达式的核心对象依旧是 `Advice` 对象。</span>
#### 查看调用信息
对于具体一个时间片的信息而言,你可以通过 `-i` 参数后边跟着对应的 `INDEX` 编号查看到他的详细信息。
`tt -i 1003`{{execute T2}}
```bash
$ tt -i 1003
INDEX 1003
GMT-CREATE 2018-12-04 11:15:41
COST(ms) 0.186073
OBJECT 0x4b67cf4d
CLASS demo.MathGame
METHOD primeFactors
IS-RETURN false
IS-EXCEPTION true
PARAMETERS[0] @Integer[-564322413]
THROW-EXCEPTION java.lang.IllegalArgumentException: number is: -564322413, need >= 2
at demo.MathGame.primeFactors(MathGame.java:46)
at demo.MathGame.run(MathGame.java:24)
at demo.MathGame.main(MathGame.java:16)
Affect(row-cnt:1) cost in 11 ms.
```
#### 重做一次调用
当你稍稍做了一些调整之后,你可能需要前端系统重新触发一次你的调用,此时得求爷爷告奶奶的需要前端配合联调的同学再次发起一次调用。而有些场景下,这个调用不是这么好触发的。
`tt` 命令由于保存了当时调用的所有现场信息,所以我们可以自己主动对一个 `INDEX` 编号的时间片自主发起一次调用,从而解放你的沟通成本。此时你需要 `-p` 参数。通过 `--replay-times` 指定
调用次数,通过 `--replay-interval` 指定多次调用间隔(单位ms, 默认1000ms)
`tt -i 1004 -p`{{execute T2}}
```bash
$ tt -i 1004 -p
RE-INDEX 1004
GMT-REPLAY 2018-12-04 11:26:00
OBJECT 0x4b67cf4d
CLASS demo.MathGame
METHOD primeFactors
PARAMETERS[0] @Integer[946738738]
IS-RETURN true
IS-EXCEPTION false
COST(ms) 0.186073
RETURN-OBJ @ArrayList[
@Integer[2],
@Integer[11],
@Integer[17],
@Integer[2531387],
]
Time fragment[1004] successfully replayed.
Affect(row-cnt:1) cost in 14 ms.
```
你会发现结果虽然一样,但调用的路径发生了变化,有原来的程序发起变成了 Arthas 自己的内部线程发起的调用了。
- 需要强调的点
1. **ThreadLocal 信息丢失**
很多框架偷偷的将一些环境变量信息塞到了发起调用线程的 ThreadLocal 中,由于调用线程发生了变化,这些 ThreadLocal 线程信息无法通过 Arthas 保存,所以这些信息将会丢失。
一些常见的 CASE 比如:鹰眼的 TraceId 等。
2. **引用的对象**
需要强调的是,`tt` 命令是将当前环境的对象引用保存起来,但仅仅也只能保存一个引用而已。如果方法内部对入参进行了变更,或者返回的对象经过了后续的处理,那么在 `tt` 查看的时候将无法看到当时最准确的值。这也是为什么 `watch` 命令存在的意义。

@ -0,0 +1,16 @@
In the new `Terminal 2`, download `arthas-boot.jar` and start with the `java -jar` command:
`wget https://alibaba.github.io/arthas/arthas-boot.jar
java -jar arthas-boot.jar`{{execute T2}}
`arthas-boot` is the launcher for `Arthas`. It lists all the Java processes, and the user can select the target process to be diagnosed.
Select the first process, type `1`{{execute T2}} then type `Enter`
After the Attach is successful, Arthas LOGO is printed. Enter `help`{{execute T2}} for more help.
![Arthas Boot](/arthas/scenarios/common-resources/assets/arthas-boot.png)

@ -0,0 +1,11 @@
Download `arthas-demo.jar` and start with the `java -jar` command:
`wget https://alibaba.github.io/arthas/arthas-demo.jar
java -jar arthas-demo.jar`{{execute T1}}
`arthas-demo` is a very simple program that randomly generates integers, performs factorization, and prints the results.
If the generated random number is negative, a error message will be printed.

@ -0,0 +1,8 @@
The `tt Tutorial` demonstrates the usage of tt. If you have more tips or questions, please feel free to ask in Issue.
* Issues: https://github.com/alibaba/arthas/issues
* Documentation: https://alibaba.github.io/arthas
If you are using Arthas, please let us know. Your use is very important to us: [View](https://github.com/alibaba/arthas/issues/111)

@ -0,0 +1,35 @@
{
"title": "Arthas tt Command",
"description": "Arthas tt Command",
"difficulty": "master",
"time": "10-20 minutes",
"details": {
"steps": [
{
"title": "Arthas demo",
"text": "arthas-demo.md"
},
{
"title": "Start arthas-boot",
"text": "arthas-boot.md"
},
{
"title": "tt Command",
"text": "tt.md"
}
],
"intro": {
"text": "intro.md"
},
"finish": {
"text": "finish.md"
}
},
"environment": {
"uilayout": "terminal"
},
"backend": {
"imageid": "java",
"environmentsprotocol": "http"
}
}

@ -0,0 +1,40 @@
![Arthas](https://alibaba.github.io/arthas/_images/arthas.png)
`Arthas` is a Java diagnostic tool open-sourced by Alibaba middleware team. Arthas helps developers in trouble-shooting issues in production environment for Java based applications without modifying code or restarting servers.
`Arthas` supports JDK 6+, supports Linux/Mac/Windows.
## Background
Oftentimes the production system network is inaccessible from local development environment. If issues are encountered in production systems, it is impossible to use IDE to debug the application remotely. Whats even worse, debugging in production environment is unacceptable, as it will suspend all the threads, leading to services downtime.
Developers could always try to reproduce the same issue on the test/staging environment. However, this is tricky as some issues cannot be reproduced easily in a different environment, or even disappear once restarted.
And if youre thinking of adding some logs to your code to help trouble-shoot the issue, you will have to go through the following lifecycle: test, staging, and then to production. Time is money! This approach is inefficient! Worse still, the issue may not be fixed since it might be irreproducible once the JVM is restarted, as described above.
Arthas is built to solve these issues. A developer can troubleshoot production issues on the fly. No JVM restart, no additional code changes. Arthas works as an observer, that is, it will never suspend your running threads.
## Key features
- Check whether a class is loaded? Or where the class is loaded from? (Useful for trouble-shooting jar file conflicts)
- Decompile a class to ensure the code is running as expected.
- Check classloader statistics, e.g. the number of classloaders, the number of classes loaded per classloader, the classloader hierarchy, possible classloader leaks, etc.
- Check the method invocation details, e.g. method parameter, returned values, exceptions and etc.
- Check the stack trace of specified method invocation. This is useful when a developer wants to know the caller of the method.
- Trace the method invocation to find slow sub-invocations.
- Monitor method invocation statistics, e.g. QPS (Query Per Second), RT (Return Time), success rate and etc.
- Monitor system metrics, thread states and CPU usage, GC statistics and etc.
- Supports command line interactive mode, with auto-complete feature enabled.
- Supports telnet and WebSocket, which enables both local and remote diagnostics with command line and browsers.
- Supports profiler/Flame Graph
- Supports JDK 6+
- Supports Linux/Mac/Windows
This tutorial takes a simple application as an example to demonstrate the the usage of tt.
* Github: https://github.com/alibaba/arthas
* Docs: https://alibaba.github.io/arthas/

@ -0,0 +1,163 @@
Check the `parameters`, `return values` and `exceptions` of the methods at different times.
`watch` is a powerful command but due to its feasibility and complexity, it's quite hard to locate the issue effectively.
In such difficulties, `tt` comes into play.
With the help of `tt` (*TimeTunnel*), you can check the contexts of the methods at different times in execution history.
### Usage
#### Record method calls
`tt -t demo.MathGame primeFactors`{{execute T2}}
```bash
$ tt -t demo.MathGame primeFactors
Press Ctrl+C to abort.
Affect(class-cnt:1 , method-cnt:1) cost in 66 ms.
INDEX TIMESTAMP COST(ms) IS-RET IS-EXP OBJECT CLASS METHOD
-------------------------------------------------------------------------------------------------------------------------------------
1000 2018-12-04 11:15:38 1.096236 false true 0x4b67cf4d MathGame primeFactors
1001 2018-12-04 11:15:39 0.191848 false true 0x4b67cf4d MathGame primeFactors
1002 2018-12-04 11:15:40 0.069523 false true 0x4b67cf4d MathGame primeFactors
1003 2018-12-04 11:15:41 0.186073 false true 0x4b67cf4d MathGame primeFactors
1004 2018-12-04 11:15:42 17.76437 true false 0x4b67cf4d MathGame primeFactors
```
* `-t`
record the calling context of the method `demo.MathGame primeFactors`
* `-n 3`
limit the number of the records (avoid overflow for too many records; with `-n` option, Arthas can automatically stop recording once the records reach the specified limit)
* Property
|Name|Specification|
|---|---|
|INDEX|the index for each call based on time|
|TIMESTAMP|time to invoke the method|
|COST(ms)|time cost of the method call|
|IS-RET|whether method exits with normal return|
|IS-EXP|whether method failed with exceptions|
|OBJECT|`hashCode()` of the object invoking the method|
|CLASS|class name of the object invoking the method|
|METHOD|method being invoked|
* Condition expression
Tips:
1. `tt -t *Test print params.length==1` with different amounts of parameters;
2. `tt -t *Test print 'params[1] instanceof Integer'` with different types of parameters;
3. `tt -t *Test print params[0].mobile=="13989838402"` with specified parameter.
Advanced:
* [Critical fields in expression](https://alibaba.github.io/arthas/en/advice-class.html)
* [Special usage](https://github.com/alibaba/arthas/issues/71)
* [OGNL official guide](https://commons.apache.org/proper/commons-ognl/language-guide.html)
#### List all records
`tt -l`{{execute T2}}
```bash
$ tt -l
INDEX TIMESTAMP COST(ms) IS-RET IS-EXP OBJECT CLASS METHOD
-------------------------------------------------------------------------------------------------------------------------------------
1000 2018-12-04 11:15:38 1.096236 false true 0x4b67cf4d MathGame primeFactors
1001 2018-12-04 11:15:39 0.191848 false true 0x4b67cf4d MathGame primeFactors
1002 2018-12-04 11:15:40 0.069523 false true 0x4b67cf4d MathGame primeFactors
1003 2018-12-04 11:15:41 0.186073 false true 0x4b67cf4d MathGame primeFactors
1004 2018-12-04 11:15:42 17.76437 true false 0x4b67cf4d MathGame primeFactors
9
1005 2018-12-04 11:15:43 0.4776 false true 0x4b67cf4d MathGame primeFactors
Affect(row-cnt:6) cost in 4 ms.
```
#### Searching for records
`tt -s 'method.name=="primeFactors"'`{{execute T2}}
```bash
$ tt -s 'method.name=="primeFactors"'
INDEX TIMESTAMP COST(ms) IS-RET IS-EXP OBJECT CLASS METHOD
-------------------------------------------------------------------------------------------------------------------------------------
1000 2018-12-04 11:15:38 1.096236 false true 0x4b67cf4d MathGame primeFactors
1001 2018-12-04 11:15:39 0.191848 false true 0x4b67cf4d MathGame primeFactors
1002 2018-12-04 11:15:40 0.069523 false true 0x4b67cf4d MathGame primeFactors
1003 2018-12-04 11:15:41 0.186073 false true 0x4b67cf4d MathGame primeFactors
1004 2018-12-04 11:15:42 17.76437 true false 0x4b67cf4d MathGame primeFactors
9
1005 2018-12-04 11:15:43 0.4776 false true 0x4b67cf4d MathGame primeFactors
Affect(row-cnt:6) cost in 607 ms.
```
Advanced:
* [Critical fields in expression](advice-class.md)
#### Check context of the call
`tt -i 1003`{{execute T2}}
Using `tt -i <index>` to check a specific calling details.
```bash
$ tt -i 1003
INDEX 1003
GMT-CREATE 2018-12-04 11:15:41
COST(ms) 0.186073
OBJECT 0x4b67cf4d
CLASS demo.MathGame
METHOD primeFactors
IS-RETURN false
IS-EXCEPTION true
PARAMETERS[0] @Integer[-564322413]
THROW-EXCEPTION java.lang.IllegalArgumentException: number is: -564322413, need >= 2
at demo.MathGame.primeFactors(MathGame.java:46)
at demo.MathGame.run(MathGame.java:24)
at demo.MathGame.main(MathGame.java:16)
Affect(row-cnt:1) cost in 11 ms.
```
### Replay record
Since Arthas stores the context of the call, you can even *replay* the method calling afterwards with extra option `-p` to replay the issue for advanced troubleshooting, option `--replay-times`
define the replay execution times, option `--replay-interval` define the interval(unit in ms,with default value 1000) of replays
`tt -i 1004 -p`{{execute T2}}
```bash
$ tt -i 1004 -p
RE-INDEX 1004
GMT-REPLAY 2018-12-04 11:26:00
OBJECT 0x4b67cf4d
CLASS demo.MathGame
METHOD primeFactors
PARAMETERS[0] @Integer[946738738]
IS-RETURN true
IS-EXCEPTION false
RETURN-OBJ @ArrayList[
@Integer[2],
@Integer[11],
@Integer[17],
@Integer[2531387],
]
Time fragment[1004] successfully replayed.
Affect(row-cnt:1) cost in 14 ms.
```
F.Y.I
1. **Loss** of the `ThreadLocal`
Arthas save params into an array, then invoke the method with the params again. The method execute in another thread, so the `ThreadLocal` **lost**.
1. params may be modified
Arthas save params into an array, they are object references. The Objects may be modified by other code.
Loading…
Cancel
Save