269 Commits

Author SHA1 Message Date
hengyoush
b4e131fefa fix: specify json-output option exited after loaded bpf done 2025-04-22 00:29:05 +08:00
烈香
de11f7a3f4 refactor: replace tp with fentry if possible (#311) 2025-04-19 09:02:43 +08:00
烈香
60e053cf1f perf: skip unnecessary protocol infer when specify in cmd line (#310) 2025-04-15 00:22:58 +08:00
hengyoush
9ce4c6620e feat(openssl): support openssl 3.5.0 2025-04-10 10:30:04 +08:00
hengyoush
a57788406f feat(protocol): support dns protocol 2025-04-10 10:30:04 +08:00
烈香
e88a230b49 fix: when tracking ssl events there is a concurrency problem in the global map (#305)
* fix: concurrent read/write problem

* fix: workflow error
2025-03-19 21:02:36 +08:00
烈香
8ccbc4544d fix: #300 (#302) 2025-03-13 19:35:52 +08:00
AvaIon
8b8fe640b7 feat: Support MongoDB protocol (#275)
* The preliminary parsing of the MongoDB protocol has been completed, but there are still issues such as incorrect end times.

* Mongodb Unit test done

* The preliminary parsing of the MongoDB protocol has been completed, but there are still issues such as incorrect end times.

* Mongodb Unit test done

* feat: support mongo

* fix: install mongsh failed

* fix: install mongodb shell

---------

Signed-off-by: 烈香 <hengyoush1@163.com>
Co-authored-by: xiaoweihao <xiaoweihao@tp-link.com.hk>
Co-authored-by: 烈香 <hengyoush1@163.com>
2025-02-27 22:29:55 +08:00
hengyoush
eb3958576b fix: dead loop bug
fix: dead loop bug
2025-02-16 19:16:55 +08:00
Laitron
da149ed0c9 feat: check pid when attaching uprode. (#284) 2025-02-14 03:08:12 +08:00
烈香
1886dedb8e feat(openssl): support openssl 3.4.1 (#292) 2025-02-13 20:17:49 +08:00
hengyoush
e1aa7f0c6a fix: use typeSize instead of hard coded '4' 2025-02-12 21:41:58 +08:00
hengyoush
8ca51dbc38 feat: add "max-allow-stuck-time-mills option 2025-01-31 14:04:18 +08:00
烈香
237c9e377f fix: fix memory leak (#281)
* fix: fix memory leak

fix: fix test_filter_by_remote_port test

* fix: fix index out of range error

---------

Signed-off-by: 烈香 <hengyoush1@163.com>
2025-01-28 17:10:12 +08:00
烈香
4532e6bd42 refactor: change tcp seq type to uint32 (#280)
delete ringbuffer.go

Signed-off-by: hengyoush <hengyoush1@163.com>
2025-01-28 15:05:51 +08:00
烈香
9a8c64da4e feat: support trace socket event for ipv6 (#278) 2025-01-26 23:12:23 +08:00
烈香
1bf214922d feat: add options to control whether trace dev/socket/ssl events (#277)
* feat: add options to control whether trace dev/socket/ssl events

* refactor: adjust watch render
2025-01-26 17:49:20 +08:00
烈香
782e138667 feat: add an option to control whether to start gops for purpose of debugging (#276) 2025-01-24 11:17:48 +08:00
烈香
42267e4ed9 fix: big syscall data truncated may lead to failing to parse HTTP message (#274)
fix: handle big syscall data (truncated) properly

When we fail to read the body, it might be due to the response being too large, causing syscall data
to be missing when transferred to user space. Here, we attempt to find a boundary. If found, that's
ideal and we return immediately. Otherwise, we try to locate a Fake Data Mark (FDM). When user space
detects missing data from the kernel (possibly due to exceeding MAX_MSG_SIZE or situations like
readv/writev where a buffer array is read/written at once), it supplements with fake data in user
space. At the beginning of this fake data, an FDM is set, which is a special string. Following the
FDM, the length of the supplemental fake data (minus the length of the FDM) is written.
2025-01-24 03:08:42 +08:00
hengyoush
927d0dce0f refactor: optimize event process logic 2025-01-21 01:14:21 +08:00
烈香
06c9b2cfa4 fix: add check for AttrDataMemberLoc when val is []uint8 (#273) 2025-01-20 19:19:28 +08:00
hengyoush
06c7267c61 feat(protocol): support filter by apikeys and topic
fix: fix test

fix: fix test
2025-01-19 21:34:11 +08:00
hengyoush
50878fb96a test(e2e): add kafka e2e test
test(e2e): remove -it options from docker command
2025-01-19 21:34:11 +08:00
hengyoush
4643212a85 test(unittest): add unittest for kafka protocol parsing 2025-01-19 21:34:11 +08:00
hengyoush
31b3410598 feat: support kafka protocol
fix(bpf): fix stack size limit exceeded

fix(bpf): fix stack size limit exceeded

fix(bpf): fix stack size limit exceeded

fix(bpf): fix stack size limit exceeded
2025-01-19 21:34:11 +08:00
烈香
57c75421f4 refactor: optimize logs (#268) 2025-01-13 01:00:21 +08:00
烈香
3d3e1e4065 test(e2e): add e2e test for sendfile & server side https (#264) 2025-01-09 20:48:32 +08:00
烈香
05b2c4075b feat(protocol): introduce the concept of streams to prepare for future support of HTTP2 and Mongo (#258)
Signed-off-by: 烈香 <hengyoush1@163.com>
2025-01-09 13:51:01 +08:00
烈香
6d507da90e fix(bpf/ssl): first HTTPS request on the server side might not be captured (#259) 2025-01-09 03:00:09 +08:00
Laitron
6653fef907 feat: new version detection (#256) 2025-01-08 14:43:29 +08:00
烈香
3f6a44c753 feat: support for parsing ipip packet (#257)
* feat: support for parsing ipip packet

    This PR introduces a new feature for parsing IPIP packets and correctly associating them.

    Additionally, this PR improves the current logic in processor.go to prevent the incorrect association of syscall and kernel events. When new events arrive, they are first enqueued and then processed only if they have been in the queue longer than a specified time limit. This is necessary because when many short connections use the same tgid-fd, syscall and kernel events may arrive asynchronously in user space. As a result, events from a new connection might reach user space before the connection event itself, causing the new connection's events to be incorrectly associated with the old connection and leading to erroneous time calculations.

  And to ensure that the total time calculation is not negative, the syscall event will report the syscall start time and the syscall duration. By adding the start time and the duration, we can determine the end time. This way, when calculating the client's elapsed time, we can subtract the start time of the write syscall from the end time of the read syscall.

  Additionally, to ensure that DEV_IN and TCP_IN events are present when the server receives the first request, the concept of a first packet event is introduced. Even if the kernel does not find conn_info or other information when reporting the event, as long as its seq=1, it will be considered a first packet. This allows it to be directly reported to user space. In user space, the connection is found based on its sock key, and then it is converted into a kernevent for processing. This way, even for the server's first request, we can see the total time and read from socket time.

* fix: remove bpf_printk statements

* feat: add first-packet-event-map-page-num option

* refactor: translate comments to english
2025-01-08 12:38:35 +08:00
xmchx
8ff2696e1d feat: support rocketMQ (#231)
feat: support rocketMQ

---------

Signed-off-by: spencercjh <spencercjh@gmail.com>
Co-authored-by: Spencer Cai <spencercjh@gmail.com>
Co-authored-by: 烈香 <hengyoush1@163.com>
2025-01-07 19:42:33 +08:00
mannkafai
141c810edd user: add command-line options to set perf event buffer size (#247)
* user: add command-line  options to set perf event buffer size

add  `syscall-mapsize` , `ssl-mapsize`, `conn-mapsize`,  `kern-mapsize` command-line options to set `pageNum` of  `PullSyscallDataEvents`, `PullSslDataEvents`, `PullConnDataEvents` and `PullKernEvents`.

* user: add command-line options to set pageNum of perf event buffer

add `syscall-perf-event-map-page-num`, `ssl-perf-event-map-page-num`, `conn-perf-event-map-page-num`, `kern-perf-event-map-page-num` command-line options to set pageNum of `SyscallDataEvents`, `SslDataEvents`, `ConnDataEvents` and `KernEvents`.

* mark `*-perf-event-map-page-num` options  hidden
2025-01-05 18:22:23 +08:00
烈香
cb48df0480 fix: fix mysql protcol parser array index out of range issue and gotls load failed issue (#246)
* fix: fix gotls load failed

* fix: crash issue and gotls load failed issue
2025-01-04 00:58:31 +08:00
Spencer Cai
6d0b142054 feat: check cap privileges instead of Geteuid during starting the agent (#242)
* feat: Introduce github.com/containerd/containerd/pkg/cap to check whether process has CAP_BPF privilege

Signed-off-by: spencercjh <spencercjh@gmail.com>

* fix: better logs

* fix: adapt to e2e test env

* style: go mod tidy

* fix: make tests pass

* fix: DO NOT use containerd cap package

* test: introduce tests to verify agent/common/permission.go

* fix: correct implementation refer to https://man7.org/linux/man-pages/man2/capset.2.html

* test: test test_add_cap_bpf first

* test: cap-add difference capability for different kernal

* test: load btf file to container and run kyanos with --btf flag

* test: add missing capability CAP_SYS_RESOURCE

* test: try to use --privileged instead of cap-add

---------

Signed-off-by: spencercjh <spencercjh@gmail.com>
2025-01-03 21:05:54 +08:00
烈香
ca70f6db07 fix: use tracepoint instead of kprobe to trace skb_copy_datagram_iovec (#243) 2025-01-03 19:43:22 +08:00
烈香
a60ee5ef53 fix(test): fix filter-by-comm.test occasionally failed (#244) 2025-01-03 01:56:31 +08:00
AS!
1c0dd7288a feat: Add json-output params to watch command (#235)
* feat: Add json-output params to watch command

* docs: modify some field explain

* docs: modify column name
2025-01-02 18:46:46 +08:00
烈香
08feac8ceb fix: handle null(/xfb) field correctly (#239) 2025-01-01 15:40:08 +08:00
烈香
e604005391 test: wait a longer time to ensure process exec event can be handled (#238) 2025-01-01 12:30:38 +08:00
烈香
9567bbe9af fix: server side ssl event can't be captured correctly (#236)
1. collect sendfile syscall event(nginx may send static file to client via sendfile syscall)
2. when conntrack created , transfer old connection's temp events to new conn, because some events may come in before conn created at userspace.
3. ignore recvmsg, recvfrom syscall with flags : MSG_OOB, MSG_PEEK.
2025-01-01 04:36:21 +08:00
烈香
78d1c633ad fix: add fallback logic to calculate totaltime when nicin event missed in server side (#232)
fix: add fallback logic to calculate totaltime when nicin event missed in server side (#232)
2024-12-31 01:02:55 +08:00
烈香
1d9f0135e9 docs: add how to add a new protocol docs (#223)
* docs: add how to add a new protocol docs

* docs: add missing cn docs
2024-12-29 19:15:43 +08:00
Spencer Cai
d22c466db2 feat: Introduce more flags to filter HTTP records (#220)
* feat: introduce path-regex and path-prefix to sub cmd http

Signed-off-by: spencercjh <spencercjh@gmail.com>

* style: reformat with goimports

Signed-off-by: spencercjh <spencercjh@gmail.com>

* fix: save FilterByRequest's result as HttpFilter's field

Signed-off-by: spencercjh <spencercjh@gmail.com>

* docs: update docs about HttpFilter

Signed-off-by: spencercjh <spencercjh@gmail.com>

---------

Signed-off-by: spencercjh <spencercjh@gmail.com>
2024-12-27 11:34:55 +08:00
烈香
7a4c410d28 fix(stat): elapsed time is negative (#213)
* fix(stat): elapsed time is negative

introduce a new option `conntrack-close-wait-time-mills` which control how long time before a
connection turn into `closed` state. If too long, new connection with same tgidfd 's data may come
into old connection  event stream or syscall data buffer. Set it  to a relatively small value  will
prevent  this situation.

* fix: add missing argument
2024-12-25 03:11:42 +08:00
xmchx
cad39e194a fix: supports container-id prefix matching with 12 or more characters (#210) 2024-12-23 13:46:10 +08:00
烈香
4be02272ec refactor: remove fatal log (#199) 2024-12-20 02:30:50 +08:00
烈香
f384ad8821 fix: add rw lock to prevent concurrent map read write (#198)
* fix(kern_event_handler): add rw lock to prevent concurrent map read write

* fix: seperate ssl in/out locks

* fix: remove lock prevent reentrent issue
2024-12-20 00:45:54 +08:00
烈香
f84fd438f2 feat: Print osinfo when start failed (#191)
* feat: print os info when start failed

* feat: add system info logging for crash reports

* refactor: remove unsed log

* refactor: add faq url
2024-12-19 01:54:08 +08:00
hengyoush
cfac4b2c80 refactor: improve terminal color check 2024-12-17 23:15:14 +08:00