Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: [0105 main standalone regression]tpcc 100-1000 test oom. #21083

Closed
1 task done
Ariznawlll opened this issue Jan 5, 2025 · 11 comments
Closed
1 task done

[Bug]: [0105 main standalone regression]tpcc 100-1000 test oom. #21083

Ariznawlll opened this issue Jan 5, 2025 · 11 comments
Assignees
Labels
kind/bug Something isn't working phase/testing severity/s0 Extreme impact: Cause the application to break down and seriously affect the use
Milestone

Comments

@Ariznawlll
Copy link
Contributor

Is there an existing issue for the same bug?

  • I have checked the existing issues.

Branch Name

main

Commit ID

04a3ebb

Other Environment Information

- Hardware parameters:
- OS type:
- Others:

Actual Behavior

job url: https://github.com/matrixorigin/mo-auto-test/actions/runs/12609544410/job/35143260248

image image

heap_profile:
LOG_7c4dccb4-4d3c-41f8-b482-5251dc7a41bf_heap_019431f2-2043-77fd-bfd0-03be3115f4df.gz
LOG_7c4dccb4-4d3c-41f8-b482-5251dc7a41bf_heap_019431f2-9574-73ff-8b03-40e71dc5889b.gz
LOG_7c4dccb4-4d3c-41f8-b482-5251dc7a41bf_heap_019431f8-b3f1-7bc9-ad78-45ba48c19568.gz

malloc_profile:
LOG_7c4dccb4-4d3c-41f8-b482-5251dc7a41bf_malloc_019431f9-297b-72af-9bc5-1ef4fcd0369c.gz
LOG_7c4dccb4-4d3c-41f8-b482-5251dc7a41bf_malloc_019431f8-b457-717d-8245-26d09d0565d2.gz
LOG_7c4dccb4-4d3c-41f8-b482-5251dc7a41bf_malloc_019431f2-2953-7ee2-ab50-7f0f09a60641.gz

Expected Behavior

No response

Steps to Reproduce

trigger workflow on 129.

Additional information

No response

@Ariznawlll Ariznawlll added kind/bug Something isn't working needs-triage severity/s0 Extreme impact: Cause the application to break down and seriously affect the use labels Jan 5, 2025
@Ariznawlll Ariznawlll added this to the 2.1.0 milestone Jan 5, 2025
@daviszhen
Copy link
Contributor

image

@daviszhen
Copy link
Contributor

image

@daviszhen
Copy link
Contributor

image

@gouhongshen gouhongshen changed the title [Bug]: [0105 main standalone regression]tpcc 100-100 test oom. [Bug]: [0105 main standalone regression]tpcc 100-1000 test oom. Jan 6, 2025
@reusee
Copy link
Contributor

reusee commented Jan 9, 2025

从go堆heap profile看,in-use里面,logtail占了9G左右,table scan占了3.6G左右,应该优先优化

Screenshot From 2025-01-09 10-59-05
Screenshot From 2025-01-09 10-58-40
Screenshot From 2025-01-09 11-01-26

@reusee
Copy link
Contributor

reusee commented Jan 9, 2025

对比OOM前几分钟内的两个 allocs profile,得到期间分配的信息:

LOG_7c4dccb4-4d3c-41f8-b482-5251dc7a41bf_heap_0194468a-d809-7aeb-8c55-6634c67da734.gz
LOG_7c4dccb4-4d3c-41f8-b482-5251dc7a41bf_heap_0194468c-379c-7854-a26b-b6336bf4158b.gz

总共分配了 368G:
Screenshot From 2025-01-09 11-38-45

细分:
Screenshot From 2025-01-09 11-39-50

每种对象分配的比例都不算很高,但加起来多。

@reusee
Copy link
Contributor

reusee commented Jan 9, 2025

再对比 in-use:

Screenshot From 2025-01-09 11-44-23

logtail占用的空间,上升比较多。以及观察到多了 DoMergeAndWrite 这部分。

@XuPeng-SH
Copy link
Contributor

fixed

@Ariznawlll
Copy link
Contributor Author

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Something isn't working phase/testing severity/s0 Extreme impact: Cause the application to break down and seriously affect the use
Projects
None yet
Development

No branches or pull requests

6 participants