Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OOM while generating tiles #42

Open
pnorman opened this issue Apr 19, 2024 · 1 comment
Open

OOM while generating tiles #42

pnorman opened this issue Apr 19, 2024 · 1 comment

Comments

@pnorman
Copy link
Owner

pnorman commented Apr 19, 2024

On the test server tilekiln was killed while generating z13. It could have been something else using the memory, but that seems unlikely since it wasn't getting killed earlier.

Rendering 67108864 tiles over 8 threads
 36%|███████████████████████████████████████████████████████                                                                                                    | 23862146/67108864 [1:48:20<18:52:14, 636.59it/s]
Process ForkPoolWorker-2:
Process ForkPoolWorker-7:
Process ForkPoolWorker-4:
Process ForkPoolWorker-3:
Process ForkPoolWorker-6:
Process ForkPoolWorker-1:
Killed
Process ForkPoolWorker-8:
Traceback (most recent call last):

dmesg reports

``` [4927139.468828] postgres invoked oom-killer: gfp_mask=0x140cca(GFP_HIGHUSER_MOVABLE|__GFP_COMP), order=0, oom_score_adj=0 [4927139.469013] CPU: 1 PID: 3157903 Comm: postgres Not tainted 6.1.0-18-amd64 #1 Debian 6.1.76-1 [4927139.469188] Hardware name: Gigabyte Technology Co., Ltd. B360 HD3P-LM/B360HD3PLM-CF, BIOS F7b HZ 07/29/2021 [4927139.469365] Call Trace: [4927139.469518] [4927139.469669] dump_stack_lvl+0x44/0x5c [4927139.469823] dump_header+0x4a/0x211 [4927139.469976] oom_kill_process.cold+0xb/0x10 [4927139.470130] out_of_memory+0x1fd/0x4c0 [4927139.470284] __alloc_pages_slowpath.constprop.0+0xc9e/0xdf0 [4927139.470443] __alloc_pages+0x305/0x330 [4927139.470595] folio_alloc+0x17/0x50 [4927139.470748] __filemap_get_folio+0x155/0x340 [4927139.470904] filemap_fault+0x139/0x910 [4927139.471056] ? filemap_map_pages+0x153/0x700 [4927139.471211] __do_fault+0x30/0x110 [4927139.471364] do_fault+0x1b9/0x410 [4927139.471517] __handle_mm_fault+0x660/0xfa0 [4927139.471673] handle_mm_fault+0xdb/0x2d0 [4927139.471826] do_user_addr_fault+0x191/0x580 [4927139.471980] exc_page_fault+0x70/0x170 [4927139.472134] asm_exc_page_fault+0x22/0x30 [4927139.472287] RIP: 0033:0x7fa3a0ccdded [4927139.472442] Code: Unable to access opcode bytes at 0x7fa3a0ccddc3. [4927139.472598] RSP: 002b:00007ffdfb6bf488 EFLAGS: 00010293 [4927139.472755] RAX: 0000000000000004 RBX: 0000000000000040 RCX: 00007fa3a71a5628 [4927139.472927] RDX: ff51afd7ed558ccd RSI: 000000000000000d RDI: 00007fa3a3e1b2e2 [4927139.473099] RBP: 00005561b4cdd430 R08: 0000000000000001 R09: 0000000000000001 [4927139.473271] R10: 00005561b4b9ce00 R11: 0000000000000008 R12: 00005561b4bf1940 [4927139.473443] R13: 00005561b4bf1940 R14: 00007ffdfb6bf528 R15: 00005561b4baab18 [4927139.473617] [4927139.473807] Mem-Info: [4927139.473958] active_anon:5908285 inactive_anon:10004114 isolated_anon:0 active_file:132 inactive_file:274 isolated_file:0 unevictable:20547 dirty:7 writeback:0 slab_reclaimable:219261 slab_unreclaimable:32109 mapped:54811 shmem:34238 pagetables:100654 sec_pagetables:0 bounce:0 kernel_misc_reclaimable:0 free:120213 free_pcp:131 free_cma:0 [4927139.474265] Node 0 active_anon:23633140kB inactive_anon:40016456kB active_file:840kB inactive_file:2692kB unevictable:82188kB isolated(anon):0kB isolated(file):0kB mapped:219244kB dirty:28kB writeback:0kB shmem:136952kB shmem_thp: 0kB shmem_pmdmapped: 0kB anon_thp: 176128kB writeback_tmp:0kB kernel_stack:8416kB pagetables:402616kB sec_pagetables:0kB all_unreclaimable? no [4927139.474522] Node 0 DMA free:11264kB boost:0kB min:12kB low:24kB high:36kB reserved_highatomic:0KB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB writepending:0kB presen t:15988kB managed:15360kB mlocked:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB [4927139.474755] lowmem_reserve[]: 0 813 64141 64141 64141 [4927139.474911] Node 0 DMA32 free:253988kB boost:0kB min:856kB low:1688kB high:2520kB reserved_highatomic:0KB active_anon:427340kB inactive_anon:186980kB active_file:16kB inactive_file:0kB unevictable:0kB writ epending:0kB present:947456kB managed:881920kB mlocked:0kB bounce:0kB free_pcp:712kB local_pcp:0kB free_cma:0kB [4927139.475157] lowmem_reserve[]: 0 0 63328 63328 63328 [4927139.475321] Node 0 Normal free:207612kB boost:0kB min:66708kB low:131552kB high:196396kB reserved_highatomic:0KB active_anon:23205800kB inactive_anon:39829488kB active_file:3000kB inactive_file:3332kB unev ictable:82188kB writepending:28kB present:66052096kB managed:64856000kB mlocked:82188kB bounce:0kB free_pcp:1552kB local_pcp:0kB free_cma:0kB [4927139.475581] lowmem_reserve[]: 0 0 0 0 0 [4927139.475733] Node 0 DMA: 0*4kB 0*8kB 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 1*1024kB (U) 1*2048kB (M) 2*4096kB (M) = 11264kB [4927139.475918] Node 0 DMA32: 113*4kB (UE) 100*8kB (UME) 252*16kB (UME) 156*32kB (UME) 116*64kB (UME) 326*128kB (UME) 88*256kB (UME) 24*512kB (UME) 10*1024kB (UME) 7*2048kB (ME) 33*4096kB (M) = 253988kB [4927139.477914] Node 0 Normal: 5359*4kB (UE) 3530*8kB (UME) 6697*16kB (UE) 1353*32kB (UE) 9*64kB (ME) 3*128kB (M) 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 201084kB [4927139.478119] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB [4927139.478299] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB [4927139.478473] 870183 total pagecache pages [4927139.478625] 811126 pages in swap cache [4927139.478777] Free swap = 0kB [4927139.478928] Total swap = 33519612kB [4927139.479083] 16753885 pages RAM [4927139.479233] 0 pages HighMem/MovableOnly [4927139.479387] 315565 pages reserved [4927139.479540] 0 pages hwpoisoned [4927139.479693] Tasks state (memory values in pages): [4927139.479847] [ pid ] uid tgid total_vm rss pgtables_bytes swapents oom_score_adj name [4927139.480030] [ 449] 0 449 28881 153 212992 202 -250 systemd-journal [4927139.480233] [ 466] 0 466 6579 19 69632 362 -1000 systemd-udevd [4927139.480406] [ 540] 0 540 907 44 45056 38 0 mdadm [4927139.480582] [ 549] 104 549 22514 55 73728 181 0 systemd-timesyn [4927139.480764] [ 642] 0 642 1652 23 53248 38 0 cron [4927139.480940] [ 643] 103 643 2313 117 53248 69 -900 dbus-daemon [4927139.481116] [ 645] 0 645 55444 223 73728 189 0 rsyslogd [4927139.481305] [ 646] 0 646 2830 73 61440 230 0 smartd [4927139.481484] [ 647] 0 647 4293 130 73728 167 0 systemd-logind [4927139.481658] [ 654] 0 654 895 18 45056 17 0 atd [4927139.481832] [ 952] 109 952 21538 20461 208896 14 0 varnishncsa [4927139.482007] [ 1149] 0 1149 1468 2 49152 19 0 agetty [4927139.482201] [ 1152] 0 1152 3853 52 69632 279 -1000 sshd [4927139.482377] [ 1157] 106 1157 58095 390 188416 543 -900 postgres [4927139.482555] [ 1158] 106 1158 58322 33297 417792 742 0 postgres [4927139.482733] [ 1159] 106 1159 58211 33478 413696 660 0 postgres [4927139.482909] [ 1161] 106 1161 58095 229 131072 599 0 postgres [4927139.483092] [ 1162] 106 1162 58547 20706 417792 641 0 postgres [4927139.483281] [ 1163] 106 1163 58510 281 143360 630 0 postgres [4927139.483454] [ 1221] 1000 1221 4715 141 73728 272 100 systemd [4927139.483629] [ 1222] 1000 1222 42191 76 90112 719 100 (sd-pam) [4927139.483802] [ 6938] 0 6938 2097 0 53248 253 0 hitch [4927139.483974] [ 6939] 110 6939 7916 718 102400 5373 0 hitch [4927139.484149] [ 6940] 110 6940 2097 12 53248 246 0 hitch [4927139.484323] [ 6952] 1000 6952 3777 1076 69632 629 0 tmux: server [4927139.484498] [ 6953] 1000 6953 2736 1 65536 770 0 bash [4927139.484672] [ 7198] 1000 7198 2451 1 65536 468 0 bash [4927139.484844] [ 27393] 1000 27393 3119 0 65536 1203 0 bash [4927139.485018] [ 152151] 1000 152151 2066 1 57344 418 0 bash [4927139.485210] [ 152652] 1000 152652 2558 1 57344 566 0 bash [4927139.485391] [ 161850] 1000 161850 4202 1 69632 2107 0 bash [4927139.485567] [ 173159] 1000 173159 180952 1663 155648 1904 0 postgres_export [4927139.485745] [ 173169] 106 173169 59407 25060 421888 805 0 postgres [4927139.485922] [ 173238] 1000 173238 401622 8991 724992 5551 0 prometheus [4927139.486098] [ 755803] 1000 755803 2132 1 57344 462 0 bash [4927139.486293] [3072865] 0 3072865 4426 0 77824 404 0 sshd [4927139.486467] [3072877] 1000 3072877 4491 62 77824 412 0 sshd [4927139.486639] [3072882] 1000 3072882 2064 32 53248 370 0 bash [4927139.486812] [3072886] 1000 3072886 2001 42 57344 42 0 tmux: client [4927139.486988] [ 806018] 0 806018 4707 28 65536 1594 0 varnishd [4927139.487161] [ 806033] 108 806033 304691 18032 2228224 159446 0 cache-main [4927139.487337] [ 873701] 1000 873701 2066 1 57344 410 0 bash [4927139.487510] [ 873870] 1000 873870 5619 0 86016 392 0 psql [4927139.487685] [ 873873] 106 873873 58860 1859 212992 905 0 postgres [4927139.487861] [3157693] 1000 3157693 20720 0 204800 12202 0 tilekiln [4927139.488036] [3157696] 1000 3157696 4606 0 69632 1410 0 python3 [4927139.488228] [3157697] 1000 3157697 20830 4326 204800 8248 0 python3 [4927139.488402] [3157698] 1000 3157698 20857 2806 200704 9786 0 python3 [4927139.488577] [3157699] 1000 3157699 20862 1783 204800 10794 0 python3 [4927139.488752] [3157700] 1000 3157700 20832 4811 200704 8331 0 python3 [4927139.488928] [3157701] 1000 3157701 20830 2301 200704 10267 0 python3 [4927139.489123] [3157702] 1000 3157702 20862 2819 204800 9664 0 python3 [4927139.489298] [3157703] 1000 3157703 20830 1793 200704 10783 0 python3 [4927139.489475] [3157704] 1000 3157704 20862 1784 208896 10826 0 python3 [4927139.489649] [3157705] 1000 3157705 20830 2778 208896 9673 0 python3 [4927139.489822] [3157706] 1000 3157706 20856 326 208896 12104 0 python3 [4927139.489997] [3157707] 1000 3157707 20832 5207 204800 7859 0 python3 [4927139.490169] [3157708] 1000 3157708 20860 322 200704 12103 0 python3 [4927139.490342] [3157733] 106 3157733 58651 895 159744 795 0 postgres [4927139.490516] [3157734] 106 3157734 58651 877 159744 793 0 postgres [4927139.490689] [3157735] 106 3157735 58651 923 159744 791 0 postgres [4927139.490864] [3157736] 106 3157736 58651 884 159744 790 0 postgres [4927139.491039] [3157737] 106 3157737 58651 905 159744 794 0 postgres [4927139.491214] [3157738] 106 3157738 58651 932 163840 787 0 postgres [4927139.491388] [3157739] 106 3157739 58651 938 159744 784 0 postgres [4927139.491563] [3157740] 106 3157740 58651 926 159744 792 0 postgres [4927139.491738] [3157741] 106 3157741 58651 941 163840 771 0 postgres [4927139.491914] [3157742] 106 3157742 58651 956 163840 778 0 postgres [4927139.492088] [3157743] 106 3157743 58651 929 159744 761 0 postgres [4927139.492264] [3157744] 106 3157744 58651 942 163840 781 0 postgres [4927139.492441] [3157745] 106 3157745 58651 949 163840 790 0 postgres [4927139.492617] [3157746] 106 3157746 58651 936 159744 770 0 postgres [4927139.492793] [3157747] 106 3157747 58651 937 163840 782 0 postgres [4927139.492966] [3157748] 106 3157748 58651 899 163840 795 0 postgres [4927139.493156] [3157749] 106 3157749 58651 922 163840 793 0 postgres [4927139.493332] [3157750] 106 3157750 58651 897 163840 785 0 postgres [4927139.493513] [3157751] 106 3157751 58651 884 159744 799 0 postgres [4927139.493687] [3157752] 106 3157752 58651 939 163840 795 0 postgres [4927139.493862] [3157753] 106 3157753 58651 911 159744 793 0 postgres [4927139.494037] [3157754] 106 3157754 58651 934 163840 765 0 postgres [4927139.494229] [3157755] 106 3157755 58651 906 159744 789 0 postgres [4927139.496168] [3157756] 106 3157756 58651 925 163840 763 0 postgres [4927139.496340] [3157900] 1000 3157900 57997 3278 221184 9966 0 tilekiln [4927139.496515] [3157903] 106 3157903 104642 42037 716800 2864 0 postgres [4927139.496687] [3174356] 1000 3174356 2066 1 53248 411 0 bash [4927139.496860] [3184700] 1000 3184700 21725225 15071057 173625344 6533578 0 tilekiln [4927139.497034] [3193199] 1000 3193199 3465914 3756 27660288 3396249 0 tilekiln [4927139.497224] [3193200] 106 3193200 58685 1510 176128 689 0 postgres [4927139.497403] [3193201] 106 3193201 58651 1026 172032 775 0 postgres [4927139.497576] [3193202] 1000 3193202 3465914 3641 27660288 3395741 0 tilekiln [4927139.497751] [3193203] 106 3193203 58662 1365 176128 690 0 postgres [4927139.497927] [3193204] 106 3193204 58651 980 172032 775 0 postgres [4927139.498101] [3193205] 1000 3193205 3465914 3985 27660288 3395754 0 tilekiln [4927139.498294] [3193206] 106 3193206 58662 1810 176128 690 0 postgres [4927139.498469] [3193207] 106 3193207 58651 999 172032 775 0 postgres [4927139.498643] [3193208] 1000 3193208 3465914 4164 27660288 3395744 0 tilekiln [4927139.498817] [3193209] 106 3193209 58662 1939 176128 691 0 postgres [4927139.498991] [3193210] 106 3193210 58651 1010 172032 775 0 postgres [4927139.499165] [3193211] 1000 3193211 3465914 3672 27660288 3395740 0 tilekiln [4927139.499340] [3193212] 106 3193212 58662 1911 176128 691 0 postgres [4927139.499515] [3193213] 106 3193213 58651 1002 172032 775 0 postgres [4927139.499690] [3193214] 1000 3193214 3465914 3886 27660288 3395745 0 tilekiln [4927139.499865] [3193215] 106 3193215 58662 1547 176128 691 0 postgres [4927139.500038] [3193216] 106 3193216 58651 984 172032 775 0 postgres [4927139.500213] [3193217] 1000 3193217 3465914 4071 27660288 3395748 0 tilekiln [4927139.500389] [3193218] 106 3193218 58662 1239 176128 691 0 postgres [4927139.500562] [3193219] 106 3193219 58651 1020 172032 775 0 postgres [4927139.500736] [3193220] 1000 3193220 3465914 3821 27660288 3395740 0 tilekiln [4927139.500909] [3193224] 106 3193224 58662 2064 176128 691 0 postgres [4927139.501098] [3193225] 106 3193225 58651 1051 172032 775 0 postgres [4927139.501280] oom-kill:constraint=CONSTRAINT_NONE,nodemask=(null),cpuset=/,mems_allowed=0,global_oom,task_memcg=/user.slice/user-1000.slice/session-2.scope,task=tilekiln,pid=3184700,uid=1000 [4927139.501512] Out of memory: Killed process 3184700 (tilekiln) total-vm:86900900kB, anon-rss:60284212kB, file-rss:0kB, shmem-rss:4kB, UID:1000 pgtables:169556kB oom_score_adj:0 [4927144.014349] oom_reaper: reaped process 3184700 (tilekiln), now anon-rss:0kB, file-rss:0kB, shmem-rss:4kB ```

From the log the tilekiln/postgres jobs at the bottom were the generation run.

[4927139.479847] [  pid  ]   uid  tgid total_vm      rss pgtables_bytes swapents oom_score_adj name
[4927139.496860] [3184700]  1000 3184700 21725225 15071057 173625344  6533578             0 tilekiln
[4927139.497034] [3193199]  1000 3193199  3465914     3756 27660288  3396249             0 tilekiln
[4927139.497224] [3193200]   106 3193200    58685     1510   176128      689             0 postgres
[4927139.497403] [3193201]   106 3193201    58651     1026   172032      775             0 postgres
[4927139.497576] [3193202]  1000 3193202  3465914     3641 27660288  3395741             0 tilekiln
[4927139.497751] [3193203]   106 3193203    58662     1365   176128      690             0 postgres
[4927139.497927] [3193204]   106 3193204    58651      980   172032      775             0 postgres
...

tilekiln has total_vm=5gb rss=3.6gb pgtables_bytes=41gb for the top process, while the others each have about total_vm=.8gb rss=tiny pgtables_bytes=6.5gb

@pnorman
Copy link
Owner Author

pnorman commented Apr 19, 2024

Note: This was done by feeding the list of all z13 tiles into tilekiln as the entire zoom function is not yet coded. I should see if it's still an issue after that, because that's the correct way to generate a high zoom

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant