Oracle memory troubleshooting, Part 1: Heapdump Analyzer
by
Tanel Poder
Posted on
January 2, 2009
When troubleshooting Oracle process memory issues like ORA-4030’s or just excessive memory usage, you may want to get a detailed breakdown of PGA, UGA and Call heaps to see which component in there is the largest one.
The same goes for shared pool memory issues and ORA-4031’s – sometimes you need to dump the shared pool heap metadata for understanding what kind of allocations take most of space in there.
The heap dumping can be done using a HEAPDUMP event, see
http://www.juliandyke.com/Diagnostics/Dumps/Dumps.html for syntax.
NB! Note that when dumping SGA heaps (like shared, large, java and streams pools), your process holds shared pool latches for the entire dump duration so this should be used only as a last resort in busy production instances. Dumping a big
shared pool could hang your instance for quite some time. Dumping private process heaps is safer as that way only the target process is affected.
The heapdump output file structure is actually very simple, all you need to look at is the HEAP DUMP header to see in which heap the following chunks of memory belong (as there may be multiple heaps dumped into a single tracefile).
HEAP DUMP heap name="sga heap(1,1)" desc=04EA22D0 extent sz=0xfc4 alt=108 het=32767 rec=9 flg=-125 opc=0 parent=00000000 owner=00000000 nex=00000000 xsz=0x400000EXTENT 0 addr=20800000 Chunk 20800038 sz= 374904 free " " Chunk 2085b8b0 sz= 540 recreate "KGL handles " latch=00000000 Chunk 2085bacc sz= 540 recreate "KGL handles " latch=00000000 Chunk 2085bce8 sz= 1036 freeable "parameter table" Chunk 2085c0f4 sz= 1036 freeable "parameter table" Chunk 2085c500 sz= 1036 freeable "parameter table" Chunk 2085c90c sz= 1036 freeable "parameter table" Chunk 2085cd18 sz= 1036 freeable "parameter table" Chunk 2085d124 sz= 228 recreate "KGL handles " latch=00000000 Chunk 2085d208 sz= 228 recreate "KGL handles " latch=00000000 Chunk 2085d2ec sz= 228 recreate "KGL handles " latch=00000000 Chunk 2085d3d0 sz= 228 recreate "KGL handles " latch=00000000 Chunk 2085d4b4 sz= 228 recreate "KGL handles " latch=00000000 Chunk 2085d598 sz= 540 recreate "KQR PO " latch=2734AA00 Chunk 2085d7b4 sz= 540 recreate "KQR PO " latch=2734AA00 Chunk 2085d9d0 sz= 228 recreate "KGL handles " latch=00000000...
The first list of chunks after HEAP DUMP (the list above) is the list of all chunks in the heap. There are more lists such as freelists and LRU lists in a regular heap, but lets ignore those for now, I’ll write more about heaps in an upcoming post.
After identifying heap name from HEAP DUMP line, you can see all individual chunks from the “Chunk” lines. The second column after Chunk shows the start address of a chunk,
sz= means chunk size, the next column shows the type of a chunk (free, freeable, recreate, perm, R-free, R-freeable).
The next column is important one for troublehsooting, it shows the reason why a chunk was allocated (such
KGL handles for library cache handles, KGR PO for dictionary cache parent objects etc). Every chunk in a heap has a fixed 16 byte area in the chunk header which stores the allocation reason (comment) of a chunk. Whenever a client layer (calling
a kghal* chunk allocation function) allocates heap memory, it needs to pass in a comment up to 16 bytes and it’s stored in the newly allocated chunk header.
This is a trivial technique for troubleshooting memory leaks and other memory allocation problems. When having memory issues you can just dump all the heap’s chunks sizes and aggregate these by allocation reason/comment. That would show you the biggest heap
occupier and give further hints where to look next.
As there can be lots of chunks in large heaps, aggregating the data manually would be time consuming (and boring). Here’s a little shell script which can summarize Oracle heapdump output tracefile contents for you:
http://blog.tanelpoder.com/files/scripts/tools/unix/heapdump_analyzer
After taking a heapdump, you just run to get a heap summary, total allocation sizes grouped by parent heap, chunk comment and chunk size.
heapdump_analyzer tracefile.trc
Here’s an example of a shared pool dump analysis (heapdump at level 2):
SQL> alter session set events 'immediate trace name heapdump level 2';Session altered.SQL> exit...$ heapdump_analyzer lin10g_ora_7145.trc -- Heapdump Analyzer v1.00 by Tanel Poder ( http://www.tanelpoder.com ) Total_size #Chunks Chunk_size, From_heap, Chunk_type, Alloc_reason ---------- ------- ------------ ----------------- ----------------- ----------------- 11943936 3 3981312 , sga heap(1,3), free, 3981244 1 3981244 , sga heap(1,0), perm, perm 3980656 1 3980656 , sga heap(1,0), perm, perm 3980116 1 3980116 , sga heap(1,0), perm, perm 3978136 1 3978136 , sga heap(1,0), perm, perm 3977156 1 3977156 , sga heap(1,1), recreate, KSFD SGA I/O b 3800712 1 3800712 , sga heap(1,0), perm, perm 3680560 1 3680560 , sga heap(1,0), perm, perm 3518780 1 3518780 , sga heap(1,0), perm, perm 3409016 1 3409016 , sga heap(1,0), perm, perm 3394124 1 3394124 , sga heap(1,0), perm, perm 2475420 1 2475420 , sga heap(1,1), free, 2319892 1 2319892 , sga heap(1,3), free, 2084864 509 4096 , sga heap(1,3), freeable, sql area...
It shows that the biggest component in shared pool is 11943936 bytes, it consists of 3
free chunks, which reside in shared pool subpool 1 and sub-sub-pool 3 (see the
sga heap(1,3) div).
Note that my script is very trivial as of now, it reports different sized chunks on different lines so you still may need to do some manual aggregation if there’s no obvious troublemaker seen in the top of the list.
Here’s an example of a summarized heapdump level 29 ( PGA + UGA + call heaps ):
$ heapdump_analyzer lin10g_ora_7145_0002.trc -- Heapdump Analyzer v1.00 by Tanel Poder ( http://www.tanelpoder.com ) Total_size #Chunks Chunk_size, From_heap, Chunk_type, Alloc_reason ---------- ------- ------------ ----------------- ----------------- ----------------- 7595216 116 65476 , top uga heap, freeable, session heap 6779640 105 64568 , session heap, freeable, kxs-heap-w 2035808 8 254476 , callheap, freeable, kllcqas:kllsltb 1017984 4 254496 , top call heap, freeable, callheap 987712 8 123464 , top uga heap, freeable, session heap 987552 8 123444 , session heap, freeable, kxs-heap-w 196260 3 65420 , session heap, freeable, kxs-heap-w 159000 5 31800 , session heap, freeable, kxs-heap-w 112320 52 2160 , callheap, free, 93240 105 888 , session heap, free, 82200 5 16440 , session heap, freeable, kxs-heap-w 65476 1 65476 , top uga heap, recreate, session heap 65244 1 65244 , top call heap, free, 56680 26 2180 , top call heap, freeable, callheap 55936 1 55936 , session heap, freeable, kxs-heap-w...
You can also use -t option to show total heap sizes in the output (this total is not computed by my script, I just take the “Total” lines from the heapdump tracefile):
$ heapdump_analyzer -t lin10g_ora_7145_0002.trc | grep Total Total_size #Chunks Chunk_size, From_heap, Chunk_type, Alloc_reason 8714788 1 8714788 , top uga heap, TOTAL, Total heap size 8653464 1 8653464 , session heap, TOTAL, Total heap size 2169328 2 1084664 , callheap, TOTAL, Total heap size 1179576 1 1179576 , top call heap, TOTAL, Total heap size 191892 1 191892 , pga heap, TOTAL, Total heap size
References:
Metalink note 396940.1 – Troubleshooting and Diagnosing ORA-4031 ErrorHeapdump syntax –
http://www.juliandyke.com/Diagnostics/Dumps/Dumps.htmlHeapdump analyzer –
http://blog.tanelpoder.com/files/scripts/tools/unix/heapdump_analyzer
Related Posts
Oracle Memory Troubleshooting, Part 4: Drilling down into PGA memory
usage with…Oracle In-Memory Column Store Internals – Part 1 – Which SIMD extensions are getting…Advanced Oracle Troubleshooting Guide – Part 10: Index unique scan doing multiblock
reads?!Advanced Oracle Troubleshooting Guide – Part 11: Complex Wait
Chain Signature Analysis with…Our take on the Oracle Database 12c In-Memory Option