These are the tools that I use to figure out what's going on with a program's heap.
time
utilityFirst step is to run the program with the time
tool.
time -f "%MKB" ./my_program
This will reveal the maximum resident set size of the process during its lifetime, in Kilobytes.
This is just a warmup tool, just to see the top memory limit for the process.
Second step is to profile the program with gperftools heap profiler.
To install it, follow the instructions from my previous post about gperftools CPU profiler.
The simplest way to use the heap profiler, which doesn't involve recompiling the program, is:
LD_PRELOAD=libtcmalloc.so HEAPPROFILE=mem.prof ./my_program
Then you can view the profile with pprof:
pprof -http=: mem.prof
With gperftools heap profiler you can get a better overview of what's on your program's heap and where the allocations happen.
This is a better approach than the previous one, but for even more information there's the next step.
Example program:
#include <stdlib.h>
int main() {
for (int i = 0; i < 20; i++){
char* b = malloc(2 << i);
free(b);
}
return 0;
}
memcheck
Useful for seeing heap summary:
valgrind --tool=memcheck ./my_program
==21134== HEAP SUMMARY:
==21134== in use at exit: 0 bytes in 0 blocks
==21134== total heap usage: 20 allocs, 20 frees, 2,097,150 bytes allocated
This tell us that we have 20 allocations, all freed, totaling 2047 KB.
massif
massif
is a heap profiler.
It records how many allocations your program does and their sizes and where in the program those allocations happen.
It can be run like this:
valgrind --tool=massif --time-unit=B --massif-out-file=massif.data ./my_program
This will collect the data.
Then you can run ms_print
to view the data:
ms_print massif.data
This will give you a graph:
MB
1.000^ #################
| #
| #
| #
| #
| #
| #
| #
| #
| #
| @@@@@@@@@ #
| @ #
| @ #
| @ #
| @ #
| @@@@@ @ #
| @ @ #
| @ @ #
| @@@ @ @ #
| @ @ @ @ #
0 +----------------------------------------------------------------------->MB
0 4.000
Number of snapshots: 58
Detailed snapshots: [2, 11, 14, 17, 20, 23, 26, 29, 32, 35, 38, 41, 44, 47, 50, 53, 56 (peak)]
And detailed snapshots of your program's allocations:
--------------------------------------------------------------------------------
n time(B) total(B) useful-heap(B) extra-heap(B) stacks(B)
--------------------------------------------------------------------------------
0 0 0 0 0 0
1 24 24 2 22 0
2 24 24 2 22 0
08.33% (2B) (heap allocation functions) malloc/new/new[], --alloc-fns, etc.
->08.33% (2B) 0x10916D: main (mem.c:5)
--------------------------------------------------------------------------------
n time(B) total(B) useful-heap(B) extra-heap(B) stacks(B)
--------------------------------------------------------------------------------
3 48 0 0 0 0
4 72 24 4 20 0
5 96 0 0 0 0
6 120 24 8 16 0
7 144 0 0 0 0
8 168 24 16 8 0
9 192 0 0 0 0
10 232 40 32 8 0
11 232 40 32 8 0
80.00% (32B) (heap allocation functions) malloc/new/new[], --alloc-fns, etc.
->80.00% (32B) 0x10916D: main (mem.c:5)
[...]
With the --pages-as-heap=yes
option, massif
can track also mmap
'ed memory not
just memory allocated with malloc
, realloc
etc.
dhat
dhat
tracks the allocated blocks, and inspects every memory access to find which block, if any, it is to.
It presents information about these blocks such as sizes, lifetimes, numbers of reads and writes, and read and write patterns.
You can run it like this:
valgrind --tool=dhat --dhat-out-file=dhat.data ./my_program
And then open in your browser dh_view.html
(found in /usr/libexec/valgrind
on Debian)
and then load dhat.data
file to view it.
It will show something like this:
Invocation {
Mode: heap
Command: ./a.out
PID: 35568
}
Times {
t-gmax: 150,676 instrs (97.36% of program duration)
t-end: 154,764 instrs
}
─ PP 1/1 {
Total: 2,097,150 bytes (100%, 13,550,631.93/Minstr) in 20 blocks (100%, 129.23/Minstr), avg size 104,857.5 bytes, avg lifetime 72 instrs (0.05% of program duration)
At t-gmax: 1,048,576 bytes (100%) in 1 blocks (100%), avg size 1,048,576 bytes
At t-end: 0 bytes (0%) in 0 blocks (0%), avg size 0 bytes
Reads: 0 bytes (0%, 0/Minstr), 0/byte
Writes: 0 bytes (0%, 0/Minstr), 0/byte
Allocated at {
#0: [root]
#1: 0x10916D: main (mem.c:5)
}
}
PP significance threshold: total >= 0.2 blocks (1%)
This shows that there are 2,097,150 bytes allocated in 20 blocks from mem.c:5
that are never read or written.