Lecture File
slides/04_dynamic_mem.pdf
Data and Memory
Heap allocation, malloc/free, lifetimes, dangling pointers, leaks, growing arrays, and doubling capacity.
Dynamic memory is the tool C gives you when the amount of data, or the lifetime of that data, cannot be handled cleanly by the stack. A stack array is useful when the size is fixed and the array only needs to live inside the current function call. The heap is useful when the size is discovered at runtime, when the data must survive after a function returns, or when the program needs to grow storage as input arrives.
The tradeoff is responsibility. Stack memory is automatically reclaimed when a function returns. Heap memory is controlled by the programmer: allocate it with malloc, use it through a pointer, and release it with free exactly once when the program is finished with it.
This lecture builds two major habits:
free.The slide memory diagram divides program memory into text, data, heap, unallocated space, and stack. The important distinction for this course is lifetime:
free.Example of invalid lifetime:
int *multiples(int n) {
int arr[4];
for (int i = 0; i < 4; ++i) {
arr[i] = (i + 1) * n;
}
return arr; // wrong: arr dies when multiples returns
}
arr is a local stack array. Returning arr returns a pointer to the first element, but those elements stop being valid when the function returns. The pointer value may still look like an address, but dereferencing it is undefined behavior.
The lecture code lecture_code/cpl/dynamic/lifetimes/badMultiples.c makes this visible by calling the bad function more than once. The later function call reuses stack space, so earlier pointers may appear to change underneath you. With one compiler you may see a crash; with another you may see overwritten values. Both outcomes come from the same bug: the program used a pointer after the pointed-to object died.
malloc and freemalloc asks the heap for a number of contiguous bytes:
int *p = malloc(sizeof(int) * 1);
int *arr = malloc(sizeof(int) * 10);
Important rules:
malloc returns a pointer to the first byte of the allocation.malloc can fail and return NULL.free.The lecture code lecture_code/cpl/dynamic/growingArray/simpleMalloc.c shows all of these points:
int *p = malloc(sizeof(int) * 1);
if (!p) {
fprintf(stderr, "Malloc failed... I simply can't continue");
return 1;
}
int *arr = malloc(sizeof(int) * 10);
if (!arr) {
fprintf(stderr, "Malloc failed... I simply can't continue");
}
After the program writes valid values, it frees both allocations:
free(p);
p = NULL;
free(arr);
arr = NULL;
Setting a pointer to NULL after free is defensive. It does not fix aliases. If two pointers point to the same heap block and one is set to NULL, the other still dangles after free.
Suppose this code runs:
int *p = malloc(sizeof(int));
int *arr = malloc(sizeof(int) * 3);
*p = 13;
arr[0] = -1;
arr[1] = -2;
arr[2] = -3;
Trace the memory:
p is a local variable on the stack. Its value is a heap address.arr is also a local stack variable. Its value is another heap address.*p = 13 writes into the heap block that p points at.arr[1] = -2 means *(arr + 1) = -2, so it writes into the second int slot of the heap block.free(p) releases the one-int block. After that, p should not be dereferenced.free(arr) releases the three-int block. After that, none of arr[0], arr[1], or arr[2] is valid.The pointer variables themselves are still ordinary stack variables. free does not delete the pointer variable; it releases the heap block whose address was stored in that pointer.
The motivating example is a program that reads all integers from standard input and prints them in sorted order. A stack array such as:
const int size = 1000;
int arr[size];
is not enough because the program does not know how many integers the user will type. Moving the same fixed capacity to the heap:
int cap = 16;
int *arr = malloc(sizeof(int) * cap);
does not solve the whole problem either. It only moves the fixed-size array from stack to heap. The real benefit of the heap is that the program can allocate a new, larger block when the old block fills up.
The resize pattern is:
From the lecture code:
if (cap == len) {
int *newArr = malloc(sizeof(int) * (cap * 2));
for (int i = 0; i < len; ++i) {
newArr[i] = arr[i];
}
free(arr);
arr = newArr;
cap = cap * 2;
}
The order matters:
arr before copying, the old values are gone.newArr but forget arr = newArr, the new block is leaked.arr before saving the old pointer or freeing it, the old block may be leaked.malloc fails and you immediately overwrite arr, you lose the only pointer to the old valid data.A more defensive resize in real code often uses a temporary pointer and checks for failure before changing the active state.
The lecture insert function assumes there is already enough capacity:
int insert(int *arr, const int len, const int cap, const int nv) {
if (cap <= len) return 0;
int i = 0;
for (; i < len && arr[i] < nv; ++i);
for (int j = len; j > i; --j) {
arr[j] = arr[j - 1];
}
arr[i] = nv;
return 1;
}
Trace arr = [2, 5, 9, _, _], len = 3, cap = 5, nv = 6:
2 < 6, 5 < 6, 9 < 6 is false, so i = 2.j = 3, assign arr[3] = arr[2], producing [2, 5, 9, 9, _].j > i is false after decrement.arr[2] = 6, producing [2, 5, 6, 9, _].len.Notice that insert cannot know whether the caller increased len unless the interface is designed to mutate len. That issue becomes central in the next lecture.
Growing by one element each time is memory efficient but time inefficient. If the array starts with capacity 1 and eventually stores n integers, the copies are:
1 + 2 + 3 + ... + (n - 1) = n(n - 1) / 2
That grows on the order of n^2.
Example for 6 inserted values:
capacity 1 -> 2: copy 1
capacity 2 -> 3: copy 2
capacity 3 -> 4: copy 3
capacity 4 -> 5: copy 4
capacity 5 -> 6: copy 5
total copies = 15
For 100000 values, growing by one would require billions of copied integers.
Doubling capacity trades a bounded amount of extra memory for much better time behavior.
Example starting capacity 4:
append values 1-4: no resize
append value 5: copy 4, capacity becomes 8
append values 6-8: no resize
append value 9: copy 8, capacity becomes 16
append values 10-16: no resize
append value 17: copy 16, capacity becomes 32
Most appends cost one write. Only occasional appends pay for copying. Spread across all appends, the average cost stays constant, so append is amortized O(1).
The unused capacity is bounded. Right after doubling, the capacity is at most about twice the number of stored items. That is usually a good tradeoff unless memory is extremely constrained.
A function can return a pointer only if the pointed-to data outlives the function call. Common valid cases:
Example returning caller-owned data:
char *my_strchr(char *s, int c) {
for (; *s != '\0'; ++s) {
if (*s == c) return s;
}
return NULL;
}
This is valid because the returned pointer points inside the string passed by the caller. The function did not create that storage, so it also should not free it.
Example returning heap data:
int *multiples(int m, size_t size) {
int *ret = malloc(sizeof(int) * size);
if (!ret) return NULL;
for (size_t i = 0; i < size; ++i) {
ret[i] = (int)i * m;
}
return ret;
}
The caller owns the returned array:
int *fives = multiples(5, 10);
if (!fives) {
return 1;
}
for (size_t i = 0; i < 10; ++i) {
printf("%d ", fives[i]);
}
free(fives);
digitslecture_code/cpl/dynamic/digits.c returns a heap array:
int *digits(int x) {
int digits = flooredLog(x, 10);
int *ret = malloc(sizeof(int) * digits);
for (unsigned int i = digits - 1; i > 0; --i, x = x / 10) {
ret[i] = x % 10;
}
ret[0] = x;
return ret;
}
Learning points:
ret is heap allocated, so returning it is valid.flooredLog(x, 10).digs.Trace digits(72519):
flooredLog(72519, 10) returns 5.i = 4: store 72519 % 10 = 9; update x = 7251.i = 3: store 1; update x = 725.i = 2: store 5; update x = 72.i = 1: store 2; update x = 7.ret[0] = 7.[7, 2, 5, 1, 9].malloc does not fill memory with zero.malloc: dereferencing NULL crashes or invokes undefined behavior.sizeof(pointer) when you meant sizeof(*pointer) or sizeof(type).arr[len] when len == cap.Use a systematic checklist:
malloc, identify the matching free.arr[i], verify 0 <= i < cap.arr[i], verify 0 <= i < len.free, suspect double free, heap corruption, or a pointer that did not come from malloc.Compiler and tooling habits:
gcc -Wall -Wextra -g program.c
valgrind ./a.out
AddressSanitizer is also useful when available:
gcc -Wall -Wextra -g -fsanitize=address program.c
./a.out
When asked about heap code, do not only describe syntax. Track ownership and lifetime:
When asked to analyze resizing:
malloc(sizeof(T) * n) allocates room for n objects of type T.free(ptr) releases a heap block allocated by malloc.ptr = NULL after free(ptr) helps catch accidental reuse of that one pointer.size_t preferred for array sizes and capacities?NULL after free?malloc succeeds but the pointer is overwritten before free, what bug occurred?int *, how can you decide whether the caller must free the returned pointer?Built from summaries/04_dynamic_mem.md and reviewed against slides/04_dynamic_mem.pdf plus matching files in lecture_code/.