Lecture File
slides/07_structs.pdf
Data and Memory
C structs, member access, struct pointers, arrow operator, linked lists, ADT design, and encapsulation motivation.
Structs let C group related values into one named aggregate type. That matters because real program data is usually not a single scalar. A rectangle has position and size. A dynamic array has a pointer, length, and capacity. A linked list has a head pointer and a length.
This lecture uses structs in two stages:
Structs do not make C object-oriented. They do not have constructors or methods. They are a named memory layout plus field access syntax. The design discipline comes from how you use them.
General form:
struct Rect {
int x, y, w, h;
};
Important syntax:
struct Rect, not just Rect, unless you use typedef.. when you have a struct value.-> when you have a pointer to a struct.Example:
struct Rect r;
r.x = 5;
r.y = 10;
r.w = 3;
r.h = 2;
The lecture begins with rectangle collision. Without structs, a collision function might take many integer parameters:
int collides(int h1, int w1, int x1, int y1,
int h2, int w2, int x2, int y2);
This is fragile because many parameters have the same type. Passing width where height belongs still compiles.
With structs:
int collides(struct Rect r1, struct Rect r2);
The interface now says what the data means. A rectangle travels as one value instead of four unrelated integers.
From lecture_code/cpl/aggregate/rect.c:
struct Rect {
int x, y, w, h;
char c;
};
int collides(struct Rect r1, struct Rect r2) {
return (r1.x < r2.x + r2.w &&
r1.x + r1.w > r2.x &&
r1.y < r2.y + r2.h &&
r1.y + r1.h > r2.y);
}
The collision test says the rectangles overlap when neither rectangle is completely to one side of the other.
Let:
r1: x=1, y=2, w=3, h=2
r2: x=3, y=3, w=2, h=2
Then:
r1.x < r2.x + r2.w 1 < 5 true
r1.x + r1.w > r2.x 4 > 3 true
r1.y < r2.y + r2.h 2 < 5 true
r1.y + r1.h > r2.y 4 > 3 true
All four conditions are true, so the rectangles collide.
If one condition is false, there is separation on that axis and the rectangles do not overlap.
Unlike arrays, structs are copied when passed by value.
This function does not mutate the caller's rectangle:
void translate(struct Rect r, int x, int y) {
r.x = r.x + x;
r.y = r.y + y;
}
Caller:
struct Rect r1 = {.x = 1, .y = 5, .h = 1, .w = 1};
translate(r1, 3, -2);
printf("Rect at (%d, %d)\n", r1.x, r1.y);
Trace:
main has r1.translate receives a copy named r.translate changes the copy's fields.translate returns and its stack frame disappears.main's r1 is unchanged.To mutate the caller's rectangle, pass a pointer:
void translate(struct Rect *r, int xt, int yt) {
r->x += xt;
r->y += yt;
}
Caller:
translate(&r1, 3, -2);
Now r stores the address of r1, and r->x writes into the caller's object.
Use dot for a struct value:
struct Rect r;
r.x = 10;
Use arrow for a pointer to a struct:
struct Rect *p = &r;
p->x = 10;
The arrow operator is shorthand:
p->x
means:
(*p).x
Parentheses are required in (*p).x because . binds more tightly than *. Writing *p.x means *(p.x), which is wrong when p is a pointer.
constChoose the parameter style based on intent:
int collides(struct Rect r1, struct Rect r2);
Good when the struct is small and the function only needs copies.
void translate(struct Rect *r, int dx, int dy);
Good when the function should mutate the original object.
int area(const struct Rect *r);
Good when the struct may be large and the function should inspect without mutating.
The const tells both the compiler and reader that the function should not change the object through that pointer.
Struct fields are laid out in declaration order, but the compiler may insert padding bytes for alignment.
From lecture_code/cpl/aggregate/structSizes.c:
struct T {
int x, y;
char a;
int z;
char b, c, d;
};
struct Q {
int x, y;
char a, b, c, d;
int z;
};
Both structs contain similar fields, but their sizes can differ because field order changes where padding is needed.
Reasoning example:
int often wants 4-byte alignment.char uses 1 byte.char is followed by an int, the compiler may add padding before the int.char fields together can reduce padding.Do not write code that assumes a struct's size is the sum of field sizes. Use sizeof.
The previous lecture's dynamic array required several parameters:
append(&arr, &len, &cap, value);
Those variables are not independent. They are all parts of one dynamic array.
A struct can bundle them:
struct IntArray {
int *data;
size_t len;
size_t cap;
};
Then functions can take one object:
void append(struct IntArray *a, int value) {
if (a->len == a->cap) {
size_t newCap = a->cap * 2;
int *newData = malloc(sizeof(int) * newCap);
for (size_t i = 0; i < a->len; ++i) {
newData[i] = a->data[i];
}
free(a->data);
a->data = newData;
a->cap = newCap;
}
a->data[a->len] = value;
++a->len;
}
Benefits:
len and forget which cap it belongs to.An abstract data type is defined by behavior, not by representation.
A list user should know operations such as:
The user should not need to know whether the list is implemented with:
The representation should be hidden behind a stable interface. This makes code safer and more reusable.
The lecture implements a list with nodes:
struct Node {
int data;
struct Node *next;
};
struct List {
struct Node *head;
size_t len;
};
Invariants:
head points to the first node, or is NULL for an empty list.next points to the next valid node, or NULL at the end.len equals the number of nodes reachable from head.These invariants are what the ADT functions must preserve.
From the slides:
struct List *createList() {
struct List *ret = malloc(sizeof(struct List));
ret->head = NULL;
ret->len = 0;
return ret;
}
From the ADT lecture code, the add-to-front operation is called cons:
void cons(int elem, struct List *l) {
struct Node *nn = malloc(sizeof(struct Node));
nn->next = l->head;
nn->data = elem;
l->head = nn;
l->len++;
}
Trace starting from empty list:
head = NULL, len = 0
Call cons(3, l):
new node: data=3, next=NULL
head -> 3
len = 1
Call cons(2, l):
new node: data=2, next=old head
head -> 2 -> 3
len = 2
Call cons(1, l):
head -> 1 -> 2 -> 3
len = 3
Because insertion is at the front, values appear in reverse order of insertion.
Linked lists do not support constant-time indexing. To read index i, the program walks from the head:
int ith(struct List *l, int i) {
assert(i < length(l));
struct Node *cur = l->head;
for (int j = 0; cur && j < i; ++j, cur = cur->next);
return cur->data;
}
Trace ith(l, 2) for head -> 17 -> 32 -> 7:
cur starts at 17, j = 0.cur points to 32, j = 1.cur points to 7, j = 2.The work grows with the index, so ith is O(n) in the worst case.
Mutation is the same traversal plus an assignment:
void setIth(struct List *l, int i, int elem) {
assert(i < length(l));
struct Node *cur = l->head;
for (int j = 0; cur && j < i; ++j, cur = cur->next);
cur->data = elem;
}
To remove a node, the list must be rewired before the removed node is freed.
For:
head -> 17 -> 32 -> 7 -> NULL
Removing index 1:
prev points to 17.cur points to 32.prev->next = cur->next, so 17 points to 7.cur.len.Result:
head -> 17 -> 7 -> NULL
Special case: removing index 0 must update head.
struct Node *tmp = l->head;
l->head = l->head->next;
free(tmp);
--l->len;
If you free the node before saving or using its next, you cannot safely access node->next afterward.
A list has multiple heap allocations:
struct List;struct Node per element.Every allocation must be freed. From the slides:
void deleteNode(struct Node *n) {
if (!n) return;
deleteNode(n->next);
free(n);
}
void deleteList(struct List *l) {
if (!l) return;
deleteNode(l->head);
free(l);
}
The recursive version frees the rest of the list before freeing the current node. That is safe because it reads n->next before n is freed.
Iterative version:
void deleteNodes(struct Node *cur) {
while (cur) {
struct Node *next = cur->next;
free(cur);
cur = next;
}
}
The key step is saving next before freeing cur.
The slides show that if clients can directly create struct Node and struct List, they can build invalid structures. For example, client code can create a cycle or point list nodes at stack variables.
Broken example from the idea in the slides:
struct List l;
struct Node n1;
struct Node n2;
n1.next = &n2;
n2.next = &n1;
l.head = &n1;
Problems:
free on them.len field may not match the actual structure.This is why a real ADT hides its representation and exposes only functions.
The next course topic, separate compilation and headers, gives C tools for this: put declarations in a header and implementation details in a .c file.
Rect r; instead of struct Rect r; without a typedef.. on a pointer or -> on a value.p->x means (*p).x.len disagree with the actual number of linked-list nodes.node->next.For struct bugs:
. for values and -> for pointers.sizeof instead of guessing layout.For linked-list bugs:
head, every next, and len.prev, cur, and cur->next.next before freeing a node.When asked whether a function mutates a struct:
struct Rect r, it mutates only a copy.struct Rect *r, it can mutate the caller's object.const struct Rect *r, it should inspect only.When asked about ->:
p->field as (*p).field.When asked about linked-list operations:
head, prev, cur, and next.len.When asked about ADTs:
struct Name { ... }; defines a struct type.struct Name x; declares a struct value.x.field accesses a field on a value.p->field accesses a field through a pointer.p->field is shorthand for (*p).field.sizeof(struct X) larger than the sum of fields.O(n).r->x mean in terms of (*r).x?translate fail before it took a pointer?. and ->?ith operation O(n)?Node and List internals?len field to its node chain?Built from summaries/07_structs.md and reviewed against slides/07_structs.pdf plus matching files in lecture_code/.