A tree-structured parallel external sort in C: a coordinator (mySort) forks splitter children, each splitter forks sorter leaves. Splitters and the coordinator merge their children's outputs back together via pipes and select. Two sorter binaries — quicksort and heapsort — are interchanged at the leaves so each chunk lands on a deliberately different algorithm.
Class project for the Operating Systems course at the University of Athens (Department of Informatics & Telecommunications). Solo project. Earlier-year attempt — see the Class history note at the end.
┌──────────────────┐
│ mySort │ (coordinator)
│ fork × k │
│ pipe per child │
│ select() merge │
└──┬────────┬──────┘
│ │
┌──────────┘ └──────────┐
▼ ▼
┌───────────┐ ┌───────────┐
│ splitter │ (k-1 of them) │ splitter │
│ k=k-i │ … │ k=1 │
│ fork × k │ │ fork × 1 │
│ pipe+merge│ │ pipe+merge│
└─┬───┬─────┘ └─┬─────────┘
│ │ │
▼ ▼ ▼
┌────┐ ┌────┐ ┌────┐
│qsrt│ │hsrt│ (alternating │qsrt│
│ │ │ │ process1/ │ │
│ │ │ │ process2 at │ │
└────┘ └────┘ each index) └────┘
Run:
./build/mySort -k 5 -e1 ./build/quicksort -e2 ./build/heapsort -i ./testing_folder/voters50.bin-k 5 means 5 splitter children at the top. Splitter i is then started with -k (k-i) so the tree fans out unevenly but predictably (5+4+3+2+1 = 15 total sorter leaves for k=5). The input file is a binary Record stream produced by the course generator (voters50.bin, voters5000.bin, etc.).
mySort (the root) and each splitter both need to:
- Allocate
kpipes - Decide how many records each child should read (
getRecsToRead) select()over allkread ends and drain whichever is ready- Merge each finished child's already-sorted output into a running merge buffer
That's identical logic between the two programs — only the kind of child differs (exec splitter vs exec quicksort/heapsort). Pulling it into splitHandler.{c,h} removed a large chunk of duplication and made the merge logic one place to debug instead of two.
Within a splitter, even-indexed children get process1 (e.g. quicksort) and odd-indexed get process2 (e.g. heapsort). The two algorithms have identical asymptotic complexity (O(n log n)) but very different constants and pathological inputs — running both in parallel against the same workload was the assignment's way of comparing them on a real distribution. Mergesort was deliberately not chosen for the leaves because the coordinator's pipe-merge is already a mergesort step.
The root mySort installs handlers for SIGUSR1 and SIGUSR2. Splitters send SIGUSR1 to the root when they finish, and SIGUSR2 is the per-child timing checkpoint. Real time vs CPU time is reported using times(2) + _SC_CLK_TCK, computed inside the signal handler against snapshots taken at startup.
| Path | Purpose |
|---|---|
mySort.c |
Root coordinator. Forks k splitters, installs USR1/USR2 handlers, prints final merged output. |
splitter.c |
Mid-tier. Forks k sorter children alternating process1/process2, merges their pipe output, signals root when done. |
sort_programs/quicksort.c |
Quicksort leaf — reads recsToRead records from offset recsToSkip, writes sorted output to stdout. |
sort_programs/heapsort.c |
Heapsort leaf — 0-indexed binary heap; same I/O contract as quicksort.c. |
handlers/splitHandler.c |
The shared fork+pipe+select+merge module described above. |
handlers/ioHandler.c |
Argument parsing (-k, -e1, -e2, -i, -s, -r, -pid), binary record read/write. |
handlers/recordHandler.c |
Record comparison + key/value helpers. |
testing_folder/voters{50,500,5000,50000,100000}.bin |
Binary Record test inputs at multiple sizes. |
make # produces build/{mySort,splitter,quicksort,heapsort}
make clean # removes objects + binaries
make clean-objs # objects onlyThis project is from the earlier year I took the Operating Systems course. The polished follow-up — a different homework from the same class, redone the next year — is at operating-systems-hw1. Both are kept public deliberately; this one is the older fork/pipe/IPC piece, the other is the shared-memory/semaphore piece.
MIT — applies to my own code in this repo. Assignment-distributed materials retain their original course copyright.