Skip to content

AlexTuring010/operating-systems-parallel-sort

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

operating-systems-parallel-sort

A tree-structured parallel external sort in C: a coordinator (mySort) forks splitter children, each splitter forks sorter leaves. Splitters and the coordinator merge their children's outputs back together via pipes and select. Two sorter binaries — quicksort and heapsort — are interchanged at the leaves so each chunk lands on a deliberately different algorithm.

Class project for the Operating Systems course at the University of Athens (Department of Informatics & Telecommunications). Solo project. Earlier-year attempt — see the Class history note at the end.

What it does

                ┌──────────────────┐
                │      mySort      │  (coordinator)
                │  fork × k        │
                │  pipe per child  │
                │  select() merge  │
                └──┬────────┬──────┘
                   │        │
        ┌──────────┘        └──────────┐
        ▼                              ▼
  ┌───────────┐                  ┌───────────┐
  │ splitter  │   (k-1 of them)  │ splitter  │
  │  k=k-i    │  …               │  k=1      │
  │ fork × k  │                  │ fork × 1  │
  │ pipe+merge│                  │ pipe+merge│
  └─┬───┬─────┘                  └─┬─────────┘
    │   │                          │
    ▼   ▼                          ▼
 ┌────┐ ┌────┐                  ┌────┐
 │qsrt│ │hsrt│  (alternating    │qsrt│
 │    │ │    │   process1/      │    │
 │    │ │    │   process2 at    │    │
 └────┘ └────┘   each index)    └────┘

Run:

./build/mySort -k 5 -e1 ./build/quicksort -e2 ./build/heapsort -i ./testing_folder/voters50.bin

-k 5 means 5 splitter children at the top. Splitter i is then started with -k (k-i) so the tree fans out unevenly but predictably (5+4+3+2+1 = 15 total sorter leaves for k=5). The input file is a binary Record stream produced by the course generator (voters50.bin, voters5000.bin, etc.).

Architecture highlights

splitHandler — shared abstraction across two binaries

mySort (the root) and each splitter both need to:

  1. Allocate k pipes
  2. Decide how many records each child should read (getRecsToRead)
  3. select() over all k read ends and drain whichever is ready
  4. Merge each finished child's already-sorted output into a running merge buffer

That's identical logic between the two programs — only the kind of child differs (exec splitter vs exec quicksort/heapsort). Pulling it into splitHandler.{c,h} removed a large chunk of duplication and made the merge logic one place to debug instead of two.

Alternating sorters at the leaves

Within a splitter, even-indexed children get process1 (e.g. quicksort) and odd-indexed get process2 (e.g. heapsort). The two algorithms have identical asymptotic complexity (O(n log n)) but very different constants and pathological inputs — running both in parallel against the same workload was the assignment's way of comparing them on a real distribution. Mergesort was deliberately not chosen for the leaves because the coordinator's pipe-merge is already a mergesort step.

Coordinator-only timing via signals

The root mySort installs handlers for SIGUSR1 and SIGUSR2. Splitters send SIGUSR1 to the root when they finish, and SIGUSR2 is the per-child timing checkpoint. Real time vs CPU time is reported using times(2) + _SC_CLK_TCK, computed inside the signal handler against snapshots taken at startup.

Files

Path Purpose
mySort.c Root coordinator. Forks k splitters, installs USR1/USR2 handlers, prints final merged output.
splitter.c Mid-tier. Forks k sorter children alternating process1/process2, merges their pipe output, signals root when done.
sort_programs/quicksort.c Quicksort leaf — reads recsToRead records from offset recsToSkip, writes sorted output to stdout.
sort_programs/heapsort.c Heapsort leaf — 0-indexed binary heap; same I/O contract as quicksort.c.
handlers/splitHandler.c The shared fork+pipe+select+merge module described above.
handlers/ioHandler.c Argument parsing (-k, -e1, -e2, -i, -s, -r, -pid), binary record read/write.
handlers/recordHandler.c Record comparison + key/value helpers.
testing_folder/voters{50,500,5000,50000,100000}.bin Binary Record test inputs at multiple sizes.

Build

make            # produces build/{mySort,splitter,quicksort,heapsort}
make clean      # removes objects + binaries
make clean-objs # objects only

Class history

This project is from the earlier year I took the Operating Systems course. The polished follow-up — a different homework from the same class, redone the next year — is at operating-systems-hw1. Both are kept public deliberately; this one is the older fork/pipe/IPC piece, the other is the shared-memory/semaphore piece.

License

MIT — applies to my own code in this repo. Assignment-distributed materials retain their original course copyright.

About

Tree-structured parallel external sort in C: coordinator forks splitter children, splitters fork sorter leaves (quicksort/heapsort alternated). Pipes + select() drive the merge-back; SIGUSR1/SIGUSR2 for finish + timing. Operating Systems class (UoA DIT, earlier-year attempt).

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors