Skip to content

Optimize/optimize cfbox#8

Merged
Charliechen114514 merged 8 commits into
mainfrom
optimize/optimize_cfbox
May 4, 2026
Merged

Optimize/optimize cfbox#8
Charliechen114514 merged 8 commits into
mainfrom
optimize/optimize_cfbox

Conversation

@Charliechen114514
Copy link
Copy Markdown
Member

No description provided.

Add FileCloser functor and unique_file type alias (unique_ptr<FILE,
FileCloser>) to io.hpp, along with open_file() helper. Refactor
read_all() and write_all() to use RAII, eliminating manual fclose.
Update tee.cpp to use unique_file vector instead of raw FILE* with
manual cleanup loop.
Add scoped_regex class in regex.hpp that automatically calls regfree()
on destruction. Replace all manual regcomp/regfree pairs in
awk_executor.cpp (4 sites) and expr.cpp (1 site), eliminating
potential resource leaks on early return paths.
Add unique_pipe type with PipeCloser to ensure pclose() is always
called, even on early returns. Eliminates potential pipe leak in
command substitution $(...) handling.
…tion

Remove zlib dependency entirely. Implement deflate compression (fixed
Huffman + LZ77 with hash chain matching) and inflate decompression
(fixed/dynamic/stored block support) in ~600 lines of header-only C++23.

Changes:
- New deflate.hpp: BitWriter with MSB-first Huffman encoding, LZ77 matcher
- New inflate.hpp: BitReader with accumulator-based peek/read, Huffman
  table builder supporting fixed and dynamic tables
- Rewrite compress.hpp: gzip format (RFC 1952) using our own deflate
- Update unzip.cpp: use raw_inflate instead of zlib
- Remove zlib from CMakeLists.txt (no more CPM fetch)
- Add 15 compression tests (round-trip, edge cases, corruption)

Verified: output compatible with system gunzip, all 331 tests pass.
Add -fvisibility=hidden, -ffunction-sections, -fdata-sections,
-Wl,--gc-sections, -Wl,--strip-all, -Wl,--hash-style=gnu,
-Wl,--build-id=none to Release builds. Debug builds unchanged.
Release size-optimized binary: 523KB.
Replace the quadratic LCS DP table with Myers' shortest edit script
algorithm, reducing memory from O(mn) to O(n+m) and time from O(mn)
to O(nd) where d is the edit distance. Add proper multi-hunk support
for unified diff output with configurable context lines. Eliminates
the code duplication between lcs_diff and unified_diff.
std::regex is known to be 5-10x slower than POSIX regex. Replace with
regex_t via the existing scoped_regex RAII wrapper. Also removes the
<regex> header dependency, reducing compile time and binary size.
…d/constexpr annotations

- Add streaming for_each_line() to io.hpp, refactor stream.hpp to delegate
- Stream grep/cat/wc — no full-file load, handles infinite input
- Optimize sort with precomputed keys, add reserve() across applets
- Mark 72 functions noexcept/[[nodiscard]]/constexpr across 6 headers
- Update README/architecture/Roadmap with current stats and benchmarks
@Charliechen114514 Charliechen114514 merged commit c3e9513 into main May 4, 2026
8 checks passed
@Charliechen114514 Charliechen114514 deleted the optimize/optimize_cfbox branch May 4, 2026 07:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant