Add built-in Resource Visibility Virtualization#77
Open
maazm7d wants to merge 5 commits intoravindu644:mainfrom
Open
Add built-in Resource Visibility Virtualization#77maazm7d wants to merge 5 commits intoravindu644:mainfrom
maazm7d wants to merge 5 commits intoravindu644:mainfrom
Conversation
…4#73) * Implement Cgroup-based Resource Limits (Memory, CPU, PIDs) This patch introduces comprehensive resource management capabilities to Droidspaces, allowing users to restrict container consumption of Memory, CPU, and PIDs. Key changes: - Added --memory, --cpus, and --pids-limit CLI options with validation. - Implemented Cgroup V1 and V2 support for resource governance. - Automated Cgroup V2 controller activation via subtree_control. - Added configuration persistence for resource limits in container.config. - Enhanced 'info' command with real-time resource usage statistics. - Improved failure handling and logging for cgroup operations. Verified with integration and stress testing on Linux x86_64. * cgroup: implement robust partial controller support Enhance Cgroup handling to gracefully manage environments with partial controller support (common on Android). Key changes: - Implement `ds_cgroup_is_supported` to explicitly detect controller availability in both V1 and V2 hierarchies. - Refine V2 bootstrap to only attempt enabling supported controllers in `cgroup.subtree_control`. - Update `ds_cgroup_apply_limits` to skip unsupported controllers with a clear warning instead of failing or silently succeeding. - Add `ds_cgroup_get_limits` to read actual enforced limits from the filesystem. - Update `info` command to display both requested and enforced limits, marking unsupported ones as "(not enforced)". This ensures predictable and transparent behavior across diverse kernel configurations. --------- Signed-off-by: maazm7d <maazm7d@gmail.com>
…ive) Add built-in resource visibility virtualization Implement a zero-dependency virtualization layer for /proc/meminfo, /proc/cpuinfo, /proc/stat, /proc/uptime, and /proc/loadavg. This ensures that containerized applications see the resource limits (RAM, CPUs) and load metrics enforced by Cgroups instead of the host's total resources. Key features: - src/virtualize.c and src/virtualize.h for scaled resource data generation. - Consistent /proc/meminfo via component scaling and Cgroup v2 integration. - Accurate CPU virtualization with aggregate recomputation in /proc/stat. - Scaled /proc/loadavg with host PID masking and scaled runnable/total. - True per-container uptime and scaled idle time in /proc/uptime. - In-place file updates preserving bind-mount inodes. - Robust PID recycling protection via PID namespace inode verification. - High-performance monitor loop using signalfd and poll (500ms heartbeat). - Safe dynamic memory allocation for variable-sized system files. - Resilient cgroup path discovery supporting various distributions. - --virtualization flag integrated into CLI and configuration. - Container boot sets up tmpfs and bind-mounts virtual proc files. - Functional verification script in tests/verify_virtualization.sh. This provides LXCFS-like functionality in a <260KB static binary, significantly improving compatibility for Java, Go, and Node.js.
- Enhanced container discovery logic to scan the 'Containers/' directory, allowing the runtime to detect and show stopped containers. - Improved auto-resolution of container names to prioritize running systems but fall back to installed ones. - Fixed a monitor warning by skipping the network handshake in host mode. - Robustified memory virtualization by adding fallbacks for missing cgroup limits and preventing division by zero during ratio calculation. - Capped virtualized CPU counts at the actual host online processor count. - Standardized /proc/meminfo formatting for better compatibility with strict parsers. - Modified ds_config_load_by_name to correctly return an error if the metadata file is missing.
This commit addresses several critical issues
- Fixed `nproc` core count reporting by virtualizing CPU sysfs entries:
`/sys/devices/system/cpu/{online,possible,present}`.
- Refined `/proc/meminfo` virtualization by capping all reported memory
fields at the container's MemTotal, fixing usage detection in fastfetch.
- Modified `sanitize_container_name` to allow the dot `.` character,
preserving OS version numbers (e.g., "Ubuntu 24.04" -> "Ubuntu-24.04").
- Enforced consistent container name sanitization early in the lifecycle
to fix "dead" internal logging and metadata path inconsistencies.
- Resolved a compiler error in `src/container.c` regarding unused
return value of `read` from signalfd.
a3a3800 to
d70d165
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Add built-in Resource Visibility Virtualization
Summary
Implements a zero-dependency virtualization layer that overrides
/proc/meminfo,/proc/cpuinfo,/proc/stat,/proc/uptime, and/proc/loadavginside containers. Applications now see the resource limits enforced by Cgroups instead of the host's totals, providing LXCFS-like functionality without the FUSE dependency.Motivation
Runtimes like Java, Go, and Node.js read
/procdirectly to size thread pools, heap limits, and worker counts. Without virtualization, a container restricted to 512MB and 1 CPU would still see the host's 32GB and 16 cores, leading to OOM kills, CPU throttling, and degraded performance.Changes
New files
src/virtualize.c— Core virtualization logic for all five/procfilessrc/virtualize.h— Public API declarationstests/verify_virtualization.sh— Functional verification scriptModified files
src/boot.c— Callsds_virtualize_init()after pivot_root, guarded by/procmountpoint checksrc/container.c— Wires up monitor loop withsignalfd/pollfor periodic updates; recordsstart_timebefore fork; stores PID namespace inode for recycling protectionsrc/droidspace.h— Addsvirtualization,start_time, andns_inodefields tods_configsrc/main.c— Adds--virtualizationCLI flag (OPT_VIRTUALIZATION = 268)src/config.c— Persistsvirtualizationfield in container configMakefile— Addsvirtualize.cto the source listImplementation Details
/proc/meminfoMemTotal/Free/Availablefrom cgroup limits; scales other components by ratio; zeroes swap; integrates cgroup v2memory.statforAnonPages,Cached,Slab/proc/cpuinfocpu_quota / cpu_period/proc/statcpuline/proc/uptimecfg->start_time(CLOCK_MONOTONIC); scales idle time by CPU ratio/proc/loadavgcontainer_cpus / host_cpus; masks host last-PIDReliability
O_WRONLY+ftruncate) to preserve bind-mount inodes — rename-based atomic updates would silently break the bind mounts/proc/<pid>/ns/pid) before every updatesignalfd+poll(500ms)instead ofsleepfor clean signal handlingOpt-in
The feature is disabled by default. Enable per-container via CLI or config:
--virtualization # or in container config: virtualization=1