Low Latency
The Significance of Low Latency for HiFi Systems
Low latency stands as a critical benchmark for digital playback performance, with every incremental reduction yielding clearly audible improvements in sonic fidelity
Most Linux-based open-source playback systems emphasize their low-latency design achievements. CelAudio's view on metrics has always been: poor metrics guarantee poor sound; good metrics do not guarantee good sound. For latency in digital playback systems, there is currently no industry measurement standard, making it impossible to definitively classify metrics as good or bad. However, throughout CelAudio's five-year R&D journey developing CelWare and CelPlayer, every latency reduction has yielded clearly audible improvements.
System latency is determined by the following key factors
CPU Performance: Higher CPU performance delivers faster system response and lower latency.
Bus Speed: Higher bus speed enables faster data transfer and lower latency.
Audio Decoding: Audio decoding is the most time-consuming stage in digital playback. Higher decoding performance directly reduces system latency.
Audio Stream Write Method: The method used to write audio streams to hardware determines overall write speed. More efficient write methods reduce latency significantly.
CelMusperOS
CelMusper Music Operating System
At the audio operating system level, the three main choices are Windows, Linux, and Android — with Linux being the mainstream approach for dedicated audio systems. The reasons are as follows:
Real-Time Operating System Capability: Among the three mainstream operating systems, Linux offers a real-time kernel branch. It adds real-time capabilities — such as real-time scheduling and inter-process communication — on top of the standard Linux kernel. A well-designed RT Linux system can guarantee real-time performance, effectively reducing latency and improving playback quality.
Precision Software Curation: In Linux systems, software deployment can be tailored to actual requirements. Through careful curation, only essential functions are included — yielding a streamlined system free of unnecessary applications. This prevents CPU cycles from being wasted on non-essential tasks, maximizing the proportion of time dedicated to audio processing, improving efficiency, and reducing latency.
Targeted Hardware Minimization: Linux kernel trimming involves retaining only the necessary hardware drivers, causing non-essential hardware to remain inactive or in its lowest power state. This reduces CPU time spent processing hardware interrupts, lowers system latency, and minimizes interference from unnecessary hardware. For example, in the latest NS series, CelHeart-G1 retains the HDMI interface for maintenance purposes, but CelWare disables the GPU — rendering the interface non-functional. This both reduces GPU interference and minimizes CPU overhead from HDMI interrupts, thereby lowering overall system latency.
For these reasons, CelAudio chose Arch Linux as the foundation to launch the CelWare music operating system. Through deep customization of Arch Linux, adopting the RT Linux kernel and trimming useless functions and hardware, CelWare provides extremely low latency.
CelPlayer
CelPlayer Audio Player
CelPlayer employs a unique two-stage playback architecture. In the first stage, music data from the hard drive is converted to the target decoder's native hardware format and stored in memory. In the second stage, this hardware-format data is written directly to the decoder via memory-mapped I/O — bypassing the Linux ALSA mixer layer entirely and eliminating format conversion overhead. By replacing the traditional path of transcoding → ALSA processing → hardware write with a single direct memory-to-hardware transfer, CelPlayer dramatically reduces playback latency.
CPU & Cache
High-Performance CPU and Cache Utilization
CelHeart-G1 employs an AMD Ryzen 5000 series CPU, delivering high performance with low latency. The Ryzen 7 5800 series features eight cores within a single CCX (Core Complex), sharing a unified L3 cache, while each core's dual threads share the same L2 cache. Through strategic allocation of player and I/O threads to specific cores, along with optimized software processing, the shared L2/L3 cache mechanism enables faster data delivery to I/O interfaces — further reducing processing latency.
PCIE Architecture
Full PCIE Architecture Reduces Latency and Noise
The latest NS series music servers adopt a high-speed PCIe architecture, keeping all interfaces in a sustained high-throughput state — significantly reducing latency. The independent PCIe design also enables each interface chip to employ its own linear power supply, effectively lowering noise levels.
* NS6 and NS6U lower USB interfaces are directly from southbridge; NS6U upper USB Audio interface is PCIe interface