Low Latency For HiFi Systems

Low Latency is one of the key to achieve high sound quality in digital playback systems.

  • CPU performance: CPU performance determines the response speed of the system. The higher the CPU performance, the faster the response speed of the system, and the lower the Latency.
  • Bus rate: The bus rate determines the transmission speed of the system. The higher the bus rate, the faster the transmission speed of the system, and the lower the Latency.
  • Audio decoding: Audio decoding is the most time-consuming part in the digital audio player system. The higher the performance of audio decoding, the lower the Latency.
  • Algorithms and implementation methods of writing audio stream to hardware: Algorithms and implementation methods of writing audio data to hardware determines the writing speed of the system. It affects Latency very much.

CelWare Music Operating System

On the operating system level, CelWare, CelAudio operating system for music playback, is based on Arch Linux. Arch Linux is a lightweight, highly customizable Linux distribution focused on simplicity and user control. It follows a "do-it-yourself" philosophy, providing a minimal base system that users configure to their needs. KISS, Keep It Simple, Stupid – minimal default configuration, is the core philosophy of Arch Linux. CelAudio highly customize Arch Linux:

RT Linux designed to achieve low-latency responsiveness by restructuring core kernel architecture, making Linux suitable for soft real-time applications. Maximum kernel-mode non-preemption latency < 50 μs

Highly customized system: Through careful design, CelWare only has necessary functions added. This makes CelWare a pure operating system for audio application. This system does not have the presence of meaningless applications, avoiding the use of CPU time for useless work, thereby increasing the proportion of time for processing audio applications, thereby improving processing efficiency and reducing audio processing Latency.

Pruning unused hardware: Only the necessary hardware support is retained. This prevents undesired hardware interrupts from consuming CPU resources.

CelPlayer Audio Player

Standard Post with Image

CelPlayer use a unique technical approach in the playback process, dividing the music playback process into two independent stages:

The first stage: Convert the music file to hardware format and stored in the memory.

The second stage: Writing data from memory to hardware via memory-mapped I/O (MMAP)

The sencond stage is the real process for playback. And achieve μs-level latency by bypassing kernel context switches. CelAudio especially leverage CPU cache lines and SIMD instructions for bulk transfers. This implementation can achieves 10x+ throughput vs. syscall-driven I/O in NVMe SSD control. This makes the first stage very fast. DO NOT need to wait for a long time to hear the music.

ZEN 3 CPU Cache

Standard Post with Image

CPU Cache is a small, ultra-fast memory layer integrated into the processor. It stores frequently accessed data and instructions to reduce latency compared to accessing main memory (RAM). CelHeart-G1 adopts the AMD ZEN 3 CPU. Zen 3, AMD's CPU microarchitecture, features a redesigned cache hierarchy optimized for latency reduction and gaming/application performance. CelPlayer is optimized to leverage Zen 3's cache line size and SIMD instructions for efficient data processing.

PCIE Based I/O

Standard Post with Image

All modules of the new generation NS music server are based on independent PCIE bus. This design makes the modules DO NOT share hardware resources, it means lower Latency in transmission phase. At the same time, Each PCIE card is equipped with a dedicated linear power supply system specifically designed for that card. The linear power subsystem has lower noise and features an isolated to eliminate switching noise cross-talk.