Enabling 8K+AI Audiovisual Experience

As humans, our interactions are naturally based on audio and visual communication. It is not surprising then that the most popular forms of entertainment are audiovisual in nature. These experiences are more immersive and exciting when presented in vivid high definition, so consumers are constantly demanding more from their visual technology. Resolution and image quality is continuously improving with 4K fast becoming mainstream, and 8K UHD is not far behind. Meanwhile, audiovisuals have been upgraded from 2D to 3D, and all-new standards such as AR/VR and panoramic sound have been developed. HiSilicon's industry-leading smart vision capabilities meet all of these requirements. At the production end, we enable our partners to produce diverse UHD media production tools, from the consumer level to the professional level, such as 4K/8K professional cameras, action cameras, and drones. In addition, thanks to years of experience producing industry-leading audio and visual technologies, we offer a wide range of SoC solutions for smart large screens, STBs, and AR/VR.

High-Quality UHD Smart TV

High-Quality UHD Smart TV

Television sets have served as the centerpiece of household entertainment for decades. They are capable of receiving broadcast signals from wired and wireless TV stations, and displaying audio and video signals transmitted by external devices (such as STBs and DVDs) through various analog and digital interfaces. In 2007, the first generation of networked smart TVs came into being, representing a new, innovative paradigm for TV products. Smart TVs could run a wide variety of apps, and enabled consumers to watch live and on-demand video directly from the Internet. They also allowed TV content to be actively selected, a big departure from the passively-received nature of traditional TV content. In addition, smart TVs supported signal reception (wireless and wired), as well as a wide range of digital/analog interfaces, network interfaces (wireless and wired), audio/video signal processing methods, and image displays.
Similar to how mobile phones now far exceed basic communications capabilities thanks to the emergence of smartphones, smart TVs have developed in much the same manner, no longer only serving as a media and entertainment center, but also an information sharing center, a household control hub, and a medium for human-machine interaction. The latest generation of smart TVs offers superb audiovisuals, supporting 4K/8K content, premium audio, and high dynamic range (HDR) content, while also integrating AI cameras and neural-network processing units (NPUs) to provide diverse network access capabilities such as WiFi and Bluetooth. Their main processors now integrate high-performance CPU, GPU, NPU, 4K/8K codec, image signal processors (ISP), as well as premium audio digital signal processors (DSP). Notably, the HiSilicon smart TV solution supports our proprietary HiStreaming technology, which enables smart TVs to quickly identify any peripheral devices that also support HiStreaming, implement seamless connection for lower latency and higher safety, and support structured transmission of AI data. This allows peripheral devices to be virtually operated and controlled as local devices, delivering the best possible experience to every member of the household.
HiSilicon has been researching and manufacturing smart TV SoCs for almost a decade, and millions of TVs around the world already use HiSilicon technologies to provide consumers with optimal audiovisual quality. As the global leader in the smart TV SoC sector, HiSilicon offers a full range of smart TV SoCs that support HDR, AI picture quality (AI-PQ), motion estimation/motion compensation (MEMC), and a wide range of different resolutions, from full high definition (FHD) to 4K/8K. All HiSilicon solutions support HarmonyOS, Linux, Android, and other mainstream operating systems. In addition, HiSilicon SoCs come equipped with the following peripheral chips:

  • Image enhancement: further enhances the image quality for high-end products
  • Display driver: features the large-screen timing controller (TCON) and screen source driver chip, ensuring clearer images and higher reliability
  • Visual perception: provides consumers with a smoother and smarter experience via HiStreaming with the full series of HiSilicon AI camera chips.
  • Wireless connection: WiFi, 4G/5G, etc.

Optimal Audiovisual STB

Optimal Audiovisual STB

A set-top box (STB) is a media device connected to a TV. It decodes audio and video signals which are then sent to the TV, compensating for the TV's data processing and app extension shortcomings. There are currently two types of STBs on the market: Internet Protocol Television (IPTV) STB and over-the-top (OTT) STB. IPTV STBs provide live streaming services through a carrier-controlled private network, while OTT STBs achieve this through the public Internet. As restricted by relevant policies, OTT STBs can only provide live streaming services after a third-party app is installed. All types of STBs consist of logical modules such as those related to network connections, digital signal processing, and video signal output.

As artificial intelligence (AI) and the Internet of Things (IoT) continue to rapidly advance, STB functions have evolved from simply decoding audio and video to gradually encompassing all major smart home features, such as camera capabilities, ultra-high-definition (UHD) encoding/decoding, access technologies such as WiFi and Bluetooth, high-quality audio playback, and local video display. Modern STBs are now capable of the following functionality:

  • Interconnecting with smart home devices, integrating far-field and near-field voices, and utilizing cameras and sensors to detect motion and identify postures.
  • Integrating audio, video, and news content from the Internet, identifying users, and intelligently recommending and displaying content.

No longer a mere decoding chipset, the STB SoC can now integrate high-performance CPU, NPU, UHD codec, ISP, and premium audio DSP.

Recent years have seen online videos become a major cultural phenomenon, with a diverse range of attractive content — including short videos and live streaming — more accessible than ever before. With such a wide variety of content and varying daily routines, it becomes difficult to satisfy the entertainment needs of an entire household with just a single TV screen. The intelligent STB has been developed as a remedy which integrates high-quality audio and video output devices to provide more convenient and flexible video playback modes for consumers. In addition, with the high-performance CPU, GPU, NPU, ISP, and DSP, the intelligent STB is a suitable entry for implementing smart home controls.

HiSilicon launched its first STB SoC in 2007. Over the past decade, HiSilicon and its partners have blazed a trail in such fields as HD, intelligence and 4K/8K, and hundreds of millions of STBs based on HiSilicon SoCs have now been brought online. In addition to providing high-quality audio and video services for consumers, they also represent highly reliable and easy-to-manage solutions, capable of serving multiple carriers. HiSilicon STB SoCs feature industry-leading image quality enhancements, alongside robust system performance, while certain mid- and high-end products integrate the Da Vinci AI NPU module and support in-depth integration into the HarmonyOS ecosystem. By offering diverse external interfaces and working with camera and WiFi chips, our SoCs provide consumers with seamless home connections and smart home services.

Capturing Best Moments with Smart AI Cameras

Capturing Best Moments with Smart AI Cameras

DV cameras and professional unmanned aerial vehicle (UAV) cameras are fantastic for documenting our lives or producing stunning videos. The content can then be used for vlogs, short videos, or live streaming. But conventional professional cameras are often quite large and cumbersome to use, and not really suitable when users just want to spontaneously capture photos and videos. What users need are high-quality and easy-to-use cameras which produce great videos anywhere and at any time. Fortunately, thanks to the latest advances in image processing, artificial intelligence (AI), and Internet of Things (IoT), more affordable "casual" cameras are increasingly receiving advanced features usually found on professional alternatives. To create a chipset solution for these devices, the following requirements must be taken into account:
  • High image quality: 4K/8K UHD image quality with digital image stabilization (DIS) algorithms.
  • Intelligence: Support for object identification and tracking, simultaneous localization and mapping (SLAM) for UAV, and intelligent obstacle avoidance.
  • Compact design: Must satisfy the high requirements of portable and UAV cameras in terms of size and power consumption.
  • Wireless connection: High-speed and low-latency connection for wireless sharing and live streaming.

At HiSilicon, we provide industry-leading Smart Vision and Smart IoT products. Our professional solutions are tailored for action DVs, smart cameras, and commercial UAVs, and we make it easy to create high-quality movies and capture unmissable moments.

Long battery life and smart HD recording
  • Our solution supports 8K UHD, ISP, and H.265 video for better image quality, and 6-DoF DIS that ensures stability and clarity when the camera is moving.
  • Accelerated AI computing powered by efficient NPU enables AI scene identification and allows the camera to automatically center on the tracked object. A pan-tilt-zoom (PTZ) camera can easily produce a smart tracking effect which is difficult for conventional cameras to achieve. This means users no longer need to be photography experts to create jaw-dropping blockbusters..
  • The depth processing unit (DPU) supports hardware acceleration for binocular ranging and 3D reconstruction, allowing UAVs to avoid obstacles during flight.
  • An intelligent algorithm enables ultra-low power consumption to prolong battery life. When paired with the unique turbo start function, your camera is always ready for action.

Instant sharing without boundaries
Our 5G Pre-Module enables the transmission of professional HD videos in real-time with low latency, while its advanced anti-interference capabilities ensure that transmissions are secure and smooth, even in challenging environments. The multi-screen instant sharing function enables images and videos to be transmitted to mobile phones and smart TVs with a single click, making it easier than ever to share incredible experiences.

ar vr web

XR: Virtual Has Never Been So Real

XR refers to technologies such as VR, AR, and MR. Encompassing next-generation computing platforms and involving the advanced convergence of digital and physical worlds, these cutting-edge technologies also introduce dramatic improvements to computing and connectivity. XR revolutionizes content display and human-machine interactions, and represents a flourishing market worth tens of billions of dollars.

Virtual reality (VR) technology generates an advanced, computer-driven simulation of the real world, allowing consumers to interact with objects in a three-dimensional space. From a technological perspective, VR exhibits the following three basic attributes: immersion, interaction, and imagination. Such is the level of excitement surrounding VR solutions, they could potentially serve as a next-generation universal computing platform. A human's central role is consistently emphasized in virtual systems. Whereas in the past, users could only observe content from an external position and react to one-dimensional displays using a keyboard and mouse, modern consumers have the option of being completely immersed in an environment created by the computer, enabling them to interact with multi-dimensional information through a variety of sensors. Such a high level of immersion gives rise to an exciting range of possible applications and use cases. For example, by utilizing VR, event attendees can benefit from a full in-person experience without actually being on-site. This was put into practice during the 70th anniversary celebration of the founding of the People's Republic of China, when local media placed VR stands on both sides of Chang'an Avenue to capture the event, allowing audiences at home to experience the parade "firsthand" through VR. The technology is also being applied to great effect within the education field, where students in remote rural areas can leverage VR to experience the geographical features of far-off lands, or observe the inner workings of the human body. Just as VR greatly facilitates student education, it has also been applied to engineering training and other suitable fields.
Augmented reality (AR) technology overlays digital information onto the physical world, as AR glasses integrate display, interaction, sensing, and multimedia capabilities. AR represents a more lucrative field than VR, and enormous progress has been made in a number of entertainment and B2B domains, including security, industry, tourism, and healthcare. AR smart glasses can also boost work efficiency by streamlining the operation processes and enabling remote assistance.
Mixed reality (MR) technology combines the real and virtual worlds to create an all-new visual environment where physical and digital objects coexist and interact with each other in real time. Representing the combination of VR and AR, MR technology introduces real scenario information into the virtual scenario and establishes an information loop of interactions and feedbacks between the virtual and real worlds to provide a heightened sense of reality. The emergence of MR has helped break through the inherent limitations of VR resulting from its entirely virtual nature, enabling an expanded range of human-machine interactions with correspondingly broad commercial applications.

However, current XR devices still face the following challenges:
1. Low resolution devices: In the case of VR, the ideal resolution for the human eye is approximately 60 pixels per degree (PPD). Generally, the single-eye resolution supported by 4Kp60 chips is 21.3 PPD, which is equivalent to viewing a 480p TV. An obvious screen-door effect is often present in this case, which greatly impacts both comfort and immersion. If the chip's capability can be upgraded to 8Kp120 hardware decoding, it will be able to support twice the resolution of current mainstream VR systems — single-eye 42.7 PPD — significantly improving the visual experience and offering greater comfort.
2. Dizziness: 25% of consumers rank dizziness as the number one obstacle preventing them from purchasing XR devices. Latency caused by insufficient CPU processing capabilities is one of the main causes of dizziness, as visual information fails to be rendered in sync with a user's head movements, resulting in disorientation.
3. NPU-based scene identification speed: Scene identification involves image capture, analysis, upload, and display. The faster a scene is identified, the better the resulting experience. However, to identify and analyze hundreds of faces per minute, not to mention license plates within milliseconds, chipsets must incorporate powerful NPU computing capabilities.
4. Transmission bandwidth and latency: According to HUAWEI X-Labs white papers, bandwidth of up to several Gbit/s and millisecond-level latency are required to ensure a premium experience. As traditional 4G and WiFi are unable to meet these requirements, WiFi 6 and 5G are the preferred technologies.

HiSilicon offers a comprehensive XR (VR/AR/MR) solution, including such components as application processing (AP), display, vision, and connection.
High-performance computing:
  • Independent NPU with HiSilicon-exclusive architecture, providing up to 9 TOPS of AI compute power
  • Octa-core Cortex-A73 processor, delivering high-speed computing with reduced latency
UHD display:
  • 8K UHD decoding, supporting 8K 360-degree immersive XR panoramic video
  • 4K UHD low-latency encoding, supporting XR remote immersive video conferences and remote assistance applications
  • AI PQ enhancement
  • XR audio and audio effect processing
Low-power smart perception:
  • Low-power AI visual solution, human/object detection and tracking, and AI visual enhancements
  • Intelligent voice solution
  • Simultaneous localization and mapping (SLAM) and six degrees of freedom (6DoF) application
High-speed and low-latency connections:
  • Low-latency image transmission supports both cloud VR applications and short-range wireless image transmission applications
  • High-speed and low-latency connection solution (5G and WiFi 6)