Enabling 8K+AI Audiovisual Experience

As human beings, our interactions are based on visual and audio communication. It is not surprising then that the most popular forms of entertainment are audiovisual ones. These audiovisual experiences are more immersive and exciting when they are in vivid high definition, so consumers are demanding more from their visual technology. Resolution and image quality is continuously improving, and 4K is becoming the mainstream. The 8K UHD industry chain is also maturing. Meanwhile, audiovisuals have been upgraded from 2D to 3D, and new standards, such as AR/VR and panoramic sound, have been developed. Our industry-leading smart vision capabilities meet all of these requirements. At the production end, we give our partners the ability to provide diverse UHD media production tools, from the consumer level to the professional level, such as 4K/8K professional cameras, action cameras, and drones. In addition, thanks to years of experience in producing industry-leading audio and visual technologies, we're able to offer wide-ranging SoC solutions for smart large screens, STBs, and AR/VR.

tv chip web

High-Quality UHD Smart TV

TVs have played an important role in providing entertainment and diverse content to households for decades. They are capable of receiving broadcast signals from wired and wireless TV stations, and displaying audio and video signals transmitted by external devices (such as STBs and DVDs) through various analog and digital interfaces. In 2007, the first generation of networked smart TVs came into being, representing a new, innovative paradigm for TV products. These smart TVs supported the installation of wide-ranging apps, and enabled consumers to watch live and on-demand video from the Internet. They also allowed TV content to be actively selected, rather than simply passively received. These smart TVs supported signal reception (wireless and wired), as well as diverse digital/analog interfaces, network interfaces (wireless and wired), audio/video signal processing, and image display.
With the emergence of smartphones, mobile phone functions now far transcend basic communications capabilities. Smart TVs have developed in a similar manner, now not just serving as the media and entertainment center of the household, but also as an information sharing center, the household control hub, and a medium for human-machine interaction. The latest generation of smart TVs offers superb audiovisuals, supporting 4K/8K content, premium audio, and high dynamic range (HDR) content, while also integrates AI cameras and neural-network processing units (NPUs) to provide diverse network access capabilities such as Wi-Fi and Bluetooth. Their main processors now integrate high-performance CPU, GPU, NPU, 4K/8K codec, image signal processors (ISP), and premium audio digital signal processors (DSP). Notably, the HiSilicon smart TV solution supports the proprietary HiStreaming technology. With HiStreaming, smart TVs can quickly identify peripheral devices that support HiStreaming as well, implement seamless connection for lower latency and higher safety, and support structured transmission of AI data. This enables peripheral devices to be virtually operated and controlled as local devices, providing every member of the household with the best possible experience for consumers.
HiSilicon has nearly a decade of experience in researching and manufacturing smart TV SoCs. Tens of millions of TVs around the world already use HiSilicon technologies to provide consumers with optimal audiovisual quality. As the global leader in the smart TV SoC sector, HiSilicon offers a full range of smart TV SoCs that support HDR, AI picture quality (AI-PQ), motion estimation/motion compensation (MEMC), and a wide range of different resolutions, from full high definition (FHD) to 4K/8K. All HiSilicon solutions support HarmonyOS, Linux, Android, and other mainstream operating systems. In addition, HiSilicon SoCs come equipped with the following peripheral chips:

  • Image enhancement: further enhances the image quality for high-end products
  • Display driver: features the large-screen timing controller (TCON) and screen source driver chip, ensuring clearer images and higher reliability
  • Visual perception: Provide consumers with a smoother and smarter experience via HiStreaming with the full series of HiSilicon AI camera chips.
  • Wireless connection: Wi-Fi, 4G/5G, etc.
divider__line divider__line
set top box web v1

Optimal Audiovisual STB

A set-top box (STB) is a media device connected to a TV. It decodes audio and video signals and sends the signals to the TV, which compensates for the TV's shortcomings in data processing and app extension. There are two types of STBs currently on the market: Internet Protocol Television (IPTV) STB and over-the-top (OTT) STB. IPTV STBs provide live streaming services through a carrier-controlled private network, while OTT STBs achieve this through the public Internet. As restricted by relevant policies, OTT STBs can only provide live streaming services after a third-party app is installed. All types of STBs consist of logical modules such as those related to network connections, digital signal processing, and video signal output.

With rapid technological and practical progress of artificial intelligence (AI) and the Internet of Things (IoT), STB functions have evolved from simply decoding audio and video to gradually encompassing all major smart home features, such as camera capabilities, ultra-high-definition (UHD) encoding/decoding, access technologies such as Wi-Fi and Bluetooth, high-quality audio playback, and local video display.

  • Interconnects with smart home devices, integrates far-field and near-field voices, and utilizes cameras and sensors to detect motion and identify postures.
  • Integrates audio, video, and news content from the Internet, identifies users, and intelligently recommends and displays content.

The STB SoC is no longer a mere decoding chip, as it now integrates high-performance CPU, NPU, UHD codec, ISP, and premium audio DSP.

In recent years, online video content has become a major cultural phenomenon, with easily accessible TV and variety shows. New short video and live streaming paradigms have also emerged. With such a diverse range of content and varying daily routines, it is difficult to meet the entertainment needs of the entire household with just a single TV screen. The intelligent STB has been developed as a remedy which integrates high-quality audio and video output devices to provide more convenient and flexible video playback modes for consumers. In addition, with the high-performance CPU, GPU, NPU, ISP, and DSP, the intelligent STB is a suitable entry for implementing smart home controls.

HiSilicon launched its first STB SoC in 2007. Over the past decade, HiSilicon and its partners have blazed a trail in such fields as HD, intelligence and 4K/8K. Hundreds of millions of STBs based on HiSilicon SoCs have been brought online thus far. In addition to providing high-quality audio and video services for consumers, they also represent highly reliable and easy-to-manage solutions, capable of serving multiple carriers. HiSilicon STB SoCs feature industry-leading image quality enhancement, alongside robust system performance. Some mid- and high-end products will integrate the Da Vinci AI NPU module and support in-depth integration into the HarmonyOS ecosystem. By offering diverse external interfaces and working with camera and Wi-Fi chips, such SoCs provide consumers with seamless home connections and smart home services.

divider__line divider__line
motion camera web

Capturing Best Moments with Smart AI Cameras

Using DV cameras or professional unmanned aerial vehicle (UAV) cameras is a great way to document our lives or make stunning videos. The content can then be used for vlogs, short videos, or live streaming. But conventional professional cameras are often quite large, making them cumbersome to use. They are not really suitable when users just want to spontaneously capture photos and videos. What users need are high-quality and easy-to-use cameras which produce great videos anywhere and at any time. Fortunately, thanks to the latest advances in image processing, artificial intelligence (AI), and Internet of Things (IoT), "casual" cameras are getting more and more professional features. They can do many of the tasks that were traditionally limited to specialized equipment like mobile cameras, pan-tilt-zoom (PTZ) cameras, and UAV cameras. To create a chipset solution for these devices, we need to take the following requirements into account:
  • High image quality: Needs to provide 4K/8K UHD image quality with digital image stabilization (DIS) algorithms.
  • Intelligence: Needs to support object identification and tracking, simultaneous localization and mapping (SLAM) for UAV, and intelligent obstacle avoidance.
  • Compact design: Needs to meet the high requirements of portable and UAV cameras when it comes to size and power consumption.
  • Wireless connection: Allows high-speed and low-latency connection for wireless sharing and live streaming.

At HiSilicon, we provide industry-leading Smart Vision and Smart IoT products. Our professional solutions are tailored for action DVs, smart cameras, and commercial UAVs. We make it easy to create high-quality movies, and capture unmissable moments.

Long battery life and smart HD shooting
  • Our solution supports videos of up to 8K UHD, ISP and H.265, for better image quality, and 6-DoF DIS that ensures stability and clarity when the camera is moving.
  • Accelerated AI computing powered by efficient NPU enables AI scene identification and allows the camera to automatically center on the tracked object. With a PTZ, a camera can easily produce a smart tracking effect which is difficult for conventional cameras to achieve. This means users no longer need to be photography experts to create jaw-dropping blockbusters.
  • The depth processing unit (DPU) supports hardware acceleration for binocular ranging and 3D reconstruction, so UAVs can avoid obstacles when flying.
  • An intelligent algorithm enables ultra-low power consumption to prolong battery life. Added to the unique turbo start function, cameras are always ready to go.

Instant sharing without boundaries
Our 5G Pre-module enables the transmission of professional HD videos in real-time with low latency. Its high anti-interference capability ensures that transmissions are secure and smooth, even in challenging environments. The multi-screen instant sharing function enables images and videos to be transmitted to mobile phones and smart TVs with one click, making it easier than ever to share incredible experiences.
divider__line divider__line
ar vr web

XR: Virtual Has Never Been So Real

XR refers to such technologies as VR, AR, and MR, which encompass next-generation computing platforms, while implementing the advanced convergence of digital and physical worlds, and dramatic improvements to computing, connectivity, and content display. XR is a technology that revolutionizes content display and human-machine interactions, and a flourishing market that worth tens of billions of dollars.

Virtual reality (VR) technology generates a simulation of the real world through advanced, computer-driven technology, allowing consumers to interact with objects in three-dimensional spaces in real time, free of restrictions. On the technology side, VR has the following three basic attributes: immersion, interaction, and imagination. They offer the potential to serve as the next-generation universal computing platform. The central role of human beings is consistently emphasized in virtual systems. In the past, consumers could only observe the processed results from the outside of a computer, and react to the one-dimensional digital information display on the computer using the keyboard and mouse. Now consumers can be immersed into an environment created by the computer, and then interact with multi-dimensional information through multiple sensors. Thanks to the application of VR, people do not have to be onsite at an event to benefit from a full in-person experience. For instance, during the 70th anniversary celebration of the founding of the People's Republic of China, the media sets VR stands on both sides of Chang'an Avenue to capture the event. Audiences could watch the parade through VR for the first time. VR is also being applied to great effect in education. With VR, students in remote rural areas can wear glasses to learn the geographical features of Australia, as if they were actually there, and can also learn the transmission of blood in the human body by "going into" the body. As VR greatly expands student's learning interests, it has been applied to engineering training and other fields as well.
Augmented reality (AR) technology overlays digital information on the physical world. AR glasses integrate numerous technologies such as display, interaction, sensing, and multimedia capabilities. Based on first-view interactions, they provide consumers with AR sensory experiences through the display of virtual content on lenses. AR is a more lucrative field than VR. Now, an enormous progress has been made in entertainment and wide-ranging B2B fields, such as security, industry, tourism, and healthcare. AR smart glasses can also boost work efficiency by streamlining the operation process and offering remote assistance.
Mixed reality (MR) technology combines the real world with the virtual world to create a new visual environment. In the new visual environment, physical and digital objects coexist and interact with each other in real time. MR technology represents the combination of VR and AR. It introduces real scenario information into the virtual scenario and sets up an information loop of interactions and feedbacks between the virtual and real worlds to provide a heightened sense of reality. The emergence of MR has helped break through inherent limitations of VR resulting from its entirely virtual nature, and further expand the range of human-machine interactions, with correspondingly broad commercial applications.

However, XR devices on the market still face the following challenges:
1. Low resolution on devices. Take VR as an example. The ideal resolution for human eyes is about 60 pixels per degree (PPD). Generally, the single-eye resolution supported by 4Kp60 chips is 21.3 PPD, which is equivalent to the effects of a 480p TV. There is an often obvious screen-door effect, which prevents consumers from feeling completely immersed or comfortable. If the chip's capability can be upgraded to 8Kp120 hardware decoding, it will be able to support twice the resolution of the current mainstream VR, that is, single-eye 42.7 PDD, which greatly improves the visual experience and offers greater comfort.
2. Dizziness. 25% of consumers think that dizziness is the main obstacle preventing them from purchasing devices. The latency caused by the CPU's insufficient processing capabilities is one of the main causes of dizziness. Latency makes visual information asynchronous with bodily perception. As a result, the consumer's visual system conflicts with other sensory channels, creating a sense of disorder, and hence, leading to dizziness.
3. NPU-based scene identification speed. Scene identification involves image capture, analysis, upload, and display. The faster it identifies the scene, the better the resulting experience. The analysis phase is the most time-consuming process. To identify and analyze hundreds of faces per minute and license plates within milliseconds, chipsets must incorporate powerful NPU computing capabilities.
4. Transmission bandwidth and latency. According to Huawei X-Labs white papers, to ensure a premium experience, hundreds of Mbit/s to several Gbit/s bandwidth and millisecond-level latency are required. Traditional 4G and Wi-Fi are unable to meet these requirements, so Wi-Fi 6 and 5G are the preferred technologies.

HiSilicon offers a comprehensive XR (VR/AR/MR) solution, which includes solution components such as application processing (AP), display, vision, and connection, and general solutions.
High-performance computing:
  • Independent NPU with a HiSilicon-exclusive architecture, providing up to 9 TOPS of AI compute power
  • Octa-core Cortex-A73 processor, delivering high-speed computing with reduced latency
UHD display:
  • 8K UHD decoding, supporting 8K 360-degree immersive XR panoramic video
  • 4K UHD low-latency encoding, supporting XR remote immersive video conferences and remote assistance applications
  • AI PQ enhancement
  • XR audio and audio effect processing
Low-power smart perception:
  • Low-power AI visual solution, human/object detection and tracking, with AI visual enhancements
  • Intelligent voice solution
  • Simultaneous localization and mapping (SLAM) and six degrees of freedom (6DoF) application
High-speed and low-latency connections:
  • Low-latency image transmission, supporting cloud VR applications and short-distance wireless image transmission applications
  • High-speed and low-latency connection solution (5G and Wi-Fi 6)
divider__line divider__line