Analysis of intelligent audio solutions
Intelligent audio solution analysis
1. Overview of the smart audio industry
Smart speaker refers to intelligent terminal products with intelligent voice interaction system, Internet service content, and can expand more devices and content access. Smart speaker is based on the traditional audio to add intelligent functions, on the one hand, with Wi-Fi connection, voice interaction; On the other hand, it can provide content services such as music and audiobooks, and Internet services such as APP applications, while realizing scenario-based smart home control. As a new smart hardware product, the smart audio market is huge. It is a tool for home consumers to use voice to surf the Internet, such as on-demand songs, online shopping, or to understand the weather, and it can also control smart home devices, such as opening curtains, setting the temperature of refrigerators, and heating up water heaters in advance.
Around 2013, smart speakers began to enter the domestic market, but at that time it was not called smart speakers, but WiFi speakers. In terms of connection method, it is more complex than Bluetooth audio, and it needs to be connected to the network with the help of a smartphone, and it can play streaming music and other content after networking. The development of Wi-Fi audio is not smooth, from 2015, the market Wi-Fi audio has withdrawn from the stage, each family has begun to no longer produce Wi-Fi audio, leaving only Bluetooth audio to continue to sell well. From the perspective of the market, this wave of domestic Wi-Fi audio companies have generally closed or transformed.
After the demise of Wi-Fi audio in China in about 2016, smart speakers began to gradually enter everyone's field of vision, which has two key time nodes, one is the opening of Amazon's Alexa platform, and the other is the CES (International Consumer Electronics Show) held in Las Vegas in January 2017. With the opening of Amazon Alexa in 1 and the best-selling of Amazon Echo, two groups of people have risen in China, and a wave has begun to study smart speakers to create a domestic voice interaction system; Another group of people rely on Alexa to do foreign trade and produce smart audio for export overseas.
Traditional audio is connected to mobile phones, tablets, computers and other devices through Bluetooth, and these devices download music resources for Bluetooth encoding and transmit to the speaker for playback, so the traditional audio is more of a "speaker" function. Smart speakers have better hardware performance and network connection capabilities, making them more advantageous in music playback functions. On the one hand, smart speakers have higher bandwidth. The smart speaker adopts WiFi networking, and the bandwidth is more than 150Mbps, which is convenient for lossless music transmission. The bandwidth of traditional Bluetooth audio is generally 24Mbps, which is not enough to transmit high-quality music; Secondly, smart speakers can interact with multiple speakers, and multiple smart speakers can be networked to form a synchronous playback network, forming synchronous playback in different rooms to improve the use experience; At the same time, smart speakers are connected to the cloud music library, which is rich in resources. The smart speaker itself can connect to various music apps for song playback, and the massive music resources on the Internet can be used by the smart speaker.
Intelligent audio has greatly expanded the scope of traditional audio capabilities, and as far as the music sector is concerned, smart audio not only retains but also has a better inheritance and development. According to the survey of why users buy smart speakers, listening to music is still the primary purpose, accounting for up to 90%, of which controlling smart home devices accounts for 48% and replacing old speakers accounts for 36%.
Source: Public information
According to the results of the 2017 Smart Audio Survey released by NPR (National Public Radio) and Edison Research, nearly one in six (16%) people in the United States are expected to own a smart speaker. If you calculate the country's population of 3 million (US Census Bureau, March 2, 2017), this means that 3.26 million people in the United States are expected to have smart speakers, and what is even more striking is that this number is an increase of 5120% compared to a year ago. According to another authoritative consulting company in the United States, Gartner Group, by 128, 2020% of households in the United States will have a smart speaker.
According to the "China Smart Audio Market Analysis" report released by GfK (one of the world's top five market research companies) in October 2017, the retail sales of smart audio in China were only 10,2015 units in 1, increased to 2016,6 units in 2017, and sold more than 1,8 units from January to August 10. With the launch of many new products in the third quarter of 2017, the sales volume of smart audio has made a significant leap, and in August 2017 alone, the smart audio market reached a year-on-year growth rate of 8%. However, for the whole of 178, the two more important time points are actually November (Double 2017) and December (Double 11). On the day of "Double 11" in 12, the Ali Tmall Genie smart audio sold more than 12 million units of Ali Tmall Genie smart audio that only sold 2017 yuan (price after coupons), of course, it should be emphasized that the low-price promotion strategy is the key factor that contributed to its million-dollar sales. The smart audio Dingdong TOP also exceeded the million sales mark on the day of Double 11, and the high sales of this smart audio with a price of only 99 yuan is also due to the low-price promotion strategy. Other products such as Xiaomi AI audio and Kugou smart audio also achieved rapid sales growth in November and December.
Second, the development of the intelligent audio industry analysis
Voice interaction technology needs to be improved. In terms of intelligent audio interaction, voice technology is hard power, the more devices that wake up and activate, the more frequent the user use, the better the voice recognition ability, reaction speed, and learning ability are trained, the more it can be recognized by users, and the more it can win over competitors to obtain more markets. Voice interaction involves a very complex technical chain, including core technologies such as acoustic processing, speech recognition, semantic understanding and speech synthesis, as well as necessary technologies in interactive experiences such as algorithm noise reduction, sound source localization, and voiceprint recognition. Although Chinese companies such as Baidu and iFLYTEK have also done well in patent applications and technological breakthroughs in recent years, overall, the gap with Silicon Valley giants is still obvious. The real problem is that domestic manufacturers pay more attention to the innovation of "business models" such as content resource integration and sales channels, rather than creating a new business model or establishing core technical competitiveness barriers through technological breakthroughs.
The domestic market is immature. There are nearly 100 smart audio (product/brand) enterprises around overseas markets, and if extended to the entire smart audio industry chain (chips, software, testing, solutions, foundry, etc.), there are about thousands of participating enterprises. The domestic smart audio market is still in the early stage of the industry, the market is not mature, and there are only dozens of domestic smart audio (product/brand) companies, such as Ali, Jingdong, Xiaomi, Mobvoi, Himalaya, etc. On the one hand, the domestic voice interaction technology research and development needs to be improved and improved, and the voice interaction technology research and development is by no means small and medium-sized companies can bear, which sets a technical industry threshold for many small and medium-sized enterprises, so that many small and medium-sized enterprises have the intention to develop but are unable to move forward; On the other hand, the domestic market is far from rising, and Xiaomi, Ali and JD.com have fought price wars, resulting in many companies being blocked out of the market by costs and prices.
Poor user interaction experience. At present, users generally feedback that smart speakers have poor far-field recognition, high false wake-up rate, unstable continuous dialogue function, poor semantic understanding, and poor sound quality, and improving interactive experience has become an important content of smart audio development. At present, in order to provide convenient use and rich functional applications, the vast majority of smart speakers on the market use integrated design, but due to the implementation of many applications requires a lot of investment and the space of smart speakers is limited, most smart speakers sound quality improvement slowly, but also difficult to meet the user's demand for high-quality sound quality, sound quality problems have become a major focus of user complaints. At present, the products on the market with lower prices have poor sound quality, but the price of smart speakers with excellent sound quality starts from 1,000 yuan, and ordinary users have no intention of bearing the high price.
The user's usage habits need to be cultivated. American kitchens and living rooms are open-plan, and housewives listening to music while cooking meet the real needs and scenes of American users. But in China, that may not have happened, and most family members are used to watch TV and play on mobile phones. Therefore, there is a big difference in the use of smart audio needs in Chinese and American families, which also determines the different product demand for smart audio, and at the same time, it has led to such a huge product popularization gap between the smart audio market in China and the United States at this stage. But in general, the potential demand for China's smart audio market has always existed, just as radio is widespread in China. At this stage, although smart speakers have many functions and rich content resources, these do not address the pain points of users, or do not make the majority of users form lasting habits, and the lack of user habits directly leads to low user stickiness and product recognition.
The progress towards the smart home control hub is slow. As we all know, many large and medium-sized Chinese enterprises have deployed smart audio not to the huge profits that their hardware can bring, but to the broad value market that they can bring after building a smart home "control hub" through voice interaction. Since China has not yet established a complete smart home ecosystem, problems such as fragmented usage scenarios and complex hardware operations have not yet been solved. At the same time, domestic smart homes lack supporting regulations and unified standards, products are uneven, and consumer experience is poor. These two points come from the problem of smart home ecological environment, resulting in the lack of fertile soil for the development of smart audio. At present, many consumers only experience "new products" with a curious attitude, more as decorations, rather than as home essentials, let alone as a smart home control hub.
3. Future development trends
Profit from software. In previous years, the price of smart audio was high, when hardware profit was still the main value acquisition point, but with the development of smart audio products in recent years and the popularity of smart homes, the profit of smart audio products tends to decline, and the profits of various content application services provided around smart audio are increasing. Smart speakers have rich content resources, and users can use various audio resources on the audio, such as music, audiobook content, etc. For users, the basic music attributes of smart speakers are the basis for attracting their purchases, and the expanded content applications based on voice interaction are the key to attracting their purchases, which has also become the main profit growth point. In the future, smart speakers will continue to enrich content applications to meet users' multi-faceted needs for content. Each participating enterprise will also carry the development logic of its own software content service with hardware, and vigorously develop its own unique software services such as music resources and related content resources through intelligent audio hardware to obtain more profits and market space.
Speech combined with vision. At present, the combination products that can be independently or jointly controlled in China's smart audio market are more mainstream, such as combining smart audio with tablets and wearable products to achieve dual interaction between voice and screen. The breakthrough and innovation of smart audio can start from the two aspects of increasing user interaction and rich functional applications, helping users achieve more functions and penetrate more application scenarios. Compared with pure voice interaction smart speakers, smart speakers with screens can enhance the experience of human-computer interaction. On the one hand, the touch screen can make the wake-up rate of the smart speaker higher, which can solve the problem of low utilization rate of smart audio; On the other hand, the multi-dimensional output mode of image and voice can not only enrich the output content, but also provide more intuitive interactive information, which can solve the problem of limited use scenarios of pure voice interaction. For example, you can further open up shopping, watching movies, video calls and other scenarios.
Improved sound quality. Sound quality is an important criterion for judging the quality of smart audio, and should be treated at the core at any time, and a powerful smart speaker will be spurned by users if the sound quality is not up to standard. Public data shows that since 2014, the most criticized problems of smart speakers by users are the problem of poor sound quality. As a smart audio product, sound quality is a key indicator to determine whether it completes its breakthrough innovation and leads the industry change. For the participating enterprises in the intelligent audio industry, only by doing a good job in sound quality can we better talk about intelligence and content resources. Therefore, smart audio products are bound to return to sound quality while enriching functional applications, and raise the sound quality effect to a higher level that better matches functional applications.
Pay more attention to intelligent experience. The purpose of the birth of smart audio is to bring users rich and thoughtful Internet services, so that users can put down their mobile phones and better enjoy high-quality music and life. In the future, more services that could only be operated through mobile apps will be transplanted to smart speakers, and smart speakers will become a distribution platform for Internet services by accessing rich Internet services. At the same time, smart speakers will also pay more attention to the improvement of personalized technology and emotional interaction experience, such as more customized wake words, personalized speech synthesis, voiceprint + face recognition, ARVR and other personalized functions on smart speakers. With the continuous advancement of artificial intelligence technology and AI chipization, intelligent voice technology will further penetrate into production and life, voice will become a form of new human-computer interaction, smart audio will be through a more universal product form, so that smart audio has the ability to think, embedded in any product.
Become the control hub of the smart home. Among the many products identified as possible entry points to smart homes, smart TVs, smart phones, and smart speakers are the three most highly anticipated smart products. With the continuous development of artificial intelligence technology, the call for smart speakers to become the mainstream entrance is getting higher and higher, the reason is that with the continuous maturity of voice interaction technology, smart speakers as a voice interaction carrier, in the convenience and experience of controlling smart homes will gradually be better than smart phones and smart TVs. In the future, smart speakers are expected to become the control hub of smart homes, becoming an open platform that connects smart TVs, lights, air conditioners, etc. in the home, and can control other smart homes through voice interaction.
From June 6 to 7, 2023, in Suzhou Industrial Park, Jiangsu Province, the two-day "Easy Trade Auto Industry Conference" was successfully held in Suzhou International Expo Center. The conference covers an area of nearly 10000 square meters, with over 300 participating enterprises, focusing on new energy and intelligence. The products cover multiple fields such as new energy vehicle thermal management, millimeter wave radar, LiDAR, power batteries, NVH acoustics, intelligent networking, etc.
Recently, Shenzhen Small and Medium Enterprises Service Bureau announced the list of "specialized, refined, and new" small and medium-sized enterprises in Shenzhen in 2023, and Shenzhen Aipu TE Connectivity Electronics Co., Ltd. successfully won the recognition of "specialized, refined, and new" enterprises in Shenzhen.
The Global 1024 Developers' Day is an artificial intelligence event with AI developers as the audience and industry, university and research forces leading the industrial development. It was initiated and hosted by IFlytek. Adhering to the concept of "openness, cooperation, ecology, and sharing", we gather top experts in global artificial intelligence and millions of developers to continuously promote the flourishing development of the artificial intelligence ecosystem. Eptec debuted at the exhibition with its latest product - ACTS, the ACTS communication acoustic testing system, which is a fully automated and objective testing and analysis system adapted to the latest audio technology for various communication devices and intelligent terminal voice calls and noise reduction performance. ACTS Advanced Speech Quality Analysis and Testing System Test Scenario Annual Meeting Site
On November 10, 2022, Mr. Chen Lixin, Vice President of the China Electronic Audio Industry Association, visited Eptec for inspection.
以“声联世界，无限未来”为主题的2022音频技术博览会暨声学楼17周年年会于11月5-6日在深圳前海圆满落幕。业界众多著名技术专家和教授亮相了此次年会论坛，给观众带来了最新的行业发展动向极领先技术分享， 现场学术气氛浓烈，行业供应链资源也聚集一堂，给与会者提供了丰富的行业信息。 本次展会爱普泰科重点推出了ACTS客观音频测试系统。此系统是一套针对各种智能终端的通话质量和降噪效果的全自动化、适应于最新音频技术的客观测试和分析系统，是目前国内唯一能满足3GPP测试标准的音频测试系统。 该系统包括ACTS-AQA与ACTS-MM两大板块. 广泛应用于手机，耳机等穿戴式，智能音箱等产品测试。The 2022 Audio Technology Expo and the 17th Anniversary Annual Meeting of the Acoustics Building, with the theme of "Connected World, Infinite Future", successfully concluded in Qianhai, Shenzhen from November 5th to 6th. Numerous renowned technical experts and professors from the industry made an appearance at this annual conference forum, bringing the latest industry development trends and leading technology sharing to the audience, The academic atmosphere on site was strong, and industry supply chain resources also gathered together, providing attendees with rich industry information. This exhibition focuses on the launch of the ACTS objective audio testing system by Aptec. This system is a fully automated and objective testing and analysis system that adapts to the latest audio technology for the call quality and noise reduction effects of various intelligent terminals. It is currently the only audio testing system in China that can meet the 3GPP testing standard. The system includes ACTS-AQA and ACTS-MM. It is widely used in the testing of wearable products such as mobile phones, headphones, and Smart speaker.