ACTS Advanced Speech Quality Analysis and Testing System
- Options and upgrades
- Resource centre
- Commodity name: ACTS Advanced Speech Quality Analysis and Testing System
- Commodity ID: customized
ACTS is a fully automated, objective testing and analysis system adapted to the latest audio technology for a wide range of intelligent devices. The ACTS system consists of 2 main modules, ACTS-MM and ACTS-AQA, many of which have been patented. In addition, customized test items can also be carried out according to the actual needs of customers.
ACTS main testing framework:
ACTS Objective Audio Test System/Mobile Phone, etc ACTS objective audio test system/headset and other wearables ACTS objective audio test system/smart speaker, etc ACTS long-distance objective test scheme for large-scale equipment
ACTS was developed in-house for the following testing purposes
1. The objective test in the laboratory is consistent with the actual user experience in the actual application
a) Correctly evaluate performance and increase user satisfaction
e) Only strictly repeatable can the performance of individual phones be strictly compared
3. Benefits of automated testing
f) Ensure strict reproducibility, minimize manual intervention, accuracy and objectivity
g) Save testing manpower and don't waste valuable talent on repetitive and error-prone labor
h) Improve efficiency, avoid repeated testing, avoid human error
i) ACTS-AQA: Analyze the general performance of communication audio to meet the test requirements of some operators, etc. The main criteria are:
YD/T 1884-2013 (corresponding to international standards EN50332-1, EN50332-2)
CMCC mobile phone test standard
3GPP2 (3GPP standard)
3GPP TS 26.131/132
3GPP TS 26.131/132 (SWB/FB)
3GPP TS 51.010-1
ITU-T G.165(IT 标准)
ITU-T G.160 Amendment 1 2012 ITU-T G.862
ITU-T G.863 POLQA
ii) ACTS-MM: It is an objective test system specially designed for voice calls, human-computer interaction, and music/games of intelligent terminals.
ACTS-MM provides customers with the following value:
Provide objective standards for quality assurance and acceptance, and ensure that the intelligent terminal tested by the laboratory performs well in practical applications, thereby improving the satisfaction of intelligent terminal users;
It provides objective evaluation criteria for the performance of various noise cancellation and pickup schemes, and provides a basis for judging and improving audio performance at all stages of design and manufacturing;
Ensure the consistency of audio and voice interaction of smart terminals;
Provides an effective tool for audio performance tuning.
1) Accurate noise field reproduction
This is the foundation and key to multi-microphone system testing;
Using 8 speakers, after automatic calibration, noise is played in eight directions to produce a test environment consistent with various real-world noise fields, which is more suitable for pairing
Performance evaluation of multi-microphone solutions;
It provides more than 100 kinds of noise reproduction for common scenarios of mobile phones, covering almost all practical scenarios of users, such as subways, restaurants, roadsides, and home TVs
disturbances, including motion noise;
Users can also customize new noise use cases.
Focus: Endless actual noise = > limited lab playback noise
2) Adjustable indoor reverberation field construction
Reverb will attenuate the performance of mobile phones/smart speakers, mainly because the near end speech is affected by reverberation to make speech blurred and confusing, especially in the case of hands-free or long-distance pickup, rather than the effect of reverberation on environmental noise.
The reverberation is caused by repeated reflections of sound by reflections such as various walls of the room (as shown below), that is, the sound source playback signal received by the microphone is not only direct sound, but also various reflected sounds.
较远距离拾音是语音识别的一大难点，远距离拾音的影响主要是两方面：低信噪比和近端语音混响。1、低信噪比对于语音识别率的影响主要是噪声造成语音的干扰使得识别错误(见下面左图 Word error vs SNR);2、混响环境中的语音对于语音识别和语音唤醒的性能影响很大(见下面右图，Word error vs 声源麦克风距离，距离越远混响影响越大)。
In the reverberation environment, because the speech is reverbed (especially late reverberation), the later syllables are smeared by the previous syllables, making the speech confused and blurred, resulting in a significant reduction in speech recognition rate and intelligibility, especially when the distance is greater than the critical distance.
From the perspective of creating a test environment in the laboratory, a low signal-to-noise ratio environment is relatively easy to generate, while it is much more difficult to generate reverberation for proximal speech in the laboratory. The most critical thing in the reverberation environment is not for the noise field, but for the near-end voice, and the performance of speech recognition, voice wake-up and voice quality of long-distance pickup after reverberation is more critical.
The ACTS-SRvB is the world's first lab-generated system for the reverberation environment of the voice under test, which can adjust different room sizes, RT60 and wall reflection coefficients to generate a variety of reverberation environments in the anechoic chamber, suitable for speech recognition, voice wake-up and voice call testing in a variety of real-world reverberation environments.
ACTS-SRvB needs to be built on top of the ACTS test system.
3) ACTS-MM test project
As explained earlier, ACTS has two modules, of which ACTS-AQA is mainly international and domestic and operator standards
Quasi-testing, while ACTS-AM is in addition to the basic standard test, is able to improve the quality of smart terminals and work with the user body
Consistent objective testing, specifically:
ACTS-MM Uplink Test Project:
8 Speaker background noise playback, covering various noise environments and adjustable reverberation scenes encountered by users every day;
Basic test items:
Proximal voice intensity after noise cancellation Voice Level. Voice Level is mainly based on NE Only as a reference, if it is high, it indicates that the noise cancellation is not clean and there is too much residual noise; If it is low, it indicates that noise cancellation also hurts speech
Noise suppression after noise suppression
ITU-T P.863 standard POLQA
ITU-T standard for objective testing of voice quality is an extension and update of PESQ, ultra-wideband testing
ETSI EC 202 396-3 标准 S-MOS/N-MOS
ETSI's objective test standard for speech and residual noise after noise cancellation treatment
SNRi: ITU-T P.160 Appendix II Amendment 2, used to measure the signal-to-noise ratio improvement, noise reduction intensity and noise cancellation balance of noise cancellation processing under various noise conditions. If you can not get the voice before noise cancellation, test it with a standard microphone close to the main microphone of the phone. Specific parameters:
SNRi: Used for improved measurement of signal-to-noise ratio after noise cancellation in mobile phones. Note that the noise power estimate here is noise using short time slots for voice
TNLR: Used to measure the overall noise cancellation dB value after mobile phone noise cancellation. The noise power estimate here is the short-noise segment with both a long pure noise segment and a short time gap for speech
DSN: When used for mobile phone noise cancellation, the near-end voice level is simultaneously suppressed dB measurement. Ideally, you want to eliminate only noise without suppressing near-end speech
DNRSL: Used to measure the dB value of the level difference between the residual noise after noise cancellation in the pure noise segment and the residual noise after noise cancellation in the near-end voice hybrid segment. Ideally, you want the residual noise to be in the pure noise segment and the near-end speech in the same as the noise aliasing segment
ACTS-PPA: Purified POLQA, i.e. Objective speech quality that ignores the effects of residual noise. The subjective evaluation of speech quality after noise reduction ignores the influence of residual noise, and the existing methods are deeply affected by residual noise after evaluating voice quality after noise reduction, which is inconsistent with human subjective evaluation. Eptec's Purified POLQA reduces the effects of residual noise, bringing the results closer to a subjective human evaluation of speech quality after noise cancellation treatment under ambient noise.
ACTS-VQi: Improvement or decrease in voice quality before and after the phone cancels noise. Although noise cancellation techniques can reduce ambient noise, noise cancellation algorithms do not always improve speech quality. VQi is used to test whether the objective quality of speech after noise cancellation is improved or reduced compared to before noise cancellation. To highlight voice quality, like Purified POLQA, VQi is an objective voice quality that ignores the effects of residual noise, and if you can get pre-noise voice, test it with a standard microphone close to the phone's main microphone.
ACTS-MESTOI: Speech Intelligibility Assessment. Voice intelligibility is directly related to the effectiveness of voice communication between people and the recognition rate of communication between humans and machines. In general, the reverberation of speech and environmental noise can seriously affect the intelligibility of speech. A good noise cancellation algorithm can not only retain good speech quality, but also retain excellent intelligibility. This test item effectively tests the noise cancelling area
The actual speech intelligibility can be used to evaluate the speech intelligibility when communicating with people and the speech recognition rate of human-computer interaction.
ACTS-RNA: Objective Quality Assessment of Residual Noise. The factors affecting the quality of residual noise are considered more comprehensively, such as musicalized noise (commonly known as running water sound), etc., which can effectively evaluate the residual noise of strong non-stationary noise such as other people's speech noise after noise cancellation treatment, which is closer to human subjective evaluation.
ACTS-CT: Measurement of noise reduction convergence time. The noise cancellation convergence time of stationary noise is easier to measure, but the convergence time of non-stationary noise is difficult to measure because the noise itself fluctuates. ACTS-CT overcomes this challenge by accurately measuring noise cancellation convergence time for nonstationary and stationary noise.
ACTS-TA: Fully automatic measurement of mobile phone grip angle. Different users have different gripping habits when using mobile phones, and multi-microphone noise cancellation processing is very sensitive to the gripping angle of the mobile phone, and poor processing may suppress the near-end voice, so that the other party cannot hear the user's own voice, causing great trouble to the call. Therefore, for multi-microphone mobile phones, the call performance of the common limit angle position must be measured. ACTS-TA is a combination of measurement, analysis and control software and electro-mechanical hardware, which can accurately rotate according to ITU-T P.64 A and B angles, and an automatic mobile phone angle measurement system that automatically applies adjustable pressure in the artificial ear. And the operation of the rotation angle is carried out off-ear, which will not wear the artificial ear, and there is an abnormal state protection device.
ACTS-WTH: Automatic measurement of watch angles. The height and angle of the hand lift when different users use the watch vary greatly, ACTS-WTH is an automatic measurement system for the height and angle of the analog user watch grafted on the artificial head. The surface of the arm is consistent with the acoustic reflection of the user
ACTS-HF: The third generation of fully automatic hands-free distance, position, angle and screen operation and mobile phone switch control test module and control software (speakerphone mode), try to consider the impact of acoustic reflection
Achieve 3D translation and 2D deflection, specifically:
X 轴:10cm ~ 130cm
Z axis: -30cm ~ +15cm (to reach the top of the artificial head)
Horizontal deflection: 0° ~ 180°
Vertical deflection: 0°~ -90°
The mobile phone fixed gripper mainly includes six part interfaces: 3 key presses, 2 screen touch and 1 RGB sensor
ACTS-SSK: intelligent speaker test automatic fixture, can automatically adjust distance, height, wall edge, wall corner, etc
ACTS-STV: Intelligent large screen test automatic fixture, can automatically adjust distance, height, inside and outside the wall, etc
There are sometimes occlusions between the device under test and the voice source, and even the device under test is sometimes covered by things (such as quilts, pillows, clothes, drawers, etc.), and can be used to find mobile phones/headphones/watches, etc. with voice wake-up. At this time, the microphone of the device under test receives a voice signal that is very different from normal conditions. ACTS-Dif is the world's first test simulation system for the occluded or covered system under test, which can adjust the material and size of different occlusion or covering, and the material is divided into hard and soft
ACTS-VR: Speech recognition rate test module in various real-world usage environments ACTS-VR: An objective and automated test system for speech recognition performance in various practical usage scenarios, which can test server speech recognition as well as local and cloud speech recognition. The interface can be displayed through an internal speech recognition interface or through an external screen.
It is used for automatic speech recognition rate testing and analysis under various noises, such as correct recognition rate and recognition errors, insertion errors, deletion errors, etc.
To test whether the effect of noise cancellation technology on speech recognition is enhanced or weakened. The complex and noisy application environment of intelligent voice is destructive to speech recognition. Many noise cancellation technologies have emerged, but noise cancellation solutions do not really improve the speech recognition rate, and some reduce the speech recognition rate. Objective and comprehensive means of testing are required.
a) Test the connection diagram
b) Test Project
i. Whole sentence recognition rate at different distances and directions under quiet, reverberation and various noises
ii. Word recognition rate at different distances and directions under quiet, reverberation and various noises
iii. Word insertion rate at different distances and directions under quiet, reverberation and various noises
iv. Word deletion rate at different distances and directions under quiet, reverberation and various noises
v. Word substitution rates at different distances and directions under quiet, reverberation and various noises
c) Test the database
i. Voice banks of different speakers (Mandarin or English or with some accents)
ii. Various noises
iii. Different reverberation fields
ACTS-VW: Voice Wake
a) Test the connection diagram
Basic connectivity for the Wake by Voice test
b) Test the project
i. Wake-up rate at different distances for quiet, reverberation and various noises
ii. False alarm rates at different distances under quiet, reverberation and various noises
iii. Inefficiency of arousal at different distances under quiet, reverberation and various noises
c) Test the database
i. Wake words for different speakers
ii. Words that are similar to the wake word of different speakers
iii. Words with a large gap between different speakers and the wake word
iv. Various noises
v. Different reverberation fields
ACTS-MD: Tests noise cancellation vs multiple MIC differences
The same noise cancellation algorithm and parameters, if the multiple MICs are very different, will greatly affect the noise cancellation performance. If the sensitivity of the main MIC is higher than that of the reference MIC, noise cancellation may not be enough; Conversely, speech is easily suppressed
When selecting noise cancellation algorithms and debugging parameters, ACTS-MD can detect differences in noise cancellation performance due to verifying the differences in multiple MICs
ACTS-Lin: Echo vs overall linearity
The same echo cancellation algorithm and parameters, if the linearity of the whole machine is different, it will greatly affect the performance of echo cancellation and duplex, ACTS-LIN can detect and verify the linearity of the whole machine
ACTS-RE: Earpiece/speaker human ear reverse engineering distortion test, the nonlinear fractal processing theory is applied to the human ear reverse engineering, and a new listening model is established, so as to achieve the performance of audio objective detection consistent with human ear subjective hearing detection.
ACTS-AIM: Test the spatial perception of the target audio device/system, including the spatial position of each special sound and the ambient spatial sense of the overall sound.
ACTS-MM Downlink Test Project:
Downlink VQ POLQA: Objective voice
ANC: Receiver Active Noise Cancellation Performance Test. This is another way to solve the problem of ambient noise so that users can hear each other's voices clearly. ACTS-ANC is used to objectively test the performance of active noise reduction at the receiving end.
ACTS-ANC is a module of the ACTS test system for testing mobile phone active noise cancellation functions under various noise conditions, such as various grip positions, pressures, and noise frequency response measurements under state (call/standby) to verify the actual effect of active noise cancellation
Under different mobile phone pressures, the actual noise reaching the ear will have a certain change, which requires actual measurement
The real-time reverse signal of ANC is played by the earpiece, and the distance between the earpiece and the eardrum is different under different pressures as a point sound source, resulting in a difference in signal energy and noise cancellation
Under different pressures, the ear as a cavity has different tightness, and the sound insulation ability to noise is also different
For microphones that collect noise signals in real time, their positions at different pressures occur
Variation, moving in a non-uniform field due to head-faced reflections in areas close to the human head
Causes a change in the acquired signal
ANC vs Talking/Idle
Frequency response curves for specified pressure and head away from the head are measured when the phone is on call and standby to obtain a complete ANC spectral signature
ANC vs Pressure
In the call state, the noise frequency response curve is measured for several continuous pressure values and the state away from the human head to obtain the difference and trend of ANC effect under different pressure conditions
Easy to use
Simple to use, no special training required—reduced application labor costs
Simple configuration -- Configuration options at a glance
Easy to add new test cases
Automatic and effective
Simple calibration process with minimal manual intervention
Fully automated testing process – one-click and overnight testing
Automated test equipment inspection Avoid test failures caused by misconnected or damaged speakers or microphones
Components of the ACTS-MM system
ACTS front end, system control, acquisition and communication;
ACTS-MM corresponding software module
ACTS various mobile phones/watches/speakers/large screens and other automatic test systems (optional). In addition, it is possible to customize what you need
Automated test system
PCs with Windows 10 preinstalled
ACTS HATS OR B&K HATS, HEAD AND SHOULDERS SIMULATORS
Standard microphones and their accessories
8 monitor-level speakers that simulate a noise field
Multi-channel sound card
Comprehensive tester (R&S CMW500 or CMU200)