Submit to this Journal Review for this Journal Propose a Special Issue

Article Menu

Share Help Cite Discuss in SciProfiles

Open AccessArticle

Peer-Review Record

Packet Loss Concealment Based on Phase Correction and Deep Neural Network

Appl. Sci. 2022, 12(19), 9721; https://doi.org/10.3390/app12199721

by Qiang Ji, Changchun Bao^* and Zihao Cui

Reviewer 1:

Ammar Odeh

Reviewer 2: Anonymous

Reviewer 3:

Giuseppe Ciaburro

Reviewer 4: Anonymous

Appl. Sci. 2022, 12(19), 9721; https://doi.org/10.3390/app12199721

Submission received: 27 July 2022 / Revised: 19 September 2022 / Accepted: 21 September 2022 / Published: 27 September 2022

(This article belongs to the Special Issue Advances in Speech and Language Processing)

Round 1

Reviewer 1 Report

This is a valuable paper and introduced a new model to deal with Packet Loss Concealment Based on Phase Correction and Deep Neural Network

Author Response

Thanks for the reviewer's work.

Reviewer 2 Report

It is good paper.

Author Response

Thanks a lot for the reviewer's work.

Reviewer 3 Report

Section 1 must be improved. Authors should emphasize contribution and novelty, the introduction needs to clarify the motivation, challenges, contribution, objectives, and significance/implication. You must properly introduce your work, specify well what were the goals you set yourself and how you approached the problem.

Section 2,3 can be improved. You must properly introduce the equation, list in detail the variables contained in it with a concise description of the meaning. To make them more readable show them in a bulleted list. In this way the reader will be able to understand the contribution of each variable.

Section 4 must be improved. Add the hardware and software resources used to simulate your methods. Lists the libraries used to run the algorithms. Extract this data from the datasheet. To make reading the specifications more immediate, you can insert them in a table, listing the resources used and the specific characteristics for each.

I could not find a detailed description of the evaluation metrics you have adopted. How will you measure your model's performance? This section is essential in order to demonstrate the effectiveness of your methodology.

At the end of the section there is no adequate discussion on the results obtained. You should summarize the results obtained and compare them with those of other workers. Then you should discuss those results by highlighting the strengths and weaknesses of your methodology.

Section 5 must be improved. The section is very concise. Paragraphs are missing where the possible practical applications of the results of this study are reported. What these results can serve the people, it is necessary to insert possible uses of this study that justify their publication. They also lack the possible future goals of this work. Do the authors plan to continue their research on this topic?

49) Do not use abbreviation such as i.e. I have seen that you often use this abbreviation, so I will not repeat this advice again, it also applies to the other occurrences.

134)Replace [31] [32] with [31,32]. I have seen that you often use this format, so I will not repeat this advice again, it also applies to the other occurrences.

363) Add a link to Librispeech ASR corpus, in this way the reader will be able to replicate your work.

370)Move the 4.2 section title in the next page

Author Response

Section 1 must be improved. Authors should emphasize contribution and novelty, the introduction needs to clarify the motivation, challenges, contribution, objectives, and significance/implication. You must properly introduce your work, specify well what were the goals you set yourself and how you approached the problem.

A: Thank you for your concerns. With regard to your statement that "the author needs to clearly explain the goals and the way to solve the problem", we think we have explained it clearly and address it again as follows:

In order to solve the problem of data packet loss in voice signal transmission, this paper proposes a method of data packet loss concealment at the receiver. Based on phase correction and depth neural network, the proposed method is divided into amplitude information concealment and phase information concealment.

The information of historical data packets is used by depth neural network to predict the amplitude information of lost data packets. At the same time, to solve the problem of inaccurate estimation of phase spectrum by the depth neural network, one method uses the waveform similarity superposition method to extend the received voice in the time domain, take the phase spectrum of its lost data segment to ensure the continuity of the voice, and the other method uses the amplitude spectrum to modify the phase spectrum in an iterative manner in the frequency domain, Then the phase spectrum matched with the amplitude spectrum is obtained. The proposed method effectively uses the information of historical packets, and improves the performance of packet loss concealment method under high packet loss rate.

Section 2,3 can be improved. You must properly introduce the equation, list in detail the variables contained in it with a concise description of the meaning. To make them more readable show them in a bulleted list. In this way the reader will be able to understand the contribution of each variable.

A: Thank you for your comments. We added the meanings of G, B, P_G and P_B in the Section 2. In addition, all variables in the Section 3 have the descriptions of their meanings.

Section 4 must be improved. Add the hardware and software resources used to simulate your methods. Lists the libraries used to run the algorithms. Extract this data from the datasheet. To make reading the specifications more immediate, you can insert them in a table, listing the resources used and the specific characteristics for each.

A: Thanks for your concerns. The evaluation indicators used in this paper have been given in the second paragraph of the Section 4.

“To evaluate speech quality and intelligibility of the proposed methods, Perceptual Evaluation of Speech Quality (PESQ) [28], Short-Time Objective Intelligibility (STOI) [29] and Log-Spectral Distortion (LSD) [30] are used to test the speech processed by the PLC, respectively.”

Additionally, a description of the software required to implement the program has been added.

Section 5 must be improved. The section is very concise. Paragraphs are missing where the possible practical applications of the results of this study are reported. What these results can serve the people, it is necessary to insert possible uses of this study that justify their publication. They also lack the possible future goals of this work. Do the authors plan to continue their research on this topic?

A: Thanks for the comments, future work and possible application for the proposed method were added in the Section 5.

49) Do not use abbreviation such as i.e. I have seen that you often use this abbreviation, so I will not repeat this advice again, it also applies to the other occurrences.

A: Done, all the abbreviation (i.e.) is replaced by “ that is ” .

134)Replace [31] [32] with [31,32]. I have seen that you often use this format, so I will not repeat this advice again, it also applies to the other occurrences.

A: Done

363) Add a link to Librispeech ASR corpus, in this way the reader will be able to replicate your work.

A: Done

370)Move the 4.2 section title in the next page

A: Done

Reviewer 4 Report

The paper proposes two Packet Loss Concealment (PLC) methods, a time domain phase correction method, and a frequency-domain phase correction method. The first method combines WSOLA and DNN to complete the lack of information in data packets in the case of poor conditions. The second method uses GLA and DNN to realize the PLC. The DNN estimates the amplitude and phase spectra of the speech signal in the lost packet. On the other hand, GLA is used to correct phase spectrum in the frequency domain through amplitude spectrum so that phase spectrum can match amplitude spectrum.

The paper uses a discrete Gilbert Elliott channel model to simulate burst packet loss. The proposed methods are validated over Librispeech ASR corpus using three metrics namely, Perceptual Evaluation of Speech Quality (PESQ), Short-Time Objective Intelligibility (STOI), and Log-Spectral Distortion (LSD). Evaluations show that the proposed methods perform better than other methods reported in the literature. Minimal proofreading is required

Article Menu

Packet Loss Concealment Based on Phase Correction and Deep Neural Network

Further Information

Guidelines

MDPI Initiatives

Follow MDPI