S. V. Andersen A. Duric R. Hagen W. B. Kleijn J. Linden M. N. Murthi J. Skoglund J. Spittka Internet Draft Document: draft-andersen-ilbc-00.txt Global IP Sound Category: Experimental Feb. 20th 2002 Expires: Aug. 20th 2002 Internet Low Bit Rate Codec Status of this Memo This document is an Internet-Draft and is in full conformance with all provisions of Section 10 of RFC2026. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet- Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." The list of current Internet-Drafts can be accessed at http://www.ietf.org/ietf/1id-abstracts.txt The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html. Abstract This document specifies a speech codec suitable for robust voice communication over IP. The codec is designed for narrow band speech and results in a payload bit rate of 13.968 kbit/s with an encoding frame length of 30 ms. The codec enables graceful speech quality degradation in the case of lost frames, which occurs in connection with lost or delayed IP packets. Table of Contents Status of this Memo................................................1 Andersen et. al. 1 Internet Low Bit Rate Codec February 2002 Abstract...........................................................1 Table of Contents..................................................1 1. INTRODUCTION....................................................5 2. OUTLINE OF THE CODEC............................................5 2.1 Encoder........................................................5 2.2 Decoder........................................................7 3. ENCODER PRINCIPLES..............................................8 3.1 LPC Analysis and Quantization..................................8 3.1.1 Computation of Autocorrelation Coefficients..................8 3.1.2 Computation of LPC Coefficients..............................9 3.1.3 Computation of LSF Coefficients from LPC Coefficients.......10 3.1.4 Quantization of LSF Coefficients............................10 3.1.5 Stability Check of LSF Coefficients.........................12 3.1.6 Interpolation of LSF Coefficients...........................12 3.2 Calculation of the Residual...................................12 3.3 Perceptual Weighting Filter...................................13 3.4 Start State Encoder...........................................13 3.4.1 Start State Estimation......................................13 3.4.1 All-Pass Filtering and Scale Quantization...................13 3.4.2 Scalar Quantization.........................................14 3.5 Codebook Encoding.............................................14 3.5.1 Perceptual Weighting of Codebook Memory and Target..........14 3.5.2 Codebook Creation...........................................15 3.5.2.1 Creation of a Base Codebook...............................15 3.5.2.2 Codebook Augmentation.....................................15 3.5.2.3 Codebook Expansion........................................16 3.5.3 Codebook Search.............................................17 3.5.3.1 The Codebook Search at Each Stage.........................17 3.5.3.2 The Gain Quantization at Each Stage.......................18 3.5.3.3 Preparation of Target for Next Stage......................19 3.6 Gain Correction Encoding......................................19 3.7 Bitstream Definition..........................................20 4. DECODER PRINCIPLES.............................................21 4.1 LPC Filter Reconstruction.....................................21 4.2 Start State Reconstruction....................................22 4.3 Excitation Decoding Loop......................................22 4.4 Multistage Adaptive Codebook Decoding.........................23 4.4.1 Construction of the Decoded Excitation Signal...............23 4.5 Packet Loss Concealment.......................................23 4.5.1 Block Received Correctly and Previous Block also Received...23 4.5.2 Block Not Received..........................................24 4.5.3 Block Received Correctly When Previous Block Not Received...24 4.6 Enhancement...................................................25 4.6.1 Outline of the Enhancement Unit.............................25 4.6.2 Determination of the Pitch-Synchronous Sequences............27 4.6.3 Re-estimation of the Current Sample-Sequence................27 Andersen et. al. Experimental - Expires August 20th, 2002 2 Internet Low Bit Rate Codec February 2002 4.7 Synthesis Filtering...........................................29 5. SECURITY CONSIDERATIONS........................................29 6. REFERENCES.....................................................30 7. ACKNOWLEDGEMENTS...............................................30 8. AUTHOR'S ADDRESSES.............................................31 APPENDIX A REFERENCE IMPLEMENTATION...............................33 A.1 iLBC_test.c...................................................34 A.2 iLBC_encode.h.................................................39 A.3 iLBC_encode.c.................................................40 A.4 iLBC_decode.h.................................................46 A.5 iLBC_decode.c.................................................47 A.6 iLBC_define.h.................................................54 A.7 constants.h...................................................57 A.8 constants.c...................................................58 A.9 anaFilter.h..................................................130 A.10 anaFilter.c.................................................130 A.11 createCB.h..................................................131 A.12 createCB.c..................................................132 A.13 doCPLC.h....................................................135 A.14 doCPLC.c....................................................136 A.15 enhancer.h..................................................141 A.16 enhancer.c..................................................141 A.17 filter.h....................................................150 A.18 filter.c....................................................151 A.19 FrameClassify.h.............................................153 A.20 FrameClassify.c.............................................154 A.21 gaincorr_Encode.h...........................................155 A.22 gaincorr_Encode.c...........................................155 A.23 gainquant.h.................................................157 A.24 gainquant.c.................................................157 A.25 getCBvec.h..................................................159 A.26 getCBvec.c..................................................160 A.27 helpfun.h...................................................163 A.28 helpfun.c...................................................165 A.29 hpInput.h...................................................170 A.30 hpInput.c...................................................171 A.31 hpOutput.h..................................................172 A.32 hpOutput.c..................................................172 A.33 iCBConstruct.h..............................................174 A.34 iCBConstruct.c..............................................174 A.35 iCBSearch.h.................................................175 A.36 iCBSearch.c.................................................176 A.37 LPCdecode.h.................................................180 A.38 LPCdecode.c.................................................181 A.39 LPCencode.h.................................................183 A.40 LPCencode.c.................................................184 Andersen et. al. Experimental - Expires August 20th, 2002 3 Internet Low Bit Rate Codec February 2002 A.41 lsf.h.......................................................188 A.42 lsf.c.......................................................188 A.43 packing.h...................................................193 A.44 packing.c...................................................194 A.45 StateConstructW.h...........................................196 A.46 StateConstructW.c...........................................196 A.47 StateSearchW.h..............................................198 A.48 StateSearchW.c..............................................198 A.49 syntFilter.h................................................201 A.50 syntFilter.c................................................202 Andersen et. al. Experimental - Expires August 20th, 2002 4 Internet Low Bit Rate Codec February 2002 1. INTRODUCTION This document contains the description of an algorithm for the coding of speech signals sampled at 8 kHz. The iLBC codec has a bit rate of 13.967 kbit/s using a block-independent linear-predictive coding (LPC) algorithm. The codec operates at block lengths of 30 ms and produces 419 bits per block which can be packetized in 53 bytes. The described algorithm results in a speech coding system with a controlled response to packet losses similar to what is known from pulse code modulation (PCM) with packet loss concealment (PLC), such as the ITU-G.711 standard [3] which operates at a fixed bit rate of 64 kbit/s. At the same time, the described algorithm enables fixed bit rate coding with a quality-versus-bit rate tradeoff close to what is known from code-excited linear prediction (CELP). A suitable RTP payload format for this codec is specified in [1]. Some of the applications for which this coder is suitable are: Real time communications such as videoconferencing and telephony, Streaming audio, Archival and messaging. This document is organized as follows. In Section 2 a brief outline of the codec is given. The specific encoder and decoder algorithms are explained in Sections 3 and 4, respectively. A c-code reference implementation is provided in Appendix A. The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119 [2]. 2. OUTLINE OF THE CODEC The codec consists of an encoder and a decoder described in Section 2.1 and 2.2, respectively. The essence of the codec is LPC and block based coding of the LPC residual signal. For each 240 sample block, the following major steps are done. An LPC filter is computed to produce the residual signal. The codec uses DPCM coding of the dominant part, in terms of energy, of the residual signal for the block. The dominant state is of length 58 samples and forms a start state for dynamic codebooks constructed from the already coded parts of the residual signal. These dynamic codebooks are used to code the remaining parts of the residual signal. By this method, coding independence between blocks is achieved, resulting in elimination of propagation of perceptual degradations due to packet loss. The method facilitates high-quality packet loss concealment (PLC). 2.1 Encoder The input to the encoder is 16 bit uniform PCM sampled at 8 kHz. The input is partitioned into blocks of BLOCKL=240 samples. Each Andersen et. al. Experimental - Expires August 20th, 2002 5 Internet Low Bit Rate Codec February 2002 block is divided into NSUB=6 consecutive sub-blocks of SUBL=40 samples each. For each input block, the encoder performs two FILTERORDER=10 linear-predictive coding (LPC) analyses. The first analysis applies a smooth window centered over the 2nd sub-block and extending to the end of the 6'th sub-block. The second LPC analysis applies a smooth window centered over the 5'th sub-block and extending to the end of the 6'th sub-block. For both LPC analyses, sets of line-spectral frequencies(LSF)'s are obtained, quantized and interpolated to obtain LSF coefficients for each sub-block. Subsequently, the LPC residual is computed using the quantized and interpolated LPC analysis filters. The two consecutive sub-blocks of residual exhibiting the maximal energy are identified. Within these 2 sub-blocks, the start state (segment) is selected from two choices: the first 58 samples or the last 58 samples of the 2 consecutive sub-blocks. The selected segment is the one of higher energy. The start state is encoded with a DPCM method. For encoding of the remaining 22 samples of the 2 sub-blocks containing the start state and the remaining four sub-blocks, gain-shape coding is performed using a codebook generated from the available already coded samples. The codebook is used in NSTAGES=3 stages in a successive refinement approach. The resulting 3 gain factors are encoded with 4, 3, and 3 bit scalar quantization, respectively. The codebook search method employs noise shaping derived from the LPC filters and minimization of the squared error between the target vector and the code vectors. Each code vector in this codebook comes from one of NSECTION=4 codebook sections. The first section is filled with delayed, already encoded residual vectors. The code vectors of the remaining 3 codebook sections are constructed by predefined linear combinations of vectors in the first section of the codebook. The linear combination coefficients differ from one section to the next. The codebook encoding is done in 3 steps: 1. The remaining 22 samples of the 2 sub-blocks containing the start state are encoded using a codebook of size 256 constructed from the 58 samples of the encoded start state. 2. If the block contains sub-blocks later in time than the ones containing the start state, each of these sub-blocks are subsequently encoded using a codebook encoding method. A new codebook is constructed for each sub-block since the available encoded residual signal samples increases. 3. If the block contains sub-blocks earlier in time than the ones encoded for the start state then a procedure equal to the one applied for sub-blocks later in time is applied, but with time- reversed signals since encoding is now performed backwards in time. Andersen et. al. Experimental - Expires August 20th, 2002 6 Internet Low Bit Rate Codec February 2002 Within steps 2. and 3. above, 4 sub-blocks are encoded. The codebooks have code vector dimension 40 and construction of the codebook for the 4 coding instances is performed as follows: 1) A codebook of size 256 is created from the 80 samples of the already coded residual signal; 2) A codebook of size 512 is created from the 120 samples of the already coded residual signal; 3) A codebook of size 512 is created from the 147 samples of the already coded residual signal closest to the sub-blocks to be encoded; 4) A codebook of size 512 is created from the 147 samples of the already coded residual signal closest to the sub-blocks to be encoded. The only difference between step 2. and 3. is that the 40 sample signal to be encoded, as well as the already coded signal for codebook construction, is time-reversed before encoding in step 3. Since codebook encoding with squared-error matching is known to produce a coded signal of less power than the scalar DPCM coded signal, a gain correction factor is calculated by comparing the power loss in the codebook encoding to the power loss in the scalar DPCM coding. The gain correction factor is quantized with 4 bits and is used to scale down the start state to produce a signal with a smooth power contour over the block. 2.2 Decoder For packet communications, typically a jitter buffer placed at the receiving end decides whether packet containing an encoded signal block has been received or lost. This logic is not part of the codec described here. For each received encoded signal block the decoder performs a decoding. For each lost signal block the decoder performs a PLC operation. The decoding for each block starts by a decoding and interpolation of the LPC coefficients. Subsequently the start state is decoded. For codebook encoded segments, each segment is decoded by constructing the 3 code vectors given by the received codebook indices in the same way as the code vectors were constructed in the encoder. The 3 gain factors are also decoded and the resulting decoded signal is given by the sum of the 3 codebook vectors scaled with respective gain. An enhancement algorithm is applied on the reconstructed excitation signal. This enhancement augments the periodicity of voiced speech regions. The enhancement is optimized under the constraint that the enhancement signal (defined as the difference between the enhanced excitation and the excitation signal prior to enhancement) has a short-time energy that does not exceed a preset fraction of the short-time energy of the speech signal. A packet loss concealment (PLC) operation is easily embedded in the decoder. The PLC operation can, e.g., be based on repetition of LPC filters and obtaining the LPC residual signal using a long term prediction estimate from previous residual blocks. Andersen et. al. Experimental - Expires August 20th, 2002 7 Internet Low Bit Rate Codec February 2002 3. ENCODER PRINCIPLES This section describes the principles of each component of the encoder algorithm. 3.1 LPC Analysis and Quantization The input to the LPC analysis module is a high-pass filtered speech buffer, speech_hp, that contains 300 (LOOKBACK + BLOCKL = 60 + 240 =300) speech samples, where samples 0 through 59 are from the previous block and samples 60 through 299 are from the current block. No look-ahead into the next block is used. For the very first block processed, the look back samples are assumed to be zeros. For each input block, the LPC analysis calculates two sets of FILTERORDER=10 LPC filter coefficients using the autocorrelation method and the Levinson-Durbin recursion. The first set, lsf1, represents the spectral properties of the input signal at the center of the second subblock while the other set, lsf2, represents the spectral characteristics as measured at the center of the fifth subblock. The details of the computation shall be executed as described in 3.1.1 through 3.1.6. 3.1.1 Computation of Autocorrelation Coefficients The first step in the LPC analysis procedure is to calculate autocorrelation coefficients using windowed speech samples. This windowing is the only difference in the LPC analysis procedure for the two sets of coefficients. For the first set, a 240 sample long standard symmetric Hanning window is applied to samples 0 through 239 of the input data. In c-like pseudo code, the first window, win1, is hence calculated as: win1[i] = 0.5 * (1.0 - cos((2 * PI * (i + 1))/(BLOCKL + 1))); i=0,...,119 win1[BLOCKL - i - 1] = win1[i]; i=120,...,239 The windowed speech speech_hp_win1 is then obtained by multiplying the 240 first samples of the input speech buffer with the window coefficients: speech_hp_win1[i] = speech_hp[i] * win1[i]; i=0,...,BLOCKL-1 From these 240 windowed speech samples, 11 (FILTERORDER + 1) autocorrelation coefficients, acf1, are calculated: acf1[lag] += speech_hp_win1[n] * speech_hp_win1[n + lag]; lag=0,...,FILTERORDER; n=0,...,BLOCKL-lag In order to make the analysis more robust against numerical precision problems, a spectral smoothing procedure is applied by Andersen et. al. Experimental - Expires August 20th, 2002 8 Internet Low Bit Rate Codec February 2002 windowing the autocorrelation coefficients with a Gaussian window before the LPC coefficients are computed. Also, a white noise floor is added to the autocorrelation function by multiplying coefficient zero by 1.0001 (40dB below the energy of the windowed speech signal). These two steps are implemented by multiplying the autocorrelation coefficients with the following window: win3[0] = 1.0001; win3[i] = exp(-0.5 * ((2 * PI * 60.0 * i) /FS)^2); i=1,...,FILTERORDER Then, the windowed acf function acf1_win is obtained by: acf1_win1[i] = acf1[i] * win3[i]; i=0,...,FILTERORDER The second set of autocorrelation coefficients, acf2_win are obtained in a similar manner. The window, win2, is applied to samples 60 through 299, i.e., the entire current block. The window consists of two segments; The first (samples 0 to 220) being half a Hanning window with length 440 and the second being a quarter of a cycle of a cosine wave. By using this asymmetric window, an LPC analysis centered in the fifth subblock is obtained without the need for any look-ahead, which would have added delay. The asymmetric window is defined as: win2[i] = (sin(PI * (i + 1) / 441))^2; i=0,...,219 win2[i] = cos((i - 220) * PI / 10); i=220,...,239 and the windowed speech is computed by: speech_hp_win2[i] = speech_hp[i + LOOKBACK] * win2[i]; i=0,....BLOCKL-1 The windowed autocorrelation coefficients are then obtained in exactly the same way as for the first analysis instance. The generation of the windows win1, win2, and win3 are typically done in advance and the arrays are stored in ROM rather than repeating the calculation for every block. 3.1.2 Computation of LPC Coefficients From the 11 smoothed autocorrelation coefficients, acf1_win and acf2_win, the 11 LPC coefficients, lp1 and lp2, are calculated in the same way for both analysis locations using the well known Levinson-Durbin recursion. The first LPC coefficient is always 1.0, resulting in 10 unique coefficients. After determining the LPC coefficients, a bandwidth expansion procedure is applied in order to smooth the spectral peaks in the short-term spectrum. The bandwidth addition is obtained by the following modification of the LPC coefficients: Andersen et. al. Experimental - Expires August 20th, 2002 9 Internet Low Bit Rate Codec February 2002 lp1_bw[i] = lp1[i] * chirp^i; i=0,...,FILTERORDER lp2_bw[i] = lp2[i] * chirp^i; i=0,...,FILTERORDER where "chirp" is a real number between 0 and 1 that typically has a value of around 0.8. 3.1.3 Computation of LSF Coefficients from LPC Coefficients Thusfar, two sets of LPC coefficients that represent the short-term spectral characteristics of the speech signal for two different time locations within the current block have been determined. These coefficients should be quantized and interpolated. Before doing so, it is advantageous to convert the LPC parameters into another type of representation called the Line Spectral Frequencies (LSF). The LSF parameters are used because they are better suited for quantization and interpolation than the regular LPC coefficients. Many computationally efficient methods for calculating the LSFs from the LPC coefficients have been proposed in the literature. The detailed implementation of one applicable method can be found in Appendix A.42. The two arrays of LSF coefficients obtained, lsf1 and lsf2, are of dimension 10 (FILTERORDER). 3.1.4 Quantization of LSF Coefficients Since the LPC filters defined by the two sets of LSFs are needed also in the decoder, the LSF parameters need to be quantized and transmitted as side information. The total number of bits required to represent the quantization of the two LSF representations for one block of speech is 52 with 24 and 28 bits for lsf1 and lsf2, respectively. For computational reasons, both LSF vectors are quantized using 3-split vector quantization (VQ). That is, the LSF vectors are split into three subvectors which are each quantized with a regular VQ. First, the quantized version of lsf2, qlsf2, is obtained by memoryless split VQ. Then qlsf1 is obtained by predictive split VQ of lsf1. The prediction of the (mean-removed) lsf1 is calculated by multiplying qlsf2 by a set of predictor coefficients (one for each of the 10 components of the vector). After subtracting the resulting prediction from lsf1 the resulting prediction error is quantized with a second 3-split VQ. The following c-like definitions explain how each LSF vector (lsf1 and lsf2) is split by defining the position of the first coefficient for each split vector (for added clarity, we additionally provide the corresponding dimension for each split vector): lsf1_splitfirst[LSF_NSPLIT] = {0, 3, 6}; lsf1_splitdim[LSF_NSPLIT] = {3, 3, 4}; lsf2_splitfirst[LSF_NSPLIT] = {0, 4, 7}; lsf2_splitdim[LSF_NSPLIT] = {4, 3, 3}; For each of the split vectors, a separate codebook of quantized values has been designed using a standard VQ training method for a large database containing speech from a large number of speakers Andersen et. al. Experimental - Expires August 20th, 2002 10 Internet Low Bit Rate Codec February 2002 recorded under various conditions. The size of each of the six codebooks associated with the split definitions above is: int lsf1_cbsize[LSF_NSPLIT] = {256, 256, 256}; int lsf2_cbsize[LSF_NSPLIT] = {512, 512, 1024}; The second set of LSF coefficients, lsf2, are quantized with a standard memoryless split vector quantization (VQ) structure using the squared error criterion in the LSF domain. The split VQ quantization consists of the following steps: 1) Quantize the first 3 LSF coefficients with a VQ codebook of size 512. 2) Quantize the LSF coefficient 4, 5, and 6 with VQ a codebook of size 512. 3) Quantize the last 4 LSF coefficients with a VQ codebook of size 1024. This procedure gives 3 quantization indices and the quantized second set of LSF coefficients qlsf2. The quantization of the first set of LSF coefficients is done on the prediction error obtained by predicting the first set of LSF coefficients from the quantized second set of LSF coefficients. The prediction error, e, is obtained by: lsfhat[i] = lsfpred[i] * (qlsf2[i] - lsfmean[i]); i=0,...,FILTERORDER-1 e[i] = lsf1[i] - lsfmean[i] - lsfhat[i]; i=0,...,FILTERORDER-1 where lsfhat is the predicted, mean-removed first set of LSF coefficients. The prediction coefficients, lsfpred, and the mean vector, lsfmean, are pre-computed and stored values. The prediction error e is quantized with a standard memoryless split vector quantization (VQ) structure using the squared error criterion in the LSF domain. The split VQ quantization consists of the following steps: 1) Quantize the first 3 prediction error values with a VQ codebook of size 256. 2) Quantize the prediction error values 4, 5 and 6 with VQ a codebook of size 256. 3) Quantize the last 4 prediction error values with a VQ codebook of size 256. This procedure gives 3 quantization indices and the quantized prediction error values qe. The first set of LSF coefficients qlsf1 is given by: qlsf1[i] = qe[i] + lsfmean[i] + lsfhat[i]; i=0,...,FILTERORDER-1 Andersen et. al. Experimental - Expires August 20th, 2002 11 Internet Low Bit Rate Codec February 2002 The result of the quantization of each of the two LSF coefficient sets is a set of 3 indices. A first set of indices represents qlsf1 and is encoded with 8+8+8=24 bits. A second set of indices represents qlsf2 is encoded with 9+9+10=28 bits. The total number of bits used for LSF quantization in a block is thus 52 bits. 3.1.5 Stability Check of LSF Coefficients The LSF representation of the LPC filter has the nice property that the coefficients are ordered by increasing value, i.e., lsf(n) > lsf(n-1), 0 < n < 10, if the corresponding synthesis filter is stable. Since we are employing a split VQ scheme it is possible that at the split boundaries the LSF coefficients are not ordered correctly and hence the corresponding LP filter is unstable. To ensure that the filter used is stable, a stability check is performed for the quantized LSF vectors. If it turns out that the coefficients are not ordered appropriately (with a safety margin of 50 Hz to ensure that formant peaks are not too narrow) they will be moved apart. The detailed method for this can be found in Appendix A.42. The same procedure is performed in the decoder. This ensures that exactly the same LSF representations are used in both encoder and decoder. 3.1.6 Interpolation of LSF Coefficients From the two sets of LSF coefficients that are computed for each block of speech, different LSFs are obtained for each subblock by means of interpolation. This procedure is performed for the original LSFs, lsf1 and lsf2 as well as the quantized versions qlsf1 and qlsf2 since both versions are used in the encoder. Here follows a brief summary of the interpolation scheme while the details are found in the c-code of Annex B. In the first sub-block, the average of the second LSF vector from the previous block and the first LSF vector in the current block is used. For sub-blocks two through five the LSFs used are obtained by linear interpolation from lsf1 (and qlsf1) to lsf2 (and qlsf2) with lsf1 used in subblock two and lsf2 in subblock five. In the last subblock, lsf2 is used. For the very first block it is assumed that the previous block has the same LSF vectors as the current one. The interpolation method is standard linear interpolation in the LSF domain. The interpolated LSF values are converted to lpc coefficients for each sub-block. A reference implementation of the lsf encoding is given in Appendix A.40. A reference implementation of the corresponding decoding can be found in Appendix A.38. 3.2 Calculation of the Residual The block of speech samples is filtered by the quantized and interpolated LPC filters to yield the residual signal. In particular, the corresponding LPC analysis filter for each subblock Andersen et. al. Experimental - Expires August 20th, 2002 12 Internet Low Bit Rate Codec February 2002 is used to filter the speech samples for the same subblock. The filter memory at the end of each subblock is carried over to the LPC filter of the next subblock. The signal at the output of each LP analysis filter constitutes the residual signal for the corresponding subblock. A reference implementation of the residual calculating filter is found in Appendix A.10. 3.3 Perceptual Weighting Filter In principle any good design of perceptual weighting filter can be applied in the encoder without compromising this codec definition. A simplified design with low complexity is to apply the filter 1/A(z/0.4) in the LPC residual domain. Here A(z) is the filter obtained from unquantized but interpolated LSF coefficients. 3.4 Start State Encoder The start state containing STATE_SHORT_LEN=58 maximum energy residual samples is quantized using a common 6-bit scale quantizer for the block and a 4-bit scalar quantizer operating on the scaled samples in the weighted speech domain. Now we describe the state encoding in greater detail. 3.4.1 Start State Estimation The two sub-blocks containing the start state are determined by finding the two consecutive sub-blocks in the block having the highest power, i.e., the following measure is computed: nsub=1,...,NSUB-1 ssqn[nsub] = 0.0; for (i=(nsub-1)*SUBL; imax_measure){ best_index = cb_index; max_measure = measure; gain = crossDot*invDot; } Andersen et. al. Experimental - Expires August 20th, 2002 17 Internet Low Bit Rate Codec February 2002 Upon search in the base codebook, the iterative search loop is continued into the 3 expanded sections of the adaptive codebook. This can be done as a full search. However, to save computations this part of the search can be constrained to indexes in restricted range RESRANGE around the best_index identified in the base codebook. This is obtained by identifying a start index sInd and an end index eInd as: base_index = best_index; sInd=base_index-RESRANGE/2; if (sInd < 0) sInd=0; eInd = sInd+RESRANGE; if (eInd>=base_size) { eInd=base_size-1; sInd=eInd-RESRANGE; } With these definitions, the iterative search can be continued over the following 3 intervals: 1) cb_index=sInd+base_size to cb_index=eInd+base_size; 2) cb_index=sInd+2*base_size to cb_index=eInd+2*base_size; 3) cb_index=sInd+2*base_size to cb_index=eInd+2*base_size; After these iterations the best codebook index, best_index, has been distilled. A good compromise between computational complexity and speech quality is obtained by choosing RESRANGE=33. 3.5.3.2 The Gain Quantization at Each Stage The gain follows as a result of the registration gain = crossDot*invDot; each time the max_measure is surpassed in the search procedure outlined in section 3.5.3.1. In the first stage, the gain is limited to the range 0.0 to 1.0. if (gain<0.0) gain = 0.0; if (gain>1.0) gain = 1.0; Subsequently this gain is quantized by finding the nearest representation value in the quantization table gain_sq4. The resulting gain index is the index to this representation value in the quantization table. The gains of subsequent stages are quantized using a quantization table which is obtained by multiplication of the values in the table gain_sq3 with a scale value. This value equates 0.1 or the absolute value of the quantized gain representation value obtained in the previous stage, whichever is the larger. Again, the resulting gain index is the index to the nearest representation value in the Andersen et. al. Experimental - Expires August 20th, 2002 18 Internet Low Bit Rate Codec February 2002 quantization table. 3.5.3.3 Preparation of Target for Next Stage Before redoing the search for the next stage the target vector is updated by subtracting from it the selected shape vector times the corresponding quantized gain. A reference implementation of the codebook encoding is found in Appendix A.36. 3.6 Gain Correction Encoding The start state is quantized in a relatively model independent manner using 3 bits per sample. Different form this, the remaining parts of the block is encoded using an adaptive codebook. This codebook will produce high matching accuracy whenever there is a high correlation between the target and a segment found in the buf variable. For unvoiced speech segments, this is not necessarily so. The result becomes a signal block for which the start state is encoded with much higher accuracy than the remaining block. Perceptually, the main problem with this is that the time envelope of the signal energy becomes unsteady. To overcome this problem, the start state is scaled down with a factor that approximates the energy loss in the remaining parts of the signal block. The determination of this scale factor is the last step in the encoding process. First the energy per sample in the start state target, Esst, in the decoded start state, Ess, in the remaining parts of the excitation signal target, Eet, and in the remaining part of the decoded excitation signal, Ee are determined. If the ratio sqrt(Eet/Esst) is larger than or equal to 0.25, a correction factor is determined as correction_factor = sqrt( (Ee/Eet) / (Ess/Esst) ); The correction factor is uniformly quantized in the range from 0.0 to 1.0 using 4 bit quantization as obtained, e.g., by the following lines of c-code: if(correction_factor > 1) correction_factor = 1; index=(int)(correction_factor*16)-1; if (index<0) index=0; However, if the ratio sqrt(Eet/Esst) is less than 0.25, it is taken as an indication that even the original signal energy did not have a steady time envelope. In this case the correction factor is forced to 1.0 by selecting the index equal to 15. A reference implementation of the gain correction encoding is listed in Appendix A.22. Andersen et. al. Experimental - Expires August 20th, 2002 19 Internet Low Bit Rate Codec February 2002 3.7 Bitstream Definition The total number of bits used to describe one block of 30 ms speech is 419 bits giving a bit rate of 13.967 kbit/s. The detailed bit allocation is shown in the table below. When representing one block in the payload of one single packet 53 bytes is needed for the 419 bits. 5 bits is unused in the last byte. Bitstream structure: Parameter Bits ------------------------------------------------------ Split 1 8 LSF 1 Split 2 8 LSF Split 3 8 --------------------------------------- Split 1 9 LSF 2 Split 2 9 Split 3 10 --------------------------------------- Sum 52 ------------------------------------------------------ Block Class. 3 ------------------------------------------------------ Scale Factor State Coder 6 ------------------------------------------------------ Sample 0 3 Quantized Sample 1 3 Residual : : State : : Samples : : Sample 56 3 Sample 57 3 --------------------------------------- Sum 174 ------------------------------------------------------ Stage 1 8 Indices sub-block 1 Stage 2 8 Stage 3 8 --------------------------------------- Stage 1 9 Indices sub-block 2 Stage 2 9 Stage 3 9 CB sub-blocks --------------------------------------- Stage 1 9 Indices sub-block 3 Stage 2 9 Stage 3 9 --------------------------------------- Stage 1 9 Andersen et. al. Experimental - Expires August 20th, 2002 20 Internet Low Bit Rate Codec February 2002 Indices sub-block 4 Stage 2 9 Stage 3 9 --------------------------------------- Sum 105 ------------------------------------------------------ Stage 1 4 Gains sub-block 1 Stage 2 3 Stage 3 3 --------------------------------------- Stage 1 4 Gains sub-block 2 Stage 2 3 Stage 3 3 Gain sub-blocks -------------------------------------- Stage 1 4 Gains sub-block 3 Stage 2 3 Stage 3 3 --------------------------------------- Stage 1 4 Gains sub-block 4 Stage 2 3 Stage 3 3 --------------------------------------- Sum 40 ------------------------------------------------------ Stage 1 8 CB for 22 samples in start state Stage 2 8 Stage 3 8 --------------------------------------- Sum 24 ------------------------------------------------------ Stage 1 4 Gain for 22 samples in start state Stage 2 3 Stage 3 3 --------------------------------------- Sum 10 ------------------------------------------------------ Position 22 sample segment 1 ------------------------------------------------------ Gain correction factor 4 ------------------------------------------------------ SUM 419 4. DECODER PRINCIPLES This section describes the principles of each component of the decoder algorithm. 4.1 LPC Filter Reconstruction The decoding of the LP filter parameters is very straightforward. For a set of six indices the corresponding LSF vectors are found by simple table look up. The three split vectors are concatenated to Andersen et. al. Experimental - Expires August 20th, 2002 21 Internet Low Bit Rate Codec February 2002 obtain qlsf2 and the quantized prediction error vector for the first LSF. The prediction vector is calculated from qlsf2 and added together with the LSF mean vector to the decoded prediction error vector to obtain qlsf1 in the same way as was described for the encoder in Section 3.1.4. The next step is the stability check described in Section 3.1.5 followed by the interpolation scheme described in Section 3.1.6. The only difference is that only the quantized LSFs are known at the decoder and hence the unquantized LSFs are not processed. A reference implementation of the LPC filter reconstruction is given in Appendix A.38. 4.2 Start State Reconstruction The scalar encoded SCLEN state samples are reconstructed by first forming a set of samples from the index stream SINDEX[n], multiplying the set with 1/SCAL=10^QMAX/4.5, and then filtering the block with the inverse dispersion (all-pass) filter used in the encoder (as described in section 3.4). The remaining STATE_ACBLEN samples in the state are reconstructed by the same adaptive codebook technique as described in section 4.3. The location bit determines whether these are the first or the last STATE_ACBLEN samples of the state vector. If the remaining STATE_ACBLEN are the first samples of the state vector, then the scalar encoded SCLEN state samples are time-reversed before initialization of the adaptive codebook memory vector. A reference implementation of the start state reconstruction is given in Appendix A.46. 4.3 Excitation Decoding Loop The decoding of the LPC excitation vector proceeds in the same order in which the residual was encoded at the encoder. That is, after the decoding of the entire state vector, the forward subblocks (corresponding to samples occurring after the state vector samples) are decoded, and then the backward subblocks (corresponding to samples occurring before the state vector) are decoded, resulting in a fully decoded block of excitation signal samples. In particular, each subblock is decoded using the multistage adaptive codebook decoding module which is described in section 4.4. This module relies upon an adaptive codebook memory that is constructed before each run of the adaptive codebook decoding. The construction of the adaptive codebook memory in the decoder is identical to the method outlined in section 3.5.2. Therefore for the initial forward subblock, the last STATLEN=80 samples of the length LMEM=147 adaptive codebook memory are filled with the samples of the state vector. For subsequent forward subblocks, the first SUBL=40 samples of the adaptive codebook memory are discarded, the remaining samples are shifted by SUBL samples towards the beginning of the Andersen et. al. Experimental - Expires August 20th, 2002 22 Internet Low Bit Rate Codec February 2002 vector, while the newly decoded SUBL=40 samples are placed at the end of the adaptive codebook memory. For backward subblocks, the construction is similar except that every vector of samples involved is first time-reversed. A reference implementation of the excitation decoding loop is found in Appendix A.5. 4.4 Multistage Adaptive Codebook Decoding The Multistage Adaptive Codebook Decoding module is used at both the sender (encoder) and the receiver (decoder) ends to produce a synthetic signal in the residual domain that is eventually used to produce synthetic speech. The module takes the index values used to construct vectors that are scaled and summed together to produce a synthetic signal that is the output of the module. 4.4.1 Construction of the Decoded Excitation Signal The unpacked index values provided at the input to the module are references to extended codebooks, which are constructed as described in Section 3.5.2 with the only difference that it is based on the codebook memory without the perceptual weighting. The unpacked 3 indexes are used to look up 3 codebook vectors. The unpacked 3 gain indexes are used to decode the corresponding 3 gains. In this decoding the successive rescaling described in Section 3.5.3.2. A reference implementation of the adaptive codebook decoding is listed in Appendix A.34. 4.5 Packet Loss Concealment If packet loss occurs, the decoder receives a signal saying that information regarding a block is lost. For such blocks a Packet Loss Concealment (PLC) unit can be used to create a decoded signal which mask the effect of that packet loss. In the following we will describe an example of a PLC unit that can be used with the iLBC codec. As the PLC unit is used only at the decoder, the PLC unit does not affect interoperability between implementations. Other PLC implementations can therefore be used. The example PLC described operates on the LP filters and the excitation signals and is based on the following principles: 4.5.1 Block Received Correctly and Previous Block also Received If the block is received correctly, the PLC only records state information of the current block that can be used in case the next block is lost. The LP filters for each subblock, each first stage adaptive codebook lag (which can be construed as pitch) for each subblock that runs the adaptive codebook decoding, and the entire decoded excitation signal are all saved in the PLCState structure. All this information will be needed if the following block is lost. Andersen et. al. Experimental - Expires August 20th, 2002 23 Internet Low Bit Rate Codec February 2002 4.5.2 Block Not Received If the block is not received, the block substitution is based on doing a pitch synchronous repetition of the excitation signal which is filtered by modified versions of the previous block's LP filters. The previous block's information is stored in the structure PLCState. First, the previous block's LP filters are bandwidth expanded (the effect of which is to pull the roots away from the unit circle to mute the resonance of the filters) to produce the LP filters that are used in the synthesis of the substituted block. A correlation analysis is performed on the previous block's excitation signal in order to detect the amount of pitch periodicity and a pitch value. The correlation measure is also used to decide on the voicing level (the degree to which the previous block's excitation was a voiced or roughly periodic signal). The excitation in the previous block is used to create an excitation for the block to be substituted such that the pitch of the previous block is maintained. Therefore, the new excitation is constructed in a pitch synchronous manner. In order to avoid a buzzy sounding substituted block, a random excitation is mixed with the new pitch periodic excitation and the relative use of the two components is computed from the correlation measure (voicing level). For the block to be substituted, the newly constructed excitation signal is then passed through the newly constructed LP filters to produce the speech that will be substituted for the lost block. For several consecutive lost blocks, the packet loss concealment continues in a similar manner. The correlation measure of the last received block is still used along with the same pitch value. The LP filters of the last received block are also used again, but the bandwidth expansion is increased for consecutive lost blocks (as the length in time from the last received block increases). This increases the muting of the resonance of the spectral envelope. The energy of the substituted excitation for consecutive lost blocks is decreased, leading to a dampened excitation, and therefore dampened speech. 4.5.3 Block Received Correctly When Previous Block Not Received For the case in which a block is received correctly when the previous block was not received, the correctly received block's directly decoded speech (based solely on the received block) is not used as the actual output. The reason for this is that the directly decoded speech does not necessarily smoothly merge into the synthetic speech generated for the previous lost block. If the two signals are not smoothly merged, an audible discontinuity is accidentally produced. Therefore, a correlation analysis between the two blocks of excitation signal (the excitation of the previous Andersen et. al. Experimental - Expires August 20th, 2002 24 Internet Low Bit Rate Codec February 2002 concealed block and the excitation of the current received block) is performed to find the best phase match. Then a simple overlap-add procedure is performed to smoothly merge the previous excitation into the current block's excitation. The exact implementation of the packet loss concealment does not influence interoperability of the codec. A reference implementation of the packet loss concealment is suggested in Appendix A.34. Exact compliance with this suggested algorithm is not needed for a reference implementation to be fully compatible with the overall codec specification. 4.6 Enhancement The decoder contains an enhancement unit that operates on the reconstructed excitation signal. The enhancement unit increases the perceptual quality of the reconstructed signal by reducing the speech-correlated noise (more accurately: speech-dependent noise) in the voiced speech segments. The enhancement unit has advantages over the postfilters that are conventionally used to a similar purpose. To understand the motivation for the enhancement unit, it is useful to define an enhancement signal that is the subtraction of the distorted input signal from the enhanced output signal. In conventional postfiltering operators, the relative power of the enhancement signal will vary strongly as a function of time. In certain time intervals the enhancement signal has (too) much energy, and in others it has (too) little. The enhancement operation settings usually form a heuristic compromise between such time regions. The need for a compromise results from the postfiltering operation being based on the input signal only, except for signal power conservation. In other words, the conventionally used postfilters operate in open-loop fashion. 4.6.1 Outline of the Enhancement Unit The enhancement unit of iLBC introduces a second constraint on the enhanced signal, in addition to the first constraint that conserves the short-term power between the input and output of the enhancer. The second constraint is that the enhancement signal (which is defined as a difference signal resulting from subtracting the distorted signal from the enhanced signal) is constrained to have a power that is less than or equal to a certain fraction of the power of the distorted speech signal. The second constraint prevents the common artifacts resulting from "over-enhancement" during some time intervals that are common to conventional postfilters. Yet, the second constraint does not significantly affect the effectiveness of the enhancement in sustained voiced regions environments, where enhancement of speech signals corrupted by speech-correlated noise is typically most needed. Andersen et. al. Experimental - Expires August 20th, 2002 25 Internet Low Bit Rate Codec February 2002 The speech enhancement unit includes two basic steps, each performed for each current time sample of the signal. The pitch track or delay track that is determined and used in the iLBC coder is an input to the first step. The first step consists of refining the pitch track so as to allow a sampling of the distorted input signal using sampling intervals of precisely one pitch period, starting from the current sample, to obtain a pitch-period-synchronous sequence. Thus, the procedure creates such a pitch-period-synchronous sequence for each sample of the coded excitation (the sample of the distorted speech signal being also a sample of the corresponding pitch-period-synchronous sequence). To simplify processing, the pitch-period-synchronous sequence is determined simultaneously for a set of consecutive samples of the distorted input signal (i.e., for a block of that signal). We refer to such a set of consecutive excitation-signal samples (block) as a sample-sequence. Our simultaneous determination of pitch-period- synchronous sequences for an entire sample-sequence results in a pitch-period-synchronous sequence of sample-sequences. The second step of our enhancement operator includes re-estimating each sample based on the corresponding pitch-period-synchronous sequence, the first signal-power constraint, and the second constraint operating on the enhancement signal. The sequence of re- estimated samples (the re-estimated signal block) forms the enhanced excitation signal. The enhanced speech signal is more periodic than the distorted speech signal, when the signal is voiced (and the pitch-period-synchronous sequence corresponds to a nearly periodic sampling of the distorted signal). To simplify the processing, the re-estimation procedure is also performed simultaneously for a sample-sequence, rather than for each sample individually. Concatenation of the re-estimated sample sequences (excitation signal blocks) results in the reconstructed excitation signal. It is noted that in regions where the speech signal is not nearly-periodic, the speech enhancement system does not change the distorted signal significantly because of the second constraint. However, whenever the distorted speech signal is nearly periodic, the speech enhancement system effectively removes or reduces the audible distortion. It is also noted that the second constraint not only results in a reduction of artifacts, but that it also results in an insensitivity to lack of robustness of determination of pitch- period-synchronous sequences. In the following two subsections, we first discuss the determination of the pitch-synchronous sequence of sample-sequences for the current sample-sequence and then the re-estimation of the sample-sequence. Concatenation of the re-estimated sample-sequences forms the reconstructed excitation signal of the iLBC coder. Andersen et. al. Experimental - Expires August 20th, 2002 26 Internet Low Bit Rate Codec February 2002 4.6.2 Determination of the Pitch-Synchronous Sequences Upon receiving the pitch track, the enhancer refines this for a particular block (sample sequence), to obtain a pitch-period- synchronous sequence of sample-sequences. Such a pitch-period- synchronous sequence of sample-sequences is determined for each consecutive block of samples (each block forms a sample-sequence). The pitch-period-synchronous sequence of sample-sequences is determined recursively, both forward- and backward-in-time. We describe the procedure to determine the pitch-synchronous- sequence determiner in more detail for the backward iterative procedure. The forward iterative procedure is analogous. The sequence of sample-sequences is determined in a computationally efficient, recursive manner. The reference sample-sequence of an iteration step is initially, i.e., for the first iteration step) defined as the current block of samples. Each subsequent reference sample-sequence is found recursively in the following steps. In a first step, a signal segment is up-sampled to create a set of polyphase signals that have identical sampling rate as the original signal. Each polyphase signal is offset by a different fractional sampling interval. In a second step, a subset of sample-sequences of the various polyphase signals is then identified as candidate sample-sequences. This subset of sample sequences falls within a certain range of time delays that is close to the pitch period obtained from the iLBC decoder. In a third step, one sample sequence is selected from the set of candidate sample sequences. The selected sample-sequence is the sample-sequence that has the highest correlation coefficient with the reference sample-sequence. In the final step of each iteration, the selected sample-sequence replaces the reference sample-sequence to prepare for the next iteration. The procedure is repeated until the required number of sample-sequences backward-in- time is found, which depends on the parameter settings used for the iLBC coder. The forward-in-time part of the pitch-period-synchronous sequence process is determined in a manner analogous to the backward-in-time part of the pitch-period-synchronous sequence. The number of sample- sequences forward-in-time and the number of sample-sequences backward-in-time can be varied individually, to obtain the desired delay and performance characteristics. 4.6.3 Re-estimation of the Current Sample-Sequence For each successive sample-sequence (i.e., each successive block of the excitation signal), a re-estimation of the sample-sequence is performed. This re-estimation is determined from the current pitch- synchronous sequence of sample-sequences, through a constrained optimization procedure. Andersen et. al. Experimental - Expires August 20th, 2002 27 Internet Low Bit Rate Codec February 2002 Let x_m be a vector representing a sample-sequence with index m within the current pitch-synchronous sequence of sample-sequences. The determination of this pitch-synchronous sample sequence was described in section X.1. Furthermore, let z be the re-estimated current sample sequence. We then define the following cross- correlation based periodicity criterion that defines a measure of periodicity for the pitch-period-synchronous sequence: e = sum_{m=-W, m!=0}^{m=W} a_m z^T x_m, (1) where T indicates conjugate transpose, != indicates not equal, and the set of coefficients a_m form a weighting window that specifies the weightings of the respective inner product between the re- estimated sample-sequence and the sample-sequences. We use a centered Hanning weighting modified so as to set a_0 to 0. The objective of the re-estimation procedure is to find the modified current sample-sequence z that maximizes the periodicity criterion (1) under two constraints. The first constraint is the constraint that the modified vector have the same energy as the original vector z^T z= x_0^T x_0 . (2) The second constraint is that the difference vector, i.e., the modification, have relative low energy: (z-x_0)^T (z-x_0) <= b x_0^T x_0 , (3) where the value selected for b is positive and less than unity, with a larger value resulting generally in stronger enhancement of the signal periodicity. It is clear that, for small b, non-periodic signals cannot generally be converted into nearly-periodic signals. The purpose of the second constraint is to prevent production of an enhanced signal that is significantly different from the original signal. This also means that the second constraint limits the numerical size of the errors that the enhancement procedure can make. To achieve constrained optimization, the Lagrange multiplier technique can be used. We distinguish two solution regions for the optimization: 1) the region where the second constraint is not activated (in this solution region inequality (3) is a true inequality) and 2) the region where the second constraint is activated (in this solution region (3) is an equality constraint). Let us define y = sum_{m=-W, m!=0}^{m=W} a_m x_m, (4) Then, in the first case, where the second constraint is not activated, the optimized re-estimated vector is simply a scaled version of y: z = y sqrt( x_0^T x_0 / (y^T y)). (5) Andersen et. al. Experimental - Expires August 20th, 2002 28 Internet Low Bit Rate Codec February 2002 In the second case, where the second constraint is activated and becomes an equality constraint, we have that z= Ay + B x_0 (6) where A = sqrt((b-b^2/4) x_0^T x_0/(y^Ty - (y^T x_0)^2/(x_0^T x_0))) (7) and B = 1 - b/2 - A (y^T x_0)/(x_0^T x_0). (8) It is now seen that the entire re-estimation of the current sample- sequence, from a given pitch-synchronous sequence of sample- sequences, can be performed in three simple steps. In a first step, we find the determine that optimizes the periodicity with only the first constraint activated. The resulting trial solution is given by equation (5). In a second step, we check if this trial solution satisfies the second constraint given by inequality (3). If it does, this trial solution for is used and the third step is omitted. If this is not the case, then we determine solution (6) of the optimization, where both the first and the second constraint are considered as equality constraints. As was mentioned before, the reconstructed excitation signal consists of the concatenation of the re-estimated current sample- sequences. Appendix A.16 contains a listing of a reference implementation for the enhancement method. 4.7 Synthesis Filtering Upon decoding or PLC of the LP excitation block, the decoded speech block is obtained by running the decoded LP synthesis filter over the block. For decoded signal blocks the LP coefficients are changed at the first sample of every sub block. For PLC blocks, one solution is to apply the last LP coefficients of the last decoded speech block for all sub blocks. The reference implementation for the synthesis filtering can be found in appendix A.50. 5. SECURITY CONSIDERATIONS This algorithm for the coding of speech signals is not subject of any known security consideration; however, its RTP payload format [1] is subject of several considerations which are addressed there. Andersen et. al. Experimental - Expires August 20th, 2002 29 Internet Low Bit Rate Codec February 2002 6. REFERENCES [1] A. Duric and S. V. Andersen, "RTP Payload Format for iLBC Speech", draft-duric-avt-gips-ilbc-00.txt, February 2002. [2] S. Bradner, "Key words for use in RFCs to Indicate requirement Levels", BCP 14, RFC 2119, March 1997. [3] ITU-T Recommendation G.711, available online from the ITU bookstore at http://www.itu.int. 7. ACKNOWLEDGEMENTS The authors wish to thank Henry Sinnreich for great support of this initiative and also wish to thank à. for their valuable feedback and comments. Andersen et. al. Experimental - Expires August 20th, 2002 30 Internet Low Bit Rate Codec February 2002 8. AUTHOR'S ADDRESSES Soren Vang Andersen Global IP Sound AB Rosenlundsgatan 54 Stockholm, S-11863 Sweden Phone: +46 8 54553040 Email: soren.andersen@globalipsound.com Alan Duric Global IP Sound AB Rosenlundsgatan 54 Stockholm, S-11863 Sweden Phone: +46 8 54553040 Email: alan.duric@globalipsound.com Roar Hagen Global IP Sound AB Rosenlundsgatan 54 Stockholm, S-11863 Sweden Phone: +46 8 54553040 Email: roar.hagen@globalipsound.com W. Bastiaan Kleijn Global IP Sound AB Rosenlundsgatan 54 Stockholm, S-11863 Sweden Phone: +46 8 54553040 Email: bastiaan.kleijn@globalipsound.com Jan Linden Global IP Sound Inc. 900 Kearny Street, suite 500 San Francisco, CA-94133 USA Phone: +1 415 397 2555 Email: jan.linden@globalipsound.com Manohar N. Murthi 1630 Eagle Dr. Sunnyvale, CA-94087 USA Phone: +1 408 749 8160 Email: mnmurthi@yahoo.com Jan Skoglund Global IP Sound Inc. 900 Kearny Street, suite 500 San Francisco, CA-94133 Andersen et. al. Experimental - Expires August 20th, 2002 31 Internet Low Bit Rate Codec February 2002 USA Phone: +1 415 397 2555 Email: jan.skoglund@globalipsound.com Julian Spittka Global IP Sound Inc. 900 Kearny Street, suite 500 San Francisco, CA-94133 USA Phone: +1 415 397 2555 Email: julian.spittka@globalipsound.com Andersen et. al. Experimental - Expires August 20th, 2002 32 Internet Low Bit Rate Codec February 2002 APPENDIX A REFERENCE IMPLEMENTATION This appendix contains the complete c-code for a reference implementation of encoder and decoder for the specified codec. The c-code consists of the following files with highest level functions: iLBC_test.c: main function for evaluation purpose iLBC_encode.h: encoder header iLBC_encode.c: encoder function iLBC_decode.h: decoder header iLBC_decode.c: decoder function the following files containing global defines and constants: iLBC_define.h: global defines constants.h: global constants header constants.c: global constants memory allocations and the following files containing subroutines: anaFilter.h: lpc analysis filter header anaFilter.c: lpc analysis filter function createCB.h: codebook construction header createCB.c: codebook construction function doCPLC.h: packet loss concealment header doCPLC.c: packet loss concealment function enhancer.h: signal enhancement header enhancer.c: signal enhancement function filter.h: general filter header filter.c: general filter functions FrameClassify.h: start state classification header FrameClassify.c: start state classification function gaincorr_Encode.h: gain correction encoder header gaincorr_Encode.c: gain correction encoder function gainquant.h: gain quantization header gainquant.c: gain quantization function getCBvec.h: codebook vector construction header getCBvec.c: codebook vector construction function helpfun.h: general purpose header helpfun.c: general purpose functions hpInput.h: input high pass filter header hpInput.c: input high pass filter function hpOutput.h: output high pass filter header hpOutput.c: output high pass filter function iCBConstruct.h: excitation decoding header iCBConstruct.c: excitation decoding function iCBSearch.h: excitation encoding header iCBSearch.c: excitation encoding function Andersen et. al. Experimental - Expires August 20th, 2002 33 Internet Low Bit Rate Codec February 2002 LPCdecode.h: lpc decoding header LPCdecode.c: lpc decoding function LPCencode.h: lpc encoding header LPCencode.c: lpc encoding function lsf.h: line spectral frequencies header lsf.c: line spectral frequencies functions packing.h: bitstream packetization header packing.c: bitstream packetization functions StateConstructW.h: state decoding header StateConstructW.c: state decoding functions StateSearchW.h: state encoding header StateSearchW.c: state encoding function syntFilter.h: lpc synthesis filter header syntFilter.c: lpc synthesis filter function The implementation is portable and should work on many different platforms. However, it is not difficult to optimize the implementation on particular platforms, an exercise left to the reader. A.1 iLBC_test.c /****************************************************************** iLBC Speech Coder ANSI-C Source Code iLBC_test.c Copyright (c) 2001, Global IP Sound AB. All rights reserved. ******************************************************************/ #include #include #include #include #include "iLBC_define.h" #include "iLBC_encode.h" #include "iLBC_decode.h" #include "constants.h" //#include "iLBCInterface.h" #define ILBCNOOFWORDS ILBCFLOAT_GIPS_NOOFBYTES/2 /* Runtime statistics */ #include #define CLOCKS_PER_SEC 1000 #define TIME_PER_FRAME 30 Andersen et. al. Experimental - Expires August 20th, 2002 34 Internet Low Bit Rate Codec February 2002 /*----------------------------------------------------------------* * Initiation of encoder instance. *---------------------------------------------------------------*/ short initEncode( /* (o) Number of bytes encoded */ iLBC_Enc_Inst_t *iLBCenc_inst /* (i/o) Encoder instance */ ){ int i; memset((*iLBCenc_inst).anaMem, 0, ILBCFLOAT_GIPS_FILTERORDER*sizeof(float)); for (i=0; i1) { Andersen et. al. Experimental - Expires August 20th, 2002 36 Internet Low Bit Rate Codec February 2002 printf("\nERROR - Wrong mode - 0, 1 allowed\n"); exit(3);} /* do actual decoding of block */ iLBC_decode(decblock, (unsigned char *)encoded_data, iLBCdec_inst, mode); /* convert to short */ for(k=0;kMAX_SAMPLE) dtmp=MAX_SAMPLE; decoded_data[k] = (short) dtmp; } return (short)ILBCFLOAT_GIPS_BLOCKL; } /*----------------------------------------------------------------* * Main program to test iLBC encoding and decoding * * Usage: * exefile_name.exe * *---------------------------------------------------------------*/ void main(int argc, char* argv[]) { /* Runtime statistics */ float starttime; float runtime; float outtime; FILE *ifileid,*efileid,*ofileid; short encoded_data[ILBCNOOFWORDS], data[ILBCFLOAT_GIPS_BLOCKL]; int blockcount = 0; iLBC_Enc_Inst_t Enc_Inst; iLBC_Dec_Inst_t Dec_Inst; /* get arguments and open files */ if(argc != 4 ){ fprintf(stderr, "%s inputfile channelfile outputfile\n", argv[0]); exit(1);} if( (ifileid=fopen(argv[1],"rb")) == NULL){ fprintf(stderr,"Cannot open input file %s\n", argv[1]); exit(2);} Andersen et. al. Experimental - Expires August 20th, 2002 37 Internet Low Bit Rate Codec February 2002 if( (efileid=fopen(argv[2],"wb")) == NULL){ fprintf(stderr, "Cannot open channelfile file %s\n", argv[2]); exit(3);} if( (ofileid=fopen(argv[3],"wb")) == NULL){ fprintf(stderr, "Cannot open output file %s\n", argv[2]); exit(3);} /* print info */ fprintf(stderr, "\n"); fprintf(stderr, "*---------------------------------------------------*\n"); fprintf(stderr, "* *\n"); fprintf(stderr, "* ilbclibtest *\n"); fprintf(stderr, "* *\n"); fprintf(stderr, "* *\n"); fprintf(stderr, "*---------------------------------------------------*\n"); fprintf(stderr, "\nInput file : %s\n", argv[1]); fprintf(stderr,"Channel file : %s\n", argv[2]); fprintf(stderr,"Output file : %s\n\n", argv[3]); /* Initialization */ initEncode(&Enc_Inst); initDecode(&Dec_Inst); /* Runtime statistics */ starttime=clock()/(float)CLOCKS_PER_SEC; /* loop over input blocks */ while( fread(data,sizeof(short), ILBCFLOAT_GIPS_BLOCKL,ifileid)==ILBCFLOAT_GIPS_BLOCKL){ blockcount++; /* encoding */ fprintf(stderr, "--- Encoding block %i --- ",blockcount); encode(&Enc_Inst, encoded_data, data); fprintf(stderr, "\r"); /* write byte file */ fwrite(encoded_data,sizeof(short),ILBCNOOFWORDS,efileid); /* decoding */ Andersen et. al. Experimental - Expires August 20th, 2002 38 Internet Low Bit Rate Codec February 2002 fprintf(stderr, "--- Decoding block %i --- ",blockcount); decode(&Dec_Inst, data, encoded_data, 1); fprintf(stderr, "\r"); /* write output file */ fwrite(data,sizeof(short),ILBCFLOAT_GIPS_BLOCKL,ofileid); } /* Runtime statistics */ runtime = (float)(clock()/(float)CLOCKS_PER_SEC-starttime); outtime = (float)((float)blockcount* (float)TIME_PER_FRAME/1000.0); printf("\nLength of speech file: %.1f s\n", outtime); printf("Time to run iLBC_encode+iLBC_decode:"); printf(" %.1f s (%.1f %% of realtime)\n", runtime, (100*runtime/outtime)); /* close files */ fclose(ifileid); fclose(efileid); fclose(ofileid); } A.2 iLBC_encode.h /****************************************************************** iLBC Speech Coder ANSI-C Source Code iLBC_encode.h Copyright (c) 2001, Global IP Sound AB. All rights reserved. ******************************************************************/ #ifndef __iLBC_ILBCENCODE_H #define __iLBC_ILBCENCODE_H #include "iLBC_define.h" void iLBC_encode( unsigned char *bytes, /* (o) encoded data bits iLBC */ float *block, /* (o) speech vector to encode */ iLBC_Enc_Inst_t *iLBCenc_inst /* (i/o) the general encoder state */ ); #endif Andersen et. al. Experimental - Expires August 20th, 2002 39 Internet Low Bit Rate Codec February 2002 A.3 iLBC_encode.c /****************************************************************** iLBC Speech Coder ANSI-C Source Code iLBC_encode.c Copyright (c) 2001, Global IP Sound AB. All rights reserved. ******************************************************************/ #include #include #include "iLBC_define.h" #include "LPCencode.h" #include "FrameClassify.h" #include "StateSearchW.h" #include "StateConstructW.h" #include "helpfun.h" #include "constants.h" #include "packing.h" #include "iCBSearch.h" #include "iCBConstruct.h" #include "hpInput.h" #include "anaFilter.h" #include "syntFilter.h" #include "gaincorr_Encode.h" #include /*----------------------------------------------------------------* * main encoder function *---------------------------------------------------------------*/ void iLBC_encode( unsigned char *bytes, /* (o) encoded data bits iLBC */ float *block, /* (o) speech vector to encode */ iLBC_Enc_Inst_t *iLBCenc_inst /* (i/o) the general encoder state */ ){ float data[BLOCKL]; float residual[BLOCKL], reverseResidual[BLOCKL]; int start, idxForMax, idxVec[STATE_LEN]; float reverseDecresidual[BLOCKL], mem[MEML]; int n, k, kk, meml_gotten, Nfor, Nback, i; Andersen et. al. Experimental - Expires August 20th, 2002 40 Internet Low Bit Rate Codec February 2002 int dummy=0; int gain_index[NSTAGES*NASUB], extra_gain_index[NSTAGES]; int cb_index[NSTAGES*NASUB],extra_cb_index[NSTAGES] ; int lsf_i[LSF_NSPLIT*LPC_N]; unsigned char *pbytes; int diff, start_pos, state_first; float en1, en2; int index, gc_index; int subcount, subframe; float gainadjusttarget[BLOCKL]; float weightState[FILTERORDER]; float syntdenum[NSUB*(FILTERORDER+1)]; float weightnum[NSUB*(FILTERORDER+1)]; float weightdenum[NSUB*(FILTERORDER+1)]; float decresidual[BLOCKL]; /* high pass filtering of input signal if such is not done prior to calling this function */ //hpInput(block, BLOCKL, data, (*iLBCenc_inst).hpimem); /* otherwise simply copy */ memcpy(data,block,BLOCKL*sizeof(float)); /* LPC of hp filtered input data */ LPCencode(syntdenum, weightnum, weightdenum, lsf_i, data, iLBCenc_inst); /* inverse filter to get residual */ for (n=0; n en2) { state_first = 1; start_pos = (start-1)*SUBL; } else { state_first = 0; start_pos = (start-1)*SUBL + diff; } /* scalar quantization of state */ StateSearchW(&residual[start_pos], &syntdenum[(start-1)*(FILTERORDER+1)], &weightnum[(start-1)*(FILTERORDER+1)], &weightdenum[(start-1)*(FILTERORDER+1)], &idxForMax, idxVec, STATE_SHORT_LEN); StateConstructW(idxForMax, idxVec, &syntdenum[(start-1)*(FILTERORDER+1)], &decresidual[start_pos], STATE_SHORT_LEN); /* predictive quantization in state */ if (state_first) { /* put adaptive part in the end */ /* setup memory */ memset(mem, 0, (MEML-STATE_SHORT_LEN)*sizeof(float)); memcpy(mem+MEML-STATE_SHORT_LEN, decresidual+start_pos, STATE_SHORT_LEN*sizeof(float)); memset(weightState, 0, FILTERORDER*sizeof(float)); /* encode subframes */ iCBSearch(extra_cb_index, extra_gain_index, &residual[start_pos+STATE_SHORT_LEN], mem+MEML-stMemL, stMemL, diff, NSTAGES, &weightdenum[(start-1)*(FILTERORDER+1)], weightState); /* construct decoded vector */ iCBConstruct(&decresidual[start_pos+STATE_SHORT_LEN], extra_cb_index, extra_gain_index, mem+MEML-stMemL, stMemL, diff, NSTAGES); } else { /* put adaptive part in the beginning */ /* create reversed vectors for prediction */ for(k=0; k 0 ){ /* setup memory */ memset(mem, 0, (MEML-STATE_LEN)*sizeof(float)); memcpy(mem+MEML-STATE_LEN, decresidual+(start-1)*SUBL, STATE_LEN*sizeof(float)); memset(weightState, 0, FILTERORDER*sizeof(float)); /* loop over subframes to encode */ for (subframe=0; subframe 0 ){ /* create reverse order vectors */ for( n=0; n MEML ){ meml_gotten=MEML; } for( k=0; k #include #include #include "iLBC_define.h" #include "StateConstructW.h" #include "LPCdecode.h" #include "iCBConstruct.h" #include "doCPLC.h" #include "helpfun.h" #include "constants.h" #include "packing.h" #include "string.h" #include "enhancer.h" #include "hpOutput.h" #include "syntFilter.h" /*----------------------------------------------------------------* * frame residual decoder function (subrutine to iLBC_decode) *---------------------------------------------------------------*/ void Decode( float *decresidual, /* (o) decoded residual frame */ int start, /* (i) location of start state */ int idxForMax, /* (i) codebook index for the maximum value */ int *idxVec, /* (i) codebook indexes for the samples in the start state*/ float *syntdenum, /* (i) the decoded synthesis filter coefficients */ int *cb_index, /* (i) the indexes for the adaptive codebook */ int *gain_index, /* (i) the indexes for the corresponding Andersen et. al. Experimental - Expires August 20th, 2002 47 Internet Low Bit Rate Codec February 2002 gains */ int *extra_cb_index, /* (i) the indexes for the adaptive codebook part of start state */ int *extra_gain_index, /* (i) the indexes for the corresponding gains */ int state_first, /* (i) 1 if non adaptive part of start state comes first 0 if that part comes last */ int gc_index /* (i) the index for the gain correction factor */ ){ float reverseDecresidual[BLOCKL], mem[MEML]; int n, k, meml_gotten, Nfor, Nback, i; int diff, start_pos; int subcount, subframe; float factor; float std_decresidual, one_minus_factor_scaled; int gaussstart; diff = STATE_LEN - STATE_SHORT_LEN; if(state_first == 1) start_pos = (start-1)*SUBL; else start_pos = (start-1)*SUBL + diff; /* decode scalar part of start state */ StateConstructW(idxForMax, idxVec, &syntdenum[(start-1)*(FILTERORDER+1)], &decresidual[start_pos], STATE_SHORT_LEN); if (state_first) { /* put adaptive part in the end */ /* setup memory */ memset(mem, 0, (MEML-STATE_SHORT_LEN)*sizeof(float)); memcpy(mem+MEML-STATE_SHORT_LEN, decresidual+start_pos, STATE_SHORT_LEN*sizeof(float)); /* construct decoded vector */ iCBConstruct(&decresidual[start_pos+STATE_SHORT_LEN], extra_cb_index, extra_gain_index, mem+MEML-stMemL, stMemL, diff, NSTAGES); } else {/* put adaptive part in the beginning */ /* create reversed vectors for prediction */ for(k=0; k 0 ){ /* setup memory */ memset(mem, 0, (MEML-STATE_LEN)*sizeof(float)); memcpy(mem+MEML-STATE_LEN, decresidual+(start-1)*SUBL, STATE_LEN*sizeof(float)); /* loop over subframes to encode */ for (subframe=0; subframe 0 ){ /* create reverse order vectors */ for( n=0; n MEML ){ meml_gotten=MEML; } for( k=0; k0) { /* the data are good */ /* decode data */ pbytes=bytes; unpack( &pbytes,lsf_i+0,lsf_bits[0]); unpack( &pbytes,lsf_i+1,lsf_bits[1]); unpack( &pbytes,lsf_i+2,lsf_bits[2]); unpack( &pbytes,lsf_i+3,lsf_bits[3]); unpack( &pbytes,lsf_i+4,lsf_bits[4]); unpack( &pbytes,lsf_i+5,lsf_bits[5]); unpack( &pbytes,&start,start_bits); unpack( &pbytes,&idxForMax,scale_bits); for(k=0;k0; i--) mem[i] = mem[i-1]; mem[0] = *pi; po++; pi++; } } A.11 createCB.h /****************************************************************** iLBC Speech Coder ANSI-C Source Code createCB.h Copyright (c) 2001, Global IP Sound AB. All rights reserved. Andersen et. al. Experimental - Expires August 20th, 2002 131 Internet Low Bit Rate Codec February 2002 ******************************************************************/ #ifndef __iLBC_CREATECB_H #define __iLBC_CREATECB_H void createCB( float *cb, /* (o) Codebook */ float *invenergy, /* (o) Energy of codebook vectors inverted */ float *mem, /* (i) Buffer to create codebook from */ int lMem, /* (i) Length of buffer */ int cbveclen /* (i) Length of codevector */ ); #endif A.12 createCB.c /****************************************************************** iLBC Speech Coder ANSI-C Source Code createCB.c Copyright (c) 2001, Global IP Sound AB. All rights reserved. ******************************************************************/ #include "iLBC_define.h" #include "constants.h" #include /*----------------------------------------------------------------* * Construct a codebook section and calculate inverted energy of * each codevector. *---------------------------------------------------------------*/ int createSection( /* (o) Number of vectors constructed */ float *cb, /* (o) Codebook */ float *energy, /* (o) Energy of codebook vectors */ float *mem, /* (i) Buffer to create codebook from */ int lMem, /* (i) Length of buffer */ int cbveclen /* (i) Length of codevector */ ){ int j, k, cb_index; int ilow, ihigh, ilen; float alfa, alfa1; float *pp, *ppe, *ppo, *ppi; /* index counter */ Andersen et. al. Experimental - Expires August 20th, 2002 132 Internet Low Bit Rate Codec February 2002 cb_index=0; ppe=energy; /* first non-interpolated vector */ k=cbveclen; *ppe=0.0; pp=mem+lMem-k; memcpy(cb+cb_index*cbveclen, pp, cbveclen*sizeof(float)); for (j=0; j 5) {ilen=5; ilow=ihigh+1-ilen;} /* no interpolation */ *ppe=0.0; pp=mem+lMem-k/2; memcpy(cb+cb_index*cbveclen, pp, ilow*sizeof(float)); for (j=0; j lMem-1) eInd=lMem-memInd; pp=mem+sInd+memInd; pp1=&cbfilters[filtno-1][sInd]; for (j=sInd;j0.0) invenergy[j]=(float)1.0/(invenergy[j]+EPS); } } A.13 doCPLC.h /****************************************************************** iLBC Speech Coder ANSI-C Source Code doCPLC.h Copyright (c) 2001, Global IP Sound AB. All rights reserved. ******************************************************************/ #ifndef __iLBC_DOLPC_H #define __iLBC_DOLPC_H void doThePLC( float *PLCresidual, /* (o) concealed residual */ float *PLClpc, /* (o) concealed LP parameters */ int PLI, /* (i) packet loss indicator 0 - no PL, 1 = PL */ float *decresidual, /* (i) decoded residual */ float *lpc, /* (i) decoded LPC (only used for no PL) */ int inlag, /* (i) pitch lag */ iLBC_Dec_Inst_t *iLBCdec_inst /* (i/o) decoder instance */ ); Andersen et. al. Experimental - Expires August 20th, 2002 135 Internet Low Bit Rate Codec February 2002 #endif A.14 doCPLC.c /****************************************************************** iLBC Speech Coder ANSI-C Source Code doCPLC.c Copyright (c) 2001, Global IP Sound AB. All rights reserved. ******************************************************************/ #include #include #include "iLBC_define.h" /*----------------------------------------------------------------* * Compute cross correlation and pitch gain for pitch prediction * of last subframe at given lag. *---------------------------------------------------------------*/ void compCorr( float *cc, /* (o) cross correlation coefficient */ float *gc, /* (o) gain */ float *buffer, /* (i) signal buffer */ int lag, /* (i) pitch lag */ int nsub, /* (i) number of subframes */ int subl /* (i) subframe length */ ){ int i; float ftmp1, ftmp2; ftmp1 = 0.0; ftmp2 = 0.0; for (i=0; i 0.0) { *cc = ftmp1*ftmp1/ftmp2; *gc = (float)fabs(ftmp1/ftmp2); } else { *cc = 0.0; *gc = 0.0; Andersen et. al. Experimental - Expires August 20th, 2002 136 Internet Low Bit Rate Codec February 2002 } } /*----------------------------------------------------------------* * Packet loss concealment routine. Conceals a residual signal * and LP parameters. If no packet loss, update state. *---------------------------------------------------------------*/ void doThePLC( float *PLCresidual, /* (o) concealed residual */ float *PLClpc, /* (o) concealed LP parameters */ int PLI, /* (i) packet loss indicator 0 - no PL, 1 = PL */ float *decresidual, /* (i) decoded residual */ float *lpc, /* (i) decoded LPC (only used for no PL) */ int inlag, /* (i) pitch lag */ iLBC_Dec_Inst_t *iLBCdec_inst /* (i/o) decoder instance */ ){ int lag, randlag; float gain, maxcc; int i, pick, offset; float ftmp, ftmp1, randvec[BLOCKL], pitchfact; /* Packet Loss */ if (PLI == 1) { (*iLBCdec_inst).consPLICount += 1; /* if previous frame not lost, determine pitch pred. gain */ if ((*iLBCdec_inst).prevPLI != 1) { lag=inlag; compCorr(&maxcc, &gain, (*iLBCdec_inst).prevResidual, lag, NSUB, SUBL); if (gain > 1.0) gain = 1.0; } /* previous frame lost, use recorded lag and gain */ else { lag=(*iLBCdec_inst).prevLag; gain=(*iLBCdec_inst).prevGain; } /* Attenuate signal and scale down pitch pred gain if several frames lost consecutively */ if ((*iLBCdec_inst).consPLICount > 1) gain *= (float)0.9; /* Compute mixing factor of picth repeatition and noise */ Andersen et. al. Experimental - Expires August 20th, 2002 137 Internet Low Bit Rate Codec February 2002 if (gain > XT_MIX) pitchfact = YT_MIX; else if (gain < XB_MIX) pitchfact = YB_MIX; else pitchfact = YB_MIX + (gain - XB_MIX) * (YT_MIX - YB_MIX) / (XT_MIX - XB_MIX); /* compute concealed residual */ (*iLBCdec_inst).energy = 0.0; for (i=0; i= GAINTHRESHOLD) { /* Compute mixing factor of pitch repeatition and noise */ if (gain > XT_MIX) pitchfact = YT_MIX; else if (gain < XB_MIX) pitchfact = YB_MIX; else pitchfact = YB_MIX + (gain - XB_MIX) * (YT_MIX - YB_MIX) / (XT_MIX - XB_MIX); /* compute concealed residual for 3 subframes */ for (i=0; i<3*SUBL; i++) { (*iLBCdec_inst).seed=((*iLBCdec_inst).seed* 69069L+1) & (0x80000000L-1); randlag = 50 + ((signed long) (*iLBCdec_inst).seed)%70; /* noise component */ pick = i - randlag; if (pick < 0) randvec[i] = gain * (*iLBCdec_inst).prevResidual[BLOCKL+pick]; else randvec[i] = gain * randvec[pick]; Andersen et. al. Experimental - Expires August 20th, 2002 139 Internet Low Bit Rate Codec February 2002 /* pitch repeatition component */ pick = i - lag; if (pick < 0) PLCresidual[i] = gain * (*iLBCdec_inst).prevResidual[BLOCKL+pick]; else PLCresidual[i] = gain * PLCresidual[pick]; /* mix noise and pitch repeatition */ PLCresidual[i] = (pitchfact * PLCresidual[i] + ((float)1.0 - pitchfact) * randvec[i]); } /* interpolate concealed residual with actual residual */ offset = 3*SUBL; for (i=0; i #include #include "iLBC_define.h" Andersen et. al. Experimental - Expires August 20th, 2002 141 Internet Low Bit Rate Codec February 2002 #include "constants.h" /*----------------------------------------------------------------* * Find index in array such that the array element with said * index is the element of said array closest to "value" * according to the squared-error criterion *---------------------------------------------------------------*/ void nn( int *index, /* (o) index of array element closest to value */ float *array, /* (i) data array */ float value, /* (i) value */ int arlength /* (i) dimension of data array */ ){ int i; float bestcrit,crit; crit=array[0]-value; bestcrit=crit*crit; *index=0; for(i=1;i dim1){ /* printf("enh_upsample.c: shortened filter: filterlength=%d > dim1=%d\n", filterlength, dim1); */ hfl2=(int) (dim1/2); for(j=0;jENH_SLOP) slop=ENH_SLOP; e=b+blockl-1; bll0=bl-slop; if(bll0<0){ bll0=0;} bll1=bl+slop; if(bll1+blockl >= idatal){ bll1=idatal-blockl-1;} Andersen et. al. Experimental - Expires August 20th, 2002 144 Internet Low Bit Rate Codec February 2002 corrdim=bll1-bll0+1; /* compute upsampled correlation (corr33) and find location of max */ mycorr1(corr22,idata+bll0,corrdim+blockl-1,idata+b,blockl); enh_upsample(corr33,corr22,corrdim,polyphaser, ENH_FL0,ENH_UPS0); tloc=0; maxv=corr33[0]; for(i=1;imaxv){ tloc=i; maxv=corr33[i]; } } /* make vector can be upsampled without ever running outside bounds */ *bl2= (float)bll0+ (float)tloc/(float)ENH_UPS0+(float)1.0; tloc2=(int)(tloc/ENH_UPS0); if(tloc>tloc2*ENH_UPS0){tloc2++;} st=bll0+tloc2-ENH_FL0; vectl=blockl+2*ENH_FL0; if(st<0){ for(i=0;i<-st;i++){ vect[i]=0.0;} for(i=-st;iidatal){ for(i=0;i alpha0 * w00){ if( w00 < 1) w00=1; denom = (w11*w00-w10*w10)/(w00*w00); if( denom > 0.0001){ /* eliminates numerical problems for if smooth */ A = (float)sqrt( (alpha0- alpha0*alpha0/4)/denom); B = -alpha0/2 - A * w10/w00; B = B+1; } else{ /* essentially no difference between cycles; smoothing not needed */ A= 0.0; B= 1.0; } /* create smoothed sequence */ psseq=sseq+hl*blockl; for(i=0;i=0;q--){ bbb[q]=bbb[q+1]-period[ppl[q+1]]; nn(ppl+q,plocs,bbb[q]+hblockl-period[ppl[q+1]],periodl); if(bbb[q]-ENH_OVERHANG>=0) refiner(sseq+q*blockl,bbb+q,idata,idatal,b,bbb[q], blockl,period[ppl[q+1]]); else{ psseq=sseq+q*blockl; for(i=0;i 0.0) { return (float)(ftmp1*ftmp1/ftmp2); } else { return (float)0.0; } } /*----------------------------------------------------------------* * interface for enhancer *---------------------------------------------------------------*/ int enhancerInterface( float *out, /* (o) enhanced signal */ float *in, /* (i) unenhanced signal */ iLBC_Dec_Inst_t *iLBCdec_inst /* (i) buffers etc */ ){ float *enh_buf, *enh_period; float dummy[2]; int iblock, isample; int lag, ilag; float cc, maxcc; enh_buf=(*iLBCdec_inst).enh_buf; enh_period=(*iLBCdec_inst).enh_period; for(isample = 0; isample maxcc) { maxcc = cc; lag = ilag; } } enh_period[iblock+ENH_NBLOCKS_EXTRA] = (float)lag; } for(iblock = 0; iblock max_ssq){ max_ssq = ssq[n]; max_ssq_n = n; } } /* calculate return index */ if( max_ssq_n == 0) return 1; if( max_ssq_n == (NSUB-1) ) return NSUB-1; if( ssq[max_ssq_n-1] > ssq[max_ssq_n+1] ) return max_ssq_n; return max_ssq_n+1; } Andersen et. al. Experimental - Expires August 20th, 2002 154 Internet Low Bit Rate Codec February 2002 A.21 gaincorr_Encode.h /****************************************************************** iLBC Speech Coder ANSI-C Source Code gaincorr_Encode.h Copyright (c) 2001, Global IP Sound AB. All rights reserved. ******************************************************************/ #ifndef __iLBC_GAINCORR_ENCODE_H #define __iLBC_GAINCORR_ENCODE_H int gaincorr_Encode( /* (o) index to quantized gain correction factor */ float *decresidual, /* (i) the decoded residual vector without gain correction */ int start_pos, /* (i) the position of the start state in the residual vector */ float *residual /* (i) the target residual vector */ ); #endif A.22 gaincorr_Encode.c /****************************************************************** iLBC Speech Coder ANSI-C Source Code gaincorr_Encode.c Copyright (c) 2001, Global IP Sound AB. All rights reserved. ******************************************************************/ #include "iLBC_define.h" #include "constants.h" #include int gaincorr_Encode( /* (o) index to quantized gain correction factor */ float *decresidual, /* (i) the decoded residual vector without gain correction */ int start_pos, /* (i) the position of the start state in the residual vector */ Andersen et. al. Experimental - Expires August 20th, 2002 155 Internet Low Bit Rate Codec February 2002 float *residual /* (i) the target residual vector */ ){ float state_energy = 0.0, dec_state_energy = 0.0, residual_energy = 0.0, dec_residual_energy = 0.0; int i, k, index; float state_loss_factor, residual_loss_factor, correction_factor; float factor, std_decresidual, one_minus_factor_scaled; int gaussstart; /* calculation of state energies */ for(k=0; k 1) correction_factor = 1; index=(int)(correction_factor*16)-1; if (index<0) index=0; if (sqrt(residual_energy/state_energy)<0.25) index=15; factor=(float)(index+1)/(float)16.0; for(k=0;k #include #include #include "constants.h" #include "filter.h" /*----------------------------------------------------------------* * quantizer for the gain in the gain-shape coding of residual *---------------------------------------------------------------*/ float gainquant( /* (o) quantized gain value */ float in, /* (i) gain value */ float maxIn, /* (i) maximum of gain value */ int cblen, /* (i) number of quantization indices */ int *index /* (o) quantization index */ ){ int i, tindex; float minmeasure,measure, *cb, scale; /* ensure a lower bound on the scaling factor */ scale=maxIn; if (scale<0.1) scale=(float)0.1; /* select the quantization table */ if (cblen == 8) cb = gain_sq3; else cb = gain_sq4; /* select the best index in the quantization table */ minmeasure=10000.0; for (i=0;i /*----------------------------------------------------------------* * Construct codebook vector for given index. *---------------------------------------------------------------*/ void getCBvec( float *cbvec, /* (o) Constructed codebook vector */ float *mem, /* (i) Codebook buffer */ int index, /* (i) Codebook index */ int lMem, /* (i) Length of codebook buffer */ int cbveclen /* (i) Codebook vector length */ ){ int j, k, n, filtno, memInd, sInd, eInd, sFilt, eFilt; float accum, tmpbuf[MEML]; int base_size; int ilow, ihigh, ilen; float alfa, alfa1; /* Determine size of codebook sections */ base_size=lMem-cbveclen+1; if (cbveclen==SUBL) base_size+=cbveclen/2; /* No filter -> First codebook section */ if (index < base_size) { Andersen et. al. Experimental - Expires August 20th, 2002 160 Internet Low Bit Rate Codec February 2002 /* first non-interpolated vectors */ if (index 5) {ilen=5; ilow=ihigh+1-ilen;} /* no interpolation */ memcpy(cbvec, mem+lMem-k/2, ilow*sizeof(float)); /* interpolation */ alfa1=(float)1.0/(float)ilen; alfa=0.0; for (j=ilow; j<=ihigh; j++) { cbvec[j]=((float)1.0-alfa)*mem[lMem-k/2+j]+ alfa*mem[lMem-k+j]; alfa+=alfa1; } /* no interpolation */ memcpy(cbvec+ihigh+1, mem+lMem-k+ihigh+1, (cbveclen-1-ihigh)*sizeof(float)); } } /* Higher codebbok sections based on filtering */ else { /* filter number (i.e. section number) */ filtno=index/base_size; /* first non-interpolated vectors */ if (index-filtno*base_size lMem-1) eInd=lMem-memInd; for (j=sInd;j lMem-1) eInd=lMem-memInd; for (j=sInd;j 5) {ilen=5; ilow=ihigh+1-ilen;} Andersen et. al. Experimental - Expires August 20th, 2002 162 Internet Low Bit Rate Codec February 2002 /* no interpolation */ memcpy(cbvec, tmpbuf+lMem-k/2, ilow*sizeof(float)); /* interpolation */ alfa1=(float)1.0/(float)ilen; alfa=0.0; for (j=ilow; j<=ihigh; j++) { cbvec[j]=((float)1.0-alfa)* tmpbuf[lMem-k/2+j]+alfa*tmpbuf[lMem-k+j]; alfa+=alfa1; } /* no interpolation */ memcpy(cbvec+ihigh+1, tmpbuf+lMem-k+ihigh+1, (cbveclen-1-ihigh)*sizeof(float)); } } } A.27 helpfun.h /****************************************************************** iLBC Speech Coder ANSI-C Source Code helpfun.h Copyright (c) 2001, Global IP Sound AB. All rights reserved. ******************************************************************/ #ifndef __iLBC_HELPFUN_H #define __iLBC_HELPFUN_H void autocorr( float *r, /* (o) autocorrelation vector */ const float *x, /* (i) data vector */ int N, /* (i) length of data vector */ int order /* largest lag for calculated autocorrelations */ ); void window( float *z, /* (o) the windowed data */ const float *x, /* (i) the original data vector */ const float *y, /* (i) the window */ int N /* (i) length of all vectors */ ); Andersen et. al. Experimental - Expires August 20th, 2002 163 Internet Low Bit Rate Codec February 2002 void levdurb( float *a, /* (o) lpc coefficient vector starting with 1.0 */ float *k, /* (o) reflection coefficients */ float *r, /* (i) autocorrelation vector */ int order /* (i) order of lpc filter */ ); void interpolate( float *out, /* (o) the interpolated vector */ float *in1, /* (i) the first vector for the interpolation */ float *in2, /* (i) the second vector for the interpolation */ float coef, /* (i) interpolation weights */ int length /* (i) length of all vectors */ ); void bwexpand( float *out, /* (o) the bandwidth expanded lpc coefficients */ float *in, /* (i) the lpc coefficients before bandwidth expansion */ float coef, /* (i) the bandwidth expansion factor */ int length /* (i) the length of lpc coefficient vectors */ ); void vq( float *Xq, /* (o) the quantized vector */ int *index, /* (o) the quantization index */ const float *CB, /* (i) the vector quantization codebook */ float *X, /* (i) the vector to quantize */ int n_cb, /* (i) the number of vectors in the codebook */ int dim /* (i) the dimension of all vectors */ ); void gvq( float *Xq, /* (o) the quantized vector */ int *index, /* (o) the quantization index */ float *CB, /* (i) the shape codebook */ float *X, /* (i) the vector to quantize */ int n_cb, /* (i) the number of vectors in the shape codebook */ int dim, /* (i) dimension of all vectors */ float in_ene, /* (i) the energy of the input vector */ float factor, /* (o) resulting gain factor */ int targlen /* (i) dimension of all vectors */ ); void SplitVQ( float *qX, /* (o) the quantized vector */ int *index, /* (o) a vector of indexes for all vector codebooks in the split */ float *X, /* (i) the vector to quantize */ const float *CB, /* (i) the quantizer codebook */ int nsplit, /* the number of vector splits */ Andersen et. al. Experimental - Expires August 20th, 2002 164 Internet Low Bit Rate Codec February 2002 const int *dim, /* the dimension of X and qX */ const int *cbsize /* the number of vectors in the codebook */ ); void sort_sq( float *xq, /* (o) the quantized value */ int *index, /* (o) the quantization index */ float x, /* (i) the value to quantize */ const float *cb, /* (i) the quantization codebook */ int cb_size /* (i) the size of the quantization codebook */ ); int LSF_check( float *lsf, /* (i) a table of lsf vectors */ int dim, /* (i) the dimension of each lsf vector */ int NoAn /* (i) the number of lsf vectors in the table */ ); #endif A.28 helpfun.c /****************************************************************** iLBC Speech Coder ANSI-C Source Code helpfun.c Copyright (c) 2001, Global IP Sound AB. All rights reserved. ******************************************************************/ #include #include "iLBC_define.h" #include "constants.h" /*----------------------------------------------------------------* * calculation of auto correlation *---------------------------------------------------------------*/ void autocorr( float *r, /* (o) autocorrelation vector */ const float *x, /* (i) data vector */ int N, /* (i) length of data vector */ int order /* largest lag for calculated autocorrelations */ ){ int lag, n; Andersen et. al. Experimental - Expires August 20th, 2002 165 Internet Low Bit Rate Codec February 2002 float sum; for (lag = 0; lag <= order; lag++) { sum = 0; for (n = 0; n < N - lag; n++) sum += x[n] * x[n+lag]; r[lag] = sum; } } /*----------------------------------------------------------------* * window multiplication *---------------------------------------------------------------*/ void window( float *z, /* (o) the windowed data */ const float *x, /* (i) the original data vector */ const float *y, /* (i) the window */ int N /* (i) length of all vectors */ ){ int i; for (i = 0; i < N; i++) { z[i] = x[i] * y[i]; } } /*----------------------------------------------------------------* * levinson-durbin solution for lpc coefficients *---------------------------------------------------------------*/ void levdurb( float *a, /* (o) lpc coefficient vector starting with 1.0 */ float *k, /* (o) reflection coefficients */ float *r, /* (i) autocorrelation vector */ int order /* (i) order of lpc filter */ ){ float sum, alpha; int m, m_h, i; a[0] = 1.0; if (r[0] < EPS) { /* if r[0] <= 0, set LPC coeff. to zero */ for (i = 0; i < order; i++) { k[i] = 0; a[i+1] = 0; } } else { a[1] = k[0] = -r[1]/r[0]; alpha = r[0] + r[1] * k[0]; for (m = 1; m < order; m++){ sum = r[m + 1]; for (i = 0; i < m; i++){ sum += a[i+1] * r[m - i]; Andersen et. al. Experimental - Expires August 20th, 2002 166 Internet Low Bit Rate Codec February 2002 } k[m] = -sum / alpha; alpha += k[m] * sum; m_h = (m + 1) >> 1; for (i = 0; i < m_h; i++){ sum = a[i+1] + k[m] * a[m - i]; a[m - i] += k[m] * a[i+1]; a[i+1] = sum; } a[m+1] = k[m]; } } } /*----------------------------------------------------------------* * interpolation between vectors *---------------------------------------------------------------*/ void interpolate( float *out, /* (o) the interpolated vector */ float *in1, /* (i) the first vector for the interpolation */ float *in2, /* (i) the second vector for the interpolation */ float coef, /* (i) interpolation weights */ int length /* (i) length of all vectors */ ){ int i; float invcoef; invcoef = (float)1.0 - coef; for (i = 0; i < length; i++) out[i] = coef * in1[i] + invcoef * in2[i]; } /*----------------------------------------------------------------* * lpc bandwidth expansion *---------------------------------------------------------------*/ void bwexpand( float *out, /* (o) the bandwidth expanded lpc coefficients */ float *in, /* (i) the lpc coefficients before bandwidth expansion */ float coef, /* (i) the bandwidth expansion factor */ int length /* (i) the length of lpc coefficient vectors */ ){ int i; float chirp; chirp = coef; out[0] = in[0]; for (i = 1; i < length; i++) { out[i] = chirp * in[i]; chirp *= coef; Andersen et. al. Experimental - Expires August 20th, 2002 167 Internet Low Bit Rate Codec February 2002 } } /*----------------------------------------------------------------* * vector quantization *---------------------------------------------------------------*/ void vq( float *Xq, /* (o) the quantized vector */ int *index, /* (o) the quantization index */ const float *CB, /* (i) the vector quantization codebook */ float *X, /* (i) the vector to quantize */ int n_cb, /* (i) the number of vectors in the codebook */ int dim /* (i) the dimension of all vectors */ ){ int i, j; int pos, minindex; float dist, tmp, mindist; pos = 0; mindist = FLOAT_MAX; for (j = 0; j < n_cb; j++) { dist = X[0] - CB[pos]; dist *= dist; for (i = 1; i < dim; i++) { tmp = X[i] - CB[pos + i]; dist += tmp*tmp; if (dist >= mindist) goto next; } if (dist < mindist) { mindist = dist; minindex = j; } next: pos += dim; } for (i = 0; i < dim; i++) { Xq[i] = CB[minindex*dim + i]; } *index = minindex; } /*----------------------------------------------------------------* * split vector quantization *---------------------------------------------------------------*/ void SplitVQ( float *qX, /* (o) the quantized vector */ int *index, /* (o) a vector of indexes for all vector codebooks in the split */ float *X, /* (i) the vector to quantize */ const float *CB, /* (i) the quantizer codebook */ int nsplit, /* the number of vector splits */ const int *dim, /* the dimension of X and qX */ Andersen et. al. Experimental - Expires August 20th, 2002 168 Internet Low Bit Rate Codec February 2002 const int *cbsize /* the number of vectors in the codebook */ ){ int cb_pos, X_pos, i; cb_pos = 0; X_pos= 0; for (i = 0; i < nsplit; i++) { vq(qX + X_pos, index + i, CB + cb_pos, X + X_pos, cbsize[i], dim[i]); X_pos += dim[i]; cb_pos += dim[i] * cbsize[i]; } } /*----------------------------------------------------------------* * scalar quantization *---------------------------------------------------------------*/ void sort_sq( float *xq, /* (o) the quantized value */ int *index, /* (o) the quantization index */ float x, /* (i) the value to quantize */ const float *cb, /* (i) the quantization codebook */ int cb_size /* (i) the size of the quantization codebook */ ){ int i; if (x <= cb[0]) { *index = 0; *xq = cb[0]; } else { i = 0; while ((x > cb[i]) && i < cb_size - 1) i++; if (x > ((cb[i] + cb[i - 1])/2)) { *index = i; *xq = cb[i]; } else { *index = i - 1; *xq = cb[i - 1]; } } } /*----------------------------------------------------------------* * check for stability of lsf coefficients *---------------------------------------------------------------*/ int LSF_check( /* (o) 1 for stable lsf vectors and 0 for nonstable ones */ float *lsf, /* (i) a table of lsf vectors */ int dim, /* (i) the dimension of each lsf vector */ int NoAn /* (i) the number of lsf vectors in the table */ Andersen et. al. Experimental - Expires August 20th, 2002 169 Internet Low Bit Rate Codec February 2002 ){ int k,n,m, Nit=2, change=0,pos; float tmp; static float eps=(float)0.039; /* 50 Hz */ static float eps2=(float)0.0195; static float maxlsf=(float)3.14; /* 4000 Hz */ static float minlsf=(float)0.01; /* 0 Hz */ /* LSF separation check*/ for (n=0;nmaxlsf) { lsf[pos]=maxlsf; change=1; } } } } return change; } A.29 hpInput.h /****************************************************************** iLBC Speech Coder ANSI-C Source Code hpInput.h Copyright (c) 2001, Global IP Sound AB. All rights reserved. Andersen et. al. Experimental - Expires August 20th, 2002 170 Internet Low Bit Rate Codec February 2002 ******************************************************************/ #ifndef __iLBC_HPINPUT_H #define __iLBC_HPINPUT_H void hpInput( float *In, /* (i) vector to filter */ int len, /* (i) length of vector to filter */ float *Out, /* (o) the resulting filtered vector */ float *mem /* (i/o) the filter state */ ); #endif A.30 hpInput.c /****************************************************************** iLBC Speech Coder ANSI-C Source Code hpInput.c Copyright (c) 2001, Global IP Sound AB. All rights reserved. ******************************************************************/ #include "constants.h" /*----------------------------------------------------------------* * Input high-pass filter *---------------------------------------------------------------*/ void hpInput( float *In, /* (i) vector to filter */ int len, /* (i) length of vector to filter */ float *Out, /* (o) the resulting filtered vector */ float *mem /* (i/o) the filter state */ ){ int i; float *pi, *po; /* all-zero section*/ pi = &In[0]; po = &Out[0]; for (i=0; i #include "iLBC_define.h" #include "gainquant.h" #include "getCBvec.h" /*----------------------------------------------------------------* * Construct decoded vector from codebook and gains. Andersen et. al. Experimental - Expires August 20th, 2002 174 Internet Low Bit Rate Codec February 2002 *---------------------------------------------------------------*/ void iCBConstruct( float *decvector, /* (o) Decoded vector */ int *index, /* (i) Codebook indices */ int *gain_index, /* (i) Gain de-quantization indices */ float *mem, /* (i) Buffer for codevector construction */ int lMem, /* (i) Length of buffer */ int veclen, /* (i) Length of vector */ int nStages /* (i) Number of codebook stages */ ){ int j,k; float gain[NSTAGES]; float cbvec[SUBL]; /* gain de-quantization */ gain[0] = gaindequant(gain_index[0], 1.0, 16); if (nStages > 1) gain[1] = gaindequant(gain_index[1], (float)fabs(gain[0]), 8); if (nStages > 2) gain[2] = gaindequant(gain_index[2], (float)fabs(gain[1]), 8); /* codebook vector construction and construction of total vector */ getCBvec(cbvec, mem, index[0], lMem, veclen); for (j=0;j 1) { for (k=1; k #include #include "iLBC_define.h" #include "gainquant.h" #include "createCB.h" #include "filter.h" /*----------------------------------------------------------------* * Search routine for codebook encoding and gain quantization. *---------------------------------------------------------------*/ void iCBSearch( int *index, /* (o) Codebook indices */ int *gain_index, /* (o) Gain quantization indices */ float *intarget, /* (i) Target vector for encoding */ float *mem, /* (i) Buffer for codebook construction */ Andersen et. al. Experimental - Expires August 20th, 2002 176 Internet Low Bit Rate Codec February 2002 int lMem, /* (i) Length of buffer */ int lTarget, /* (i) Length of vector */ int nStages, /* (i) Number of codebook stages */ float *weightDenum, /* (i) weighting filter coefficients */ float *weightState /* (i) weighting filter state */ ){ int i, j, icount, stage, best_index; float max_measure, gain, measure, crossDot; float gains[NSTAGES]; float cb[(MEML+SUBL+1)*CBEXPAND*SUBL]; float target[SUBL]; int base_index, sInd, eInd, base_size; float buf[MEML+SUBL+2*FILTERORDER]; float invenergy[512], *pp; /* copy target */ memcpy(target, intarget, lTarget*sizeof(float)); /* Determine size of codebook sections */ base_size=lMem-lTarget+1; if (lTarget==SUBL) base_size=lMem-lTarget+1+lTarget/2; /* setup buffer for weighting */ memcpy(buf,weightState,sizeof(float)*FILTERORDER); memcpy(buf+FILTERORDER,mem,lMem*sizeof(float)); memcpy(buf+FILTERORDER+lMem,intarget,lTarget*sizeof(float)); /* weighting */ AllPoleFilter(buf+FILTERORDER, weightDenum, lMem+lTarget, FILTERORDER); /* Construct the codebook and target needed */ createCB(cb, invenergy, buf+FILTERORDER, lMem, lTarget); memcpy(target, buf+FILTERORDER+lMem, lTarget*sizeof(float)); /* The Main Loop over stages */ for (stage=0;stage 0.0) measure = crossDot*crossDot*invenergy[icount]; } else { measure = crossDot*crossDot*invenergy[icount]; } /* check if measure better */ if(measure>max_measure){ best_index = icount; max_measure = measure; gain = crossDot*invenergy[icount]; } } /* set search range for following codebook sections */ base_index=best_index; /* unrestricted search */ if (RESRANGE == -1) { sInd=0; eInd=base_size-1; } /* restriced search around best index from first codebook section */ else { sInd=base_index-RESRANGE/2; if (sInd < 0) sInd=0; eInd = sInd+RESRANGE; if (eInd>=base_size) { eInd=base_size-1; sInd=eInd-RESRANGE; } } /* search of higher codebook sections */ for (i=1; i 0.0) measure = crossDot*crossDot* invenergy[icount]; } else { measure = crossDot*crossDot*invenergy[icount]; } /* check if measure better */ if(measure>max_measure){ best_index = icount; max_measure = measure; gain = crossDot*invenergy[icount]; } } } /* record best index */ index[stage] = best_index; /* gain quantization */ if(stage==0){ if (gain<0.0) gain = 0.0; if (gain>1.0) gain = 1.0; gain = gainquant(gain, 1.0, 16, &gain_index[stage]); } else { if(fabs(gain) > fabs(gains[stage-1])){ gain = gain * (float)fabs(gains[stage-1])/ (float)fabs(gain); } Andersen et. al. Experimental - Expires August 20th, 2002 179 Internet Low Bit Rate Codec February 2002 gain = gainquant(gain, (float)fabs(gains[stage-1]), 8, &gain_index[stage]); } /* Update target */ for(j=0;j #include #include "helpfun.h" #include "lsf.h" #include "iLBC_define.h" #include "constants.h" /*----------------------------------------------------------------* * interpolation of lsf coefficients for the decoder *---------------------------------------------------------------*/ void LSFinterpolate2a_dec( float *a, /* (o) lpc coefficients for a sub frame */ float *lsf1, /* (i) first lsf coefficient vector */ float *lsf2, /* (i) second lsf coefficient vector */ float coef, /* (i) interpolation weight */ int length /* (i) length of lsf vectors */ ){ float lsftmp[FILTERORDER]; interpolate(lsftmp, lsf1, lsf2, coef, length); lsf2a(a, lsftmp); } /*----------------------------------------------------------------* * obtain quantized lsf coefficients from quantization index *---------------------------------------------------------------*/ void SimplelsfUNQ( float *lsfunq, /* (o) quantized lsf coefficients */ Andersen et. al. Experimental - Expires August 20th, 2002 181 Internet Low Bit Rate Codec February 2002 int *index /* (i) quantization index */ ){ int i,j, pos, cb_pos; float lsfhat[FILTERORDER]; /* decode last LSF */ pos = 0; cb_pos = 0; for (i = 0; i < LSF_NSPLIT; i++) { for (j = 0; j < dim_ml[i]; j++) { lsfunq[FILTERORDER + pos + j] = cb_ml[cb_pos + (long)(index[LSF_NSPLIT + i])*dim_ml[i] + j]; } pos += dim_ml[i]; cb_pos += size_ml[i]*dim_ml[i]; } /* decode predicion error for first LSF */ pos = 0; cb_pos = 0; for (i = 0; i < LSF_NSPLIT; i++) { for (j = 0; j < dim_p[i]; j++) { lsfunq[pos + j] = cb_p[cb_pos + (long)(index[i])*dim_p[i] + j]; } pos += dim_p[i]; cb_pos += size_p[i]*dim_p[i]; } /* add prediction, mean, and unquantized prediction error to obtain output LSF */ for (i = 0; i < FILTERORDER; i++) { lsfhat[i] = lsfpred[i] * (lsfunq[FILTERORDER + i] - lsfmean[i]); lsfunq[i] += lsfmean[i] + lsfhat[i]; } } /*----------------------------------------------------------------* * obtain synthesis and weighting filters form lsf coefficients *---------------------------------------------------------------*/ void DecoderInterpolateLSF( float *syntdenum, /* (o) synthesis filter coefficients */ float *weightnum, /* (o) weighting numbrator coefficients */ float *weightdenum, /* (o) weighting denumerator coefficients */ float *lsfunq, /* (i) quantized lsf coefficients */ int length, /* (i) length of lsf coefficient vector */ Andersen et. al. Experimental - Expires August 20th, 2002 182 Internet Low Bit Rate Codec February 2002 iLBC_Dec_Inst_t *iLBCdec_inst /* (i) the decoder state structure */ ){ int i, pos, lp_length; float lp[FILTERORDER + 1], *lsfunq2; lsfunq2 = lsfunq + length; lp_length = length + 1; /* subframe 1: Interpolation between old and first */ LSFinterpolate2a_dec(lp, (*iLBCdec_inst).lsfunqold, lsfunq, coef[0], length); bwexpand(syntdenum, lp, CHIRP_SYNTDENUM, lp_length); bwexpand(weightnum, lp, CHIRP_WEIGHTNUM, lp_length); bwexpand(weightdenum, lp, CHIRP_WEIGHTDENUM, lp_length); /* subframes 2 to 6: interpolation between first and last LSF */ pos = lp_length; for (i = 1; i < 6; i++) { LSFinterpolate2a_dec(lp, lsfunq, lsfunq2, coef[i], length); bwexpand(syntdenum+pos, lp, CHIRP_SYNTDENUM, lp_length); bwexpand(weightnum + pos, lp, CHIRP_WEIGHTNUM, lp_length); bwexpand(weightdenum + pos, lp, CHIRP_WEIGHTDENUM, lp_length); pos += lp_length; } /* update memory */ for (i = 0; i < length; i++) { (*iLBCdec_inst).lsfunqold[i] = lsfunq2[i]; } } A.39 LPCencode.h /****************************************************************** iLBC Speech Coder ANSI-C Source Code LPCencode.h Copyright (c) 2001, Global IP Sound AB. All rights reserved. ******************************************************************/ Andersen et. al. Experimental - Expires August 20th, 2002 183 Internet Low Bit Rate Codec February 2002 #ifndef __iLBC_LPCENCOD_H #define __iLBC_LPCENCOD_H void LPCencode( float *syntdenum, /* (i/o) synthesis filter coefficients before/after encoding */ float *weightnum, /* (i/o) weighting numerator coefficients before/after encoding */ float *weightdenum, /* (i/o) weighting denumerator coefficients before/after encoding */ int *lsf_index, /* (o) lsf quantization index */ float *data, /* (i) lsf coefficients to quantize */ iLBC_Enc_Inst_t *iLBCenc_inst /* (i/o) the encoder state structure */ ); #endif A.40 LPCencode.c /****************************************************************** iLBC Speech Coder ANSI-C Source Code LPCencode.c Copyright (c) 2001, Global IP Sound AB. All rights reserved. ******************************************************************/ #include #include "iLBC_define.h" #include "helpfun.h" #include "lsf.h" #include "constants.h" /*----------------------------------------------------------------* * lpc analysis (subrutine to LPCencode) *---------------------------------------------------------------*/ void SimpleAnalysis( float *lsf, /* (o) lsf coefficients */ float *data, /* (i) new data vector */ float *lpc_buffer /* (i) buffer containing old data */ ){ int k, is,i; float temp[BLOCKL], lp[FILTERORDER + 1], lp2[FILTERORDER + 1], r[FILTERORDER + 1]; Andersen et. al. Experimental - Expires August 20th, 2002 184 Internet Low Bit Rate Codec February 2002 for (i = 0; i < BLOCKL; i++) lpc_buffer[LPC_AHEADL + i] = data[i]; /* No lookahead, last window is asymmetric */ for (k = 0; k < LPC_N; k++) { is = k*LPC_AHEADL; if (k < (LPC_N - 1)) window(temp, lpc_win, lpc_buffer + is, BLOCKL); else window(temp, lpc_asymwin, lpc_buffer + is, BLOCKL); autocorr(r, temp, BLOCKL, FILTERORDER); window(r, r, lpc_lagwin, FILTERORDER + 1); levdurb(lp, temp, r, FILTERORDER); bwexpand(lp2, lp, CHIRP_SYNTDENUM, FILTERORDER+1); a2lsf(lsf + k*FILTERORDER, lp2); } memcpy(lpc_buffer, lpc_buffer+BLOCKL, LPC_AHEADL*sizeof(double)); } /*----------------------------------------------------------------* * lsf interpolator and conversion from lsf to a coefficients * (subrutine to SimpleInterpolateLSF) *---------------------------------------------------------------*/ void LSFinterpolate2a_enc( float *a, /* (o) lpc coefficients */ float *lsf1, /* (i) first set of lsf coefficients */ float *lsf2, /* (i) second set of lsf coefficients */ float coef, /* (i) weighting coefficient to use between lsf1 and lsf2 */ long length /* (i) length of coefficient vectors */ ){ float lsftmp[FILTERORDER]; interpolate(lsftmp, lsf1, lsf2, coef, length); lsf2a(a, lsftmp); } /*----------------------------------------------------------------* * lsf interpolator (subrutine to LPCencode) *---------------------------------------------------------------*/ void SimpleInterpolateLSF( float *syntdenum, /* (o) the synthesis filter denominator resulting from the quantized interpolated lsf */ float *weightnum, /* (o) the weighting filter numerator resulting from the unquantized interpolated lsf */ Andersen et. al. Experimental - Expires August 20th, 2002 185 Internet Low Bit Rate Codec February 2002 float *weightdenum, /* (o) the weighting filter denominator resulting from the unquantized interpolated lsf */ float *lsf, /* (i) the unquantized lsf coefficients */ float *lsfq, /* (i) the quantized lsf coefficients */ float *lsfold, /* (i) the unquantized lsf coefficients of the previous signal frame */ float *lsfqold, /* (i) the quantized lsf coefficients of the previous signal frame */ int length /* (i) should equate FILTERORDER */ ){ int i, pos, lp_length; float lp[FILTERORDER + 1], *lsf2, *lsfq2; lsf2 = lsf + length; lsfq2 = lsfq + length; lp_length = length + 1; /* subframe 1: Interpolation between old and first set of lsf coefficients */ LSFinterpolate2a_enc(lp, lsfqold, lsfq, coef[0], length); bwexpand(syntdenum, lp, CHIRP_SYNTDENUM, lp_length); LSFinterpolate2a_enc(lp, lsfold, lsf, coef[0], length); bwexpand(weightnum, lp, CHIRP_WEIGHTNUM, lp_length); bwexpand(weightdenum, lp, CHIRP_WEIGHTDENUM, lp_length); /* subframe 2 to 6: Interpolation between first and second set of lsf coefficients */ pos = lp_length; for (i = 1; i < NSUB; i++) { LSFinterpolate2a_enc(lp, lsfq, lsfq2, coef[i], length); bwexpand(syntdenum + pos, lp, CHIRP_SYNTDENUM, lp_length); LSFinterpolate2a_enc(lp, lsf, lsf2, coef[i], length); bwexpand(weightnum + pos, lp, CHIRP_WEIGHTNUM, lp_length); bwexpand(weightdenum + pos, lp, CHIRP_WEIGHTDENUM, lp_length); pos += lp_length; } /* update memory */ for (i = 0; i < length; i++) { lsfold[i] = lsf2[i]; lsfqold[i] = lsfq2[i]; } } /*----------------------------------------------------------------* * lsf quantizer (subrutine to LPCencode) *---------------------------------------------------------------*/ Andersen et. al. Experimental - Expires August 20th, 2002 186 Internet Low Bit Rate Codec February 2002 void SimplelsfQ( float *lsfq, /* (o) quantized lsf coefficients (dimension FILTERORDER) */ int *index, /* (o) quantization index */ float *lsf /* (i) the lsf coefficient vector to be quantized (dimension FILTERORDER ) */ ){ int i; float e[FILTERORDER], lsfhat[FILTERORDER]; /* Quantize second LSF with memoryless split VQ */ SplitVQ(lsfq + FILTERORDER, index + LSF_NSPLIT, lsf + FILTERORDER, cb_ml, LSF_NSPLIT, dim_ml, size_ml); /* Calculate predicion error for first LSF from second */ for (i = 0; i < FILTERORDER; i++) { lsfhat[i] = lsfpred[i] * (lsfq[FILTERORDER + i] - lsfmean[i]); e[i] = lsf[i] - lsfmean[i] - lsfhat[i]; } /* Quantize prediction error */ SplitVQ(lsfq, index, e, cb_p, LSF_NSPLIT, dim_p, size_p); for (i = 0; i < FILTERORDER; i++) lsfq[i] += lsfmean[i] + lsfhat[i]; } /*----------------------------------------------------------------* * lpc encoder *---------------------------------------------------------------*/ void LPCencode( float *syntdenum, /* (i/o) synthesis filter coefficients before/after encoding */ float *weightnum, /* (i/o) weighting numerator coefficients before/after encoding */ float *weightdenum, /* (i/o) weighting denumerator coefficients before/after encoding */ int *lsf_index, /* (o) lsf quantization index */ float *data, /* (i) lsf coefficients to quantize */ iLBC_Enc_Inst_t *iLBCenc_inst /* (i/o) the encoder state structure */ ){ float lsf[FILTERORDER * LPC_N], lsfq[FILTERORDER * LPC_N]; int change=0; SimpleAnalysis(lsf, data, (*iLBCenc_inst).lpc_buffer); SimplelsfQ(lsfq, lsf_index, lsf); change=LSF_check(lsfq, FILTERORDER, LPC_N); Andersen et. al. Experimental - Expires August 20th, 2002 187 Internet Low Bit Rate Codec February 2002 SimpleInterpolateLSF(syntdenum, weightnum, weightdenum, lsf, lsfq, (*iLBCenc_inst).lsfold, (*iLBCenc_inst).lsfqold, FILTERORDER); } A.41 lsf.h /****************************************************************** iLBC Speech Coder ANSI-C Source Code lsf.h Copyright (c) 2001, Global IP Sound AB. All rights reserved. ******************************************************************/ #ifndef __iLBC_LSF_H #define __iLBC_LSF_H void a2lsf( float *freq, /* (o) lsf coefficients */ float *a /* (i) lpc coefficients */ ); void lsf2a( float *a_coef, /* (o) lpc coefficients */ float *freq /* (i) lsf coefficients */ ); #endif A.42 lsf.c /****************************************************************** iLBC Speech Coder ANSI-C Source Code lsf.c Copyright (c) 2001, Global IP Sound AB. All rights reserved. ******************************************************************/ #include #include Andersen et. al. Experimental - Expires August 20th, 2002 188 Internet Low Bit Rate Codec February 2002 #include "iLBC_define.h" /*----------------------------------------------------------------* * conversion from lpc coefficients to lsf coefficients *---------------------------------------------------------------*/ void a2lsf( float *freq, /* (o) lsf coefficients */ float *a /* (i) lpc coefficients */ ){ float steps[NUMBER_OF_STEPS] = {(float)0.00635, (float)0.003175, (float)0.0015875, (float)0.00079375}; float step; int step_idx; int lsp_index; float p[HALFORDER]; float q[HALFORDER]; float p_pre[HALFORDER]; float q_pre[HALFORDER]; float old_p, old_q, *old; float *pq_coef; float omega, old_omega; int i; float hlp, hlp1, hlp2, hlp3, hlp4, hlp5; for (i = 0; i < HALFORDER; i++){ p[i] = (float)-1.0 * (a[i + 1] + a[FILTERORDER - i]); q[i] = a[FILTERORDER - i] - a[i + 1]; } p_pre[0] = (float)-1.0 - p[0]; p_pre[1] = - p_pre[0] - p[1]; p_pre[2] = - p_pre[1] - p[2]; p_pre[3] = - p_pre[2] - p[3]; p_pre[4] = - p_pre[3] - p[4]; p_pre[4] = p_pre[4] / 2; q_pre[0] = (float)1.0 - q[0]; q_pre[1] = q_pre[0] - q[1]; q_pre[2] = q_pre[1] - q[2]; q_pre[3] = q_pre[2] - q[3]; q_pre[4] = q_pre[3] - q[4]; q_pre[4] = q_pre[4] / 2; omega = 0.0; old_omega = 0.0; old_p = FLOAT_MAX; old_q = FLOAT_MAX; /* Here we loop through lsp_index to find all the FILTERORDER roots for omega. */ Andersen et. al. Experimental - Expires August 20th, 2002 189 Internet Low Bit Rate Codec February 2002 for (lsp_index = 0; lsp_index < FILTERORDER; lsp_index++){ /* Depending on lsp_index being even or odd, we alternatively solve the roots for the two LSP equations. */ if ((lsp_index % 2) == 0){ pq_coef = p_pre; old = &old_p; } else { pq_coef = q_pre; old = &old_q; } /* Start with low resolution grid */ for (step_idx = 0, step = steps[step_idx]; step_idx < NUMBER_OF_STEPS;){ /* cos(10piw) + pq(0)cos(8piw) + pq(1)cos(6piw) + pq(2)cos(4piw) + pq(3)cod(2piw) + pq(4) */ hlp = (float)cos(omega * TWO_PI); hlp1 = (float)2.0 * hlp + pq_coef[0]; hlp2 = (float)2.0 * hlp * hlp1 - (float)1.0 + pq_coef[1]; hlp3 = (float)2.0 * hlp * hlp2 - hlp1 + pq_coef[2]; hlp4 = (float)2.0 * hlp * hlp3 - hlp2 + pq_coef[3]; hlp5 = hlp * hlp4 - hlp3 + pq_coef[4]; if (((hlp5 * (*old)) <= 0.0) || (omega >= 0.5)){ if (step_idx == (NUMBER_OF_STEPS - 1)){ if (fabs(hlp5) >= fabs(*old)) { freq[lsp_index] = omega - step; } else { freq[lsp_index] = omega; } if ((*old) >= 0.0){ *old = (float)-1.0 * FLOAT_MAX; } else { *old = FLOAT_MAX; } omega = old_omega; step_idx = 0; step_idx = NUMBER_OF_STEPS; } else { if (step_idx == 0){ old_omega = omega; } Andersen et. al. Experimental - Expires August 20th, 2002 190 Internet Low Bit Rate Codec February 2002 step_idx++; omega -= steps[step_idx]; /* Go back one grid step */ step = steps[step_idx]; } } else { /* increment omega until they are of different sign, and we know there is at least one root between omega and old_omega */ *old = hlp5; omega += step; } } } for (i = 0; i < FILTERORDER; i++) { freq[i] = freq[i] * TWO_PI; } } /*----------------------------------------------------------------* * conversion from lsf coefficients to lpc coefficients *---------------------------------------------------------------*/ void lsf2a( float *a_coef, /* (o) lpc coefficients */ float *freq /* (i) lsf coefficients */ ){ int i, j; float hlp; float p[HALFORDER], q[HALFORDER]; float a[HALFORDER + 1], a1[HALFORDER], a2[HALFORDER]; float b[HALFORDER + 1], b1[HALFORDER], b2[HALFORDER]; for (i = 0; i < FILTERORDER; i++) { freq[i] = freq[i] * PI2; } /* Check input for ill-conditioned cases. This part is not found in the TIA standard. It involves the following 2 IF blocks. If "freq" is judged ill-conditioned, then we first modify freq[0] and freq[HALFORDER-1] (normally HALFORDER = 10 for LPC applications), then we adjust the other "freq" values slightly */ if ((freq[0] <= 0.0) || (freq[FILTERORDER - 1] >= 0.5)){ if (freq[0] <= 0.0) { freq[0] = (float)0.022; Andersen et. al. Experimental - Expires August 20th, 2002 191 Internet Low Bit Rate Codec February 2002 } if (freq[FILTERORDER - 1] >= 0.5) { freq[FILTERORDER - 1] = (float)0.499; } hlp = (freq[FILTERORDER - 1] - freq[0]) / (float) (FILTERORDER - 1); for (i = 1; i < FILTERORDER; i++) { freq[i] = freq[i - 1] + hlp; } } memset(a1, 0, HALFORDER*sizeof(float)); memset(a2, 0, HALFORDER*sizeof(float)); memset(b1, 0, HALFORDER*sizeof(float)); memset(b2, 0, HALFORDER*sizeof(float)); memset(a, 0, (HALFORDER+1)*sizeof(float)); memset(b, 0, (HALFORDER+1)*sizeof(float)); /* p[i] and q[i] compute cos(2*pi*omega_{2j}) and cos(2*pi*omega_{2j-1} in eqs. 4.2.2.2-1 and 4.2.2.2-2. Note that for this code p[i] specifies the coefficients used in .Q_A(z) while q[i] specifies the coefficients used in .P_A(z) */ for (i = 0; i < HALFORDER; i++){ p[i] = (float)cos(TWO_PI * freq[2 * i]); q[i] = (float)cos(TWO_PI * freq[2 * i + 1]); } a[0] = 0.25; b[0] = 0.25; for (i = 0; i < HALFORDER; i++){ a[i + 1] = a[i] - 2 * p[i] * a1[i] + a2[i]; b[i + 1] = b[i] - 2 * q[i] * b1[i] + b2[i]; a2[i] = a1[i]; a1[i] = a[i]; b2[i] = b1[i]; b1[i] = b[i]; } for (j = 0; j < FILTERORDER; j++){ if (j == 0){ a[0] = 0.25; b[0] = -0.25; } else { a[0] = b[0] = 0.0; } for (i = 0; i < HALFORDER; i++){ Andersen et. al. Experimental - Expires August 20th, 2002 192 Internet Low Bit Rate Codec February 2002 a[i + 1] = a[i] - 2 * p[i] * a1[i] + a2[i]; b[i + 1] = b[i] - 2 * q[i] * b1[i] + b2[i]; a2[i] = a1[i]; a1[i] = a[i]; b2[i] = b1[i]; b1[i] = b[i]; } a_coef[j + 1] = 2 * (a[HALFORDER] + b[HALFORDER]); } a_coef[0] = 1.0; } A.43 packing.h /****************************************************************** iLBC Speech Coder ANSI-C Source Code packing.h Copyright (c) 2001, Global IP Sound AB. All rights reserved. ******************************************************************/ #ifndef __PACKING_H #define __PACKING_H void dopack( unsigned char **bitstream, /* (i/o) on entrance pointer to place in bitstream to pack new data, on exit pointer to place in bitstream to pack future data */ int *index, /* (i) the value to pack */ int bitno /* (i) the number of bits that the value will fit within */ ); void unpack( unsigned char **bitstream, /* (i/o) on entrance pointer to place in bitstream to unpack new data from, on exit pointer to place in bitstream to unpack future data from*/ int *index, /* (o) resulting value */ int bitno /* (i) number of bits used to represent the value */ ); #endif Andersen et. al. Experimental - Expires August 20th, 2002 193 Internet Low Bit Rate Codec February 2002 A.44 packing.c /****************************************************************** iLBC Speech Coder ANSI-C Source Code packing.c Copyright (c) 2001, Global IP Sound AB. All rights reserved. ******************************************************************/ #include #include #include "iLBC_define.h" #include "constants.h" #include "helpfun.h" #include "string.h" #define BBL 100 /*----------------------------------------------------------------* * packing of bits into bitstream, i.e., vector of bytes *---------------------------------------------------------------*/ void dopack( unsigned char **bitstream, /* (i/o) on entrance pointer to place in bitstream to pack new data, on exit pointer to place in bitstream to pack future data */ int *index, /* (i) the value to pack */ int bitno /* (i) the number of bits that the value will fit within bitno equal to 0 will cause the bitstream to be flushed to integer bytes */ ){ static int bb[BBL]; static int bbs=0; static int bbe=0; int i; /* place the individual bits in individual integers in the table bb */ if( bitno > 0){ for(i=0;i>i)&1; } /* flush the bitstream to an integer number of bytes */ if(bitno==0){ memset(bb+bbe, 0, (8-bbe)*sizeof(int)); Andersen et. al. Experimental - Expires August 20th, 2002 194 Internet Low Bit Rate Codec February 2002 bbe=8; } /* write the bits into bitstream */ while( bbe-bbs>=8){ **bitstream = 0; for(i=0;i<8;i++) **bitstream += (unsigned char) bb[bbs++]<> i) & 1; *bitstream += 1; } /* the bits are combined into the index value */ *index=0; for(i=0;i #include #include "iLBC_define.h" Andersen et. al. Experimental - Expires August 20th, 2002 196 Internet Low Bit Rate Codec February 2002 #include "constants.h" #include "filter.h" /*----------------------------------------------------------------* * decoding of the start state *---------------------------------------------------------------*/ void StateConstructW( int idxForMax, /* (i) 7-bit index for the quantization of max amplitude */ int *idxVec, /* (i) vector of quantization indexes */ float *syntDenum, /* (i) synthesis filter denumerator */ float *out, /* (o) the decoded state vector */ int len /* (i) length of a state vector */ ){ float maxVal, tmpbuf[FILTERORDER+2*STATE_LEN], *tmp, numerator[FILTERORDER+1]; float foutbuf[FILTERORDER+2*STATE_LEN], *fout; int k,tmpi; /* decoding of the maximum value */ maxVal = state_frgq[idxForMax]; maxVal = (float)pow(10,maxVal)/(float)4.5; /* initialization of buffers and coefficients */ memset(tmpbuf, 0, FILTERORDER*sizeof(float)); memset(foutbuf, 0, FILTERORDER*sizeof(float)); for(k=0; k #include #include "iLBC_define.h" #include "constants.h" #include "filter.h" #include "helpfun.h" Andersen et. al. Experimental - Expires August 20th, 2002 198 Internet Low Bit Rate Codec February 2002 /*----------------------------------------------------------------* * predictive noise shaping encoding of scaled start state * (subrutine for StateSearchW) *---------------------------------------------------------------*/ void AbsQuantW( float *in, /* (i) vector to encode */ float *syntDenum, /* (i) denominator of synthesis filter */ float *weightNum, /* (i) numerator of weighting filter */ float *weightDenum, /* (i) denominator of weighting filter */ int *out, /* (o) vector of quantizer indexes */ int len /* (i) length of vector to encode and vector of quantizer indexes */ ){ float *target, targetBuf[FILTERORDER+STATE_LEN], *syntOut, syntOutBuf[FILTERORDER+STATE_LEN], *weightOut, weightOutBuf[FILTERORDER+STATE_LEN], toQ, xq; int n; int index; /* initialization of buffers for filterings */ memset(targetBuf, 0, FILTERORDER*sizeof(float)); memset(syntOutBuf, 0, FILTERORDER*sizeof(float)); memset(weightOutBuf, 0, FILTERORDER*sizeof(float)); /* initialization of pointers for filterings */ target = &targetBuf[FILTERORDER]; syntOut = &syntOutBuf[FILTERORDER]; weightOut = &weightOutBuf[FILTERORDER]; /* encoding loop */ for(n=0;n maxVal*maxVal){ maxVal = fout[k]; } } maxVal=(float)fabs(maxVal); /* encoding of the maximum amplitude value */ if(maxVal < 10.0){ maxVal = 10.0; } maxVal = (float)log10(maxVal); sort_sq(&dtmp, &index, maxVal, state_frgq, 64); /* decoding of the maximum amplitude representation value, and corresponding scaling of start state */ maxVal=state_frgq[index]; utmp=index; *idxForMax=utmp; maxVal = (float)pow(10,maxVal); maxVal = (float)(4.5)/maxVal; for(k=0;k0; i--) mem[i] = mem[i-1]; mem[0] = *po; po++; } } Andersen et. al. Experimental - Expires August 20th, 2002 203