S. V. Andersen
A. Duric
R. Hagen
W. B. Kleijn
J. Linden
M. N. Murthi
J. Skoglund
J. Spittka
Internet Draft
Document: draft-andersen-ilbc-00.txt Global IP Sound
Category: Experimental
Feb. 20th 2002
Expires: Aug. 20th 2002
Internet Low Bit Rate Codec
Status of this Memo
This document is an Internet-Draft and is in full conformance
with all provisions of Section 10 of RFC2026.
Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF), its areas, and its working groups. Note that
other groups may also distribute working documents as Internet-
Drafts.
Internet-Drafts are draft documents valid for a maximum of six
months and may be updated, replaced, or obsoleted by other documents
at any time. It is inappropriate to use Internet-Drafts as
reference material or to cite them other than as "work in progress."
The list of current Internet-Drafts can be accessed at
http://www.ietf.org/ietf/1id-abstracts.txt
The list of Internet-Draft Shadow Directories can be accessed at
http://www.ietf.org/shadow.html.
Abstract
This document specifies a speech codec suitable for robust voice
communication over IP. The codec is designed for narrow band speech
and results in a payload bit rate of 13.968 kbit/s with an encoding
frame length of 30 ms. The codec enables graceful speech quality
degradation in the case of lost frames, which occurs in connection
with lost or delayed IP packets.
Table of Contents
Status of this Memo................................................1
Andersen et. al. 1
Internet Low Bit Rate Codec February 2002
Abstract...........................................................1
Table of Contents..................................................1
1. INTRODUCTION....................................................5
2. OUTLINE OF THE CODEC............................................5
2.1 Encoder........................................................5
2.2 Decoder........................................................7
3. ENCODER PRINCIPLES..............................................8
3.1 LPC Analysis and Quantization..................................8
3.1.1 Computation of Autocorrelation Coefficients..................8
3.1.2 Computation of LPC Coefficients..............................9
3.1.3 Computation of LSF Coefficients from LPC Coefficients.......10
3.1.4 Quantization of LSF Coefficients............................10
3.1.5 Stability Check of LSF Coefficients.........................12
3.1.6 Interpolation of LSF Coefficients...........................12
3.2 Calculation of the Residual...................................12
3.3 Perceptual Weighting Filter...................................13
3.4 Start State Encoder...........................................13
3.4.1 Start State Estimation......................................13
3.4.1 All-Pass Filtering and Scale Quantization...................13
3.4.2 Scalar Quantization.........................................14
3.5 Codebook Encoding.............................................14
3.5.1 Perceptual Weighting of Codebook Memory and Target..........14
3.5.2 Codebook Creation...........................................15
3.5.2.1 Creation of a Base Codebook...............................15
3.5.2.2 Codebook Augmentation.....................................15
3.5.2.3 Codebook Expansion........................................16
3.5.3 Codebook Search.............................................17
3.5.3.1 The Codebook Search at Each Stage.........................17
3.5.3.2 The Gain Quantization at Each Stage.......................18
3.5.3.3 Preparation of Target for Next Stage......................19
3.6 Gain Correction Encoding......................................19
3.7 Bitstream Definition..........................................20
4. DECODER PRINCIPLES.............................................21
4.1 LPC Filter Reconstruction.....................................21
4.2 Start State Reconstruction....................................22
4.3 Excitation Decoding Loop......................................22
4.4 Multistage Adaptive Codebook Decoding.........................23
4.4.1 Construction of the Decoded Excitation Signal...............23
4.5 Packet Loss Concealment.......................................23
4.5.1 Block Received Correctly and Previous Block also Received...23
4.5.2 Block Not Received..........................................24
4.5.3 Block Received Correctly When Previous Block Not Received...24
4.6 Enhancement...................................................25
4.6.1 Outline of the Enhancement Unit.............................25
4.6.2 Determination of the Pitch-Synchronous Sequences............27
4.6.3 Re-estimation of the Current Sample-Sequence................27
Andersen et. al. Experimental - Expires August 20th, 2002 2
Internet Low Bit Rate Codec February 2002
4.7 Synthesis Filtering...........................................29
5. SECURITY CONSIDERATIONS........................................29
6. REFERENCES.....................................................30
7. ACKNOWLEDGEMENTS...............................................30
8. AUTHOR'S ADDRESSES.............................................31
APPENDIX A REFERENCE IMPLEMENTATION...............................33
A.1 iLBC_test.c...................................................34
A.2 iLBC_encode.h.................................................39
A.3 iLBC_encode.c.................................................40
A.4 iLBC_decode.h.................................................46
A.5 iLBC_decode.c.................................................47
A.6 iLBC_define.h.................................................54
A.7 constants.h...................................................57
A.8 constants.c...................................................58
A.9 anaFilter.h..................................................130
A.10 anaFilter.c.................................................130
A.11 createCB.h..................................................131
A.12 createCB.c..................................................132
A.13 doCPLC.h....................................................135
A.14 doCPLC.c....................................................136
A.15 enhancer.h..................................................141
A.16 enhancer.c..................................................141
A.17 filter.h....................................................150
A.18 filter.c....................................................151
A.19 FrameClassify.h.............................................153
A.20 FrameClassify.c.............................................154
A.21 gaincorr_Encode.h...........................................155
A.22 gaincorr_Encode.c...........................................155
A.23 gainquant.h.................................................157
A.24 gainquant.c.................................................157
A.25 getCBvec.h..................................................159
A.26 getCBvec.c..................................................160
A.27 helpfun.h...................................................163
A.28 helpfun.c...................................................165
A.29 hpInput.h...................................................170
A.30 hpInput.c...................................................171
A.31 hpOutput.h..................................................172
A.32 hpOutput.c..................................................172
A.33 iCBConstruct.h..............................................174
A.34 iCBConstruct.c..............................................174
A.35 iCBSearch.h.................................................175
A.36 iCBSearch.c.................................................176
A.37 LPCdecode.h.................................................180
A.38 LPCdecode.c.................................................181
A.39 LPCencode.h.................................................183
A.40 LPCencode.c.................................................184
Andersen et. al. Experimental - Expires August 20th, 2002 3
Internet Low Bit Rate Codec February 2002
A.41 lsf.h.......................................................188
A.42 lsf.c.......................................................188
A.43 packing.h...................................................193
A.44 packing.c...................................................194
A.45 StateConstructW.h...........................................196
A.46 StateConstructW.c...........................................196
A.47 StateSearchW.h..............................................198
A.48 StateSearchW.c..............................................198
A.49 syntFilter.h................................................201
A.50 syntFilter.c................................................202
Andersen et. al. Experimental - Expires August 20th, 2002 4
Internet Low Bit Rate Codec February 2002
1. INTRODUCTION
This document contains the description of an algorithm for the
coding of speech signals sampled at 8 kHz. The iLBC codec has a bit
rate of 13.967 kbit/s using a block-independent linear-predictive
coding (LPC) algorithm. The codec operates at block lengths of 30 ms
and produces 419 bits per block which can be packetized in 53 bytes.
The described algorithm results in a speech coding system with a
controlled response to packet losses similar to what is known from
pulse code modulation (PCM) with packet loss concealment (PLC), such
as the ITU-G.711 standard [3] which operates at a fixed bit rate of
64 kbit/s. At the same time, the described algorithm enables fixed
bit rate coding with a quality-versus-bit rate tradeoff close to
what is known from code-excited linear prediction (CELP). A suitable
RTP payload format for this codec is specified in [1].
Some of the applications for which this coder is suitable are: Real
time communications such as videoconferencing and telephony,
Streaming audio, Archival and messaging.
This document is organized as follows. In Section 2 a brief outline
of the codec is given. The specific encoder and decoder algorithms
are explained in Sections 3 and 4, respectively. A c-code reference
implementation is provided in Appendix A.
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in
this document are to be interpreted as described in RFC 2119 [2].
2. OUTLINE OF THE CODEC
The codec consists of an encoder and a decoder described in Section
2.1 and 2.2, respectively.
The essence of the codec is LPC and block based coding of the LPC
residual signal. For each 240 sample block, the following major
steps are done. An LPC filter is computed to produce the residual
signal. The codec uses DPCM coding of the dominant part, in terms of
energy, of the residual signal for the block. The dominant state is
of length 58 samples and forms a start state for dynamic codebooks
constructed from the already coded parts of the residual signal.
These dynamic codebooks are used to code the remaining parts of the
residual signal. By this method, coding independence between blocks
is achieved, resulting in elimination of propagation of perceptual
degradations due to packet loss. The method facilitates high-quality
packet loss concealment (PLC).
2.1 Encoder
The input to the encoder is 16 bit uniform PCM sampled at 8 kHz.
The input is partitioned into blocks of BLOCKL=240 samples. Each
Andersen et. al. Experimental - Expires August 20th, 2002 5
Internet Low Bit Rate Codec February 2002
block is divided into NSUB=6 consecutive sub-blocks of SUBL=40
samples each.
For each input block, the encoder performs two FILTERORDER=10
linear-predictive coding (LPC) analyses. The first analysis applies
a smooth window centered over the 2nd sub-block and extending to the
end of the 6'th sub-block. The second LPC analysis applies a smooth
window centered over the 5'th sub-block and extending to the end of
the 6'th sub-block. For both LPC analyses, sets of line-spectral
frequencies(LSF)'s are obtained, quantized and interpolated to
obtain LSF coefficients for each sub-block.
Subsequently, the LPC residual is computed using the quantized and
interpolated LPC analysis filters. The two consecutive sub-blocks of
residual exhibiting the maximal energy are identified. Within these
2 sub-blocks, the start state (segment) is selected from two
choices: the first 58 samples or the last 58 samples of the 2
consecutive sub-blocks. The selected segment is the one of higher
energy. The start state is encoded with a DPCM method.
For encoding of the remaining 22 samples of the 2 sub-blocks
containing the start state and the remaining four sub-blocks,
gain-shape coding is performed using a codebook generated from the
available already coded samples. The codebook is used in NSTAGES=3
stages in a successive refinement approach. The resulting 3 gain
factors are encoded with 4, 3, and 3 bit scalar quantization,
respectively. The codebook search method employs noise
shaping derived from the LPC filters and minimization of the squared
error between the target vector and the code vectors. Each code
vector in this codebook comes from one of NSECTION=4 codebook
sections. The first section is filled with delayed, already encoded
residual vectors. The code vectors of the remaining 3 codebook
sections are constructed by predefined linear combinations of
vectors in the first section of the codebook. The linear combination
coefficients differ from one section to the next.
The codebook encoding is done in 3 steps:
1. The remaining 22 samples of the 2 sub-blocks containing the start
state are encoded using a codebook of size 256 constructed from the
58 samples of the encoded start state.
2. If the block contains sub-blocks later in time than the ones
containing the start state, each of these sub-blocks are
subsequently encoded using a codebook encoding method. A new
codebook is constructed for each sub-block since the available
encoded residual signal samples increases.
3. If the block contains sub-blocks earlier in time than the ones
encoded for the start state then a procedure equal to the one
applied for sub-blocks later in time is applied, but with time-
reversed signals since encoding is now performed backwards in time.
Andersen et. al. Experimental - Expires August 20th, 2002 6
Internet Low Bit Rate Codec February 2002
Within steps 2. and 3. above, 4 sub-blocks are encoded. The
codebooks have code vector dimension 40 and construction of the
codebook for the 4 coding instances is performed as follows: 1) A
codebook of size 256 is created from the 80 samples of the already
coded residual signal; 2) A codebook of size 512 is created from the
120 samples of the already coded residual signal; 3) A codebook of
size 512 is created from the 147 samples of the already coded
residual signal closest to the sub-blocks to be encoded; 4) A
codebook of size 512 is created from the 147 samples of the already
coded residual signal closest to the sub-blocks to be encoded. The
only difference between step 2. and 3. is that the 40 sample signal
to be encoded, as well as the already coded signal for codebook
construction, is time-reversed before encoding in step 3.
Since codebook encoding with squared-error matching is known to
produce a coded signal of less power than the scalar DPCM coded
signal, a gain correction factor is calculated by comparing the
power loss in the codebook encoding to the power loss in the scalar
DPCM coding. The gain correction factor is quantized with 4 bits and
is used to scale down the start state to produce a signal with a
smooth power contour over the block.
2.2 Decoder
For packet communications, typically a jitter buffer placed at the
receiving end decides whether packet containing an encoded signal
block has been received or lost. This logic is not part of the codec
described here. For each received encoded signal block the decoder
performs a decoding. For each lost signal block the decoder performs
a PLC operation.
The decoding for each block starts by a decoding and interpolation
of the LPC coefficients. Subsequently the start state is decoded.
For codebook encoded segments, each segment is decoded by
constructing the 3 code vectors given by the received codebook
indices in the same way as the code vectors were constructed in the
encoder. The 3 gain factors are also decoded and the resulting
decoded signal is given by the sum of the 3 codebook vectors scaled
with respective gain.
An enhancement algorithm is applied on the reconstructed excitation
signal. This enhancement augments the periodicity of voiced speech
regions. The enhancement is optimized under the constraint that the
enhancement signal (defined as the difference between the enhanced
excitation and the excitation signal prior to enhancement) has a
short-time energy that does not exceed a preset fraction of the
short-time energy of the speech signal.
A packet loss concealment (PLC) operation is easily embedded in the
decoder. The PLC operation can, e.g., be based on repetition of LPC
filters and obtaining the LPC residual signal using a long term
prediction estimate from previous residual blocks.
Andersen et. al. Experimental - Expires August 20th, 2002 7
Internet Low Bit Rate Codec February 2002
3. ENCODER PRINCIPLES
This section describes the principles of each component of the
encoder algorithm.
3.1 LPC Analysis and Quantization
The input to the LPC analysis module is a high-pass filtered speech
buffer, speech_hp, that contains 300 (LOOKBACK + BLOCKL = 60 + 240
=300) speech samples, where samples 0 through 59 are from the
previous block and samples 60 through 299 are from the current
block. No look-ahead into the next block is used. For the very first
block processed, the look back samples are assumed to be zeros.
For each input block, the LPC analysis calculates two sets of
FILTERORDER=10 LPC filter coefficients using the autocorrelation
method and the Levinson-Durbin recursion. The first set, lsf1,
represents the spectral properties of the input signal at the center
of the second subblock while the other set, lsf2, represents the
spectral characteristics as measured at the center of the fifth
subblock. The details of the computation shall be executed as
described in 3.1.1 through 3.1.6.
3.1.1 Computation of Autocorrelation Coefficients
The first step in the LPC analysis procedure is to calculate
autocorrelation coefficients using windowed speech samples. This
windowing is the only difference in the LPC analysis procedure for
the two sets of coefficients. For the first set, a 240 sample long
standard symmetric Hanning window is applied to samples 0 through
239 of the input data. In c-like pseudo code, the first window,
win1, is hence calculated as:
win1[i] = 0.5 * (1.0 - cos((2 * PI * (i + 1))/(BLOCKL + 1)));
i=0,...,119
win1[BLOCKL - i - 1] = win1[i]; i=120,...,239
The windowed speech speech_hp_win1 is then obtained by multiplying
the 240 first samples of the input speech buffer with the window
coefficients:
speech_hp_win1[i] = speech_hp[i] * win1[i]; i=0,...,BLOCKL-1
From these 240 windowed speech samples, 11 (FILTERORDER + 1)
autocorrelation coefficients, acf1, are calculated:
acf1[lag] += speech_hp_win1[n] * speech_hp_win1[n + lag];
lag=0,...,FILTERORDER; n=0,...,BLOCKL-lag
In order to make the analysis more robust against numerical
precision problems, a spectral smoothing procedure is applied by
Andersen et. al. Experimental - Expires August 20th, 2002 8
Internet Low Bit Rate Codec February 2002
windowing the autocorrelation coefficients with a Gaussian window
before the LPC coefficients are computed. Also, a white noise floor
is added to the autocorrelation function by multiplying coefficient
zero by 1.0001 (40dB below the energy of the windowed speech
signal). These two steps are implemented by multiplying the
autocorrelation coefficients with the following window:
win3[0] = 1.0001;
win3[i] = exp(-0.5 * ((2 * PI * 60.0 * i) /FS)^2);
i=1,...,FILTERORDER
Then, the windowed acf function acf1_win is obtained by:
acf1_win1[i] = acf1[i] * win3[i]; i=0,...,FILTERORDER
The second set of autocorrelation coefficients, acf2_win are
obtained in a similar manner. The window, win2, is applied to
samples 60 through 299, i.e., the entire current block. The window
consists of two segments; The first (samples 0 to 220) being half a
Hanning window with length 440 and the second being a quarter of a
cycle of a cosine wave. By using this asymmetric window, an LPC
analysis centered in the fifth subblock is obtained without the need
for any look-ahead, which would have added delay. The asymmetric
window is defined as:
win2[i] = (sin(PI * (i + 1) / 441))^2; i=0,...,219
win2[i] = cos((i - 220) * PI / 10); i=220,...,239
and the windowed speech is computed by:
speech_hp_win2[i] = speech_hp[i + LOOKBACK] * win2[i];
i=0,....BLOCKL-1
The windowed autocorrelation coefficients are then obtained in
exactly the same way as for the first analysis instance.
The generation of the windows win1, win2, and win3 are typically
done in advance and the arrays are stored in ROM rather than
repeating the calculation for every block.
3.1.2 Computation of LPC Coefficients
From the 11 smoothed autocorrelation coefficients, acf1_win and
acf2_win, the 11 LPC coefficients, lp1 and lp2, are calculated in
the same way for both analysis locations using the well known
Levinson-Durbin recursion. The first LPC coefficient is always 1.0,
resulting in 10 unique coefficients.
After determining the LPC coefficients, a bandwidth expansion
procedure is applied in order to smooth the spectral peaks in the
short-term spectrum. The bandwidth addition is obtained by the
following modification of the LPC coefficients:
Andersen et. al. Experimental - Expires August 20th, 2002 9
Internet Low Bit Rate Codec February 2002
lp1_bw[i] = lp1[i] * chirp^i; i=0,...,FILTERORDER
lp2_bw[i] = lp2[i] * chirp^i; i=0,...,FILTERORDER
where "chirp" is a real number between 0 and 1 that typically has a
value of around 0.8.
3.1.3 Computation of LSF Coefficients from LPC Coefficients
Thusfar, two sets of LPC coefficients that represent the short-term
spectral characteristics of the speech signal for two different time
locations within the current block have been determined. These
coefficients should be quantized and interpolated. Before
doing so, it is advantageous to convert the LPC parameters into
another type of representation called the Line Spectral Frequencies
(LSF). The LSF parameters are used because they are better suited
for quantization and interpolation than the regular LPC
coefficients. Many computationally efficient methods for calculating
the LSFs from the LPC coefficients have been proposed in the
literature. The detailed implementation of one applicable method can
be found in Appendix A.42. The two arrays of LSF coefficients
obtained, lsf1 and lsf2, are of dimension 10 (FILTERORDER).
3.1.4 Quantization of LSF Coefficients
Since the LPC filters defined by the two sets of LSFs are needed
also in the decoder, the LSF parameters need to be quantized and
transmitted as side information. The total number of bits required
to represent the quantization of the two LSF representations for one
block of speech is 52 with 24 and 28 bits for lsf1 and lsf2,
respectively. For computational reasons, both LSF vectors are
quantized using 3-split vector quantization (VQ). That is, the LSF
vectors are split into three subvectors which are each quantized
with a regular VQ. First, the quantized version of lsf2, qlsf2, is
obtained by memoryless split VQ. Then qlsf1 is obtained by
predictive split VQ of lsf1. The prediction of the (mean-removed)
lsf1 is calculated by multiplying qlsf2 by a set of predictor
coefficients (one for each of the 10 components of the vector).
After subtracting the resulting prediction from lsf1 the resulting
prediction error is quantized with a second 3-split VQ.
The following c-like definitions explain how each LSF vector (lsf1
and lsf2) is split by defining the position of the first coefficient
for each split vector (for added clarity, we additionally provide
the corresponding dimension for each split vector):
lsf1_splitfirst[LSF_NSPLIT] = {0, 3, 6};
lsf1_splitdim[LSF_NSPLIT] = {3, 3, 4};
lsf2_splitfirst[LSF_NSPLIT] = {0, 4, 7};
lsf2_splitdim[LSF_NSPLIT] = {4, 3, 3};
For each of the split vectors, a separate codebook of quantized
values has been designed using a standard VQ training method for a
large database containing speech from a large number of speakers
Andersen et. al. Experimental - Expires August 20th, 2002 10
Internet Low Bit Rate Codec February 2002
recorded under various conditions. The size of each of the six
codebooks associated with the split definitions above is:
int lsf1_cbsize[LSF_NSPLIT] = {256, 256, 256};
int lsf2_cbsize[LSF_NSPLIT] = {512, 512, 1024};
The second set of LSF coefficients, lsf2, are quantized with a
standard memoryless split vector quantization (VQ) structure using
the squared error criterion in the LSF domain. The split VQ
quantization consists of the following steps:
1) Quantize the first 3 LSF coefficients with a VQ codebook of size
512.
2) Quantize the LSF coefficient 4, 5, and 6 with VQ a codebook of
size 512.
3) Quantize the last 4 LSF coefficients with a VQ codebook of size
1024.
This procedure gives 3 quantization indices and the quantized second
set of LSF coefficients qlsf2.
The quantization of the first set of LSF coefficients is done on the
prediction error obtained by predicting the first set of LSF
coefficients from the quantized second set of LSF coefficients. The
prediction error, e, is obtained by:
lsfhat[i] = lsfpred[i] * (qlsf2[i] - lsfmean[i]);
i=0,...,FILTERORDER-1
e[i] = lsf1[i] - lsfmean[i] - lsfhat[i]; i=0,...,FILTERORDER-1
where lsfhat is the predicted, mean-removed first set of LSF
coefficients. The prediction coefficients, lsfpred, and the mean
vector, lsfmean, are pre-computed and stored values.
The prediction error e is quantized with a standard memoryless split
vector quantization (VQ) structure using the squared error criterion
in the LSF domain. The split VQ quantization consists of the
following steps:
1) Quantize the first 3 prediction error values with a VQ codebook
of size 256.
2) Quantize the prediction error values 4, 5 and 6 with VQ a
codebook of size 256.
3) Quantize the last 4 prediction error values with a VQ codebook of
size 256.
This procedure gives 3 quantization indices and the quantized
prediction error values qe. The first set of LSF coefficients qlsf1
is given by:
qlsf1[i] = qe[i] + lsfmean[i] + lsfhat[i];
i=0,...,FILTERORDER-1
Andersen et. al. Experimental - Expires August 20th, 2002 11
Internet Low Bit Rate Codec February 2002
The result of the quantization of each of the two LSF coefficient
sets is a set of 3 indices. A first set of indices represents qlsf1
and is encoded with 8+8+8=24 bits. A second set of indices
represents qlsf2 is encoded with 9+9+10=28 bits. The total number of
bits used for LSF quantization in a block is thus 52 bits.
3.1.5 Stability Check of LSF Coefficients
The LSF representation of the LPC filter has the nice property that
the coefficients are ordered by increasing value, i.e., lsf(n) >
lsf(n-1), 0 < n < 10, if the corresponding synthesis filter is
stable. Since we are employing a split VQ scheme it is possible that
at the split boundaries the LSF coefficients are not ordered
correctly and hence the corresponding LP filter is unstable. To
ensure that the filter used is stable, a stability check is
performed for the quantized LSF vectors. If it turns out that the
coefficients are not ordered appropriately (with a safety margin of
50 Hz to ensure that formant peaks are not too narrow) they will be
moved apart. The detailed method for this can be found in Appendix
A.42. The same procedure is performed in the decoder. This ensures
that exactly the same LSF representations are used in both encoder
and decoder.
3.1.6 Interpolation of LSF Coefficients
From the two sets of LSF coefficients that are computed for each
block of speech, different LSFs are obtained for each subblock by
means of interpolation. This procedure is performed for the original
LSFs, lsf1 and lsf2 as well as the quantized versions qlsf1 and
qlsf2 since both versions are used in the encoder. Here follows a
brief summary of the interpolation scheme while the details are
found in the c-code of Annex B. In the first sub-block, the average
of the second LSF vector from the previous block and the first LSF
vector in the current block is used. For sub-blocks two through five
the LSFs used are obtained by linear interpolation from lsf1 (and
qlsf1) to lsf2 (and qlsf2) with lsf1 used in subblock two and lsf2
in subblock five. In the last subblock, lsf2 is used. For the very
first block it is assumed that the previous block has the same LSF
vectors as the current one.
The interpolation method is standard linear interpolation in the LSF
domain. The interpolated LSF values are converted to lpc
coefficients for each sub-block.
A reference implementation of the lsf encoding is given in Appendix
A.40. A reference implementation of the corresponding decoding can
be found in Appendix A.38.
3.2 Calculation of the Residual
The block of speech samples is filtered by the quantized and
interpolated LPC filters to yield the residual signal. In
particular, the corresponding LPC analysis filter for each subblock
Andersen et. al. Experimental - Expires August 20th, 2002 12
Internet Low Bit Rate Codec February 2002
is used to filter the speech samples for the same subblock. The
filter memory at the end of each subblock is carried over to the LPC
filter of the next subblock. The signal at the output of each LP
analysis filter constitutes the residual signal for the
corresponding subblock.
A reference implementation of the residual calculating filter is
found in Appendix A.10.
3.3 Perceptual Weighting Filter
In principle any good design of perceptual weighting filter can be
applied in the encoder without compromising this codec definition.
A simplified design with low complexity is to apply the filter
1/A(z/0.4) in the LPC residual domain. Here A(z) is the filter
obtained from unquantized but interpolated LSF coefficients.
3.4 Start State Encoder
The start state containing STATE_SHORT_LEN=58 maximum energy
residual samples is quantized using a common 6-bit scale quantizer
for the block and a 4-bit scalar quantizer operating on the scaled
samples in the weighted speech domain. Now we describe the state
encoding in greater detail.
3.4.1 Start State Estimation
The two sub-blocks containing the start state are determined by
finding the two consecutive sub-blocks in the block having the
highest power, i.e., the following measure is computed:
nsub=1,...,NSUB-1
ssqn[nsub] = 0.0;
for (i=(nsub-1)*SUBL; imax_measure){
best_index = cb_index;
max_measure = measure;
gain = crossDot*invDot;
}
Andersen et. al. Experimental - Expires August 20th, 2002 17
Internet Low Bit Rate Codec February 2002
Upon search in the base codebook, the iterative search loop is
continued into the 3 expanded sections of the adaptive codebook.
This can be done as a full search. However, to save computations
this part of the search can be constrained to indexes in restricted
range RESRANGE around the best_index identified in the base
codebook. This is obtained by identifying a start index sInd and an
end index eInd as:
base_index = best_index;
sInd=base_index-RESRANGE/2;
if (sInd < 0) sInd=0;
eInd = sInd+RESRANGE;
if (eInd>=base_size) {
eInd=base_size-1;
sInd=eInd-RESRANGE;
}
With these definitions, the iterative search can be continued over
the following 3 intervals:
1) cb_index=sInd+base_size to cb_index=eInd+base_size;
2) cb_index=sInd+2*base_size to cb_index=eInd+2*base_size;
3) cb_index=sInd+2*base_size to cb_index=eInd+2*base_size;
After these iterations the best codebook index, best_index, has been
distilled. A good compromise between computational complexity and
speech quality is obtained by choosing RESRANGE=33.
3.5.3.2 The Gain Quantization at Each Stage
The gain follows as a result of the registration
gain = crossDot*invDot;
each time the max_measure is surpassed in the search procedure
outlined in section 3.5.3.1.
In the first stage, the gain is limited to the range 0.0 to 1.0.
if (gain<0.0) gain = 0.0;
if (gain>1.0) gain = 1.0;
Subsequently this gain is quantized by finding the nearest
representation value in the quantization table gain_sq4. The
resulting gain index is the index to this representation value in
the quantization table.
The gains of subsequent stages are quantized using a quantization
table which is obtained by multiplication of the values in the table
gain_sq3 with a scale value. This value equates 0.1 or the absolute
value of the quantized gain representation value obtained in the
previous stage, whichever is the larger. Again, the resulting gain
index is the index to the nearest representation value in the
Andersen et. al. Experimental - Expires August 20th, 2002 18
Internet Low Bit Rate Codec February 2002
quantization table.
3.5.3.3 Preparation of Target for Next Stage
Before redoing the search for the next stage the target vector is
updated by subtracting from it the selected shape vector times the
corresponding quantized gain.
A reference implementation of the codebook encoding is found in
Appendix A.36.
3.6 Gain Correction Encoding
The start state is quantized in a relatively model independent
manner using 3 bits per sample. Different form this, the remaining
parts of the block is encoded using an adaptive codebook. This
codebook will produce high matching accuracy whenever there is a
high correlation between the target and a segment found in the buf
variable. For unvoiced speech segments, this is not necessarily so.
The result becomes a signal block for which the start state is
encoded with much higher accuracy than the remaining block.
Perceptually, the main problem with this is that the time envelope
of the signal energy becomes unsteady. To overcome this problem, the
start state is scaled down with a factor that approximates the
energy loss in the remaining parts of the signal block. The
determination of this scale factor is the last step in the encoding
process.
First the energy per sample in the start state target, Esst, in the
decoded start state, Ess, in the remaining parts of the excitation
signal target, Eet, and in the remaining part of the decoded
excitation signal, Ee are determined.
If the ratio sqrt(Eet/Esst) is larger than or equal to 0.25, a
correction factor is determined as
correction_factor = sqrt( (Ee/Eet) / (Ess/Esst) );
The correction factor is uniformly quantized in the range from 0.0
to 1.0 using 4 bit quantization as obtained, e.g., by the following
lines of c-code:
if(correction_factor > 1) correction_factor = 1;
index=(int)(correction_factor*16)-1;
if (index<0) index=0;
However, if the ratio sqrt(Eet/Esst) is less than 0.25, it is taken
as an indication that even the original signal energy did not have a
steady time envelope. In this case the correction factor is forced
to 1.0 by selecting the index equal to 15.
A reference implementation of the gain correction encoding is listed
in Appendix A.22.
Andersen et. al. Experimental - Expires August 20th, 2002 19
Internet Low Bit Rate Codec February 2002
3.7 Bitstream Definition
The total number of bits used to describe one block of 30 ms speech
is 419 bits giving a bit rate of 13.967 kbit/s. The detailed bit
allocation is shown in the table below.
When representing one block in the payload of one single packet 53
bytes is needed for the 419 bits. 5 bits is unused in the last byte.
Bitstream structure:
Parameter Bits
------------------------------------------------------
Split 1 8
LSF 1 Split 2 8
LSF Split 3 8
---------------------------------------
Split 1 9
LSF 2 Split 2 9
Split 3 10
---------------------------------------
Sum 52
------------------------------------------------------
Block Class. 3
------------------------------------------------------
Scale Factor State Coder 6
------------------------------------------------------
Sample 0 3
Quantized Sample 1 3
Residual : :
State : :
Samples : :
Sample 56 3
Sample 57 3
---------------------------------------
Sum 174
------------------------------------------------------
Stage 1 8
Indices sub-block 1 Stage 2 8
Stage 3 8
---------------------------------------
Stage 1 9
Indices sub-block 2 Stage 2 9
Stage 3 9
CB sub-blocks ---------------------------------------
Stage 1 9
Indices sub-block 3 Stage 2 9
Stage 3 9
---------------------------------------
Stage 1 9
Andersen et. al. Experimental - Expires August 20th, 2002 20
Internet Low Bit Rate Codec February 2002
Indices sub-block 4 Stage 2 9
Stage 3 9
---------------------------------------
Sum 105
------------------------------------------------------
Stage 1 4
Gains sub-block 1 Stage 2 3
Stage 3 3
---------------------------------------
Stage 1 4
Gains sub-block 2 Stage 2 3
Stage 3 3
Gain sub-blocks --------------------------------------
Stage 1 4
Gains sub-block 3 Stage 2 3
Stage 3 3
---------------------------------------
Stage 1 4
Gains sub-block 4 Stage 2 3
Stage 3 3
---------------------------------------
Sum 40
------------------------------------------------------
Stage 1 8
CB for 22 samples in start state Stage 2 8
Stage 3 8
---------------------------------------
Sum 24
------------------------------------------------------
Stage 1 4
Gain for 22 samples in start state Stage 2 3
Stage 3 3
---------------------------------------
Sum 10
------------------------------------------------------
Position 22 sample segment 1
------------------------------------------------------
Gain correction factor 4
------------------------------------------------------
SUM 419
4. DECODER PRINCIPLES
This section describes the principles of each component of the
decoder algorithm.
4.1 LPC Filter Reconstruction
The decoding of the LP filter parameters is very straightforward.
For a set of six indices the corresponding LSF vectors are found by
simple table look up. The three split vectors are concatenated to
Andersen et. al. Experimental - Expires August 20th, 2002 21
Internet Low Bit Rate Codec February 2002
obtain qlsf2 and the quantized prediction error vector for the first
LSF. The prediction vector is calculated from qlsf2 and added
together with the LSF mean vector to the decoded prediction error
vector to obtain qlsf1 in the same way as was described for the
encoder in Section 3.1.4. The next step is the stability check
described in Section 3.1.5 followed by the interpolation scheme
described in Section 3.1.6. The only difference is that only the
quantized LSFs are known at the decoder and hence the unquantized
LSFs are not processed.
A reference implementation of the LPC filter reconstruction is given
in Appendix A.38.
4.2 Start State Reconstruction
The scalar encoded SCLEN state samples are reconstructed by first
forming a set of samples from the index stream SINDEX[n],
multiplying the set with 1/SCAL=10^QMAX/4.5, and then filtering the
block with the inverse dispersion (all-pass) filter used in the
encoder (as described in section 3.4).
The remaining STATE_ACBLEN samples in the state are reconstructed by
the same adaptive codebook technique as described in section 4.3.
The location bit determines whether these are the first or the last
STATE_ACBLEN samples of the state vector. If the remaining
STATE_ACBLEN are the first samples of the state vector, then the
scalar encoded SCLEN state samples are time-reversed before
initialization of the adaptive codebook memory vector.
A reference implementation of the start state reconstruction is
given in Appendix A.46.
4.3 Excitation Decoding Loop
The decoding of the LPC excitation vector proceeds in the same order
in which the residual was encoded at the encoder. That is, after the
decoding of the entire state vector, the forward subblocks
(corresponding to samples occurring after the state vector samples)
are decoded, and then the backward subblocks (corresponding to
samples occurring before the state vector) are decoded, resulting in
a fully decoded block of excitation signal samples.
In particular, each subblock is decoded using the multistage
adaptive codebook decoding module which is described in section 4.4.
This module relies upon an adaptive codebook memory that is
constructed before each run of the adaptive codebook decoding. The
construction of the adaptive codebook memory in the decoder is
identical to the method outlined in section 3.5.2. Therefore for the
initial forward subblock, the last STATLEN=80 samples of the length
LMEM=147 adaptive codebook memory are filled with the samples of the
state vector. For subsequent forward subblocks, the first SUBL=40
samples of the adaptive codebook memory are discarded, the remaining
samples are shifted by SUBL samples towards the beginning of the
Andersen et. al. Experimental - Expires August 20th, 2002 22
Internet Low Bit Rate Codec February 2002
vector, while the newly decoded SUBL=40 samples are placed at the
end of the adaptive codebook memory. For backward subblocks, the
construction is similar except that every vector of samples involved
is first time-reversed.
A reference implementation of the excitation decoding loop is found
in Appendix A.5.
4.4 Multistage Adaptive Codebook Decoding
The Multistage Adaptive Codebook Decoding module is used at both the
sender (encoder) and the receiver (decoder) ends to produce a
synthetic signal in the residual domain that is eventually used to
produce synthetic speech. The module takes the index values used to
construct vectors that are scaled and summed together to produce a
synthetic signal that is the output of the module.
4.4.1 Construction of the Decoded Excitation Signal
The unpacked index values provided at the input to the module are
references to extended codebooks, which are constructed as described
in Section 3.5.2 with the only difference that it is based on the
codebook memory without the perceptual weighting. The unpacked 3
indexes are used to look up 3 codebook vectors. The unpacked 3 gain
indexes are used to decode the corresponding 3 gains. In this
decoding the successive rescaling described in Section 3.5.3.2.
A reference implementation of the adaptive codebook decoding is
listed in Appendix A.34.
4.5 Packet Loss Concealment
If packet loss occurs, the decoder receives a signal saying that
information regarding a block is lost. For such blocks a Packet Loss
Concealment (PLC) unit can be used to create a decoded signal which
mask the effect of that packet loss. In the following we will
describe an example of a PLC unit that can be used with the iLBC
codec. As the PLC unit is used only at the decoder, the PLC unit
does not affect interoperability between implementations. Other PLC
implementations can therefore be used.
The example PLC described operates on the LP filters and the
excitation signals and is based on the following principles:
4.5.1 Block Received Correctly and Previous Block also Received
If the block is received correctly, the PLC only records state
information of the current block that can be used in case the next
block is lost. The LP filters for each subblock, each first stage
adaptive codebook lag (which can be construed as pitch) for each
subblock that runs the adaptive codebook decoding, and the entire
decoded excitation signal are all saved in the PLCState structure.
All this information will be needed if the following block is lost.
Andersen et. al. Experimental - Expires August 20th, 2002 23
Internet Low Bit Rate Codec February 2002
4.5.2 Block Not Received
If the block is not received, the block substitution is based on
doing a pitch synchronous repetition of the excitation signal which
is filtered by modified versions of the previous block's LP filters.
The previous block's information is stored in the structure
PLCState.
First, the previous block's LP filters are bandwidth expanded (the
effect of which is to pull the roots away from the unit circle to
mute the resonance of the filters) to produce the LP filters that
are used in the synthesis of the substituted block.
A correlation analysis is performed on the previous block's
excitation signal in order to detect the amount of pitch periodicity
and a pitch value. The correlation measure is also used to decide on
the voicing level (the degree to which the previous block's
excitation was a voiced or roughly periodic signal). The excitation
in the previous block is used to create an excitation for the block
to be substituted such that the pitch of the previous block is
maintained. Therefore, the new excitation is constructed in a pitch
synchronous manner. In order to avoid a buzzy sounding substituted
block, a random excitation is mixed with the new pitch periodic
excitation and the relative use of the two components is computed
from the correlation measure (voicing level).
For the block to be substituted, the newly constructed excitation
signal is then passed through the newly constructed LP filters to
produce the speech that will be substituted for the lost block.
For several consecutive lost blocks, the packet loss concealment
continues in a similar manner. The correlation measure of the last
received block is still used along with the same pitch value. The LP
filters of the last received block are also used again, but the
bandwidth expansion is increased for consecutive lost blocks (as the
length in time from the last received block increases). This
increases the muting of the resonance of the spectral envelope. The
energy of the substituted excitation for consecutive lost blocks is
decreased, leading to a dampened excitation, and therefore dampened
speech.
4.5.3 Block Received Correctly When Previous Block Not Received
For the case in which a block is received correctly when the
previous block was not received, the correctly received block's
directly decoded speech (based solely on the received block) is not
used as the actual output. The reason for this is that the directly
decoded speech does not necessarily smoothly merge into the
synthetic speech generated for the previous lost block. If the two
signals are not smoothly merged, an audible discontinuity is
accidentally produced. Therefore, a correlation analysis between the
two blocks of excitation signal (the excitation of the previous
Andersen et. al. Experimental - Expires August 20th, 2002 24
Internet Low Bit Rate Codec February 2002
concealed block and the excitation of the current received block) is
performed to find the best phase match. Then a simple overlap-add
procedure is performed to smoothly merge the previous excitation
into the current block's excitation.
The exact implementation of the packet loss concealment does not
influence interoperability of the codec.
A reference implementation of the packet loss concealment is
suggested in Appendix A.34. Exact compliance with this suggested
algorithm is not needed for a reference implementation to be fully
compatible with the overall codec specification.
4.6 Enhancement
The decoder contains an enhancement unit that operates on the
reconstructed excitation signal. The enhancement unit increases the
perceptual quality of the reconstructed signal by reducing the
speech-correlated noise (more accurately: speech-dependent noise)
in the voiced speech segments. The enhancement unit has advantages
over the postfilters that are conventionally used to a similar
purpose.
To understand the motivation for the enhancement unit, it is useful
to define an enhancement signal that is the subtraction of the
distorted input signal from the enhanced output signal. In
conventional postfiltering operators, the relative power of the
enhancement signal will vary strongly as a function of time. In
certain time intervals the enhancement signal has (too) much energy,
and in others it has (too) little. The enhancement operation
settings usually form a heuristic compromise between such time
regions. The need for a compromise results from the postfiltering
operation being based on the input signal only, except for signal
power conservation. In other words, the conventionally used
postfilters operate in open-loop fashion.
4.6.1 Outline of the Enhancement Unit
The enhancement unit of iLBC introduces a second constraint on the
enhanced signal, in addition to the first constraint that conserves
the short-term power between the input and output of the enhancer.
The second constraint is that the enhancement signal (which is
defined as a difference signal resulting from subtracting the
distorted signal from the enhanced signal) is constrained to have a
power that is less than or equal to a certain fraction of the power
of the distorted speech signal. The second constraint prevents the
common artifacts resulting from "over-enhancement" during some time
intervals that are common to conventional postfilters. Yet, the
second constraint does not significantly affect the effectiveness of
the enhancement in sustained voiced regions environments, where
enhancement of speech signals corrupted by speech-correlated noise
is typically most needed.
Andersen et. al. Experimental - Expires August 20th, 2002 25
Internet Low Bit Rate Codec February 2002
The speech enhancement unit includes two basic steps, each performed
for each current time sample of the signal. The pitch track or delay
track that is determined and used in the iLBC coder is an input to
the first step. The first step consists of refining the pitch track
so as to allow a sampling of the distorted input signal using
sampling intervals of precisely one pitch period, starting from the
current sample, to obtain a pitch-period-synchronous sequence. Thus,
the procedure creates such a pitch-period-synchronous sequence for
each sample of the coded excitation (the sample of the distorted
speech signal being also a sample of the corresponding
pitch-period-synchronous sequence).
To simplify processing, the pitch-period-synchronous sequence is
determined simultaneously for a set of consecutive samples of the
distorted input signal (i.e., for a block of that signal). We refer
to such a set of consecutive excitation-signal samples (block) as a
sample-sequence. Our simultaneous determination of pitch-period-
synchronous sequences for an entire sample-sequence results in a
pitch-period-synchronous sequence of sample-sequences.
The second step of our enhancement operator includes re-estimating
each sample based on the corresponding pitch-period-synchronous
sequence, the first signal-power constraint, and the second
constraint operating on the enhancement signal. The sequence of re-
estimated samples (the re-estimated signal block) forms the enhanced
excitation signal. The enhanced speech signal is more periodic than
the distorted speech signal, when the signal is voiced (and the
pitch-period-synchronous sequence corresponds to a nearly periodic
sampling of the distorted signal). To simplify the processing, the
re-estimation procedure is also performed simultaneously for a
sample-sequence, rather than for each sample individually.
Concatenation of the re-estimated sample sequences (excitation
signal blocks) results in the reconstructed excitation signal.
It is noted that in regions where the speech signal is not
nearly-periodic, the speech enhancement system does not change the
distorted signal significantly because of the second constraint.
However, whenever the distorted speech signal is nearly periodic,
the speech enhancement system effectively removes or reduces the
audible distortion. It is also noted that the second constraint not
only results in a reduction of artifacts, but that it also results
in an insensitivity to lack of robustness of determination of pitch-
period-synchronous sequences.
In the following two subsections, we first discuss the determination
of the pitch-synchronous sequence of sample-sequences for the
current
sample-sequence and then the re-estimation of the sample-sequence.
Concatenation of the re-estimated sample-sequences forms the
reconstructed excitation signal of the iLBC coder.
Andersen et. al. Experimental - Expires August 20th, 2002 26
Internet Low Bit Rate Codec February 2002
4.6.2 Determination of the Pitch-Synchronous Sequences
Upon receiving the pitch track, the enhancer refines this for a
particular block (sample sequence), to obtain a pitch-period-
synchronous sequence of sample-sequences. Such a pitch-period-
synchronous sequence of sample-sequences is determined for each
consecutive block of samples (each block forms a sample-sequence).
The pitch-period-synchronous sequence of sample-sequences is
determined recursively, both forward- and backward-in-time.
We describe the procedure to determine the pitch-synchronous-
sequence determiner in more detail for the backward iterative
procedure. The forward iterative procedure is analogous. The
sequence of sample-sequences is determined in a computationally
efficient, recursive manner.
The reference sample-sequence of an iteration step is initially,
i.e., for the first iteration step) defined as the current block of
samples. Each subsequent reference sample-sequence is found
recursively in the following steps. In a first step, a signal
segment is up-sampled to create a set of polyphase signals that have
identical sampling rate as the original signal. Each polyphase
signal is offset by a different fractional sampling interval. In a
second step, a subset of sample-sequences of the various polyphase
signals is then identified as candidate sample-sequences. This
subset of sample sequences falls within a certain range of time
delays that is close to the pitch period obtained from the iLBC
decoder. In a third step, one sample sequence is selected from the
set of candidate sample sequences. The selected sample-sequence is
the sample-sequence that has the highest correlation coefficient
with the reference sample-sequence. In the final step of each
iteration, the selected sample-sequence replaces the reference
sample-sequence to prepare for the next iteration. The procedure is
repeated until the required number of sample-sequences backward-in-
time is found, which depends on the parameter settings used for the
iLBC coder.
The forward-in-time part of the pitch-period-synchronous sequence
process is determined in a manner analogous to the backward-in-time
part of the pitch-period-synchronous sequence. The number of sample-
sequences forward-in-time and the number of sample-sequences
backward-in-time can be varied individually, to obtain the desired
delay and performance characteristics.
4.6.3 Re-estimation of the Current Sample-Sequence
For each successive sample-sequence (i.e., each successive block of
the excitation signal), a re-estimation of the sample-sequence is
performed. This re-estimation is determined from the current pitch-
synchronous sequence of sample-sequences, through a constrained
optimization procedure.
Andersen et. al. Experimental - Expires August 20th, 2002 27
Internet Low Bit Rate Codec February 2002
Let x_m be a vector representing a sample-sequence with index m
within the current pitch-synchronous sequence of sample-sequences.
The determination of this pitch-synchronous sample sequence was
described in section X.1. Furthermore, let z be the re-estimated
current sample sequence. We then define the following cross-
correlation based periodicity criterion that defines a measure of
periodicity for the pitch-period-synchronous sequence:
e = sum_{m=-W, m!=0}^{m=W} a_m z^T x_m, (1)
where T indicates conjugate transpose, != indicates not equal, and
the set of coefficients a_m form a weighting window that specifies
the weightings of the respective inner product between the re-
estimated sample-sequence and the sample-sequences. We use a
centered Hanning weighting modified so as to set a_0 to 0.
The objective of the re-estimation procedure is to find the modified
current sample-sequence z that maximizes the periodicity criterion
(1) under two constraints. The first constraint is the constraint
that the modified vector have the same energy as the original vector
z^T z= x_0^T x_0 . (2)
The second constraint is that the difference vector, i.e., the
modification, have relative low energy:
(z-x_0)^T (z-x_0) <= b x_0^T x_0 , (3)
where the value selected for b is positive and less than unity, with
a larger value resulting generally in stronger enhancement of the
signal periodicity. It is clear that, for small b, non-periodic
signals cannot generally be converted into nearly-periodic signals.
The purpose of the second constraint is to prevent production of an
enhanced signal that is significantly different from the original
signal. This also means that the second constraint limits the
numerical size of the errors that the enhancement procedure can
make.
To achieve constrained optimization, the Lagrange multiplier
technique can be used. We distinguish two solution regions for the
optimization: 1) the region where the second constraint is not
activated (in this solution region inequality (3) is a true
inequality) and 2) the region where the second constraint is
activated (in this solution region (3) is an equality constraint).
Let us define
y = sum_{m=-W, m!=0}^{m=W} a_m x_m, (4)
Then, in the first case, where the second constraint is not
activated, the optimized re-estimated vector is simply a scaled
version of y:
z = y sqrt( x_0^T x_0 / (y^T y)). (5)
Andersen et. al. Experimental - Expires August 20th, 2002 28
Internet Low Bit Rate Codec February 2002
In the second case, where the second constraint is activated and
becomes an equality constraint, we have that
z= Ay + B x_0 (6)
where
A = sqrt((b-b^2/4) x_0^T x_0/(y^Ty - (y^T x_0)^2/(x_0^T x_0))) (7)
and
B = 1 - b/2 - A (y^T x_0)/(x_0^T x_0). (8)
It is now seen that the entire re-estimation of the current sample-
sequence, from a given pitch-synchronous sequence of sample-
sequences, can be performed in three simple steps. In a first step,
we find the determine that optimizes the periodicity with only the
first constraint activated. The resulting trial solution is given by
equation (5). In a second step, we check if this trial solution
satisfies the second constraint given by inequality (3). If it does,
this trial solution for is used and the third step is omitted. If
this is not the case, then we determine solution (6) of the
optimization, where both the first and the second constraint are
considered as equality constraints.
As was mentioned before, the reconstructed excitation signal
consists of the concatenation of the re-estimated current sample-
sequences.
Appendix A.16 contains a listing of a reference implementation for
the enhancement method.
4.7 Synthesis Filtering
Upon decoding or PLC of the LP excitation block, the decoded speech
block is obtained by running the decoded LP synthesis filter over
the block. For decoded signal blocks the LP coefficients are changed
at the first sample of every sub block. For PLC blocks, one solution
is to apply the last LP coefficients of the last decoded speech
block for all sub blocks.
The reference implementation for the synthesis filtering can be
found in appendix A.50.
5. SECURITY CONSIDERATIONS
This algorithm for the coding of speech signals is not subject of
any known security consideration; however, its RTP payload format
[1] is subject of several considerations which are addressed there.
Andersen et. al. Experimental - Expires August 20th, 2002 29
Internet Low Bit Rate Codec February 2002
6. REFERENCES
[1] A. Duric and S. V. Andersen, "RTP Payload Format for iLBC
Speech", draft-duric-avt-gips-ilbc-00.txt, February 2002.
[2] S. Bradner, "Key words for use in RFCs to Indicate requirement
Levels", BCP 14, RFC 2119, March 1997.
[3] ITU-T Recommendation G.711, available online from the ITU
bookstore at http://www.itu.int.
7. ACKNOWLEDGEMENTS
The authors wish to thank Henry Sinnreich for great support of this
initiative and also wish to thank à. for their valuable feedback and
comments.
Andersen et. al. Experimental - Expires August 20th, 2002 30
Internet Low Bit Rate Codec February 2002
8. AUTHOR'S ADDRESSES
Soren Vang Andersen
Global IP Sound AB
Rosenlundsgatan 54
Stockholm, S-11863
Sweden
Phone: +46 8 54553040
Email: soren.andersen@globalipsound.com
Alan Duric
Global IP Sound AB
Rosenlundsgatan 54
Stockholm, S-11863
Sweden
Phone: +46 8 54553040
Email: alan.duric@globalipsound.com
Roar Hagen
Global IP Sound AB
Rosenlundsgatan 54
Stockholm, S-11863
Sweden
Phone: +46 8 54553040
Email: roar.hagen@globalipsound.com
W. Bastiaan Kleijn
Global IP Sound AB
Rosenlundsgatan 54
Stockholm, S-11863
Sweden
Phone: +46 8 54553040
Email: bastiaan.kleijn@globalipsound.com
Jan Linden
Global IP Sound Inc.
900 Kearny Street, suite 500
San Francisco, CA-94133
USA
Phone: +1 415 397 2555
Email: jan.linden@globalipsound.com
Manohar N. Murthi
1630 Eagle Dr.
Sunnyvale, CA-94087
USA
Phone: +1 408 749 8160
Email: mnmurthi@yahoo.com
Jan Skoglund
Global IP Sound Inc.
900 Kearny Street, suite 500
San Francisco, CA-94133
Andersen et. al. Experimental - Expires August 20th, 2002 31
Internet Low Bit Rate Codec February 2002
USA
Phone: +1 415 397 2555
Email: jan.skoglund@globalipsound.com
Julian Spittka
Global IP Sound Inc.
900 Kearny Street, suite 500
San Francisco, CA-94133
USA
Phone: +1 415 397 2555
Email: julian.spittka@globalipsound.com
Andersen et. al. Experimental - Expires August 20th, 2002 32
Internet Low Bit Rate Codec February 2002
APPENDIX A REFERENCE IMPLEMENTATION
This appendix contains the complete c-code for a reference
implementation of encoder and decoder for the specified codec.
The c-code consists of the following files with highest level
functions:
iLBC_test.c: main function for evaluation purpose
iLBC_encode.h: encoder header
iLBC_encode.c: encoder function
iLBC_decode.h: decoder header
iLBC_decode.c: decoder function
the following files containing global defines and constants:
iLBC_define.h: global defines
constants.h: global constants header
constants.c: global constants memory allocations
and the following files containing subroutines:
anaFilter.h: lpc analysis filter header
anaFilter.c: lpc analysis filter function
createCB.h: codebook construction header
createCB.c: codebook construction function
doCPLC.h: packet loss concealment header
doCPLC.c: packet loss concealment function
enhancer.h: signal enhancement header
enhancer.c: signal enhancement function
filter.h: general filter header
filter.c: general filter functions
FrameClassify.h: start state classification header
FrameClassify.c: start state classification function
gaincorr_Encode.h: gain correction encoder header
gaincorr_Encode.c: gain correction encoder function
gainquant.h: gain quantization header
gainquant.c: gain quantization function
getCBvec.h: codebook vector construction header
getCBvec.c: codebook vector construction function
helpfun.h: general purpose header
helpfun.c: general purpose functions
hpInput.h: input high pass filter header
hpInput.c: input high pass filter function
hpOutput.h: output high pass filter header
hpOutput.c: output high pass filter function
iCBConstruct.h: excitation decoding header
iCBConstruct.c: excitation decoding function
iCBSearch.h: excitation encoding header
iCBSearch.c: excitation encoding function
Andersen et. al. Experimental - Expires August 20th, 2002 33
Internet Low Bit Rate Codec February 2002
LPCdecode.h: lpc decoding header
LPCdecode.c: lpc decoding function
LPCencode.h: lpc encoding header
LPCencode.c: lpc encoding function
lsf.h: line spectral frequencies header
lsf.c: line spectral frequencies functions
packing.h: bitstream packetization header
packing.c: bitstream packetization functions
StateConstructW.h: state decoding header
StateConstructW.c: state decoding functions
StateSearchW.h: state encoding header
StateSearchW.c: state encoding function
syntFilter.h: lpc synthesis filter header
syntFilter.c: lpc synthesis filter function
The implementation is portable and should work on many different
platforms. However, it is not difficult to optimize the
implementation on particular platforms, an exercise left to the
reader.
A.1 iLBC_test.c
/******************************************************************
iLBC Speech Coder ANSI-C Source Code
iLBC_test.c
Copyright (c) 2001,
Global IP Sound AB.
All rights reserved.
******************************************************************/
#include
#include
#include
#include
#include "iLBC_define.h"
#include "iLBC_encode.h"
#include "iLBC_decode.h"
#include "constants.h"
//#include "iLBCInterface.h"
#define ILBCNOOFWORDS ILBCFLOAT_GIPS_NOOFBYTES/2
/* Runtime statistics */
#include
#define CLOCKS_PER_SEC 1000
#define TIME_PER_FRAME 30
Andersen et. al. Experimental - Expires August 20th, 2002 34
Internet Low Bit Rate Codec February 2002
/*----------------------------------------------------------------*
* Initiation of encoder instance.
*---------------------------------------------------------------*/
short initEncode( /* (o) Number of bytes encoded */
iLBC_Enc_Inst_t *iLBCenc_inst /* (i/o) Encoder instance */
){
int i;
memset((*iLBCenc_inst).anaMem, 0,
ILBCFLOAT_GIPS_FILTERORDER*sizeof(float));
for (i=0; i1) {
Andersen et. al. Experimental - Expires August 20th, 2002 36
Internet Low Bit Rate Codec February 2002
printf("\nERROR - Wrong mode - 0, 1 allowed\n"); exit(3);}
/* do actual decoding of block */
iLBC_decode(decblock, (unsigned char *)encoded_data,
iLBCdec_inst, mode);
/* convert to short */
for(k=0;kMAX_SAMPLE)
dtmp=MAX_SAMPLE;
decoded_data[k] = (short) dtmp;
}
return (short)ILBCFLOAT_GIPS_BLOCKL;
}
/*----------------------------------------------------------------*
* Main program to test iLBC encoding and decoding
*
* Usage:
* exefile_name.exe
*
*---------------------------------------------------------------*/
void main(int argc, char* argv[])
{
/* Runtime statistics */
float starttime;
float runtime;
float outtime;
FILE *ifileid,*efileid,*ofileid;
short encoded_data[ILBCNOOFWORDS], data[ILBCFLOAT_GIPS_BLOCKL];
int blockcount = 0;
iLBC_Enc_Inst_t Enc_Inst;
iLBC_Dec_Inst_t Dec_Inst;
/* get arguments and open files */
if(argc != 4 ){
fprintf(stderr, "%s inputfile channelfile outputfile\n",
argv[0]); exit(1);}
if( (ifileid=fopen(argv[1],"rb")) == NULL){
fprintf(stderr,"Cannot open input file %s\n", argv[1]);
exit(2);}
Andersen et. al. Experimental - Expires August 20th, 2002 37
Internet Low Bit Rate Codec February 2002
if( (efileid=fopen(argv[2],"wb")) == NULL){
fprintf(stderr, "Cannot open channelfile file %s\n",
argv[2]); exit(3);}
if( (ofileid=fopen(argv[3],"wb")) == NULL){
fprintf(stderr, "Cannot open output file %s\n",
argv[2]); exit(3);}
/* print info */
fprintf(stderr, "\n");
fprintf(stderr,
"*---------------------------------------------------*\n");
fprintf(stderr,
"* *\n");
fprintf(stderr,
"* ilbclibtest *\n");
fprintf(stderr,
"* *\n");
fprintf(stderr,
"* *\n");
fprintf(stderr,
"*---------------------------------------------------*\n");
fprintf(stderr, "\nInput file : %s\n", argv[1]);
fprintf(stderr,"Channel file : %s\n", argv[2]);
fprintf(stderr,"Output file : %s\n\n", argv[3]);
/* Initialization */
initEncode(&Enc_Inst);
initDecode(&Dec_Inst);
/* Runtime statistics */
starttime=clock()/(float)CLOCKS_PER_SEC;
/* loop over input blocks */
while( fread(data,sizeof(short),
ILBCFLOAT_GIPS_BLOCKL,ifileid)==ILBCFLOAT_GIPS_BLOCKL){
blockcount++;
/* encoding */
fprintf(stderr, "--- Encoding block %i --- ",blockcount);
encode(&Enc_Inst, encoded_data, data);
fprintf(stderr, "\r");
/* write byte file */
fwrite(encoded_data,sizeof(short),ILBCNOOFWORDS,efileid);
/* decoding */
Andersen et. al. Experimental - Expires August 20th, 2002 38
Internet Low Bit Rate Codec February 2002
fprintf(stderr, "--- Decoding block %i --- ",blockcount);
decode(&Dec_Inst, data, encoded_data, 1);
fprintf(stderr, "\r");
/* write output file */
fwrite(data,sizeof(short),ILBCFLOAT_GIPS_BLOCKL,ofileid);
}
/* Runtime statistics */
runtime = (float)(clock()/(float)CLOCKS_PER_SEC-starttime);
outtime = (float)((float)blockcount*
(float)TIME_PER_FRAME/1000.0);
printf("\nLength of speech file: %.1f s\n", outtime);
printf("Time to run iLBC_encode+iLBC_decode:");
printf(" %.1f s (%.1f %% of realtime)\n", runtime,
(100*runtime/outtime));
/* close files */
fclose(ifileid); fclose(efileid); fclose(ofileid);
}
A.2 iLBC_encode.h
/******************************************************************
iLBC Speech Coder ANSI-C Source Code
iLBC_encode.h
Copyright (c) 2001,
Global IP Sound AB.
All rights reserved.
******************************************************************/
#ifndef __iLBC_ILBCENCODE_H
#define __iLBC_ILBCENCODE_H
#include "iLBC_define.h"
void iLBC_encode(
unsigned char *bytes, /* (o) encoded data bits iLBC */
float *block, /* (o) speech vector to encode */
iLBC_Enc_Inst_t *iLBCenc_inst /* (i/o) the general encoder
state */
);
#endif
Andersen et. al. Experimental - Expires August 20th, 2002 39
Internet Low Bit Rate Codec February 2002
A.3 iLBC_encode.c
/******************************************************************
iLBC Speech Coder ANSI-C Source Code
iLBC_encode.c
Copyright (c) 2001,
Global IP Sound AB.
All rights reserved.
******************************************************************/
#include
#include
#include "iLBC_define.h"
#include "LPCencode.h"
#include "FrameClassify.h"
#include "StateSearchW.h"
#include "StateConstructW.h"
#include "helpfun.h"
#include "constants.h"
#include "packing.h"
#include "iCBSearch.h"
#include "iCBConstruct.h"
#include "hpInput.h"
#include "anaFilter.h"
#include "syntFilter.h"
#include "gaincorr_Encode.h"
#include
/*----------------------------------------------------------------*
* main encoder function
*---------------------------------------------------------------*/
void iLBC_encode(
unsigned char *bytes, /* (o) encoded data bits iLBC */
float *block, /* (o) speech vector to encode */
iLBC_Enc_Inst_t *iLBCenc_inst /* (i/o) the general encoder
state */
){
float data[BLOCKL];
float residual[BLOCKL], reverseResidual[BLOCKL];
int start, idxForMax, idxVec[STATE_LEN];
float reverseDecresidual[BLOCKL], mem[MEML];
int n, k, kk, meml_gotten, Nfor, Nback, i;
Andersen et. al. Experimental - Expires August 20th, 2002 40
Internet Low Bit Rate Codec February 2002
int dummy=0;
int gain_index[NSTAGES*NASUB], extra_gain_index[NSTAGES];
int cb_index[NSTAGES*NASUB],extra_cb_index[NSTAGES] ;
int lsf_i[LSF_NSPLIT*LPC_N];
unsigned char *pbytes;
int diff, start_pos, state_first;
float en1, en2;
int index, gc_index;
int subcount, subframe;
float gainadjusttarget[BLOCKL];
float weightState[FILTERORDER];
float syntdenum[NSUB*(FILTERORDER+1)];
float weightnum[NSUB*(FILTERORDER+1)];
float weightdenum[NSUB*(FILTERORDER+1)];
float decresidual[BLOCKL];
/* high pass filtering of input signal if such is not done
prior to calling this function */
//hpInput(block, BLOCKL, data, (*iLBCenc_inst).hpimem);
/* otherwise simply copy */
memcpy(data,block,BLOCKL*sizeof(float));
/* LPC of hp filtered input data */
LPCencode(syntdenum, weightnum, weightdenum, lsf_i, data,
iLBCenc_inst);
/* inverse filter to get residual */
for (n=0; n en2) {
state_first = 1;
start_pos = (start-1)*SUBL;
} else {
state_first = 0;
start_pos = (start-1)*SUBL + diff;
}
/* scalar quantization of state */
StateSearchW(&residual[start_pos],
&syntdenum[(start-1)*(FILTERORDER+1)],
&weightnum[(start-1)*(FILTERORDER+1)],
&weightdenum[(start-1)*(FILTERORDER+1)], &idxForMax,
idxVec, STATE_SHORT_LEN);
StateConstructW(idxForMax, idxVec,
&syntdenum[(start-1)*(FILTERORDER+1)],
&decresidual[start_pos], STATE_SHORT_LEN);
/* predictive quantization in state */
if (state_first) { /* put adaptive part in the end */
/* setup memory */
memset(mem, 0, (MEML-STATE_SHORT_LEN)*sizeof(float));
memcpy(mem+MEML-STATE_SHORT_LEN, decresidual+start_pos,
STATE_SHORT_LEN*sizeof(float));
memset(weightState, 0, FILTERORDER*sizeof(float));
/* encode subframes */
iCBSearch(extra_cb_index, extra_gain_index,
&residual[start_pos+STATE_SHORT_LEN], mem+MEML-stMemL,
stMemL, diff, NSTAGES,
&weightdenum[(start-1)*(FILTERORDER+1)], weightState);
/* construct decoded vector */
iCBConstruct(&decresidual[start_pos+STATE_SHORT_LEN],
extra_cb_index, extra_gain_index, mem+MEML-stMemL,
stMemL, diff, NSTAGES);
}
else { /* put adaptive part in the beginning */
/* create reversed vectors for prediction */
for(k=0; k 0 ){
/* setup memory */
memset(mem, 0, (MEML-STATE_LEN)*sizeof(float));
memcpy(mem+MEML-STATE_LEN, decresidual+(start-1)*SUBL,
STATE_LEN*sizeof(float));
memset(weightState, 0, FILTERORDER*sizeof(float));
/* loop over subframes to encode */
for (subframe=0; subframe 0 ){
/* create reverse order vectors */
for( n=0; n MEML ){ meml_gotten=MEML; }
for( k=0; k
#include
#include
#include "iLBC_define.h"
#include "StateConstructW.h"
#include "LPCdecode.h"
#include "iCBConstruct.h"
#include "doCPLC.h"
#include "helpfun.h"
#include "constants.h"
#include "packing.h"
#include "string.h"
#include "enhancer.h"
#include "hpOutput.h"
#include "syntFilter.h"
/*----------------------------------------------------------------*
* frame residual decoder function (subrutine to iLBC_decode)
*---------------------------------------------------------------*/
void Decode(
float *decresidual, /* (o) decoded residual frame */
int start, /* (i) location of start state */
int idxForMax, /* (i) codebook index for the maximum value */
int *idxVec, /* (i) codebook indexes for the samples in the
start state*/
float *syntdenum, /* (i) the decoded synthesis filter
coefficients */
int *cb_index, /* (i) the indexes for the adaptive codebook */
int *gain_index, /* (i) the indexes for the corresponding
Andersen et. al. Experimental - Expires August 20th, 2002 47
Internet Low Bit Rate Codec February 2002
gains */
int *extra_cb_index, /* (i) the indexes for the adaptive
codebook part of start state */
int *extra_gain_index, /* (i) the indexes for the corresponding
gains */
int state_first, /* (i) 1 if non adaptive part of start state
comes first 0 if that part comes last */
int gc_index /* (i) the index for the gain correction factor */
){
float reverseDecresidual[BLOCKL], mem[MEML];
int n, k, meml_gotten, Nfor, Nback, i;
int diff, start_pos;
int subcount, subframe;
float factor;
float std_decresidual, one_minus_factor_scaled;
int gaussstart;
diff = STATE_LEN - STATE_SHORT_LEN;
if(state_first == 1) start_pos = (start-1)*SUBL;
else start_pos = (start-1)*SUBL + diff;
/* decode scalar part of start state */
StateConstructW(idxForMax, idxVec,
&syntdenum[(start-1)*(FILTERORDER+1)],
&decresidual[start_pos], STATE_SHORT_LEN);
if (state_first) { /* put adaptive part in the end */
/* setup memory */
memset(mem, 0, (MEML-STATE_SHORT_LEN)*sizeof(float));
memcpy(mem+MEML-STATE_SHORT_LEN, decresidual+start_pos,
STATE_SHORT_LEN*sizeof(float));
/* construct decoded vector */
iCBConstruct(&decresidual[start_pos+STATE_SHORT_LEN],
extra_cb_index, extra_gain_index, mem+MEML-stMemL,
stMemL, diff, NSTAGES);
}
else {/* put adaptive part in the beginning */
/* create reversed vectors for prediction */
for(k=0; k 0 ){
/* setup memory */
memset(mem, 0, (MEML-STATE_LEN)*sizeof(float));
memcpy(mem+MEML-STATE_LEN, decresidual+(start-1)*SUBL,
STATE_LEN*sizeof(float));
/* loop over subframes to encode */
for (subframe=0; subframe 0 ){
/* create reverse order vectors */
for( n=0; n MEML ){ meml_gotten=MEML; }
for( k=0; k0) { /* the data are good */
/* decode data */
pbytes=bytes;
unpack( &pbytes,lsf_i+0,lsf_bits[0]);
unpack( &pbytes,lsf_i+1,lsf_bits[1]);
unpack( &pbytes,lsf_i+2,lsf_bits[2]);
unpack( &pbytes,lsf_i+3,lsf_bits[3]);
unpack( &pbytes,lsf_i+4,lsf_bits[4]);
unpack( &pbytes,lsf_i+5,lsf_bits[5]);
unpack( &pbytes,&start,start_bits);
unpack( &pbytes,&idxForMax,scale_bits);
for(k=0;k0; i--)
mem[i] = mem[i-1];
mem[0] = *pi;
po++;
pi++;
}
}
A.11 createCB.h
/******************************************************************
iLBC Speech Coder ANSI-C Source Code
createCB.h
Copyright (c) 2001,
Global IP Sound AB.
All rights reserved.
Andersen et. al. Experimental - Expires August 20th, 2002 131
Internet Low Bit Rate Codec February 2002
******************************************************************/
#ifndef __iLBC_CREATECB_H
#define __iLBC_CREATECB_H
void createCB(
float *cb, /* (o) Codebook */
float *invenergy, /* (o) Energy of codebook vectors inverted */
float *mem, /* (i) Buffer to create codebook from */
int lMem, /* (i) Length of buffer */
int cbveclen /* (i) Length of codevector */
);
#endif
A.12 createCB.c
/******************************************************************
iLBC Speech Coder ANSI-C Source Code
createCB.c
Copyright (c) 2001,
Global IP Sound AB.
All rights reserved.
******************************************************************/
#include "iLBC_define.h"
#include "constants.h"
#include
/*----------------------------------------------------------------*
* Construct a codebook section and calculate inverted energy of
* each codevector.
*---------------------------------------------------------------*/
int createSection( /* (o) Number of vectors constructed */
float *cb, /* (o) Codebook */
float *energy, /* (o) Energy of codebook vectors */
float *mem, /* (i) Buffer to create codebook from */
int lMem, /* (i) Length of buffer */
int cbveclen /* (i) Length of codevector */
){
int j, k, cb_index;
int ilow, ihigh, ilen;
float alfa, alfa1;
float *pp, *ppe, *ppo, *ppi;
/* index counter */
Andersen et. al. Experimental - Expires August 20th, 2002 132
Internet Low Bit Rate Codec February 2002
cb_index=0;
ppe=energy;
/* first non-interpolated vector */
k=cbveclen;
*ppe=0.0;
pp=mem+lMem-k;
memcpy(cb+cb_index*cbveclen, pp, cbveclen*sizeof(float));
for (j=0; j 5) {ilen=5; ilow=ihigh+1-ilen;}
/* no interpolation */
*ppe=0.0;
pp=mem+lMem-k/2;
memcpy(cb+cb_index*cbveclen, pp, ilow*sizeof(float));
for (j=0; j lMem-1) eInd=lMem-memInd;
pp=mem+sInd+memInd; pp1=&cbfilters[filtno-1][sInd];
for (j=sInd;j0.0)
invenergy[j]=(float)1.0/(invenergy[j]+EPS);
}
}
A.13 doCPLC.h
/******************************************************************
iLBC Speech Coder ANSI-C Source Code
doCPLC.h
Copyright (c) 2001,
Global IP Sound AB.
All rights reserved.
******************************************************************/
#ifndef __iLBC_DOLPC_H
#define __iLBC_DOLPC_H
void doThePLC(
float *PLCresidual, /* (o) concealed residual */
float *PLClpc, /* (o) concealed LP parameters */
int PLI, /* (i) packet loss indicator 0 - no PL, 1 = PL */
float *decresidual, /* (i) decoded residual */
float *lpc, /* (i) decoded LPC (only used for no PL) */
int inlag, /* (i) pitch lag */
iLBC_Dec_Inst_t *iLBCdec_inst /* (i/o) decoder instance */
);
Andersen et. al. Experimental - Expires August 20th, 2002 135
Internet Low Bit Rate Codec February 2002
#endif
A.14 doCPLC.c
/******************************************************************
iLBC Speech Coder ANSI-C Source Code
doCPLC.c
Copyright (c) 2001,
Global IP Sound AB.
All rights reserved.
******************************************************************/
#include
#include
#include "iLBC_define.h"
/*----------------------------------------------------------------*
* Compute cross correlation and pitch gain for pitch prediction
* of last subframe at given lag.
*---------------------------------------------------------------*/
void compCorr(
float *cc, /* (o) cross correlation coefficient */
float *gc, /* (o) gain */
float *buffer, /* (i) signal buffer */
int lag, /* (i) pitch lag */
int nsub, /* (i) number of subframes */
int subl /* (i) subframe length */
){
int i;
float ftmp1, ftmp2;
ftmp1 = 0.0;
ftmp2 = 0.0;
for (i=0; i 0.0) {
*cc = ftmp1*ftmp1/ftmp2;
*gc = (float)fabs(ftmp1/ftmp2);
}
else {
*cc = 0.0;
*gc = 0.0;
Andersen et. al. Experimental - Expires August 20th, 2002 136
Internet Low Bit Rate Codec February 2002
}
}
/*----------------------------------------------------------------*
* Packet loss concealment routine. Conceals a residual signal
* and LP parameters. If no packet loss, update state.
*---------------------------------------------------------------*/
void doThePLC(
float *PLCresidual, /* (o) concealed residual */
float *PLClpc, /* (o) concealed LP parameters */
int PLI, /* (i) packet loss indicator 0 - no PL, 1 = PL */
float *decresidual, /* (i) decoded residual */
float *lpc, /* (i) decoded LPC (only used for no PL) */
int inlag, /* (i) pitch lag */
iLBC_Dec_Inst_t *iLBCdec_inst /* (i/o) decoder instance */
){
int lag, randlag;
float gain, maxcc;
int i, pick, offset;
float ftmp, ftmp1, randvec[BLOCKL], pitchfact;
/* Packet Loss */
if (PLI == 1) {
(*iLBCdec_inst).consPLICount += 1;
/* if previous frame not lost, determine pitch pred. gain */
if ((*iLBCdec_inst).prevPLI != 1) {
lag=inlag;
compCorr(&maxcc, &gain, (*iLBCdec_inst).prevResidual,
lag, NSUB, SUBL);
if (gain > 1.0) gain = 1.0;
}
/* previous frame lost, use recorded lag and gain */
else {
lag=(*iLBCdec_inst).prevLag;
gain=(*iLBCdec_inst).prevGain;
}
/* Attenuate signal and scale down pitch pred gain if
several frames lost consecutively */
if ((*iLBCdec_inst).consPLICount > 1) gain *= (float)0.9;
/* Compute mixing factor of picth repeatition and noise */
Andersen et. al. Experimental - Expires August 20th, 2002 137
Internet Low Bit Rate Codec February 2002
if (gain > XT_MIX)
pitchfact = YT_MIX;
else if (gain < XB_MIX)
pitchfact = YB_MIX;
else
pitchfact = YB_MIX + (gain - XB_MIX) *
(YT_MIX - YB_MIX) / (XT_MIX - XB_MIX);
/* compute concealed residual */
(*iLBCdec_inst).energy = 0.0;
for (i=0; i= GAINTHRESHOLD) {
/* Compute mixing factor of pitch repeatition
and noise */
if (gain > XT_MIX)
pitchfact = YT_MIX;
else if (gain < XB_MIX)
pitchfact = YB_MIX;
else
pitchfact = YB_MIX + (gain - XB_MIX) *
(YT_MIX - YB_MIX) / (XT_MIX - XB_MIX);
/* compute concealed residual for 3 subframes */
for (i=0; i<3*SUBL; i++) {
(*iLBCdec_inst).seed=((*iLBCdec_inst).seed*
69069L+1) & (0x80000000L-1);
randlag = 50 + ((signed long)
(*iLBCdec_inst).seed)%70;
/* noise component */
pick = i - randlag;
if (pick < 0)
randvec[i] = gain *
(*iLBCdec_inst).prevResidual[BLOCKL+pick];
else
randvec[i] = gain * randvec[pick];
Andersen et. al. Experimental - Expires August 20th, 2002 139
Internet Low Bit Rate Codec February 2002
/* pitch repeatition component */
pick = i - lag;
if (pick < 0)
PLCresidual[i] = gain *
(*iLBCdec_inst).prevResidual[BLOCKL+pick];
else
PLCresidual[i] = gain * PLCresidual[pick];
/* mix noise and pitch repeatition */
PLCresidual[i] = (pitchfact * PLCresidual[i] +
((float)1.0 - pitchfact) * randvec[i]);
}
/* interpolate concealed residual with actual
residual */
offset = 3*SUBL;
for (i=0; i
#include
#include "iLBC_define.h"
Andersen et. al. Experimental - Expires August 20th, 2002 141
Internet Low Bit Rate Codec February 2002
#include "constants.h"
/*----------------------------------------------------------------*
* Find index in array such that the array element with said
* index is the element of said array closest to "value"
* according to the squared-error criterion
*---------------------------------------------------------------*/
void nn(
int *index, /* (o) index of array element closest to value */
float *array, /* (i) data array */
float value, /* (i) value */
int arlength /* (i) dimension of data array */
){
int i;
float bestcrit,crit;
crit=array[0]-value;
bestcrit=crit*crit;
*index=0;
for(i=1;i dim1){
/* printf("enh_upsample.c: shortened filter:
filterlength=%d > dim1=%d\n", filterlength, dim1); */
hfl2=(int) (dim1/2);
for(j=0;jENH_SLOP) slop=ENH_SLOP;
e=b+blockl-1;
bll0=bl-slop; if(bll0<0){ bll0=0;}
bll1=bl+slop; if(bll1+blockl >= idatal){ bll1=idatal-blockl-1;}
Andersen et. al. Experimental - Expires August 20th, 2002 144
Internet Low Bit Rate Codec February 2002
corrdim=bll1-bll0+1;
/* compute upsampled correlation (corr33) and find location of
max */
mycorr1(corr22,idata+bll0,corrdim+blockl-1,idata+b,blockl);
enh_upsample(corr33,corr22,corrdim,polyphaser,
ENH_FL0,ENH_UPS0);
tloc=0; maxv=corr33[0];
for(i=1;imaxv){
tloc=i;
maxv=corr33[i];
}
}
/* make vector can be upsampled without ever running outside
bounds */
*bl2= (float)bll0+ (float)tloc/(float)ENH_UPS0+(float)1.0;
tloc2=(int)(tloc/ENH_UPS0); if(tloc>tloc2*ENH_UPS0){tloc2++;}
st=bll0+tloc2-ENH_FL0;
vectl=blockl+2*ENH_FL0;
if(st<0){
for(i=0;i<-st;i++){ vect[i]=0.0;}
for(i=-st;iidatal){
for(i=0;i alpha0 * w00){
if( w00 < 1) w00=1;
denom = (w11*w00-w10*w10)/(w00*w00);
if( denom > 0.0001){ /* eliminates numerical problems
for if smooth */
A = (float)sqrt( (alpha0- alpha0*alpha0/4)/denom);
B = -alpha0/2 - A * w10/w00;
B = B+1;
}
else{ /* essentially no difference between cycles;
smoothing not needed */
A= 0.0;
B= 1.0;
}
/* create smoothed sequence */
psseq=sseq+hl*blockl;
for(i=0;i=0;q--){
bbb[q]=bbb[q+1]-period[ppl[q+1]];
nn(ppl+q,plocs,bbb[q]+hblockl-period[ppl[q+1]],periodl);
if(bbb[q]-ENH_OVERHANG>=0)
refiner(sseq+q*blockl,bbb+q,idata,idatal,b,bbb[q],
blockl,period[ppl[q+1]]);
else{
psseq=sseq+q*blockl;
for(i=0;i 0.0) {
return (float)(ftmp1*ftmp1/ftmp2);
}
else {
return (float)0.0;
}
}
/*----------------------------------------------------------------*
* interface for enhancer
*---------------------------------------------------------------*/
int enhancerInterface(
float *out, /* (o) enhanced signal */
float *in, /* (i) unenhanced signal */
iLBC_Dec_Inst_t *iLBCdec_inst /* (i) buffers etc */
){
float *enh_buf, *enh_period;
float dummy[2];
int iblock, isample;
int lag, ilag;
float cc, maxcc;
enh_buf=(*iLBCdec_inst).enh_buf;
enh_period=(*iLBCdec_inst).enh_period;
for(isample = 0; isample maxcc) {
maxcc = cc;
lag = ilag;
}
}
enh_period[iblock+ENH_NBLOCKS_EXTRA] = (float)lag;
}
for(iblock = 0; iblock max_ssq){
max_ssq = ssq[n];
max_ssq_n = n;
}
}
/* calculate return index */
if( max_ssq_n == 0) return 1;
if( max_ssq_n == (NSUB-1) ) return NSUB-1;
if( ssq[max_ssq_n-1] > ssq[max_ssq_n+1] ) return max_ssq_n;
return max_ssq_n+1;
}
Andersen et. al. Experimental - Expires August 20th, 2002 154
Internet Low Bit Rate Codec February 2002
A.21 gaincorr_Encode.h
/******************************************************************
iLBC Speech Coder ANSI-C Source Code
gaincorr_Encode.h
Copyright (c) 2001,
Global IP Sound AB.
All rights reserved.
******************************************************************/
#ifndef __iLBC_GAINCORR_ENCODE_H
#define __iLBC_GAINCORR_ENCODE_H
int gaincorr_Encode( /* (o) index to quantized gain correction
factor */
float *decresidual, /* (i) the decoded residual vector without
gain correction */
int start_pos, /* (i) the position of the start state in the
residual vector */
float *residual /* (i) the target residual vector */
);
#endif
A.22 gaincorr_Encode.c
/******************************************************************
iLBC Speech Coder ANSI-C Source Code
gaincorr_Encode.c
Copyright (c) 2001,
Global IP Sound AB.
All rights reserved.
******************************************************************/
#include "iLBC_define.h"
#include "constants.h"
#include
int gaincorr_Encode( /* (o) index to quantized gain correction
factor */
float *decresidual, /* (i) the decoded residual vector without
gain correction */
int start_pos, /* (i) the position of the start state in the
residual vector */
Andersen et. al. Experimental - Expires August 20th, 2002 155
Internet Low Bit Rate Codec February 2002
float *residual /* (i) the target residual vector */
){
float state_energy = 0.0, dec_state_energy = 0.0,
residual_energy = 0.0, dec_residual_energy = 0.0;
int i, k, index;
float state_loss_factor, residual_loss_factor, correction_factor;
float factor, std_decresidual, one_minus_factor_scaled;
int gaussstart;
/* calculation of state energies */
for(k=0; k 1) correction_factor = 1;
index=(int)(correction_factor*16)-1;
if (index<0) index=0;
if (sqrt(residual_energy/state_energy)<0.25) index=15;
factor=(float)(index+1)/(float)16.0;
for(k=0;k
#include
#include
#include "constants.h"
#include "filter.h"
/*----------------------------------------------------------------*
* quantizer for the gain in the gain-shape coding of residual
*---------------------------------------------------------------*/
float gainquant( /* (o) quantized gain value */
float in, /* (i) gain value */
float maxIn, /* (i) maximum of gain value */
int cblen, /* (i) number of quantization indices */
int *index /* (o) quantization index */
){
int i, tindex;
float minmeasure,measure, *cb, scale;
/* ensure a lower bound on the scaling factor */
scale=maxIn;
if (scale<0.1) scale=(float)0.1;
/* select the quantization table */
if (cblen == 8)
cb = gain_sq3;
else
cb = gain_sq4;
/* select the best index in the quantization table */
minmeasure=10000.0;
for (i=0;i
/*----------------------------------------------------------------*
* Construct codebook vector for given index.
*---------------------------------------------------------------*/
void getCBvec(
float *cbvec, /* (o) Constructed codebook vector */
float *mem, /* (i) Codebook buffer */
int index, /* (i) Codebook index */
int lMem, /* (i) Length of codebook buffer */
int cbveclen /* (i) Codebook vector length */
){
int j, k, n, filtno, memInd, sInd, eInd, sFilt, eFilt;
float accum, tmpbuf[MEML];
int base_size;
int ilow, ihigh, ilen;
float alfa, alfa1;
/* Determine size of codebook sections */
base_size=lMem-cbveclen+1;
if (cbveclen==SUBL)
base_size+=cbveclen/2;
/* No filter -> First codebook section */
if (index < base_size) {
Andersen et. al. Experimental - Expires August 20th, 2002 160
Internet Low Bit Rate Codec February 2002
/* first non-interpolated vectors */
if (index 5) {ilen=5; ilow=ihigh+1-ilen;}
/* no interpolation */
memcpy(cbvec, mem+lMem-k/2, ilow*sizeof(float));
/* interpolation */
alfa1=(float)1.0/(float)ilen;
alfa=0.0;
for (j=ilow; j<=ihigh; j++) {
cbvec[j]=((float)1.0-alfa)*mem[lMem-k/2+j]+
alfa*mem[lMem-k+j];
alfa+=alfa1;
}
/* no interpolation */
memcpy(cbvec+ihigh+1, mem+lMem-k+ihigh+1,
(cbveclen-1-ihigh)*sizeof(float));
}
}
/* Higher codebbok sections based on filtering */
else {
/* filter number (i.e. section number) */
filtno=index/base_size;
/* first non-interpolated vectors */
if (index-filtno*base_size lMem-1) eInd=lMem-memInd;
for (j=sInd;j lMem-1) eInd=lMem-memInd;
for (j=sInd;j 5) {ilen=5; ilow=ihigh+1-ilen;}
Andersen et. al. Experimental - Expires August 20th, 2002 162
Internet Low Bit Rate Codec February 2002
/* no interpolation */
memcpy(cbvec, tmpbuf+lMem-k/2, ilow*sizeof(float));
/* interpolation */
alfa1=(float)1.0/(float)ilen;
alfa=0.0;
for (j=ilow; j<=ihigh; j++) {
cbvec[j]=((float)1.0-alfa)*
tmpbuf[lMem-k/2+j]+alfa*tmpbuf[lMem-k+j];
alfa+=alfa1;
}
/* no interpolation */
memcpy(cbvec+ihigh+1, tmpbuf+lMem-k+ihigh+1,
(cbveclen-1-ihigh)*sizeof(float));
}
}
}
A.27 helpfun.h
/******************************************************************
iLBC Speech Coder ANSI-C Source Code
helpfun.h
Copyright (c) 2001,
Global IP Sound AB.
All rights reserved.
******************************************************************/
#ifndef __iLBC_HELPFUN_H
#define __iLBC_HELPFUN_H
void autocorr(
float *r, /* (o) autocorrelation vector */
const float *x, /* (i) data vector */
int N, /* (i) length of data vector */
int order /* largest lag for calculated autocorrelations */
);
void window(
float *z, /* (o) the windowed data */
const float *x, /* (i) the original data vector */
const float *y, /* (i) the window */
int N /* (i) length of all vectors */
);
Andersen et. al. Experimental - Expires August 20th, 2002 163
Internet Low Bit Rate Codec February 2002
void levdurb(
float *a, /* (o) lpc coefficient vector starting with 1.0 */
float *k, /* (o) reflection coefficients */
float *r, /* (i) autocorrelation vector */
int order /* (i) order of lpc filter */
);
void interpolate(
float *out, /* (o) the interpolated vector */
float *in1, /* (i) the first vector for the interpolation */
float *in2, /* (i) the second vector for the interpolation */
float coef, /* (i) interpolation weights */
int length /* (i) length of all vectors */
);
void bwexpand(
float *out, /* (o) the bandwidth expanded lpc coefficients */
float *in, /* (i) the lpc coefficients before bandwidth
expansion */
float coef, /* (i) the bandwidth expansion factor */
int length /* (i) the length of lpc coefficient vectors */
);
void vq(
float *Xq, /* (o) the quantized vector */
int *index, /* (o) the quantization index */
const float *CB, /* (i) the vector quantization codebook */
float *X, /* (i) the vector to quantize */
int n_cb, /* (i) the number of vectors in the codebook */
int dim /* (i) the dimension of all vectors */
);
void gvq(
float *Xq, /* (o) the quantized vector */
int *index, /* (o) the quantization index */
float *CB, /* (i) the shape codebook */
float *X, /* (i) the vector to quantize */
int n_cb, /* (i) the number of vectors in the shape
codebook */
int dim, /* (i) dimension of all vectors */
float in_ene, /* (i) the energy of the input vector */
float factor, /* (o) resulting gain factor */
int targlen /* (i) dimension of all vectors */
);
void SplitVQ(
float *qX, /* (o) the quantized vector */
int *index, /* (o) a vector of indexes for all vector
codebooks in the split */
float *X, /* (i) the vector to quantize */
const float *CB, /* (i) the quantizer codebook */
int nsplit, /* the number of vector splits */
Andersen et. al. Experimental - Expires August 20th, 2002 164
Internet Low Bit Rate Codec February 2002
const int *dim, /* the dimension of X and qX */
const int *cbsize /* the number of vectors in the
codebook */
);
void sort_sq(
float *xq, /* (o) the quantized value */
int *index, /* (o) the quantization index */
float x, /* (i) the value to quantize */
const float *cb, /* (i) the quantization codebook */
int cb_size /* (i) the size of the quantization codebook */
);
int LSF_check(
float *lsf, /* (i) a table of lsf vectors */
int dim, /* (i) the dimension of each lsf vector */
int NoAn /* (i) the number of lsf vectors in the table */
);
#endif
A.28 helpfun.c
/******************************************************************
iLBC Speech Coder ANSI-C Source Code
helpfun.c
Copyright (c) 2001,
Global IP Sound AB.
All rights reserved.
******************************************************************/
#include
#include "iLBC_define.h"
#include "constants.h"
/*----------------------------------------------------------------*
* calculation of auto correlation
*---------------------------------------------------------------*/
void autocorr(
float *r, /* (o) autocorrelation vector */
const float *x, /* (i) data vector */
int N, /* (i) length of data vector */
int order /* largest lag for calculated autocorrelations */
){
int lag, n;
Andersen et. al. Experimental - Expires August 20th, 2002 165
Internet Low Bit Rate Codec February 2002
float sum;
for (lag = 0; lag <= order; lag++) {
sum = 0;
for (n = 0; n < N - lag; n++)
sum += x[n] * x[n+lag];
r[lag] = sum;
}
}
/*----------------------------------------------------------------*
* window multiplication
*---------------------------------------------------------------*/
void window(
float *z, /* (o) the windowed data */
const float *x, /* (i) the original data vector */
const float *y, /* (i) the window */
int N /* (i) length of all vectors */
){
int i;
for (i = 0; i < N; i++) {
z[i] = x[i] * y[i];
}
}
/*----------------------------------------------------------------*
* levinson-durbin solution for lpc coefficients
*---------------------------------------------------------------*/
void levdurb(
float *a, /* (o) lpc coefficient vector starting with 1.0 */
float *k, /* (o) reflection coefficients */
float *r, /* (i) autocorrelation vector */
int order /* (i) order of lpc filter */
){
float sum, alpha;
int m, m_h, i;
a[0] = 1.0;
if (r[0] < EPS) { /* if r[0] <= 0, set LPC coeff. to zero */
for (i = 0; i < order; i++) {
k[i] = 0;
a[i+1] = 0;
}
} else {
a[1] = k[0] = -r[1]/r[0];
alpha = r[0] + r[1] * k[0];
for (m = 1; m < order; m++){
sum = r[m + 1];
for (i = 0; i < m; i++){
sum += a[i+1] * r[m - i];
Andersen et. al. Experimental - Expires August 20th, 2002 166
Internet Low Bit Rate Codec February 2002
}
k[m] = -sum / alpha;
alpha += k[m] * sum;
m_h = (m + 1) >> 1;
for (i = 0; i < m_h; i++){
sum = a[i+1] + k[m] * a[m - i];
a[m - i] += k[m] * a[i+1];
a[i+1] = sum;
}
a[m+1] = k[m];
}
}
}
/*----------------------------------------------------------------*
* interpolation between vectors
*---------------------------------------------------------------*/
void interpolate(
float *out, /* (o) the interpolated vector */
float *in1, /* (i) the first vector for the interpolation */
float *in2, /* (i) the second vector for the interpolation */
float coef, /* (i) interpolation weights */
int length /* (i) length of all vectors */
){
int i;
float invcoef;
invcoef = (float)1.0 - coef;
for (i = 0; i < length; i++)
out[i] = coef * in1[i] + invcoef * in2[i];
}
/*----------------------------------------------------------------*
* lpc bandwidth expansion
*---------------------------------------------------------------*/
void bwexpand(
float *out, /* (o) the bandwidth expanded lpc coefficients */
float *in, /* (i) the lpc coefficients before bandwidth
expansion */
float coef, /* (i) the bandwidth expansion factor */
int length /* (i) the length of lpc coefficient vectors */
){
int i;
float chirp;
chirp = coef;
out[0] = in[0];
for (i = 1; i < length; i++) {
out[i] = chirp * in[i];
chirp *= coef;
Andersen et. al. Experimental - Expires August 20th, 2002 167
Internet Low Bit Rate Codec February 2002
}
}
/*----------------------------------------------------------------*
* vector quantization
*---------------------------------------------------------------*/
void vq(
float *Xq, /* (o) the quantized vector */
int *index, /* (o) the quantization index */
const float *CB, /* (i) the vector quantization codebook */
float *X, /* (i) the vector to quantize */
int n_cb, /* (i) the number of vectors in the codebook */
int dim /* (i) the dimension of all vectors */
){
int i, j;
int pos, minindex;
float dist, tmp, mindist;
pos = 0;
mindist = FLOAT_MAX;
for (j = 0; j < n_cb; j++) {
dist = X[0] - CB[pos];
dist *= dist;
for (i = 1; i < dim; i++) {
tmp = X[i] - CB[pos + i];
dist += tmp*tmp;
if (dist >= mindist) goto next;
}
if (dist < mindist) {
mindist = dist;
minindex = j;
}
next: pos += dim;
}
for (i = 0; i < dim; i++) {
Xq[i] = CB[minindex*dim + i];
}
*index = minindex;
}
/*----------------------------------------------------------------*
* split vector quantization
*---------------------------------------------------------------*/
void SplitVQ(
float *qX, /* (o) the quantized vector */
int *index, /* (o) a vector of indexes for all vector
codebooks in the split */
float *X, /* (i) the vector to quantize */
const float *CB, /* (i) the quantizer codebook */
int nsplit, /* the number of vector splits */
const int *dim, /* the dimension of X and qX */
Andersen et. al. Experimental - Expires August 20th, 2002 168
Internet Low Bit Rate Codec February 2002
const int *cbsize /* the number of vectors in the codebook */
){
int cb_pos, X_pos, i;
cb_pos = 0;
X_pos= 0;
for (i = 0; i < nsplit; i++) {
vq(qX + X_pos, index + i, CB + cb_pos, X + X_pos,
cbsize[i], dim[i]);
X_pos += dim[i];
cb_pos += dim[i] * cbsize[i];
}
}
/*----------------------------------------------------------------*
* scalar quantization
*---------------------------------------------------------------*/
void sort_sq(
float *xq, /* (o) the quantized value */
int *index, /* (o) the quantization index */
float x, /* (i) the value to quantize */
const float *cb, /* (i) the quantization codebook */
int cb_size /* (i) the size of the quantization codebook */
){
int i;
if (x <= cb[0]) {
*index = 0;
*xq = cb[0];
} else {
i = 0;
while ((x > cb[i]) && i < cb_size - 1)
i++;
if (x > ((cb[i] + cb[i - 1])/2)) {
*index = i;
*xq = cb[i];
} else {
*index = i - 1;
*xq = cb[i - 1];
}
}
}
/*----------------------------------------------------------------*
* check for stability of lsf coefficients
*---------------------------------------------------------------*/
int LSF_check( /* (o) 1 for stable lsf vectors and 0 for
nonstable ones */
float *lsf, /* (i) a table of lsf vectors */
int dim, /* (i) the dimension of each lsf vector */
int NoAn /* (i) the number of lsf vectors in the table */
Andersen et. al. Experimental - Expires August 20th, 2002 169
Internet Low Bit Rate Codec February 2002
){
int k,n,m, Nit=2, change=0,pos;
float tmp;
static float eps=(float)0.039; /* 50 Hz */
static float eps2=(float)0.0195;
static float maxlsf=(float)3.14; /* 4000 Hz */
static float minlsf=(float)0.01; /* 0 Hz */
/* LSF separation check*/
for (n=0;nmaxlsf) {
lsf[pos]=maxlsf;
change=1;
}
}
}
}
return change;
}
A.29 hpInput.h
/******************************************************************
iLBC Speech Coder ANSI-C Source Code
hpInput.h
Copyright (c) 2001,
Global IP Sound AB.
All rights reserved.
Andersen et. al. Experimental - Expires August 20th, 2002 170
Internet Low Bit Rate Codec February 2002
******************************************************************/
#ifndef __iLBC_HPINPUT_H
#define __iLBC_HPINPUT_H
void hpInput(
float *In, /* (i) vector to filter */
int len, /* (i) length of vector to filter */
float *Out, /* (o) the resulting filtered vector */
float *mem /* (i/o) the filter state */
);
#endif
A.30 hpInput.c
/******************************************************************
iLBC Speech Coder ANSI-C Source Code
hpInput.c
Copyright (c) 2001,
Global IP Sound AB.
All rights reserved.
******************************************************************/
#include "constants.h"
/*----------------------------------------------------------------*
* Input high-pass filter
*---------------------------------------------------------------*/
void hpInput(
float *In, /* (i) vector to filter */
int len, /* (i) length of vector to filter */
float *Out, /* (o) the resulting filtered vector */
float *mem /* (i/o) the filter state */
){
int i;
float *pi, *po;
/* all-zero section*/
pi = &In[0];
po = &Out[0];
for (i=0; i
#include "iLBC_define.h"
#include "gainquant.h"
#include "getCBvec.h"
/*----------------------------------------------------------------*
* Construct decoded vector from codebook and gains.
Andersen et. al. Experimental - Expires August 20th, 2002 174
Internet Low Bit Rate Codec February 2002
*---------------------------------------------------------------*/
void iCBConstruct(
float *decvector, /* (o) Decoded vector */
int *index, /* (i) Codebook indices */
int *gain_index, /* (i) Gain de-quantization indices */
float *mem, /* (i) Buffer for codevector construction */
int lMem, /* (i) Length of buffer */
int veclen, /* (i) Length of vector */
int nStages /* (i) Number of codebook stages */
){
int j,k;
float gain[NSTAGES];
float cbvec[SUBL];
/* gain de-quantization */
gain[0] = gaindequant(gain_index[0], 1.0, 16);
if (nStages > 1)
gain[1] = gaindequant(gain_index[1],
(float)fabs(gain[0]), 8);
if (nStages > 2)
gain[2] = gaindequant(gain_index[2],
(float)fabs(gain[1]), 8);
/* codebook vector construction and construction of
total vector */
getCBvec(cbvec, mem, index[0], lMem, veclen);
for (j=0;j 1) {
for (k=1; k
#include
#include "iLBC_define.h"
#include "gainquant.h"
#include "createCB.h"
#include "filter.h"
/*----------------------------------------------------------------*
* Search routine for codebook encoding and gain quantization.
*---------------------------------------------------------------*/
void iCBSearch(
int *index, /* (o) Codebook indices */
int *gain_index, /* (o) Gain quantization indices */
float *intarget, /* (i) Target vector for encoding */
float *mem, /* (i) Buffer for codebook construction */
Andersen et. al. Experimental - Expires August 20th, 2002 176
Internet Low Bit Rate Codec February 2002
int lMem, /* (i) Length of buffer */
int lTarget, /* (i) Length of vector */
int nStages, /* (i) Number of codebook stages */
float *weightDenum, /* (i) weighting filter coefficients */
float *weightState /* (i) weighting filter state */
){
int i, j, icount, stage, best_index;
float max_measure, gain, measure, crossDot;
float gains[NSTAGES];
float cb[(MEML+SUBL+1)*CBEXPAND*SUBL];
float target[SUBL];
int base_index, sInd, eInd, base_size;
float buf[MEML+SUBL+2*FILTERORDER];
float invenergy[512], *pp;
/* copy target */
memcpy(target, intarget, lTarget*sizeof(float));
/* Determine size of codebook sections */
base_size=lMem-lTarget+1;
if (lTarget==SUBL)
base_size=lMem-lTarget+1+lTarget/2;
/* setup buffer for weighting */
memcpy(buf,weightState,sizeof(float)*FILTERORDER);
memcpy(buf+FILTERORDER,mem,lMem*sizeof(float));
memcpy(buf+FILTERORDER+lMem,intarget,lTarget*sizeof(float));
/* weighting */
AllPoleFilter(buf+FILTERORDER, weightDenum, lMem+lTarget,
FILTERORDER);
/* Construct the codebook and target needed */
createCB(cb, invenergy, buf+FILTERORDER, lMem, lTarget);
memcpy(target, buf+FILTERORDER+lMem, lTarget*sizeof(float));
/* The Main Loop over stages */
for (stage=0;stage 0.0)
measure = crossDot*crossDot*invenergy[icount];
}
else {
measure = crossDot*crossDot*invenergy[icount];
}
/* check if measure better */
if(measure>max_measure){
best_index = icount;
max_measure = measure;
gain = crossDot*invenergy[icount];
}
}
/* set search range for following codebook sections */
base_index=best_index;
/* unrestricted search */
if (RESRANGE == -1) {
sInd=0;
eInd=base_size-1;
}
/* restriced search around best index from first
codebook section */
else {
sInd=base_index-RESRANGE/2;
if (sInd < 0) sInd=0;
eInd = sInd+RESRANGE;
if (eInd>=base_size) {
eInd=base_size-1;
sInd=eInd-RESRANGE;
}
}
/* search of higher codebook sections */
for (i=1; i 0.0)
measure = crossDot*crossDot*
invenergy[icount];
}
else {
measure = crossDot*crossDot*invenergy[icount];
}
/* check if measure better */
if(measure>max_measure){
best_index = icount;
max_measure = measure;
gain = crossDot*invenergy[icount];
}
}
}
/* record best index */
index[stage] = best_index;
/* gain quantization */
if(stage==0){
if (gain<0.0) gain = 0.0;
if (gain>1.0) gain = 1.0;
gain = gainquant(gain, 1.0, 16, &gain_index[stage]);
}
else {
if(fabs(gain) > fabs(gains[stage-1])){
gain = gain * (float)fabs(gains[stage-1])/
(float)fabs(gain);
}
Andersen et. al. Experimental - Expires August 20th, 2002 179
Internet Low Bit Rate Codec February 2002
gain = gainquant(gain, (float)fabs(gains[stage-1]),
8, &gain_index[stage]);
}
/* Update target */
for(j=0;j
#include
#include "helpfun.h"
#include "lsf.h"
#include "iLBC_define.h"
#include "constants.h"
/*----------------------------------------------------------------*
* interpolation of lsf coefficients for the decoder
*---------------------------------------------------------------*/
void LSFinterpolate2a_dec(
float *a, /* (o) lpc coefficients for a sub frame */
float *lsf1, /* (i) first lsf coefficient vector */
float *lsf2, /* (i) second lsf coefficient vector */
float coef, /* (i) interpolation weight */
int length /* (i) length of lsf vectors */
){
float lsftmp[FILTERORDER];
interpolate(lsftmp, lsf1, lsf2, coef, length);
lsf2a(a, lsftmp);
}
/*----------------------------------------------------------------*
* obtain quantized lsf coefficients from quantization index
*---------------------------------------------------------------*/
void SimplelsfUNQ(
float *lsfunq, /* (o) quantized lsf coefficients */
Andersen et. al. Experimental - Expires August 20th, 2002 181
Internet Low Bit Rate Codec February 2002
int *index /* (i) quantization index */
){
int i,j, pos, cb_pos;
float lsfhat[FILTERORDER];
/* decode last LSF */
pos = 0;
cb_pos = 0;
for (i = 0; i < LSF_NSPLIT; i++) {
for (j = 0; j < dim_ml[i]; j++) {
lsfunq[FILTERORDER + pos + j] = cb_ml[cb_pos +
(long)(index[LSF_NSPLIT + i])*dim_ml[i] + j];
}
pos += dim_ml[i];
cb_pos += size_ml[i]*dim_ml[i];
}
/* decode predicion error for first LSF */
pos = 0;
cb_pos = 0;
for (i = 0; i < LSF_NSPLIT; i++) {
for (j = 0; j < dim_p[i]; j++) {
lsfunq[pos + j] = cb_p[cb_pos +
(long)(index[i])*dim_p[i] + j];
}
pos += dim_p[i];
cb_pos += size_p[i]*dim_p[i];
}
/* add prediction, mean, and unquantized prediction error
to obtain output LSF */
for (i = 0; i < FILTERORDER; i++) {
lsfhat[i] = lsfpred[i] *
(lsfunq[FILTERORDER + i] - lsfmean[i]);
lsfunq[i] += lsfmean[i] + lsfhat[i];
}
}
/*----------------------------------------------------------------*
* obtain synthesis and weighting filters form lsf coefficients
*---------------------------------------------------------------*/
void DecoderInterpolateLSF(
float *syntdenum, /* (o) synthesis filter coefficients */
float *weightnum, /* (o) weighting numbrator coefficients */
float *weightdenum, /* (o) weighting denumerator
coefficients */
float *lsfunq, /* (i) quantized lsf coefficients */
int length, /* (i) length of lsf coefficient vector */
Andersen et. al. Experimental - Expires August 20th, 2002 182
Internet Low Bit Rate Codec February 2002
iLBC_Dec_Inst_t *iLBCdec_inst /* (i) the decoder state
structure */
){
int i, pos, lp_length;
float lp[FILTERORDER + 1], *lsfunq2;
lsfunq2 = lsfunq + length;
lp_length = length + 1;
/* subframe 1: Interpolation between old and first */
LSFinterpolate2a_dec(lp, (*iLBCdec_inst).lsfunqold, lsfunq,
coef[0], length);
bwexpand(syntdenum, lp, CHIRP_SYNTDENUM, lp_length);
bwexpand(weightnum, lp, CHIRP_WEIGHTNUM, lp_length);
bwexpand(weightdenum, lp, CHIRP_WEIGHTDENUM, lp_length);
/* subframes 2 to 6: interpolation between first and last
LSF */
pos = lp_length;
for (i = 1; i < 6; i++) {
LSFinterpolate2a_dec(lp, lsfunq, lsfunq2, coef[i],
length);
bwexpand(syntdenum+pos, lp, CHIRP_SYNTDENUM, lp_length);
bwexpand(weightnum + pos, lp, CHIRP_WEIGHTNUM, lp_length);
bwexpand(weightdenum + pos, lp, CHIRP_WEIGHTDENUM,
lp_length);
pos += lp_length;
}
/* update memory */
for (i = 0; i < length; i++) {
(*iLBCdec_inst).lsfunqold[i] = lsfunq2[i];
}
}
A.39 LPCencode.h
/******************************************************************
iLBC Speech Coder ANSI-C Source Code
LPCencode.h
Copyright (c) 2001,
Global IP Sound AB.
All rights reserved.
******************************************************************/
Andersen et. al. Experimental - Expires August 20th, 2002 183
Internet Low Bit Rate Codec February 2002
#ifndef __iLBC_LPCENCOD_H
#define __iLBC_LPCENCOD_H
void LPCencode(
float *syntdenum, /* (i/o) synthesis filter coefficients
before/after encoding */
float *weightnum, /* (i/o) weighting numerator coefficients
before/after encoding */
float *weightdenum, /* (i/o) weighting denumerator coefficients
before/after encoding */
int *lsf_index, /* (o) lsf quantization index */
float *data, /* (i) lsf coefficients to quantize */
iLBC_Enc_Inst_t *iLBCenc_inst /* (i/o) the encoder state
structure */
);
#endif
A.40 LPCencode.c
/******************************************************************
iLBC Speech Coder ANSI-C Source Code
LPCencode.c
Copyright (c) 2001,
Global IP Sound AB.
All rights reserved.
******************************************************************/
#include
#include "iLBC_define.h"
#include "helpfun.h"
#include "lsf.h"
#include "constants.h"
/*----------------------------------------------------------------*
* lpc analysis (subrutine to LPCencode)
*---------------------------------------------------------------*/
void SimpleAnalysis(
float *lsf, /* (o) lsf coefficients */
float *data, /* (i) new data vector */
float *lpc_buffer /* (i) buffer containing old data */
){
int k, is,i;
float temp[BLOCKL], lp[FILTERORDER + 1], lp2[FILTERORDER + 1],
r[FILTERORDER + 1];
Andersen et. al. Experimental - Expires August 20th, 2002 184
Internet Low Bit Rate Codec February 2002
for (i = 0; i < BLOCKL; i++)
lpc_buffer[LPC_AHEADL + i] = data[i];
/* No lookahead, last window is asymmetric */
for (k = 0; k < LPC_N; k++) {
is = k*LPC_AHEADL;
if (k < (LPC_N - 1))
window(temp, lpc_win, lpc_buffer + is, BLOCKL);
else
window(temp, lpc_asymwin, lpc_buffer + is, BLOCKL);
autocorr(r, temp, BLOCKL, FILTERORDER);
window(r, r, lpc_lagwin, FILTERORDER + 1);
levdurb(lp, temp, r, FILTERORDER);
bwexpand(lp2, lp, CHIRP_SYNTDENUM, FILTERORDER+1);
a2lsf(lsf + k*FILTERORDER, lp2);
}
memcpy(lpc_buffer, lpc_buffer+BLOCKL, LPC_AHEADL*sizeof(double));
}
/*----------------------------------------------------------------*
* lsf interpolator and conversion from lsf to a coefficients
* (subrutine to SimpleInterpolateLSF)
*---------------------------------------------------------------*/
void LSFinterpolate2a_enc(
float *a, /* (o) lpc coefficients */
float *lsf1, /* (i) first set of lsf coefficients */
float *lsf2, /* (i) second set of lsf coefficients */
float coef, /* (i) weighting coefficient to use between lsf1
and lsf2 */
long length /* (i) length of coefficient vectors */
){
float lsftmp[FILTERORDER];
interpolate(lsftmp, lsf1, lsf2, coef, length);
lsf2a(a, lsftmp);
}
/*----------------------------------------------------------------*
* lsf interpolator (subrutine to LPCencode)
*---------------------------------------------------------------*/
void SimpleInterpolateLSF(
float *syntdenum, /* (o) the synthesis filter denominator
resulting from the quantized interpolated lsf */
float *weightnum, /* (o) the weighting filter numerator
resulting from the unquantized interpolated lsf */
Andersen et. al. Experimental - Expires August 20th, 2002 185
Internet Low Bit Rate Codec February 2002
float *weightdenum, /* (o) the weighting filter denominator
resulting from the unquantized interpolated lsf */
float *lsf, /* (i) the unquantized lsf coefficients */
float *lsfq, /* (i) the quantized lsf coefficients */
float *lsfold, /* (i) the unquantized lsf coefficients of the
previous signal frame */
float *lsfqold, /* (i) the quantized lsf coefficients of the
previous signal frame */
int length /* (i) should equate FILTERORDER */
){
int i, pos, lp_length;
float lp[FILTERORDER + 1], *lsf2, *lsfq2;
lsf2 = lsf + length;
lsfq2 = lsfq + length;
lp_length = length + 1;
/* subframe 1: Interpolation between old and first set of
lsf coefficients */
LSFinterpolate2a_enc(lp, lsfqold, lsfq, coef[0], length);
bwexpand(syntdenum, lp, CHIRP_SYNTDENUM, lp_length);
LSFinterpolate2a_enc(lp, lsfold, lsf, coef[0], length);
bwexpand(weightnum, lp, CHIRP_WEIGHTNUM, lp_length);
bwexpand(weightdenum, lp, CHIRP_WEIGHTDENUM, lp_length);
/* subframe 2 to 6: Interpolation between first and second
set of lsf coefficients */
pos = lp_length;
for (i = 1; i < NSUB; i++) {
LSFinterpolate2a_enc(lp, lsfq, lsfq2, coef[i], length);
bwexpand(syntdenum + pos, lp, CHIRP_SYNTDENUM, lp_length);
LSFinterpolate2a_enc(lp, lsf, lsf2, coef[i], length);
bwexpand(weightnum + pos, lp, CHIRP_WEIGHTNUM, lp_length);
bwexpand(weightdenum + pos, lp, CHIRP_WEIGHTDENUM,
lp_length);
pos += lp_length;
}
/* update memory */
for (i = 0; i < length; i++) {
lsfold[i] = lsf2[i];
lsfqold[i] = lsfq2[i];
}
}
/*----------------------------------------------------------------*
* lsf quantizer (subrutine to LPCencode)
*---------------------------------------------------------------*/
Andersen et. al. Experimental - Expires August 20th, 2002 186
Internet Low Bit Rate Codec February 2002
void SimplelsfQ(
float *lsfq, /* (o) quantized lsf coefficients
(dimension FILTERORDER) */
int *index, /* (o) quantization index */
float *lsf /* (i) the lsf coefficient vector to be
quantized (dimension FILTERORDER ) */
){
int i;
float e[FILTERORDER], lsfhat[FILTERORDER];
/* Quantize second LSF with memoryless split VQ */
SplitVQ(lsfq + FILTERORDER, index + LSF_NSPLIT, lsf +
FILTERORDER, cb_ml, LSF_NSPLIT, dim_ml, size_ml);
/* Calculate predicion error for first LSF from second */
for (i = 0; i < FILTERORDER; i++) {
lsfhat[i] = lsfpred[i] * (lsfq[FILTERORDER + i] -
lsfmean[i]);
e[i] = lsf[i] - lsfmean[i] - lsfhat[i];
}
/* Quantize prediction error */
SplitVQ(lsfq, index, e, cb_p, LSF_NSPLIT, dim_p, size_p);
for (i = 0; i < FILTERORDER; i++)
lsfq[i] += lsfmean[i] + lsfhat[i];
}
/*----------------------------------------------------------------*
* lpc encoder
*---------------------------------------------------------------*/
void LPCencode(
float *syntdenum, /* (i/o) synthesis filter coefficients
before/after encoding */
float *weightnum, /* (i/o) weighting numerator coefficients
before/after encoding */
float *weightdenum, /* (i/o) weighting denumerator coefficients
before/after encoding */
int *lsf_index, /* (o) lsf quantization index */
float *data, /* (i) lsf coefficients to quantize */
iLBC_Enc_Inst_t *iLBCenc_inst /* (i/o) the encoder state
structure */
){
float lsf[FILTERORDER * LPC_N], lsfq[FILTERORDER * LPC_N];
int change=0;
SimpleAnalysis(lsf, data, (*iLBCenc_inst).lpc_buffer);
SimplelsfQ(lsfq, lsf_index, lsf);
change=LSF_check(lsfq, FILTERORDER, LPC_N);
Andersen et. al. Experimental - Expires August 20th, 2002 187
Internet Low Bit Rate Codec February 2002
SimpleInterpolateLSF(syntdenum, weightnum, weightdenum,
lsf, lsfq, (*iLBCenc_inst).lsfold,
(*iLBCenc_inst).lsfqold, FILTERORDER);
}
A.41 lsf.h
/******************************************************************
iLBC Speech Coder ANSI-C Source Code
lsf.h
Copyright (c) 2001,
Global IP Sound AB.
All rights reserved.
******************************************************************/
#ifndef __iLBC_LSF_H
#define __iLBC_LSF_H
void a2lsf(
float *freq, /* (o) lsf coefficients */
float *a /* (i) lpc coefficients */
);
void lsf2a(
float *a_coef, /* (o) lpc coefficients */
float *freq /* (i) lsf coefficients */
);
#endif
A.42 lsf.c
/******************************************************************
iLBC Speech Coder ANSI-C Source Code
lsf.c
Copyright (c) 2001,
Global IP Sound AB.
All rights reserved.
******************************************************************/
#include
#include
Andersen et. al. Experimental - Expires August 20th, 2002 188
Internet Low Bit Rate Codec February 2002
#include "iLBC_define.h"
/*----------------------------------------------------------------*
* conversion from lpc coefficients to lsf coefficients
*---------------------------------------------------------------*/
void a2lsf(
float *freq, /* (o) lsf coefficients */
float *a /* (i) lpc coefficients */
){
float steps[NUMBER_OF_STEPS] =
{(float)0.00635, (float)0.003175, (float)0.0015875,
(float)0.00079375};
float step;
int step_idx;
int lsp_index;
float p[HALFORDER];
float q[HALFORDER];
float p_pre[HALFORDER];
float q_pre[HALFORDER];
float old_p, old_q, *old;
float *pq_coef;
float omega, old_omega;
int i;
float hlp, hlp1, hlp2, hlp3, hlp4, hlp5;
for (i = 0; i < HALFORDER; i++){
p[i] = (float)-1.0 * (a[i + 1] + a[FILTERORDER - i]);
q[i] = a[FILTERORDER - i] - a[i + 1];
}
p_pre[0] = (float)-1.0 - p[0];
p_pre[1] = - p_pre[0] - p[1];
p_pre[2] = - p_pre[1] - p[2];
p_pre[3] = - p_pre[2] - p[3];
p_pre[4] = - p_pre[3] - p[4];
p_pre[4] = p_pre[4] / 2;
q_pre[0] = (float)1.0 - q[0];
q_pre[1] = q_pre[0] - q[1];
q_pre[2] = q_pre[1] - q[2];
q_pre[3] = q_pre[2] - q[3];
q_pre[4] = q_pre[3] - q[4];
q_pre[4] = q_pre[4] / 2;
omega = 0.0;
old_omega = 0.0;
old_p = FLOAT_MAX;
old_q = FLOAT_MAX;
/* Here we loop through lsp_index to find all the FILTERORDER
roots for omega. */
Andersen et. al. Experimental - Expires August 20th, 2002 189
Internet Low Bit Rate Codec February 2002
for (lsp_index = 0; lsp_index < FILTERORDER; lsp_index++){
/* Depending on lsp_index being even or odd, we
alternatively solve the roots for the two LSP equations. */
if ((lsp_index % 2) == 0){
pq_coef = p_pre;
old = &old_p;
} else {
pq_coef = q_pre;
old = &old_q;
}
/* Start with low resolution grid */
for (step_idx = 0, step = steps[step_idx];
step_idx < NUMBER_OF_STEPS;){
/* cos(10piw) + pq(0)cos(8piw) + pq(1)cos(6piw) +
pq(2)cos(4piw) + pq(3)cod(2piw) + pq(4) */
hlp = (float)cos(omega * TWO_PI);
hlp1 = (float)2.0 * hlp + pq_coef[0];
hlp2 = (float)2.0 * hlp * hlp1 - (float)1.0 +
pq_coef[1];
hlp3 = (float)2.0 * hlp * hlp2 - hlp1 + pq_coef[2];
hlp4 = (float)2.0 * hlp * hlp3 - hlp2 + pq_coef[3];
hlp5 = hlp * hlp4 - hlp3 + pq_coef[4];
if (((hlp5 * (*old)) <= 0.0) || (omega >= 0.5)){
if (step_idx == (NUMBER_OF_STEPS - 1)){
if (fabs(hlp5) >= fabs(*old)) {
freq[lsp_index] = omega - step;
} else {
freq[lsp_index] = omega;
}
if ((*old) >= 0.0){
*old = (float)-1.0 * FLOAT_MAX;
} else {
*old = FLOAT_MAX;
}
omega = old_omega;
step_idx = 0;
step_idx = NUMBER_OF_STEPS;
} else {
if (step_idx == 0){
old_omega = omega;
}
Andersen et. al. Experimental - Expires August 20th, 2002 190
Internet Low Bit Rate Codec February 2002
step_idx++;
omega -= steps[step_idx];
/* Go back one grid step */
step = steps[step_idx];
}
} else {
/* increment omega until they are of different sign,
and we know there is at least one root between omega
and old_omega */
*old = hlp5;
omega += step;
}
}
}
for (i = 0; i < FILTERORDER; i++) {
freq[i] = freq[i] * TWO_PI;
}
}
/*----------------------------------------------------------------*
* conversion from lsf coefficients to lpc coefficients
*---------------------------------------------------------------*/
void lsf2a(
float *a_coef, /* (o) lpc coefficients */
float *freq /* (i) lsf coefficients */
){
int i, j;
float hlp;
float p[HALFORDER], q[HALFORDER];
float a[HALFORDER + 1], a1[HALFORDER], a2[HALFORDER];
float b[HALFORDER + 1], b1[HALFORDER], b2[HALFORDER];
for (i = 0; i < FILTERORDER; i++) {
freq[i] = freq[i] * PI2;
}
/* Check input for ill-conditioned cases. This part is not
found in the TIA standard. It involves the following 2 IF
blocks. If "freq" is judged ill-conditioned, then we first
modify freq[0] and freq[HALFORDER-1] (normally HALFORDER =
10 for LPC applications), then we adjust the other "freq"
values slightly */
if ((freq[0] <= 0.0) || (freq[FILTERORDER - 1] >= 0.5)){
if (freq[0] <= 0.0) {
freq[0] = (float)0.022;
Andersen et. al. Experimental - Expires August 20th, 2002 191
Internet Low Bit Rate Codec February 2002
}
if (freq[FILTERORDER - 1] >= 0.5) {
freq[FILTERORDER - 1] = (float)0.499;
}
hlp = (freq[FILTERORDER - 1] - freq[0]) /
(float) (FILTERORDER - 1);
for (i = 1; i < FILTERORDER; i++) {
freq[i] = freq[i - 1] + hlp;
}
}
memset(a1, 0, HALFORDER*sizeof(float));
memset(a2, 0, HALFORDER*sizeof(float));
memset(b1, 0, HALFORDER*sizeof(float));
memset(b2, 0, HALFORDER*sizeof(float));
memset(a, 0, (HALFORDER+1)*sizeof(float));
memset(b, 0, (HALFORDER+1)*sizeof(float));
/* p[i] and q[i] compute cos(2*pi*omega_{2j}) and
cos(2*pi*omega_{2j-1} in eqs. 4.2.2.2-1 and 4.2.2.2-2.
Note that for this code p[i] specifies the coefficients
used in .Q_A(z) while q[i] specifies the coefficients used
in .P_A(z) */
for (i = 0; i < HALFORDER; i++){
p[i] = (float)cos(TWO_PI * freq[2 * i]);
q[i] = (float)cos(TWO_PI * freq[2 * i + 1]);
}
a[0] = 0.25;
b[0] = 0.25;
for (i = 0; i < HALFORDER; i++){
a[i + 1] = a[i] - 2 * p[i] * a1[i] + a2[i];
b[i + 1] = b[i] - 2 * q[i] * b1[i] + b2[i];
a2[i] = a1[i];
a1[i] = a[i];
b2[i] = b1[i];
b1[i] = b[i];
}
for (j = 0; j < FILTERORDER; j++){
if (j == 0){
a[0] = 0.25;
b[0] = -0.25;
} else {
a[0] = b[0] = 0.0;
}
for (i = 0; i < HALFORDER; i++){
Andersen et. al. Experimental - Expires August 20th, 2002 192
Internet Low Bit Rate Codec February 2002
a[i + 1] = a[i] - 2 * p[i] * a1[i] + a2[i];
b[i + 1] = b[i] - 2 * q[i] * b1[i] + b2[i];
a2[i] = a1[i];
a1[i] = a[i];
b2[i] = b1[i];
b1[i] = b[i];
}
a_coef[j + 1] = 2 * (a[HALFORDER] + b[HALFORDER]);
}
a_coef[0] = 1.0;
}
A.43 packing.h
/******************************************************************
iLBC Speech Coder ANSI-C Source Code
packing.h
Copyright (c) 2001,
Global IP Sound AB.
All rights reserved.
******************************************************************/
#ifndef __PACKING_H
#define __PACKING_H
void dopack(
unsigned char **bitstream, /* (i/o) on entrance pointer to
place in bitstream to pack new data, on exit
pointer to place in bitstream to pack
future data */
int *index, /* (i) the value to pack */
int bitno /* (i) the number of bits that the value will fit
within */
);
void unpack(
unsigned char **bitstream, /* (i/o) on entrance pointer to
place in bitstream to unpack new data from, on exit
pointer to place in bitstream to unpack future data from*/
int *index, /* (o) resulting value */
int bitno /* (i) number of bits used to represent the value */
);
#endif
Andersen et. al. Experimental - Expires August 20th, 2002 193
Internet Low Bit Rate Codec February 2002
A.44 packing.c
/******************************************************************
iLBC Speech Coder ANSI-C Source Code
packing.c
Copyright (c) 2001,
Global IP Sound AB.
All rights reserved.
******************************************************************/
#include
#include
#include "iLBC_define.h"
#include "constants.h"
#include "helpfun.h"
#include "string.h"
#define BBL 100
/*----------------------------------------------------------------*
* packing of bits into bitstream, i.e., vector of bytes
*---------------------------------------------------------------*/
void dopack(
unsigned char **bitstream, /* (i/o) on entrance pointer to
place in bitstream to pack new data, on exit pointer to
place in bitstream to pack future data */
int *index, /* (i) the value to pack */
int bitno /* (i) the number of bits that the value will fit
within bitno equal to 0 will cause the bitstream to be
flushed to integer bytes */
){
static int bb[BBL];
static int bbs=0;
static int bbe=0;
int i;
/* place the individual bits in individual integers in the table
bb */
if( bitno > 0){
for(i=0;i>i)&1;
}
/* flush the bitstream to an integer number of bytes */
if(bitno==0){
memset(bb+bbe, 0, (8-bbe)*sizeof(int));
Andersen et. al. Experimental - Expires August 20th, 2002 194
Internet Low Bit Rate Codec February 2002
bbe=8;
}
/* write the bits into bitstream */
while( bbe-bbs>=8){
**bitstream = 0;
for(i=0;i<8;i++)
**bitstream += (unsigned char) bb[bbs++]<*> i) & 1;
*bitstream += 1;
}
/* the bits are combined into the index value */
*index=0;
for(i=0;i
#include
#include "iLBC_define.h"
Andersen et. al. Experimental - Expires August 20th, 2002 196
Internet Low Bit Rate Codec February 2002
#include "constants.h"
#include "filter.h"
/*----------------------------------------------------------------*
* decoding of the start state
*---------------------------------------------------------------*/
void StateConstructW(
int idxForMax, /* (i) 7-bit index for the quantization of max
amplitude */
int *idxVec, /* (i) vector of quantization indexes */
float *syntDenum, /* (i) synthesis filter denumerator */
float *out, /* (o) the decoded state vector */
int len /* (i) length of a state vector */
){
float maxVal, tmpbuf[FILTERORDER+2*STATE_LEN], *tmp,
numerator[FILTERORDER+1];
float foutbuf[FILTERORDER+2*STATE_LEN], *fout;
int k,tmpi;
/* decoding of the maximum value */
maxVal = state_frgq[idxForMax];
maxVal = (float)pow(10,maxVal)/(float)4.5;
/* initialization of buffers and coefficients */
memset(tmpbuf, 0, FILTERORDER*sizeof(float));
memset(foutbuf, 0, FILTERORDER*sizeof(float));
for(k=0; k
#include
#include "iLBC_define.h"
#include "constants.h"
#include "filter.h"
#include "helpfun.h"
Andersen et. al. Experimental - Expires August 20th, 2002 198
Internet Low Bit Rate Codec February 2002
/*----------------------------------------------------------------*
* predictive noise shaping encoding of scaled start state
* (subrutine for StateSearchW)
*---------------------------------------------------------------*/
void AbsQuantW(
float *in, /* (i) vector to encode */
float *syntDenum, /* (i) denominator of synthesis filter */
float *weightNum, /* (i) numerator of weighting filter */
float *weightDenum, /* (i) denominator of weighting filter */
int *out, /* (o) vector of quantizer indexes */
int len /* (i) length of vector to encode and vector of
quantizer indexes */
){
float *target, targetBuf[FILTERORDER+STATE_LEN],
*syntOut, syntOutBuf[FILTERORDER+STATE_LEN],
*weightOut, weightOutBuf[FILTERORDER+STATE_LEN],
toQ, xq;
int n;
int index;
/* initialization of buffers for filterings */
memset(targetBuf, 0, FILTERORDER*sizeof(float));
memset(syntOutBuf, 0, FILTERORDER*sizeof(float));
memset(weightOutBuf, 0, FILTERORDER*sizeof(float));
/* initialization of pointers for filterings */
target = &targetBuf[FILTERORDER];
syntOut = &syntOutBuf[FILTERORDER];
weightOut = &weightOutBuf[FILTERORDER];
/* encoding loop */
for(n=0;n maxVal*maxVal){
maxVal = fout[k];
}
}
maxVal=(float)fabs(maxVal);
/* encoding of the maximum amplitude value */
if(maxVal < 10.0){
maxVal = 10.0;
}
maxVal = (float)log10(maxVal);
sort_sq(&dtmp, &index, maxVal, state_frgq, 64);
/* decoding of the maximum amplitude representation value,
and corresponding scaling of start state */
maxVal=state_frgq[index];
utmp=index;
*idxForMax=utmp;
maxVal = (float)pow(10,maxVal);
maxVal = (float)(4.5)/maxVal;
for(k=0;k0; i--)
mem[i] = mem[i-1];
mem[0] = *po;
po++;
}
}
Andersen et. al. Experimental - Expires August 20th, 2002 203
*