lbg

Functions

int main(int argc, char *argv[])

lbg [ option ] [ infile ]

  • -l int

    • length of vector \((1 \le M + 1)\)

  • -m int

    • order of vector \((0 \le M)\)

  • -s int

    • random seed

  • -e int

    • target codebook size \((2 \le I_E)\)

  • -C str

    • double-type initial codebook

  • -I str

    • int-type output codebook index

  • -n int

    • minimum number of vectors in a cluster \((1 \le V)\)

  • -i int

    • maximum number of iterations \((1 \le N)\)

  • -d double

    • convergence threshold \((0 \le \varepsilon)\)

  • -r double

    • splitting factor \((0 < r)\)

  • infile str

    • double-type input vectors

  • stdout

    • double-type codebook

If -C option is not specified, the initial codebook is generated from the whole collection of training data as follows:

\[ \boldsymbol{c}_0 = \frac{1}{T} \sum_{t=0}^{T-1} \boldsymbol{x}(t) \]
where the codebook size is one.

Parameters
  • argc[in] Number of arguments.

  • argv[in] Argument vector.

Returns

0 on success, 1 on failure.

See also

imsvq msvq gmm extract

class sptk::LindeBuzoGrayAlgorithm

Design codebook.

The input is the \(M\)-th order input vectors:

\[ \begin{array}{cccc} \boldsymbol{x}_0, & \boldsymbol{x}_1, & \ldots, & \boldsymbol{x}_{T-1}, \end{array} \]
where \(T\) is the number of vectors. The output is the \(M\)-th order codebook vectors:
\[ \begin{array}{cccc} \boldsymbol{c}_0, & \boldsymbol{c}_1, & \ldots, & \boldsymbol{c}_{I-1}, \end{array} \]
where \(I\) is the codebook size. The codebook size is determined by the given initial codebook size \(I_0\) and target codebook size \(I_E\). In the implemented algorithm, codebook size is repeatedly doubled from the initial codebook size while \(I < I_E\).

The codebook is generated by the following algorithm:

  • Step 0: Set \(I \leftarrow I_0\).

  • Step 1: Split the codebook vectors as

    \[\begin{split} \boldsymbol{c}_i = \left\{ \begin{array}{ll} \boldsymbol{c}_i + r \boldsymbol{\epsilon}, & 0 \le i < I \\ \boldsymbol{c}_{i-I} - r \boldsymbol{\epsilon}, & I \le i < 2I \end{array} \right. \end{split}\]
    where \(\boldsymbol{\epsilon}\) is a \(M\)-th order vector of random numbers and \(r\) is the splitting factor.

  • Step 2: Update the codebook \(N\) times until the convergence is reached. The stop criterion is

    \[ \left| \frac{D_{n-1} - D_{n}}{D_{n}} \right| < \varepsilon \]
    where \(D_{n}\) is the total distance between the updated codebook vectors at \(n\)-th iteration and the input vectors.

  • Step 3: If the number of vectors in a cluster \(j\) is less than the pre-determined threshold value \(V\), the corresponding codebook vector is updated as

    \[\begin{split}\begin{eqnarray} \boldsymbol{c}_j &=& \boldsymbol{c}_{i_{max}} - r \boldsymbol{\epsilon}, \\ \boldsymbol{c}_{i_{max}} &=& \boldsymbol{c}_{i_{max}} + r \boldsymbol{\epsilon}, \end{eqnarray}\end{split}\]
    where \(i_{max}\) is the index of the cluster that have the largest number of input vectors.

  • Step 4: Set \(I \leftarrow 2I\). If \(I \ge I_E\) exit, otherwise go to Step 1.

Public Functions

LindeBuzoGrayAlgorithm(int num_order, int initial_codebook_size, int target_codebook_size, int min_num_vector_in_cluster, int num_iteration, double convergence_threshold, double splitting_factor, int seed)
Parameters
  • num_order[in] Order of vector, \(M\).

  • initial_codebook_size[in] Initial codebook size, \(I_0\).

  • target_codebook_size[in] Target codebook size, \(I_E\).

  • min_num_vector_in_cluster[in] Lower bound of number of vectors in a cluster, \(V\).

  • num_iteration[in] Number of iterations, \(N\).

  • convergence_threshold[in] Convergence threshold, \(\varepsilon\).

  • splitting_factor[in] Splitting factor, \(r\).

  • seed[in] Random seed.

inline int GetNumOrder() const
Returns

Order of vector.

inline int GetInitialCodeBookSize() const
Returns

Initial codebook size.

inline int GetTargetCodeBookSize() const
Returns

Target codebook size.

inline int GetMinNumVectorInCluster() const
Returns

Minimum number of vectors in a cluster.

inline int GetNumIteration() const
Returns

Number of iterations.

inline double GetConvergenceThreshold() const
Returns

Convergence threshold.

inline double GetSplittingFactor() const
Returns

Splitting factor.

inline int GetSeed() const
Returns

Random seed.

inline bool IsValid() const
Returns

True if this object is valid.

bool Run(const std::vector<std::vector<double>> &input_vectors, std::vector<std::vector<double>> *codebook_vectors, std::vector<int> *codebook_indices) const
Parameters
  • input_vectors[in] \(M\)-th order input vectors. The shape is \([T, M+1]\).

  • codebook_vectors[inout] \(M\)-th order codebook vectors. The shape is \([I, M+1]\).

  • codebook_indices[out] \(T\) codebook indices.

Returns

True on success, false on failure.