lbg

Functions

int main(int argc, char *argv[])

lbg [ option ] [ infile ]

  • -l int

    • length of vector \((1 \le M + 1)\)

  • -m int

    • order of vector \((0 \le M)\)

  • -s int

    • random seed

  • -e int

    • target codebook size \((2 \le I_E)\)

  • -C str

    • double-type initial codebook

  • -I str

    • int-type output codebook index

  • -S str

    • double-type output total distance

  • -n int

    • minimum number of vectors in a cluster \((1 \le V)\)

  • -i int

    • maximum number of iterations \((1 \le N)\)

  • -d double

    • convergence threshold \((0 \le \varepsilon)\)

  • -r double

    • splitting factor \((0 < r)\)

  • infile str

    • double-type input vectors

  • stdout

    • double-type codebook

If -C option is not specified, the initial codebook is generated from the whole collection of training data as follows:

\[ \boldsymbol{c}_0 = \frac{1}{T} \sum_{t=0}^{T-1} \boldsymbol{x}(t) \]
where the codebook size is one.

Parameters:
  • argc[in] Number of arguments.

  • argv[in] Argument vector.

Returns:

0 on success, 1 on failure.

See also

imsvq msvq gmm extract

class LindeBuzoGrayAlgorithm

Design codebook.

The input is the \(M\)-th order input vectors:

\[ \begin{array}{cccc} \boldsymbol{x}_0, & \boldsymbol{x}_1, & \ldots, & \boldsymbol{x}_{T-1}, \end{array} \]
where \(T\) is the number of vectors. The output is the \(M\)-th order codebook vectors:
\[ \begin{array}{cccc} \boldsymbol{c}_0, & \boldsymbol{c}_1, & \ldots, & \boldsymbol{c}_{I-1}, \end{array} \]
where \(I\) is the codebook size. The codebook size is determined by the given initial codebook size \(I_0\) and target codebook size \(I_E\). In the implemented algorithm, codebook size is repeatedly doubled from the initial codebook size while \(I < I_E\).

The codebook is generated by the following algorithm:

  • Step 0: Set \(I \leftarrow I_0\).

  • Step 1: Split the codebook vectors as

    \[\begin{split} \boldsymbol{c}_i = \left\{ \begin{array}{ll} \boldsymbol{c}_i + r \boldsymbol{\epsilon}, & 0 \le i < I \\ \boldsymbol{c}_{i-I} - r \boldsymbol{\epsilon}, & I \le i < 2I \end{array} \right. \end{split}\]
    where \(\boldsymbol{\epsilon}\) is a \(M\)-th order vector of random numbers and \(r\) is the splitting factor.

  • Step 2: Update the codebook \(N\) times until the convergence is reached. The stop criterion is

    \[ \left| \frac{D_{n-1} - D_{n}}{D_{n}} \right| < \varepsilon \]
    where \(D_{n}\) is the total distance between the updated codebook vectors at \(n\)-th iteration and the input vectors.

  • Step 3: If the number of vectors in a cluster \(j\) is less than the pre-determined threshold value \(V\), the corresponding codebook vector is updated as

    \[\begin{split}\begin{eqnarray} \boldsymbol{c}_j &=& \boldsymbol{c}_{i_{max}} - r \boldsymbol{\epsilon}, \\ \boldsymbol{c}_{i_{max}} &=& \boldsymbol{c}_{i_{max}} + r \boldsymbol{\epsilon}, \end{eqnarray}\end{split}\]
    where \(i_{max}\) is the index of the cluster that have the largest number of input vectors.

  • Step 4: Set \(I \leftarrow 2I\). If \(I \ge I_E\) exit, otherwise go to Step 1.

Public Functions

LindeBuzoGrayAlgorithm(int num_order, int initial_codebook_size, int target_codebook_size, int min_num_vector_in_cluster, int num_iteration, double convergence_threshold, double splitting_factor, int seed)
Parameters:
  • num_order[in] Order of vector, \(M\).

  • initial_codebook_size[in] Initial codebook size, \(I_0\).

  • target_codebook_size[in] Target codebook size, \(I_E\).

  • min_num_vector_in_cluster[in] Lower bound of number of vectors in a cluster, \(V\).

  • num_iteration[in] Number of iterations, \(N\).

  • convergence_threshold[in] Convergence threshold, \(\varepsilon\).

  • splitting_factor[in] Splitting factor, \(r\).

  • seed[in] Random seed.

inline int GetNumOrder() const
Returns:

Order of vector.

inline int GetInitialCodeBookSize() const
Returns:

Initial codebook size.

inline int GetTargetCodeBookSize() const
Returns:

Target codebook size.

inline int GetMinNumVectorInCluster() const
Returns:

Minimum number of vectors in a cluster.

inline int GetNumIteration() const
Returns:

Number of iterations.

inline double GetConvergenceThreshold() const
Returns:

Convergence threshold.

inline double GetSplittingFactor() const
Returns:

Splitting factor.

inline int GetSeed() const
Returns:

Random seed.

inline bool IsValid() const
Returns:

True if this object is valid.

bool Run(const std::vector<std::vector<double>> &input_vectors, std::vector<std::vector<double>> *codebook_vectors, std::vector<int> *codebook_indices, double *total_distance) const
Parameters:
  • input_vectors[in] \(M\)-th order input vectors. The shape is \([T, M+1]\).

  • codebook_vectors[inout] \(M\)-th order codebook vectors. The shape is \([I, M+1]\).

  • codebook_indices[out] \(T\) codebook indices.

  • total_distance[out] Total distance.

Returns:

True on success, false on failure.