lbg
Functions
-
int main(int argc, char *argv[])
lbg [ option ] [ infile ]
-l int
length of vector \((1 \le M + 1)\)
-m int
order of vector \((0 \le M)\)
-s int
random seed
-e int
target codebook size \((2 \le I_E)\)
-C str
double-type initial codebook
-I str
int-type output codebook index
-S str
double-type output total distance
-n int
minimum number of vectors in a cluster \((1 \le V)\)
-i int
maximum number of iterations \((1 \le N)\)
-d double
convergence threshold \((0 \le \varepsilon)\)
-r double
splitting factor \((0 < r)\)
infile str
double-type input vectors
stdout
double-type codebook
If -C option is not specified, the initial codebook is generated from the whole collection of training data as follows:
\[ \boldsymbol{c}_0 = \frac{1}{T} \sum_{t=0}^{T-1} \boldsymbol{x}(t) \]where the codebook size is one.- Parameters:
argc – [in] Number of arguments.
argv – [in] Argument vector.
- Returns:
0 on success, 1 on failure.
-
class LindeBuzoGrayAlgorithm
Design codebook.
The input is the \(M\)-th order input vectors:
\[ \begin{array}{cccc} \boldsymbol{x}_0, & \boldsymbol{x}_1, & \ldots, & \boldsymbol{x}_{T-1}, \end{array} \]where \(T\) is the number of vectors. The output is the \(M\)-th order codebook vectors:\[ \begin{array}{cccc} \boldsymbol{c}_0, & \boldsymbol{c}_1, & \ldots, & \boldsymbol{c}_{I-1}, \end{array} \]where \(I\) is the codebook size. The codebook size is determined by the given initial codebook size \(I_0\) and target codebook size \(I_E\). In the implemented algorithm, codebook size is repeatedly doubled from the initial codebook size while \(I < I_E\).The codebook is generated by the following algorithm:
Step 0: Set \(I \leftarrow I_0\).
Step 1: Split the codebook vectors as
\[\begin{split} \boldsymbol{c}_i = \left\{ \begin{array}{ll} \boldsymbol{c}_i + r \boldsymbol{\epsilon}, & 0 \le i < I \\ \boldsymbol{c}_{i-I} - r \boldsymbol{\epsilon}, & I \le i < 2I \end{array} \right. \end{split}\]where \(\boldsymbol{\epsilon}\) is a \(M\)-th order vector of random numbers and \(r\) is the splitting factor.Step 2: Update the codebook \(N\) times until the convergence is reached. The stop criterion is
\[ \left| \frac{D_{n-1} - D_{n}}{D_{n}} \right| < \varepsilon \]where \(D_{n}\) is the total distance between the updated codebook vectors at \(n\)-th iteration and the input vectors.Step 3: If the number of vectors in a cluster \(j\) is less than the pre-determined threshold value \(V\), the corresponding codebook vector is updated as
\[\begin{split}\begin{eqnarray} \boldsymbol{c}_j &=& \boldsymbol{c}_{i_{max}} - r \boldsymbol{\epsilon}, \\ \boldsymbol{c}_{i_{max}} &=& \boldsymbol{c}_{i_{max}} + r \boldsymbol{\epsilon}, \end{eqnarray}\end{split}\]where \(i_{max}\) is the index of the cluster that have the largest number of input vectors.Step 4: Set \(I \leftarrow 2I\). If \(I \ge I_E\) exit, otherwise go to Step 1.
Public Functions
-
LindeBuzoGrayAlgorithm(int num_order, int initial_codebook_size, int target_codebook_size, int min_num_vector_in_cluster, int num_iteration, double convergence_threshold, double splitting_factor, int seed)
- Parameters:
num_order – [in] Order of vector, \(M\).
initial_codebook_size – [in] Initial codebook size, \(I_0\).
target_codebook_size – [in] Target codebook size, \(I_E\).
min_num_vector_in_cluster – [in] Lower bound of number of vectors in a cluster, \(V\).
num_iteration – [in] Number of iterations, \(N\).
convergence_threshold – [in] Convergence threshold, \(\varepsilon\).
splitting_factor – [in] Splitting factor, \(r\).
seed – [in] Random seed.
-
inline int GetNumOrder() const
- Returns:
Order of vector.
-
inline int GetInitialCodeBookSize() const
- Returns:
Initial codebook size.
-
inline int GetTargetCodeBookSize() const
- Returns:
Target codebook size.
-
inline int GetMinNumVectorInCluster() const
- Returns:
Minimum number of vectors in a cluster.
-
inline int GetNumIteration() const
- Returns:
Number of iterations.
-
inline double GetConvergenceThreshold() const
- Returns:
Convergence threshold.
-
inline double GetSplittingFactor() const
- Returns:
Splitting factor.
-
inline int GetSeed() const
- Returns:
Random seed.
-
inline bool IsValid() const
- Returns:
True if this object is valid.
-
bool Run(const std::vector<std::vector<double>> &input_vectors, std::vector<std::vector<double>> *codebook_vectors, std::vector<int> *codebook_indices, double *total_distance) const
- Parameters:
input_vectors – [in] \(M\)-th order input vectors. The shape is \([T, M+1]\).
codebook_vectors – [inout] \(M\)-th order codebook vectors. The shape is \([I, M+1]\).
codebook_indices – [out] \(T\) codebook indices.
total_distance – [out] Total distance.
- Returns:
True on success, false on failure.