pitch

Functions

int main(int argc, char *argv[])

pitch [ option ] [ infile ]

  • -a int

    • algorithm used for pitch extraction

      • 0 RAPT

      • 1 SWIPE’

      • 2 REAPER

      • 3 DIO

      • 4 Harvest

  • -p int

    • frame shift [point] \((1 \le P)\)

  • -s double

    • sampling rate [kHz] \((6 < F_s \le 98)\)

  • -L double

    • minimum F0 to search for [Hz] \((10 < F_l < F_h)\)

  • -H double

    • maximum F0 to search for [Hz] \((F_l < F_h < 500F_s)\)

  • -t0 double

    • voicing threshold for RAPT \((-0.6 \le T \le 0.7)\)

  • -t1 double

    • voicing threshold for SWIPE’ \((0.2 \le T \le 0.5)\)

  • -t2 double

    • voicing threshold for REAPER \((-0.5 \le T \le 1.6)\)

  • -t3 double

    • voicing threshold for DIO \((0.02 \le T \le 0.2)\)

  • -t4 double

    • voicing threshold for Harvest \((0.0 \le T \le 0.2)\)

  • -o int

    • output format

      • 0 pitch \((F_s / F_0)\)

      • 1 F0

      • 2 log F0

  • infile str

    • double-type waveform

  • stdout

    • double-type pitch

If \(T\) is raised, the number of voiced frames increase except SWIPE’.

The below is a simple example to extract pitch from data.d

pitch -s 16 -p 80 -L 80 -H 200 -o 1 < data.d > data.f0
Parameters:
  • argc[in] Number of arguments.

  • argv[in] Argument vector.

Returns:

0 on success, 1 on failure.

See also

pitch_mark excite

class PitchExtraction

Extract pitch (fundamental frequency) from waveform.

The input is whole audio waveform and the output is the sequence of the fundamental frequency. The implemented algorithms of the extraction are RAPT, SWIPE, REAPER, DIO, and harvest.

[1] D. Talkin, “A robust algorithm for pitch tracking,” Speech Coding and Synthesis, pp. 497-518, 1995.

[2] A. Camacho, “SWIPE: A sawtooth waveform inspired pitch estimator for speech and music,” Doctoral dissertation, 2007.

[3] D. Talkin, “REAPER: Robust epoch and pitch estimator,” https://github.com/google/REAPER, 2015.

[4] M. Morise, H. Kawahara and H. Katayose, “Fast and reliable F0 estimation method based on the period extraction of vocal fold vibration of singing voice and speech,” Proc. of AES 35th International Conference, 2009.

[5] M. Morise, “Harvest: A high-performance fundamental frequency estimator from speech signals,” Proc. of Interspeech, pp. 2321-2325, 2017.

Public Types

enum Algorithms

Pitch extraction algorithms.

Values:

enumerator kRapt
enumerator kSwipe
enumerator kReaper
enumerator kDio
enumerator kHarvest
enumerator kNumAlgorithms

Public Functions

PitchExtraction(int frame_shift, double sampling_rate, double lower_f0, double upper_f0, double voicing_threshold, Algorithms algorithm)
Parameters:
  • frame_shift[in] Frame shift in point.

  • sampling_rate[in] Sampling rate in Hz.

  • lower_f0[in] Lower bound of F0 in Hz.

  • upper_f0[in] Upper bound of F0 in Hz.

  • voicing_threshold[in] Threshold for determining voiced/unvoiced.

  • algorithm[in] Algorithm used for pitch extraction.

inline bool IsValid() const
Returns:

True if this object is valid.

bool Run(const std::vector<double> &waveform, std::vector<double> *f0, std::vector<double> *epochs, PitchExtractionInterface::Polarity *polarity) const
Parameters:
  • waveform[in] Waveform.

  • f0[out] Extracted pitch in Hz.

  • epochs[out] Pitchmark (valid only for REAPER).

  • polarity[out] Polarity (valid only for REAPER).

Returns:

True on success, false on failure.