# Difference between revisions of "an HDP-HMM for Systems with State Persistence"

(Created page with "(Draft 1) == Introduction == === The Big Picture === === Speaker Diarization Problem === === Nonparametric Bayesian Models === == Background == === Dirichlet Process === ===...") |
(→Introduction) |
||

Line 3: | Line 3: | ||

== Introduction == | == Introduction == | ||

=== The Big Picture === | === The Big Picture === | ||

+ | Hidden Markov Model is one of the most effective and widely used probabilistic models for time series data. A serious limitation of this model is the needs to define the number of states for the model on prior. This number cannot easily determine in real life applications. The usual way to define it is by a trying several different numbers of states and choose the one that gives the best results (trial and error approach). This limitation can be overcome by types of probabilistic models call “Bayesian Nonparametric Models”. Bayesian Nonparametric Models approach this problem by defining distributions that can have infinite number of parameters. However, the assumption here is that even though the distribution could has infinite number of parameters, only finite number of them is required to explain the observed data. | ||

+ | A huge breakthrough in the possible applications of “Bayesian Nonparametric Methods” occurred after a paper published in 2005 by Yee Whye The, Michael I. Jordan, et.al. The title of the paper was “Hierarchical Dirichlet Processes (HDP)”. This paper describes a way to define a nonparametric Bayesian prior that allows atoms (which can be seen as components of mixture models) to be shared between groups of data (draws from mixture models). One application of this model was an extension to the Hidden Markov Model that allows the number of stats not to be defined in prior. The new model, which named HDP-HMM, allows the number of stats to be infinite. | ||

+ | |||

+ | The paper that I’m reviewing here proposes an augmented version of the HDP-HMM that solves the problem of state persistence. The problem is also arises in the original Hidden Markov Model. The stats in situations where this problem could occur have the tendency not to change their settings, i.e. the transition probability to a new state is less than the probability of staying at the same stats. The authors did not only provide a solution for this problem, but they also provided a full Bayesian treatment for it. | ||

=== Speaker Diarization Problem === | === Speaker Diarization Problem === | ||

+ | The task of segmenting or annotating an audio recording into different temporal segments where each segment corresponds to a specific event or speaker is called Speaker Diarization. One example of where Speaker Diarization could be useful is when analyzing a few minutes recorded audio from a radio broadcast. This few minutes could consist of commercials, intro music, a speech of the host, and a speech of the guest. The Speaker Diarization problem on such an audio recording could be defined as identifying speech and non-speech segments. Audio recording of meetings with known or unknown number of participants is another example the uses of Speaker Diarization. In this case the task is to answer the question of “who spoke when”. | ||

+ | |||

+ | |||

+ | The most common technique for solving the Speaker Diarization problem, which has also showed better results than others, is using Hidden Markov Models. In this setting, each speaker is associated with a specified state in the HMM. The transition among these stats represents the transition among speakers. However, a model like this suffers from a serious limitation. It requires the number of speakers to be known in advanced, which is needed to design the structure of the model. | ||

+ | |||

+ | The authors of this paper proposed a solution to the problem of Speaker Diarization on situations where a prior knowledge of the number of participants is unavailable. Their solution is based on a slightly modified version of an interesting Bayesian non-parametric model called “Hierarchical Dirichlet Process–Hidden Markov Model (HDP-HMM)”. The proposed modified version, which named “Sticky HDP-HMM”, imposes state persistence on the model. | ||

=== Nonparametric Bayesian Models === | === Nonparametric Bayesian Models === | ||

## Revision as of 16:45, 20 November 2011

(Draft 1)

## Introduction

### The Big Picture

Hidden Markov Model is one of the most effective and widely used probabilistic models for time series data. A serious limitation of this model is the needs to define the number of states for the model on prior. This number cannot easily determine in real life applications. The usual way to define it is by a trying several different numbers of states and choose the one that gives the best results (trial and error approach). This limitation can be overcome by types of probabilistic models call “Bayesian Nonparametric Models”. Bayesian Nonparametric Models approach this problem by defining distributions that can have infinite number of parameters. However, the assumption here is that even though the distribution could has infinite number of parameters, only finite number of them is required to explain the observed data.

A huge breakthrough in the possible applications of “Bayesian Nonparametric Methods” occurred after a paper published in 2005 by Yee Whye The, Michael I. Jordan, et.al. The title of the paper was “Hierarchical Dirichlet Processes (HDP)”. This paper describes a way to define a nonparametric Bayesian prior that allows atoms (which can be seen as components of mixture models) to be shared between groups of data (draws from mixture models). One application of this model was an extension to the Hidden Markov Model that allows the number of stats not to be defined in prior. The new model, which named HDP-HMM, allows the number of stats to be infinite.

The paper that I’m reviewing here proposes an augmented version of the HDP-HMM that solves the problem of state persistence. The problem is also arises in the original Hidden Markov Model. The stats in situations where this problem could occur have the tendency not to change their settings, i.e. the transition probability to a new state is less than the probability of staying at the same stats. The authors did not only provide a solution for this problem, but they also provided a full Bayesian treatment for it.

### Speaker Diarization Problem

The task of segmenting or annotating an audio recording into different temporal segments where each segment corresponds to a specific event or speaker is called Speaker Diarization. One example of where Speaker Diarization could be useful is when analyzing a few minutes recorded audio from a radio broadcast. This few minutes could consist of commercials, intro music, a speech of the host, and a speech of the guest. The Speaker Diarization problem on such an audio recording could be defined as identifying speech and non-speech segments. Audio recording of meetings with known or unknown number of participants is another example the uses of Speaker Diarization. In this case the task is to answer the question of “who spoke when”.

The most common technique for solving the Speaker Diarization problem, which has also showed better results than others, is using Hidden Markov Models. In this setting, each speaker is associated with a specified state in the HMM. The transition among these stats represents the transition among speakers. However, a model like this suffers from a serious limitation. It requires the number of speakers to be known in advanced, which is needed to design the structure of the model.

The authors of this paper proposed a solution to the problem of Speaker Diarization on situations where a prior knowledge of the number of participants is unavailable. Their solution is based on a slightly modified version of an interesting Bayesian non-parametric model called “Hierarchical Dirichlet Process–Hidden Markov Model (HDP-HMM)”. The proposed modified version, which named “Sticky HDP-HMM”, imposes state persistence on the model.