您的位置:首页 > 其它

Derivation of User Browsing Model

2015-08-14 19:28 260 查看
Mathematicians hate words like “trivial, traditional, … “

Prelude

Derivation in detail
E-step

M-step

Prelude

UBM1is a simple but ecient model in estimating position bias. The model

structures as following:



Figure 1 UBM: Multiple Browsing

The complete-data likelihood:

P(c,a,e,m|q,u,r,d,Θ)=P(c|a,e)P(a|q,u)P(e|r,d,m)P(m|q) P(c, a, e, m | q, u, r, d, \Theta) = P( c | a, e)P(a | q, u)P(e | r, d, m)P(m | q)

The incomplete-data likelihood:

P(c|q,u,r,d,Θ)=∑a,eP(c|a,e)P(a|q,u)∑mP(e|r,d,m)P(m|q) P(c | q, u, r, d, \Theta) = \sum_{a,e}P( c | a, e)P(a | q, u)\sum_{m}P(e | r, d, m)P(m | q)

The conditional distribution of a,ea, e is Bernoulli:

P(a|q,u)={αuq1−αuqifa=1ifa=0 P(a | q, u) = \begin{cases}
\alpha_{uq} & \quad \mathrm{if} \quad a = 1 \\
1-\alpha_{uq} & \quad \mathrm{if} \quad a = 0
\end{cases}

P(e|r,d,m)={γrdm1−γrdmife=1ife=0 P(e | r, d, m) =
\begin{cases}
\gamma_{rdm} & \quad \mathrm{if} \quad e = 1 \\
1-\gamma_{rdm} & \quad \mathrm{if} \quad e = 0 \\
\end{cases}

With deterministic assumption:

c=1⟺a=e=1c = 1 \iff a=e=1

The complete-data likelihood becomes:

P(c,a,e,m|q,u,r,d,Θ)=⎧⎩⎨⎪⎪⎪⎪⎪⎪⎪⎪αuqγrdmμmqαuq(1−γrdm)μmq(1−αuq)γrdmμmq(1−αuq)(1−γrdm)μmq,ifc=1,ifa=1&e=0,ifa=0&e=1,ifa=0&e=0
P(c, a, e, m | q, u, r, d, \Theta) =
\begin{cases}
\alpha_{uq} \gamma_{rdm} \mu_{mq} & ,\quad \mathrm{if} \quad c = 1 \\
\alpha_{uq} (1- \gamma_{rdm}) \mu_{mq} & ,\quad \mathrm{if} \quad a = 1 \quad \& \quad e = 0 \\
( 1 - \alpha_{uq} ) \gamma_{rdm}\mu_{mq} & ,\quad \mathrm{if} \quad a = 0 \quad \& \quad e = 1 \\
( 1 - \alpha_{uq} ) ( 1 - \gamma_{rdm} ) \mu_{mq} & ,\quad \mathrm{if} \quad a = 0 \quad \& \quad e = 0 \\
\end{cases}

The incomplete-data likelihood becomes:

P(c|q,u,r,d,Θ)={αuq∑mγrdmμmq1−αuq∑mγrdmμmq,ifc=1,ifc=0 P(c | q, u, r, d, \Theta) =
\begin{cases}
\alpha_{uq}\sum_{m}\gamma_{rdm}\mu_{mq} & ,\quad \mathrm{if} \quad c = 1 \\
1 - \alpha_{uq}\sum_{m}\gamma_{rdm}\mu_{mq} & ,\quad \mathrm{if} \quad c = 0 \\
\end{cases}

The log-likelihood function becomes:

logL(Θ)===∑all recordlogP(c|q,u,r,d,Θ)∑u,q∑r,d{S∙uqrdlogP(c=1|q,u,r,d,Θ)+S∘uqrdlogP(c=0|q,u,r,d,Θ)}∑u,q∑r,d{S∙uqrdlog[αuq∑mγrdmμmq]+S∘uqrdlog[(1−αuq∑mγrdmμmq)]}
\begin{eqnarray*}
\log L(\Theta) & = & \sum_{\text{all record}} \log P(c | q, u, r, d, \Theta) \\
\quad & = & \sum_{u,q} \sum_{r, d} \bigg \{ S_{uqrd}^{\bullet} \log P(c = 1 | q, u, r, d, \Theta)
+ S_{uqrd}^{\circ} \log P(c = 0 | q, u, r, d, \Theta) \bigg \} \\
\quad & = & \sum_{u,q} \sum_{r, d} \bigg \{ S_{uqrd}^{\bullet} \log \left [ \alpha_{uq}\sum_{m}\gamma_{rdm}\mu_{mq} \right ]
+ S_{uqrd}^{\circ} \log \left [ (1 - \alpha_{uq}\sum_{m}\gamma_{rdm}\mu_{mq}) \right ] \bigg \}
\end{eqnarray*}

The so-called deterministic relationship c↔(a,e) c \leftrightarrow (a ,e) and upon formula of log-likelihood are so misleading that at first I omited a,e a , e as latent variables. Despite all kinds of tricks tried in M-step, I failed to deduce the iteration formulas appended in the paper.

Derivation in detail

E-step

Posterior distributions of latent variables after tt-th iteration is:

Qt(a,e,m|c,u,q,r,d,Θt)=P(c,a,e,m|u,q,r,d,Θt)∑a,e,mP(c,a,e,m|u,q,r,d,Θt) Q^t (a, e, m | c, u, q, r, d, \Theta^t) = \frac{ P(c,a,e,m | u, q, r, d, \Theta^t ) }{\sum_{a, e,m} P(c,a,e,m | u, q, r, d, \Theta^t ) }

, thus

Qt(a=1,e=1,m|c=1,u,q,r,d,Θt)=γrdmμmq∑mγrdmμmq Q^t (a=1, e=1, m | c=1, u,q,r,d,\Theta^t ) = \frac{ \gamma_{rdm} \mu_{mq} }{ \sum_{m} \gamma_{rdm} \mu_{mq} }

Qt(a=1,e=0,m|c=0,u,q,r,d,Θt)=αtuq(1−γtrdm)μtmq1−αtuq∑mγtrdmμtmq Q^t (a=1, e=0, m | c=0, u,q,r,d,\Theta^t ) = \frac{ \alpha_{uq}^t ( 1 - \gamma_{rdm}^t ) \mu_{mq}^t }{
1 - \alpha_{uq}^t \sum_{m} \gamma_{rdm}^t \mu_{mq}^t}

Qt(a=0,e=1,m|c=0,u,q,r,d,Θt)=(1−αtuq)γtrdmμtmq1−αtuq∑mγtrdmμtmq Q^t (a=0, e=1, m | c=0, u,q,r,d,\Theta^t ) = \frac{ (1 - \alpha_{uq}^t ) \gamma_{rdm}^t \mu_{mq}^t }{
1 - \alpha_{uq}^t \sum_{m} \gamma_{rdm}^t \mu_{mq}^t}

Qt(a=0,e=0,m|c=0,u,q,r,d,Θt)=(1−αtuq)(1−γtrdm)μtmq1−αtuq∑mγtrdmμtmq Q^t (a=0, e=0, m | c=0, u,q,r,d,\Theta^t ) = \frac{ (1 - \alpha_{uq}^t ) ( 1 - \gamma_{rdm}^t ) \mu_{mq}^t }{
1 - \alpha_{uq}^t \sum_{m} \gamma_{rdm}^t \mu_{mq}^t}

M-step

The free energy with respect to Qt Q^t is:

F(Qt,Θ)==+=++∑obs∑a,e,mQt(a,e,m|obs)logP(c,a,e,m|u,q,r,d,Θ)∑u,q∑r,d{S∙uqrd∑mQt(a=1,e=1,m|c=1,r,d,u,q)log(αuqγrdmμmq)S∘uqrd[∑mQt(a=1,e=0,m|c=0,r,d,u,q)log(αuq(1−γrdm)μmq)+∑mQt(a=0,e=1,m|c=0,r,d,u,q)log((1−αuq)γrdmμmq)+∑mQt(a=0,e=0,m|c=0,r,d,u,q)log((1−αuq)(1−γrdm)μmq)]}∑u,q∑r,d[(S∙uqrd+S∘uqrdQt(a=1,e=0|c=0,r,d,u,q))logαuq+S∘uqrdQt(a=0|c=0,r,d,u,q)log(1−αuq)]∑r,d,m{[∑u,q(S∙uqrdQt(a=1,e=1,m|c=1,r,d,u,q)+S∘uqrdQt(a=0,e=1,m|c=0,r,d,u,q))]logγrdm+[∑u,qS∘uqrdQt(e=0,m|c=0,r,d,u,q)]log(1−γrdm)}∑m,q{∑u,r,d[S∙uqrdQt(m|c=1,r,d,u,q)+S∘uqrdQt(m|c=0,r,d,u,q)]}logμmq
\begin{eqnarray*}
F(Q^t , \Theta) & = & \sum_{obs} \sum_{a,e, m} Q^t (a,e,m | obs) \log P(c, a, e, m | u, q, r, d, \Theta ) \\
\quad & = & \sum_{u, q} \sum_{r, d} \bigg \{ S_{uqrd}^{\bullet} \sum_{m} Q^t (a=1,e=1, m | c=1, r,d,u,q) \log (\alpha_{uq} \gamma_{rdm} \mu_{mq}) \\
\quad & + & S_{uqrd}^{\circ} \Big [ \sum_{m} Q^t (a=1,e=0, m | c=0, r,d,u,q) \log \big ( \alpha_{uq} (1 - \gamma_{rdm}) \mu_{mq} \big ) \\
\quad & \quad & + \sum_{m} Q^t (a=0,e=1, m | c=0, r,d,u,q) \log \big( (1- \alpha_{uq} ) \gamma_{rdm} \mu_{mq} \big ) \\
\quad & \quad & + \sum_{m} Q^t (a=0,e=0, m | c=0, r,d,u,q) \log \big( (1- \alpha_{uq} ) (1 - \gamma_{rdm} ) \mu_{mq} \big ) \Big ] \bigg \} \\
& = & \sum_{u, q} \sum_{r, d} \bigg [ \big ( S_{uqrd}^{\bullet} + S_{uqrd}^{\circ} Q^t (a=1, e=0 | c=0, r,d,u,q) \big ) \log \alpha_{uq} \\
\quad & \quad & \qquad \qquad + S_{uqrd}^{\circ} Q^t (a=0 | c=0, r,d,u,q) \log (1- \alpha_{uq}) \bigg ] \\
\quad & + & \sum_{r, d, m} \bigg \{ \Big [ \sum_{u, q} \big ( S_{uqrd}^{\bullet} Q^t (a=1,e=1, m | c=1, r,d,u,q) \\
\quad & \quad & \qquad \qquad + S_{uqrd}^{\circ} Q^t (a=0,e=1, m | c=0, r,d,u,q) \big ) \Big ] \log \gamma_{rdm} \\
\quad & \quad & \qquad + \Big [ \sum_{u, q} S_{uqrd}^{\circ } Q^t (e=0, m | c=0, r,d,u,q) \Big ] \log ( 1 - \gamma_{rdm} ) \bigg \} \\
\quad & + & \sum_{m, q} \bigg \{ \sum_{u, r, d} \Big [ S_{uqrd}^{\bullet} Q^t (m | c=1, r,d, u,q) + S_{uqrd}^{\circ} Q^t (m | c=0, r,d, u,q) \Big ] \bigg \} \log \mu_{mq}
\end{eqnarray*}

Maximization over αuq,γrdm,μmq \alpha_{uq}, \gamma_{rdm}, \mu_{mq} with contraint ∑mμmq=1 \sum_{m} \mu_{mq} = 1 leads to the updating formulas:

for αuq\alpha_{uq},

αt+1uq===∑r,d(S∙uqrd+S∘uqrdQt(a=1,e=0|c=0,r,d,u,q))∑r,d[S∙uqrd+S∘uqrd(Qt(a=1,e=0|c=0,r,d,u,q)+Qt(a=0|c=0,r,d,u,q))]∑r,d(S∙uqrd+S∘uqrdQt(a=1,e=0|c=0,r,d,u,q))∑r,d(S∙uqrd+S∘uqrd)1Suq(∑rdS∘uqrdαtuq(1−∑mγtrdmμtmq)1−αtuq∑mγtrdmμtmq+S∙uq)
\begin{eqnarray*}
\alpha_{uq}^{t+1} & = & \frac{ \sum_{r, d} \big ( S_{uqrd}^{\bullet} + S_{uqrd}^{\circ} Q^t (a=1, e=0 | c=0, r,d,u,q) \big ) }{
\sum_{r, d} \Big [ S_{uqrd}^{\bullet} + S_{uqrd}^{\circ} \big ( Q^t (a=1, e=0 | c=0, r,d,u,q) + Q^t (a=0 | c=0, r,d,u,q) \big ) \Big ] } \\
\quad & = & \frac{ \sum_{r, d} \big ( S_{uqrd}^{\bullet} + S_{uqrd}^{\circ} Q^t (a=1, e=0 | c=0, r,d,u,q) \big ) }{ \sum_{r, d} \big ( S_{uqrd}^{\bullet} + S_{uqrd}^{\circ} \big ) } \\
\quad & = & \frac{1}{S_{uq}} \Big ( \sum_{rd} S_{uqrd}^{\circ} \frac{ \alpha_{uq}^t ( 1 - \sum_{m} \gamma_{rdm}^t \mu_{mq}^t ) }{
1 - \alpha_{uq}^t \sum_{m} \gamma_{rdm}^t \mu_{mq}^t} + S_{uq}^{\bullet} \Big )
\end{eqnarray*}

for γrdm\gamma_{rdm},

γt+1rdm=≡∑u,q(S∙uqrdQt(m|c=1,r,d,u,q)+S∘uqrdQt(e=1,m|c=0,r,d,u,q))∑u,q(S∙uqrdQt(m|c=1,r,d,u,q)+S∘uqrdQt(m|c=0,r,d,u,q))A/B,
\begin{eqnarray*}
\gamma_{rdm}^{t+1} & = & \frac{ \sum_{u,q} \left ( S_{uqrd}^{\bullet} Q^t (m | c=1, r,d,u,q) + S_{uqrd}^{\circ} Q^t (e=1, m | c=0, r,d,u,q) \right ) }{
\sum_{u,q} \left ( S_{uqrd}^{\bullet} Q^t (m | c=1, r,d,u,q) + S_{uqrd}^{\circ} Q^t ( m | c=0, r,d,u,q) \right ) } \\
\quad & \quad \equiv \quad & A / B ,
\end{eqnarray*}

in which,

AB====∑u,q(S∙uqrdQt(m|c=1,r,d,u,q)+S∘uqrdQt(e=1,m|c=0,r,d,u,q))∑u,q(S∘uqrd(1−αtuq)γtrdmμtmq1−αtuq∑mγtrdmμtmq+S∙uqrdγtrdmμtmq∑mγtrdmμtmq)∑u,q(S∙uqrdQt(m|c=1,r,d,u,q)+S∘uqrdQt(m|c=0,r,d,u,q))∑u,q(S∘uqrd(1−αtuqγtrdm)μtmq1−αtuq∑mγtrdmμtmq+S∙uqrdγtrdmμtmq∑mγtrdmμtmq)
\begin{eqnarray*}
A & = & \sum_{u,q} \Big ( S_{uqrd}^{\bullet} Q^t (m | c=1, r,d,u,q) + S_{uqrd}^{\circ} Q^t (e=1, m | c=0, r,d,u,q) \Big ) \\
& = & \sum_{u,q} \Big ( S_{uqrd}^{\circ} \frac{ (1 - \alpha_{uq}^t ) \gamma_{rdm}^t \mu_{mq}^t }{1 - \alpha_{uq}^t \sum_{m} \gamma_{rdm}^t \mu_{mq}^t} + S_{uqrd}^{\bullet} \frac{ \gamma_{rdm}^t \mu_{mq}^t }{ \sum_{m} \gamma_{rdm}^t \mu_{mq}^t} \Big ) \\
B & = & \sum_{u,q} \Big ( S_{uqrd}^{\bullet} Q^t (m | c=1, r,d,u,q) + S_{uqrd}^{\circ} Q^t ( m | c=0, r,d,u,q) \Big ) \\
& = & \sum_{u,q} \Big ( S_{uqrd}^{\circ} \frac{ (1 - \alpha_{uq}^t \gamma_{rdm}^t ) \mu_{mq}^t }{1 - \alpha_{uq}^t \sum_{m} \gamma_{rdm}^t \mu_{mq}^t} + S_{uqrd}^{\bullet} \frac{ \gamma_{rdm}^t \mu_{mq}^t }{ \sum_{m} \gamma_{rdm}^t \mu_{mq}^t} \Big ) \\
\end{eqnarray*}

for μmq\mu_{mq},

μmq=λq∑u,r,d[S∙uqrdQt(m|c=1,r,d,u,q)+S∘uqrdQt(m|c=0,r,d,u,q)]. \mu_{mq} = \lambda_{q} \sum_{u, r, d} \Big [ S_{uqrd}^{\bullet} Q^t (m | c=1, r,d, u,q) + S_{uqrd}^{\circ} Q^t (m | c=0, r,d, u,q) \Big ] .

∑mμmq=1⟹λqSq=1⟹λq=1Sq \sum_{m} \mu_{mq} = 1 \implies \lambda_{q} S_{q} = 1 \implies \lambda_{q} = \frac{1}{S_{q}} , thus

μmq==1Sq∑u,r,d[S∘uqrd(1−αtuqγtrdm)μtmq1−αtuq∑mγtrdmμtmq+S∙uqrdγtrdmμtmq∑mγtrdmμtmq]μtmqSq∑u,r,d[S∘uqrd1−αtuqγtrdm1−αtuq∑mγtrdmμtmq+S∙uqrdγtrdm∑mγtrdmμtmq]
\begin{eqnarray*}
\mu_{mq} & = & \frac{1}{S_{q}} \sum_{u,r,d} \Big [ S_{uqrd}^{\circ} \frac{ (1 - \alpha_{uq}^t \gamma_{rdm}^t ) \mu_{mq}^t }{1 - \alpha_{uq}^t \sum_{m} \gamma_{rdm}^t \mu_{mq}^t} + S_{uqrd}^{\bullet} \frac{ \gamma_{rdm}^t \mu_{mq}^t }{ \sum_{m} \gamma_{rdm}^t \mu_{mq}^t} \Big ] \\
& = & \frac{ \mu_{mq}^t }{S_{q}} \sum_{u,r,d} \Big [ S_{uqrd}^{\circ} \frac{ 1 - \alpha_{uq}^t \gamma_{rdm}^t }{1 - \alpha_{uq}^t \sum_{m} \gamma_{rdm}^t \mu_{mq}^t} + S_{uqrd}^{\bullet} \frac{ \gamma_{rdm}^t }{ \sum_{m} \gamma_{rdm}^t \mu_{mq}^t} \Big ]
\end{eqnarray*}

G. Dupret and B. Piwowarski, A user browsing model to predict search engine click data from past observations, In ACM SIGIR Conference, 2008.
内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息
标签: