首页 [2014]-word2vec中的数学原理详解_peghoty

[2014]-word2vec中的数学原理详解_peghoty

举报
开通vip

[2014]-word2vec中的数学原理详解_peghotypeghotyword2vecOLMNpeghoty(peghoty@163.com)2014?7tg�℄1oE32WSe�42.1sigmoidf2....................................52.2���^......................................62.3BayesM�.....................................72.4Huffman&�...................................82....

[2014]-word2vec中的数学原理详解_peghoty
peghotyword2vecOLMNpeghoty(peghoty@163.com)2014?7tg�℄1oE32WSe�42.1sigmoidf2....................................52.2���^......................................62.3BayesM�.....................................72.4Huffman&�...................................82.4.1Huffman0................................82.4.2Huffman0vP}............................82.4.3Huffman&�...............................93P�e�113.1a�i=1(...................................123.2n-gram1(....................................133.3�B;�i=1(................................163.4℄��vw9...................................194*RHierarchicalSoftmax~e;224.1CBOW1(....................................234.1.1l�8P..................................234.1.2Q��>..................................244.2Skip-gram1(..................................304.2.1l�8P..................................304.2.2Q��>..................................305*RNegativeSampling~e;335.1CBOW1(....................................345.2Skip-gram1(..................................371peghotyword2vecl~�B5.375C>�....................................406{�Ya19426.1σ(x)v?9�>.................................426.2℄~vbW....................................436.3�*1.......................................446.4yO℄k?O℄..................................456.5\b���x...................................466.6;%W5��...................................476.762S�zh8�................................486.8��N-*....................................496.9�}Iyk7W..................................502peghotyword2vecl~�B§1pFword2vec$Googlef2013>UngTvGD\f�nwordvectorvJL�,H&k�?,R^TW�n�yvY1._fword2vecvFTomasMikolov{�K�Yv�x([3],[4])$-"aL�I�>��7,R�{G�N�����DJL�v�(=.G! H00vyf$4~�^a9Sni�v$�pGiGF.{G_4Xword2vec$2013>v10s+,n� �6lg5�Khs��+v�x[7],U-DJF$,SENNAv8N>�([8])�r$xCC.Qt\aM7,f$E�GD�Æ(\6'[20]),m fU$<��v8��$IG,'4~�\word2vecpTL<��,"�r$x*℄`~0 ,}��word2vecW6�V�G�,hV%EA��G.sp,��22Vr�word2vecvG!LVW\,�TomasMikolovf�!�E,Ug\r�N:kxo([6]),R^Qtt�a#D�word2vecyv>�mwED�9,K'�G&vs2<G>*4=.f$,I�%p,9� �G�i�,>$�!�/Æy+vE��.{GD=QH$,�//$Dn&kv_;8P,q�!8!�y))BBz5L$DeepLearning:?�9Sword2vecni�vaN$,U�>�;+v)�,U�&N�b$+v)�ER�.�vv�K/p i�,~$�w9rv���wLx,Fa-DvFbTL}6W.{�w!xvaN$,k��5�uvub� r:([15,16])>*��_aOvM�,{^+=#.�i,E6W�UGyvG!8�,��{6Wx��,{^�G&vJFEG-+=#.3peghotyword2vecl~�B§2XTf !7;word2vec$,\rvG!)D��},�lsigmoidf2�BeyesM�kHuffman&�w.4peghotyword2vecl~�B§2.1sigmoid�sigmoid�$�Bl�$F\v��f2�G,U�Nqσ(x)=11+e−x,9f2v�Njq(−∞,+∞),�jq(0,1).e1FT�sigmoidf2ve�.e1sigmoidf2ve�sigmoidf2v{�LaK�)�σ′(x)=σ(x)[1−σ(x)],_^Lt,f2logσ(x)klog(1−σ(x))v{�*,q[logσ(x)]′=1−σ(x),[log(1−σ(x))]′=−σ(x),(2.1)M�(2.1){s+vgq$,\r.5peghotyword2vecl~�B§2.2_.(���$BF�Gr�Q,�,|~,4,:`)$.qn `),4D_u$.q^{_u,4_{�0L$.b{T�*q,ww.�{(xi,yi)}mi=1qGD�*uyUvC!2K,U$xi∈Rn,yi∈{0,1},nyi=1�K�WvC!q V,nyi=0�K�WvC!q�V.{\sigmoidf2,�fzMC!x=(x1,x2,···,xn)⊤,\,�*uyUvhypothesisf2"Lhθ(x)=σ(θ0+θ1x1+θ2x2+···+θnxn),U$θ=(θ0,θ1,···,θn)⊤qj�62.q�1j�&zW',Tx0=1,xm�q(x0,x1,x2,···,xn)⊤,d{0TW��vfh�|,U�qx.f$,hθ\&"qhθ(x)=σ(θ⊤x)=11+e−θ⊤x.n��T=0.5,�*uvC,M�qy(x)=1,hθ(x)≥0.5;0,hθ(x)<0.5.862θ~lk:?^FvE�$,�t�GD)~��vb����J(θ)=1mm∑i=1cost(xi,yi),vs�U>*℄z,`�trC℄v62θ∗.��W\$,kDC!vC�f2cost(xi,yi)Fnq���w�cost(xi,yi)=−log(hθ(xi)),yi=1;−log(1−hθ(xi)),yi=0.1M,��$GD*�f2,E\,U"L~�v�V+d�cost(xi,yi)=−yi·log(hθ(xi))−(1−yi)·log(1−hθ(xi)).6peghotyword2vecl~�B§2.3Bayes���F6M�$V_25�RH�(ThomasBayes)TTpv,\p,/�DZ);��$vY�.��P(A),P(B)*,+")Ak")B��v;�,P(A|B)+")B��vfh�")A��v;�,P(A,B)+")A,B_���v;�,aP(A|B)=P(A,B)P(B),P(B|A)=P(A,B)P(A),{\��,>G2\tP(A|B)=P(A)P(B|A)P(B),�H$Bayes��.7peghotyword2vecl~�B§2.4HuffmanYa!7&k;Huffman&�(LV;}-Dp;���[v℄Z,[10]),q^,*�;Huffman0v�N�UP}>�.§2.4.1Huffman�{�> [5$,�$G()Dv(�,2K8P,H$2Kl<({0$Kq;�) *�Y�A�Wpv8P.�<Yt0�0v0FPLv�mKq}Z.�+FT�Dh0�YvF\;�.•\Bk\Bi�{GY0$,`GD8}m�\Kdrv :�B:8}�$v^�,Kq�E.^�$*�v26Kq�EG�.�℄�G8}v;jq1,`G8}r{L;8}v�EG�qL−1.•;�~vkxv\Bi��q0$8}5gGDLa4(eNv((7)2�,�D2�Kq98}vq.8}vhq�EG�$�,`G8}r98}�$v�EG�h98}vqvMÆ.•�~xv\Bi�0vhq�EG�℄�qFaF:8}vhq�EG��k.g�$#D8}C�a�D:0va10.�D:0^FKq�so��k�Qo��,�N$v�a1�$��D:0aD �*,410<|p.F�nDq�FqnDF:8},P}GY�<0,�Hvhq�EG�drC�,K�Cv�<0qrPg�,EKqHuffman�.§2.4.2Huffman�~�^F�nDq�{w1,w2,···,wn}Fq�<0vnDF:8},\^aK�>�pP}GZHuffman0.��2.1(HuffmanUF,[3)(1)i{w1,w2,···,wn}zÆN"n|U#> (�|Up"��m&).(2)+> = � �Bm&#58G�#UP�,Jo�|�U#H�#EU,-�U#Bm&58o+H�#EUBm&586O.(3)�> =?� 4# |U,�i�Ua:> .(4)�:(2)�(3)�,7!> =;D�|Uo:,;U\o℄2#HuffmanU.8peghotyword2vecl~�B4�p,FT>�2.1vGDLV�|.V2.1bB2014&Loz* ,���m�=C4�;>e%F1H#m�,th^,�u���wU���Iz���wv���F1���Loz�2����}#�V7�o15,8,6,5,3,1.0�26��o�Em&,���(58,F,�|HuffmanU.e2Huffman0vP}aN�[32.1,�5+F,L�9j2℄K.j=%��A��G>#HuffmanU,!j}fsj[w~sR�;�[>.F,L�=,fLP��/#m&{�_oW=.!$� �m&*�r��P�,��,;�Em&#�Von,-F,#HuffmanU=�/m&#�Von−1.|�=n=6,���/m&#�Vo5.B�,,�"`!,1�U# �EUN7H##,.$�6�Em&W,uN+ �MEm&N7H##,+|�=,h�6sjw~;�u(s�o;�,sj8~u(Q�o;�.6,2;N��)),%�i�(�#m&Jo#MEm&��"ta.§2.4.3HuffmanYa{2K^&$,-D,Z:vx<3�L�>"v<1[,\0,1�v0_B�p+<1.|~,-Z:v�xq�AFTERDATAEARAREARTAREA�,�y\rv<1�q�A,E,R,T,F,D�,E<5TÆv_2q8,4,5,3,1,1.ÆDkq�!<5��&�.Dm,6D<5,C&kv�>"&�$�$�iYa,V�5\3v�>"(23=8>6),\*,\000�001�010�011�100�101��A,E,R,T,F,D�>*&��:,n�$4)�x�y ��vG*>*P�.�v&�vG�nR�x$0_<1vD2.��x$\<TÆ26D0_<1,V�&�G�q5(25=32>26).v�,Z:�x�>$�n>G��\<Æ.{��W\$,9peghotyword2vecl~�BED<1vn4j��Ot�$0�_v,~A�B�Cv�\O�oo?fX�Y�Z,;v��r��&��,w�\O�?v\Æ�,�\O�yv\G�,K℄z�D�x&�.q�_�iYaqonYa(�DkGD<1v&�0<$�GD<1&�v℄5),\\<1�$v#D<1FqF:8}�LGY&��<0,q��tZ:�xvCÆG�,\,#D<1vTÆO�Fq<18}vq�5g98}�,�v<�\O�r�q�r�,q�r�F:HrX�,f$O��&�G,O�?&�Æ,�CH���^0vC�hq�EG�,`�H$Z:�xvCÆG�.R^,kZ:�xvCÆG�yU3zqk_<1�$vFa<1FqF:8},_<1TÆO�FqUq�FA�vHuffman0vyU.{\Huffman0��v�>"℄5&�,KqHuffmanYa,H�<��℄5&�vZ),d<���x&�>GCÆ.!x,;vword2vecJL$E,\rHuffman&�,H�8�i�$v℄nLF:8},U{i�$TÆv_2nFq�,^aP}�WvHuffman0p�#GD℄>*Huffman&�.e3FT�|2.1$ÆD℄vHuffman&�,U$q�(℄O2gv)D :8}&�q1,(℄O2�v) :&�q0.�CGp,�z����|���ZV���������i���!:���ÆD℄vHuffman&�*,q0,111,110,101,1001k1000.e3Huffman&�Me1M,r6℄q�,YfHuffman0kHuffman&�,a�Dq�:(1),q�gv8}FqD :8},q��vFq :8};(2)D :8}&�q1, :8}&�q0.{word2vecn�$,q�2gv :8}&�q1,2�v :8}&�q0.qh�/q�aGW',3*l�|~�s�o;���Æjvh8w~�o;�.10peghotyword2vecl~�B§3QAf word2vec$\p�L℄��vJL,�℄��hi=1(a7) vY�,q^,0&�p�9G!i=1($+v��.11peghotyword2vecl~�B§3.1"2VEe;n<vtl9'��,#X�{A�g�vx!�eM�iSk'O2K,D��!2K>*Yw-`$hPTa"�v&�,v0U;vi=Yw(NatureLanguageProcessing,NLP)�.,U$"2VEe;(StatisticalLanguageModel)H$n)DvG},H$FaNLPv�V,\"W\fiS�,� X�P�*℄�℄,*1k&�%Ewz~.V3.1+&�H�xh=,.$A)#&�-Voice,��0!��I=�p(Text|Voice)G�#s|-Text.�BayesDJ,"p(Text|Voice)=p(Voice|Text)·p(Text)p(Voice),+=p(Voice|Text)o�Be;,0p(Text)oVEe;([18]).&kz5,a�i=1($\p�>GDN:v;�v�^e;,H^F�fGDi�dpP+.8�!3EGDN:v;�:?!�W=wT1:=(w1,w2,···,wT)+_TD℄w1,w2,···,wT 41PLvGDN:,w1,w2,···,wTvm;�p(W)=p(wT1)=p(w1,w2,···,wT)H$�DN:v;�.{\Bayes��,��\K��z*9qp(wT1)=p(w1)·p(w2|w1)·p(w3|w21)···p(wT|wT−11),(3.1)U$v(Z));�p(w1),p(w2|w1),p(w3|w21),···,p(wT|wT−11)H$VEe;~b�,��!62JBr3>t,8!F�GDN:wT1,H\Knfz>T�Wvp(wT1)�.VWph�n&k,$?m$,LV�ÆWp~$a}�.|~,�pVVe;b�~��.>4$W�GDF�vG�qTvN:,H-D�>TD62.0&!�i�d�W℄~Dvg�(�℄��)qN,8!,~`W�G�qTvzMN:,w��HaNT(\<,�#(\<�D�>TD62,>OH-D�>TNTD62.nv,�y�$&kR>,-"aW�)662,m�D��~$a�y.^i,�!;��>hs,~t�b�p,R^,bW�!&�E-Dngv;bU�.^i,a:b�y"2�h?F'v$�an-gram1(�R90�CgN1(�CgN��[/1(�Z)A C��Bl�w$�.!x�M�n-grame;k�?$`�($�.*�pVVn-gram1(.12peghotyword2vecl~�B§3.2n-grame;W�p(wk|wk−11)(k>1)v?9�>.{\BayesM�,ap(wk|wk−11)=p(wk1)p(wk−11),GKg2�w,ni�d�Qg�,p(wk|wk−11)\?9z+qp(wk|wk−11)≈count(wk1)count(wk−11),(3.2)U$count(wk1)kcount(wk−11)*,+℄[wk1kwk−11{i�$TÆv_2.\���,nkng�,count(wk1)kcount(wk−11)va�,��!i�.`M�(3.1)\KVT:GD℄TÆv;�hH℄+vFa℄��Y.~`!�GD℄TÆv;��hH℄+V�26v℄�Y:?�H$n-grame;v�!7�,HF�GDn−15vMarkov5,{qGD℄TÆv;�H�hH℄+vn−1D℄�Y,�p(wk|wk−11)≈p(wk|wk−1k−n+1),f$,(3.2)H(L�p(wk|wk−11)≈count(wkk−n+1)count(wk−1k−n+1).(3.3)Kn=2q|,Hap(wk|wk−11)≈count(wk−1,wk)count(wk−1).�CG&z,0=�tkD62va�(tI}L(a��-DIEv℄[IÆ),E�t62v>2( �.8!,n-gram$v62nn�g"2m%:?G�p5,nv4n-D_�W��>6v�k1(`�DR<.[1e;b��XUn~�0n1(622�1(unigram)2×1052(bigram)4×10103(trigram)8×10154(4-gram)16×1020{2��\�$+,+1FT�n-gram1($1(622�A7nv,*�g�(zvfh,U$!�℄~g�N=200000(giv℄��g$�D��)."��,1(6213peghotyword2vecl~�Bv��$Nv�2f2(O(Nn)),�vn0<ntIg,��W\$C�v$5\n=3v�l1(.{e;9�$+,w��$nrg,`rh.Æ~<,tlvd�2KK� X,<vT��t�>I?5vi=1((~n>10)Lq\<,m-D1Mv$,nngrG�N��,1(`vT�0��(�.|~,nn`1r2,y`2r3�,1(v`���.,�`3r4�,`vT�H0�.�(LV\6W|T{�25�$�$v�Y�7)."��,�y~ �rGDML?kMt\?vyU,62r�,\m,,rh,m_�kD62v�|( `�/y�\X,,R^-D{\X,k\m,,�$>* $.�i,n-gram1($~aGD3Ek%&v)D}7.�rM�(3.3),W��DyU:1.�count(wkk−n+1)=0,<.{qp(wk|wk−11)Hwf0:?2.�count(wkk−n+1)=count(wk−1k−n+1),<.{qp(wk|wk−11)Hwf1:?�v0<�m�$GD{��$vyU,7A=vi�da�!g.Qwz�.H$\pYw�DyUv,�y0�UM�,LV\6W[11].>8Wp,n-gram1($�CG(1(,U-DJF${i�$a�E(℄[TÆv_2K�QwzYw.;���>h�sHbWWp,�_-D�>GDN:v;��,�-�r�Yv;�62,,H&�MWpHh�.v�,{ X5�jaG(^\v�2$�Cv:�FW�vyU+1s�qUP}GD6*f2,vs��D6*f2>*℄z,`�ktGAC℄v62,Cs{\�AC℄62�Wv1(p>*k:.�fa�i=1(�=,{\rw�w,\�6*f2�q∏w∈Cp(w|Context(w)).U$C+i�(Corpus),Context(w)+℄wv~3*(Context),�w+%v℄v�m.nContext(w)qa�,Hnp(w|Context(w))=p(w).O,z,�f℄+;vn-gram1(,HaContext(wi)=wi−1i−n+1.m3.1&�CO�'D#3�:�'DN�&�C=�4�#,��+�:#�;0&�CN9℄"#s|#8,y~�:#�.nv,��W\$F5\rw���w,��6*f2�qL=∑w∈Clogp(w|Context(w)),(3.4)vs��Df2>*Cgz.`(3.4)\',;�p(w|Context(w))J'qYfwkContext(w)v�,�p(w|Context(w))=F(w,Context(w),θ),14peghotyword2vecl~�BU$θqy�b�/.�CGp,Gl�(3.4)>*℄ztrC℄62�θ∗s,FEHoGt��,Kszl;�p(w|Context(w))H\K^af2F(w,Context(w),θ∗)p�>�.hn-gram�",�($�0-D("��>-)�bFav;��,�$^a�4�>p�n,d^a4nm%v1(\�tθ$62vD2o�fn-gram$1(62vD2.n�v,�f�CG($�,CY(vz$H{f�F~�^�.�G�7,;G(^a�Bl�pP}Fv$�.�FKOM;�D$�,$RqH\K'qword2vec$>�g#v℄��5�V.15peghotyword2vecl~�B§3.3�?�^VEe;!�7;Bengiowy{x�Aneuralprobabilisticlanguagemodel�(2003)$TTvG(�B;�i=1(([2]).91($\r�GD)DvJL—s7X.�!$℄��:?&kp5H$,�℄~D$vzM℄w,��GDV�G�v����v(w)∈Rm,v(w)HKqwv℄��,mq℄��vG�.Yf℄��v>G2w9,'r�G�7p2%-9.�v$�B;�i=1(,U$nvD\rGD�Bl�o.e4FT��D�Bl�v8PMe,H�l8D;:�z(Input);�#N(Projection);�Ld(Hidden);k�n(Output);.U$W,U*,qbY;hU7;K�U7;h-T;�$vq�IÆ,p,q*,qU7;k-T;�vL!��.e4�Bl�8PMem3.2`[s[2]=#Ctl�E,7�C/i+Qo9j5℄K#<�mF.|si+�To9j4℄K#Y�mF,�5�N$�T,Æ�5�N$Oword2vec=I#l�mFr.}.e5�;�Bl�8PMem3.3J1+s[2]=V{��i��OS��#Ct'6 "~�#/ ,�0�X/����#58v3,|sR��2?/ ,�2����.[3|<#�n.+V8G16peghotyword2vecl~�B�=,J12}�:i��OS��6 #58v3\6�$`?���K,�}�eAÆ�#(��V.�fi�C$vzMGD℄w,,Context(w)nqU℄+vn−1D℄(u9fn-gram),�C�l�(Context(w),w)H$GDCWGU�.4�p,M�C!(Context(w),w)Ba~e4Fv�Bl��$~l6hu>v.1M,Gli�Ck℄��G�mF�s,bY;k-T;v℄1Ht��,℄q(n−1)m,sqN=|D|�i�Cv℄��g�.�U7;v℄1nh$M�b�_\u��.q�!bY;v℄1$(n−1)m:?Rq-;�eContext(w)$n−1D℄v℄��,�bY;v��xw$�CP}v:,-;vn−1D℄�� 41*t�4zNWp)LGDG��,UG�nvH$(n−1)m�.a���xw,4�pv�>aNHnQ��,LVqzw=tanh(Wxw+p),yw=Uzw+q,(3.5)U$tanhq�u q�,\pEU7;v-)�,��$,tanhF\{���+HF\{��v#GD*��.m3.4"+1}$�t:.$&�=#��A)yE#,℄��,+,�#��Fn−1�.�x?�E,}�7o$ a��(Y℄�)d��u}��,_�� %Æ�L�.Ba�/�2�>trvyw=(yw,1,yw,2,···,yw,N)⊤�$GDG�qNv��,U*�0<+;�.~`�Dywv*�yw,i+n��xqContext(w)��GD℄Zq℄~D${iD℄v;�,~-DEGDsoftmax^Gz,^Gzs,p(w|Context(w))H\K+qp(w|Context(w))=eyw,iw∑Ni=1eyw,i,(3.6)U$iw+℄w{℄~D$vET.M�(3.6)FT�;�p(w|Context(w))vf2+,��r��G�7$Trvf2F(w,Context(w),θ),8!U$jt�v62θa7!:?>8Wp,�l�3*•℄��:v(w)∈Rm,w∈DK�YP��.•�Bl�62:W∈Rnh×(n−1)m,p∈Rnh;U∈RN×nh,q∈RN,�!62S^a8�>�tr.�tGTv$,^Fv X5�>�$,-�$J�v,�{�/�B;�i=1($,-v(w)E-D^a8�4<tr.4�p,&Dz*G��/1(vu>�.{~e4Fv�Bl�$,bY;�U7;k-T;v℄1*,q(n−1)m,nh,N,H_VVU$ �v62:17peghotyword2vecl~�B(1)n$GD℄v��x$�ev℄2,^F0Ha5;(2)m$℄��G�,^F$101∼102��;(3)nh_\u��,^F0-ntIg,~102��;(4)N$i�℄��vg�,hi��Y,m^F$104∼105��.y8m(3.5)k(3.6),09�Æ,�D1(vg3*�>�${U7;k-T;�$vIÆ��u>,K�-T;�vsoftmax^Gzu>.R^s2v�Y<GJF$,an�$��G3*>*℄zv,U$H�l�word2vecvJF.hn-gram1(�",�B;�i=1(a�!P :?-DaK��}:1.℄i�$v�9,\K^a℄��pVÆ.J|p5,~`4D(Vi)i�$S1=�Adogisrunningintheroom�TÆ�10000_,�S2=�Acatisrunningintheroom��TÆ�1_. �n-gram1(vE�,p(S1)`��ogfp(S2).1M,S1kS2voGm,{fdogkcat,���D℄{�$N�~$iN���?��_v1�,R^,p(S1)kp(S2)W9n�?4�.v�,_�B;�i=1(>tvp(S1)kp(S2)$g�wv.mR{f:(1){�B;�i=1($!����9v�v℄�Wv℄��E$�9v;(2);�f2Yf℄��$[wv,�℄��$vGD�(z�;�vY�E�$GD�(z.�CGp,�f�+�!N:AdogisrunningintheroomAcatisrunningintheroomThecatisrunninginaroomAdogiswalkinginabedroomThedogwaswalkingintheroom...�D{i�d$TÆGD,UGN:v;�E��Wz�g.2.�f℄��v1(;hQwzK<(_(3.6)\�,p(w|Context(w))∈(0,1)0�q ),0y-D�n-gram8C>*�iYw�.Cs,z&�a p��,℄��{�D�B;�i=1($�?��!1�:?8��,H$\p�/P}6*f2v3/62,8�jLs,HEh��$i=1(vGD4AP.m�D4AP\0<�M,�G�7,�UF>G2B/.18peghotyword2vecl~�B§3.4s7X~S<^a�G�7vM�,�&g��℄��JBaGDS2v{��.4�p,�℄��E>G2;.{NLPz~$,z&,;vi=0F X5�>�pYw,m X{��4w9yuvi=,R^*�DEv"fH$,i=25z,~l�;vi=>*25z:?℄��TL�G(nhv$�.G(C&kv℄��$one-hotrepresentation,H$\GDnGv��p+GD℄,��vG�q℄~Dvg�N,��v*��aGD1,UHrq0,1vv!�W9℄{℄~$vET.m�(℄��+aG!s},~}L+r2w9vkx,^U$,U\fDeepLearningCC�;d~,H0<nhz^x℄h℄�$v�9,.�G(℄��$DistributedRepresentation,HC|$Hintonf1986>TTv([1]),\K℄2one-hotrepresentationv�/s}.U�!��$:^a8�,4(i=$v#GD℄Z�LGDV�G�vÆ��(nv�yv�Æ�$��fone-hotrepresentationv�G��=v),Fa�!��PLGD℄��a$,�#G��\'q9a$$vGD},{�Da$�T�Mv�,H\KGK℄�$vMvpC�H&�$v(℄��iN�v)�9,�.word2vec$5\vH$�(DistributedRepresentationv℄��.q�!3EDistributedRepresentation?n�yyr�DyU.zvGDw9$�Cv:�fone-hotrepresentation,��$�aGD( *�,(F�$(a}S1G�v=Q);��fDistributedRepresentation,��$ag�( *�,��*�(a}- QKv=Q),�℄v&�*1rED*�$p�.�G},H-*�>yv*1�-*n�.qIhzw9�/7�,z&pJGD^;v|:.V3.2bB+1p)��7�"a��g#&,A)+=#�&,}+�+)��0!%2�&Gs#��&.u�N.�I#"?R{,h���7jK�x,Z$;K�x,+�#��&un�$.���K�(x,y);kD�:'Px�;GQ7�^[2�&%+^a−1�&6 #x�,.�G�x�8#!�(Y!�)&Nu��0#&�.�+v|:$,G*(x,y)vzvH�nf℄��,H\p,Q+�GD}vv!{25�F�z.G*�+}hKs,Dtr4D}vG*$n}Lv.v�,{NLPz~$,Dtr℄��H6vt��,�d℄��-0oG,U#�Hqf8�i��8�>�wR<.~l�n℄��:?an�0_1(\\pR�℄��,�la0vLSA(LatentSe-manticAnalysis)kLDA(LatentDirichletAllocation).^i,{\�Bl�>�E$G(F\v$�,�G�7;v�B;�i=1(H$GDnhv�|.nv,{8D1($,6*$�Li=1(,℄���$GD4AP."��,g3*fh�,℄��ki=1(�$j�{GWv,8�jLs�_�tr.\�Bl�p8�i=1(v7�C|_��IDL(��5�<Gp)vA)TT([14]).�$+CB~vx�D2Bengiof2003>�19peghotyword2vecl~�B+{JMLR�v�ANeuralProbabilisticLanguageModel�,UsaG���Yv<GJF,U$E�lTATomasMikolovf�vword2vec.e6SENNAperformanceinper-wordaccuracyforPOS,andF1scoreforalltheothertasks.TimingcorrespondstothetimeneededbySENNAtopassoverthegiventestdataset(MacbookProi7,2.8GHz,IntelMKL).ForPSG,F1scoreistheoneoverallsentences.G+hv℄��$na"�v,|~,RonanCollobertf�{�)�SENNA([12])${\℄��>*�POS�CHK�NERwz~,dnt�0 v`('e6$v+C).C?�9r℄��{ X�PjvGDW\([13]),�s$�Cv:TAvTomasMikolovf�U��G(℄~k.i+v;��L�.,<Q�G(i=3(L�G(i=.9�.{\2KhPpP+�(i=v8P1(,vsK�".#(i=℄i�$vY��m��i=a$�,\K+�q25MN�v���m.{��a$;,0_vi=�a0�O,,�D�ÆGD��a$��GD��a$vZ�k3�,i=�P�\�Æ.9�.`(F0 ,�Vik�i$v�P6t�?d90%.x[5]{T=$;>�mw�J�GD&kv|:,\K�/z&Ihzw9℄��vJFmw,O,U;~�.e7}D℄{�D��a$$vv!(D:E :S)W�Vik��:i�(i=,^a8�*,trH&�Wv℄��a$E(nglish)kS(panish).`Vi$nT}D℄one,two,three,four,five,�U{E$�Wv℄��20peghotyword2vecl~�B*,qu1,u2,u3,u4,u5,q$'Fe,{\-L**(PCA)/r,tr�Wv�r��v1,v2,v3,v4,v5,{�rQ+�,�}D},Tp,~e7DeF.u9z,{��:i$nT(hone,two,three,four,five�Wv)uno,dos,tres,cuatro,cinco,�U{S$�Wv℄��*,qs1,s2,s3,s4,s5,\PCA/rsv�r��*,qt1,t2,t3,t4,t5,,H&{�rQ+�,Tp(\<~-F%nv33),~e7 eF.Z>D� �0e,}L�Æ:}D℄{�D��a$$v��v!?0�,�5/�(0_i=�W��a$v8P�$La�9,,`�>G25/�{℄��a$${\Mv^x℄�$�9,vmw,.1M,℄���$ ��℄�pTv,"��,z&E\K �I�~��Ia~�p>*g\,~q7X([7]),Fo7Xk*z7X([6]),H&<q<�N:�xowklTLIhv+.21peghotyword2vecl~�B§4+SHierarchicalSoftmaxf<a�℄+v6�,!7U���;word2vec$\rv�D)D1(—CBOW1((ContinuousBag-of-WordsModel)kSkip-gram1((ContinuousSkip-gramModel).Yf��D1(,FTomasMikolov{x[5]FT�~e8ke9FvMe._e\',�D1(��e�;:�zf�#Nfk�nf.℄${J�n℄℄wtv��xwt−2,wt−1,wt+1,wt+2v℄T�k:n℄℄wt('e8);�sZZ�!,${J�n℄℄wtv℄T�,k:U��xwt−2,wt−1,wt+1,wt+2('e9).e8CBOW1(e9Skip-gram1(�fCBOWkSkip-gram�D1(,word2vecFT��Ng#,H&*,�fHier-archicalSoftmaxkNegativeSamplingp>*��.!7;�fHierarchicalSoftmaxvCBOWkSkip-gram1(.{§3.2$,z&Tr,�f�Bl�vi=1(v6*f2^Fnq~����w�L=∑w∈Clogp(w|Context(w)),(4.1)U$vY($Z);�f2p(w|Context(w))vP},x[2]$v1(HFT��Df2vG(P}$�('(3.6)�).�fword2vec$�fHierarchicalSoftmaxvCBOW1(,℄zv6*f2E)~(4.1);��f�fHierarchicalSoftmaxvSkip-gram1(,℄zv6*f2)~L=∑w∈Clogp(Context(w)|w),(4.2)R^,M�aN$z&W,)}'{p(w|Context(w))�p(Context(w)|w)vP}�,M�r�G}n)D,RqH\Kwz&6*/t�%{DL,0f�rG!�Dv�7n$p.4�p,`25v1����D1(>*��;.22peghotyword2vecl~�B§4.1CBOWe;!�7;word2vec$v{GD1(—CBOW1(.§4.1.1$`;�e10FT�CBOW1(vl�8P,H�l�;:-;�bY;k-T;.�+KC!(Context(w),w)q|(�y!�Context(w)_w℄sEcD℄PL),���D;E&D5/.1.�zf:�eContext(w)$2cD℄v℄��v(Context(w)1),v(Context(w)2),···,v(Context(w)2c)∈Rm.�y,mveN_�+℄��vG�.2.#Nf:,-;v2cD��Ekkt,�xw=2c∑i=1v(Context(w)i)∈Rm.e10CBOW1(vl�8PMe3.�nf:-T;�WGY�<0,H$Ki�$TÆav℄nF:8},KE℄{i�$TÆv_2nq�P}TpvHuffman0.{�YHuffman0$,F:8}ON(=|D|)D,*,�W℄~D$v℄,(F:8}N−1D(e$*L��v8!8}).23peghotyword2vecl~�B�"§3.3$�B;�i=1(vl�e('e4)kCBOW1(v8Pe('e10),L�H&-DaK��Y0_:1.(`-;rbY;v8F)℄$^aN4,s^aP4s!.2.(U7;)℄aU7;,s-Ldf.3.(-T;)℄$�,8P,s$�=;�.{§3.3;v�B;�i=1($,z&�T,1(vg3*�>�${U7;k-T;�$vIÆ��u>,K�-T;�vsoftmax^Gzu>.�`�+v�"$\',CBOW1(��!�>6v�?vz$a �,z>*�:(,*�,p��U7;,U_,-T;:\�Huffman0,`�q{\Hierarchicalsoftmax�.����V.§4.1.2��2�HierarchicalSoftmax$wor2vec$\fT?,<vG�Y(�..q,/$'W',{LV;�D�.�℄,�T�<�Y�j.W�Huffman0$v4DF:8},!�H�W℄~D$v℄w,�1.pw:`G8}T�rdw�WF:8}v�E.2.lw:�Epw$�e8}vD2.3.pw1,pw2,···,pwlw:�Epw$vlwD8},U$pw1+G8},pwlw+℄w�Wv8}.4.dw2,dw3,···,dwlw∈{0,1}:℄wvHuffman&�,H_lw−1v&�PL,dwj+�Epw${jD8}�Wv&�(G8}0�W&�).5.θw1,θw2,···,θwlw−1∈Rm:�Epw$ÆHo;��Wv��,θwj+�Epw${jD(F:8}�Wv��.m4.1v�W,u��2#N�'D=���(\HuffmanU=℄"�El&)#�,oF�2�V�oHuffmanU=���6�Em&�)���g #�"?MG�,_�;N[3=#9A�,wbk+zs=iXo�`nO.�.h�,T��!Gg�R�v�j,4�p,z&~$^aGD&kv|:�H&�r�Y,Ve11,|Kk���$|2.1q|,W�℄w=��i�vf).e11$_4Zq�%[Wpv5D7}HPL�Epw,UG�lw=5.pw1,pw2,pw3,pw4,pw5q�Epw�v5D8},U$pw1�WG8}.dw2,dw3,dw4,dw5*,q1,0,0,1,���i�vHuffman&�q1001.^i,θw1,θw2,θw3,θw4*,+�Epw�4D(F:8}�Wv��.24peghotyword2vecl~�Be11w=��i��v�Y�jMe8!,{~e10Fvl�8P�,~l�NZ);�f2p(w|Contex(w)):?ILVz5,H$~l{\��xw∈RmK�Huffman0p�Nf2p(w|Contex(w)):?Ke11$℄w=��i�q|.`G8}T�rd��i��DF:7},$$OBz�4_*�(#Zq�v%�WG_*�),�#G_*��\'q>*�G_�Q.�v$`�*uv1�pW�yU,8!�f#GD
本文档为【[2014]-word2vec中的数学原理详解_peghoty】,请使用软件OFFICE或WPS软件打开。作品中的文字与图均可以修改和编辑, 图片更改请在作品中右键图片并更换,文字修改请直接点击文字进行修改,也可以新增和删除文档中的内容。
该文档来自用户分享,如有侵权行为请发邮件ishare@vip.sina.com联系网站客服,我们会及时删除。
[版权声明] 本站所有资料为用户分享产生,若发现您的权利被侵害,请联系客服邮件isharekefu@iask.cn,我们尽快处理。
本作品所展示的图片、画像、字体、音乐的版权可能需版权方额外授权,请谨慎使用。
网站提供的党政主题相关内容(国旗、国徽、党徽..)目的在于配合国家政策宣传,仅限个人学习分享使用,禁止用于任何广告和商用目的。
下载需要: ¥18.0 已有0 人下载
最新资料
资料动态
专题动态
个人认证用户
绘画的问号
暂无简介~
格式:pdf
大小:1MB
软件:PDF阅读器
页数:0
分类:高中语文
上传时间:2020-01-28
浏览量:4