๐Ÿง  Deep Learning/RNN

๐Ÿง  Deep Learning/RNN

[RNN] Seq2seq Learning - Encoder & Decoder, Attention, Feedforward Neural Network

Sequence-to-sequence model `Seq2Seq` ๋ชจ๋ธ์€ words, letters, features of images ๋“ฑ์˜ sequence data๋ฅผ Inputs์œผ๋กœ ์‚ฌ์šฉํ•˜๋ฉฐ Outputs ๋˜ํ•œ ๋˜๋‹ค๋ฅธ sequence data์ด๋‹ค. ์—ฌ๊ธฐ์„œ ์ž…๋ ฅ์— ์‚ฌ์šฉํ•˜๋Š” sequence์— ํ•ด๋‹นํ•˜๋Š” item์˜ ๊ฐœ์ˆ˜์™€ ์ถœ๋ ฅ์˜ sequence์— ํ•ด๋‹นํ•˜๋Š” item์˜ ๊ฐœ์ˆ˜๊ฐ€ ๋™์ผํ•  ํ•„์š”๋Š” ์—†๋‹ค. ์ด๋Ÿฌํ•œ sequence-to-sequence ๋ชจ๋ธ์€ ๋ฒˆ์—ญ ๋จธ์‹ ์œผ๋กœ ์‚ฌ์šฉ๋˜๋ฉฐ ์ด ๊ฒฝ์šฐ sequence๋Š” ๋‹จ์–ด๋“ค๋กœ ๊ตฌ์„ฑ๋˜๋ฉฐ, output ๋˜ํ•œ ๋งˆ์ฐฌ๊ฐ€์ง€๋กœ ๋‹จ์–ด๋“ค๋กœ ๊ตฌ์„ฑ๋œ๋‹ค. Encoder-Decoder Seq2Seq ๋ชจ๋ธ์€ `Encoder`์™€ `Decoder`๋กœ ๊ตฌ์„ฑ๋œ๋‹ค. ๊ฐ๊ฐ์˜ ์—ญํ• ์€ ๋‹ค์Œ๊ณผ ๊ฐ™๋‹ค. Encoder : in..

๐Ÿง  Deep Learning/RNN

[RNN] ์ˆœํ™˜ ์‹ ๊ฒฝ๋ง - RNN, Vanilla RNN, encoder-decoder, BPTT, LSTM, GRU, Attention

๊ธฐ์–ต์„ ๊ฐ–๋Š” ์‹ ๊ฒฝ๋ง ๋ชจ๋ธ RNN ๊ธฐ์–ต์„ ์ „๋‹ฌํ•˜๋Š” ์ˆœํ™˜ ์‹ ๊ฒฝ๋ง ์‹œ๊ฐ„๊ณผ ๊ณต๊ฐ„์  ์ˆœ์„œ ๊ด€๊ณ„๊ฐ€ ์žˆ๋Š” ๋ฐ์ดํ„ฐ๋ฅผ `์ˆœ์ฐจ ๋ฐ์ดํ„ฐ(sequence data)`๋ผ๊ณ  ๋ถ€๋ฅธ๋‹ค. ์ˆœ์ฐจ ๋ฐ์ดํ„ฐ๋Š” ์‹œ๊ณต๊ฐ„์˜ ์ˆœ์„œ ๊ด€๊ณ„๋กœ ํ˜•์„ฑ๋˜๋Š” ๋ฌธ๋งฅ ๋˜๋Š” `์ฝ˜ํ…์ŠคํŠธ(context)`๋ฅผ ๊ฐ–๋Š”๋‹ค. ํ˜„์žฌ ๋ฐ์ดํ„ฐ๋ฅผ ์ดํ•ดํ•  ๋•Œ ์•ž๋’ค์— ์žˆ๋Š” ๋ฐ์ดํ„ฐ๋ฅผ ํ•จ๊ป˜ ์‚ดํŽด๋ณด๋ฉด์„œ ์ฝ˜ํ…์ŠคํŠธ๋ฅผ ํŒŒ์•…ํ•ด์•ผ ํ˜„์žฌ ๋ฐ์ดํ„ฐ์˜ ์—ญํ• ์„ ์ดํ•ดํ•  ์ˆ˜ ์žˆ๋‹ค. ์ธ๊ณต ์‹ ๊ฒฝ๋ง์ด ๋ฐ์ดํ„ฐ์˜ ์ˆœ์„œ๋ฅผ ๊ณ ๋ คํ•˜์—ฌ ์ฝ˜ํ…์ŠคํŠธ๋ฅผ ๋งŒ๋“ค๋ ค๋ฉด ๋ฐ์ดํ„ฐ์˜ ์ˆœ์ฐจ ๊ตฌ์กฐ๋ฅผ ์ธ์‹ํ•  ์ˆ˜ ์žˆ์–ด์•ผ ํ•˜๊ณ , ๋ฐ์ดํ„ฐ์˜ ์ฝ˜ํ…์ŠคํŠธ ๋ฒ”์œ„๊ฐ€ ๋„“๋”๋ผ๋„ ์ฒ˜๋ฆฌํ•  ์ˆ˜ ์žˆ์–ด์•ผ ํ•œ๋‹ค. ์ด๋Ÿฐ ์ ๋“ค์„ ๊ณ ๋ คํ•˜์—ฌ ๋งŒ๋“  ์ธ๊ณต ์‹ ๊ฒฝ๋ง์ด ๋ฐ”๋กœ `์ˆœํ™˜ ์‹ ๊ฒฝ๋ง(RNN: Recurrent Neural Network)`์ด๋‹ค. ์ˆœ๋ฐฉํ–ฅ ์‹ ๊ฒฝ๋ง์ด๋‚˜ ์ปจ๋ฒŒ๋ฃจ์…˜ ์‹ ๊ฒฝ๋ง๊ณผ๋Š” ๋‹ค๋ฅด๊ฒŒ ์ˆœํ™˜ ..

๐Ÿง  Deep Learning/RNN

[RNN] RNN, LSTM and GRU

์ˆœํ™˜ ์‹ ๊ฒฝ๋ง ํ•™์Šต ํ•™์Šต์ด๋ž€ ๋ฐ์ดํ„ฐ๋ฅผ ํ†ตํ•ด parameters๋ฅผ ์ถ”์ •ํ•œ๋‹ค๋Š” ๊ฒƒ์„ ์˜๋ฏธํ•œ๋‹ค. ์ˆœํ™˜ ์‹ ๊ฒฝ๋ง์€ t์‹œ์ ๊นŒ์ง€์˜ ๊ณผ๊ฑฐ ์ •๋ณด๋ฅผ ํ™œ์šฉํ•˜์—ฌ y๋ฅผ ์˜ˆ์ธกํ•˜๊ฒŒ ๋˜๋Š”๋ฐ, ํ•™์Šต ๋Œ€์ƒ์ด ๋˜๋Š” ํŒŒ๋ผ๋ฏธํ„ฐ๋Š” 3๊ฐ€์ง€์ด๋‹ค. ์ฒซ ๋ฒˆ์งธ๋Š” t ์‹œ์ ์˜ ๋ฐ์ดํ„ฐ๋ฅผ ๋ฐ˜์˜ํ•œ W_xh ๊ฐ€์ค‘์น˜, ๋‘ ๋ฒˆ์งธ๋Š” t ์‹œ์  ์ด์ „์˜ ์ •๋ณด๋ฅผ ๋ฐ˜์˜ํ•˜๋Š” W_hh ๊ฐ€์ค‘์น˜, ๊ทธ๋ฆฌ๊ณ  t ์‹œ์ ์˜ y๋ฅผ ์˜ˆ์ธกํ•  ๋•Œ ํ™œ์šฉํ•˜๋Š” W_hy ๊ฐ€์ค‘์น˜์ด๋‹ค. ํ•ด๋‹น ํŒŒ๋ผ๋ฏธํ„ฐ๋Š” ๋งค ์‹œ์ ๋งˆ๋‹ค ๊ณต์œ ํ•˜๋Š” ๊ตฌ์กฐ์ด๋ฉฐ(parameter sharing), ๋งค ์‹œ์  ํŒŒ๋ผ๋ฏธํ„ฐ๋ฅผ ๊ตฌ์„ฑํ•˜๋Š” ๊ฐ’์ด ๊ฐ™๋‹ค. ๋˜ํ•œ ์ตœ์ ์˜ W๋Š” W๋ฅผ ๋งค ์‹œ์  ์ ์šฉํ–ˆ์„ ๋•Œ Loss๊ฐ€ ์ตœ์†Œ๊ฐ€ ๋˜๋Š” W์ด๋‹ค. hidden state์™€ ์˜ˆ์ธก๊ฐ’์€ ๋‹ค์Œ๊ณผ ๊ฐ™์ด ๊ณ„์‚ฐ๋œ๋‹ค. ์ˆœํ™˜ ์‹ ๊ฒฝ๋ง์ด 3๊ฐ€์ง€์˜ ๊ฐ€์ค‘์น˜๋ฅผ ์ถ”๋ก ํ•˜๋Š” ํ•™์Šต๊ณผ์ •์€ ๋‹ค์Œ๊ณผ ๊ฐ™๋‹ค. Los..

๐Ÿง  Deep Learning/RNN

[RNN] Recurrent Neural Networks and Attention(Introduction)

์‹œ๊ณ„์—ด ๋ฐ์ดํ„ฐ ์˜ˆ์ธก ๋ถ„์„ ๋ฐฉ๋ฒ•๋ก  ํŠธ๋ Œ๋“œ `์‹œ๊ณ„์—ด ๋ฐ์ดํ„ฐ(Time Series Data)`๋ž€, ์‹œ๊ฐ„์˜ ํ๋ฆ„์— ๋”ฐ๋ผ ์ˆœ์„œ๋Œ€๋กœ ๊ด€์ธก๋˜์–ด ์‹œ๊ฐ„์˜ ์˜ํ–ฅ์„ ๋ฐ›๊ฒŒ ๋˜๋Š” ๋ฐ์ดํ„ฐ๋ฅผ ๋งํ•œ๋‹ค. ์ด๋Ÿฌํ•œ ์‹œ๊ณ„์—ด ๋ฐ์ดํ„ฐ์—๋Š” `์‹œ๊ณ„์—ด ๋‹จ๋ณ€๋Ÿ‰ ๋ฐ์ดํ„ฐ(Univariate time series data)`, `์‹œ๊ณ„์—ด ๋‹ค๋ณ€๋Ÿ‰ ๋ฐ์ดํ„ฐ(Multivariate time series data)`, `์‹œ๊ณ„์—ด ์ด๋ฏธ์ง€ ๋ฐ์ดํ„ฐ(Time series image data)` ๋“ฑ์ด ์žˆ๋‹ค. ์ „ํ†ต ํ†ต๊ณ„ ๊ธฐ๋ฐ˜ ์‹œ๊ณ„์—ด ๋ฐ์ดํ„ฐ ๋ถ„์„ ๋ฐฉ๋ฒ•๋ก  ์ด๋™ํ‰๊ท ๋ฒ•(Moving average) ์ง€์ˆ˜ํ‰ํ™œ๋ฒ•(Exponential smoothing) ARIMA(Autoregressive integrated moving average) ๋ชจ๋ธ SARIMA(Seasonal ARIMA) ๋ชจ๋ธ ..

Junyeong Son
'๐Ÿง  Deep Learning/RNN' ์นดํ…Œ๊ณ ๋ฆฌ์˜ ๊ธ€ ๋ชฉ๋ก