01 Introduction - PAI-yoonsung/lstm-paper GitHub Wiki

1 Introduction

This article is an tutorial-like introduction initially developed as supplementary material for lectures focused on Artificial Intelligence.

이 아티클은 인공지능 강의를 목적으로 한 보조 자료로써 제작된 튜토리얼 같은 소개문입니다.

The interested reader can deepen his/her knowledge by understanding Long Short-Term Memory Recurrent Neural Networks (LSTM-RNN) considering its evolution since the early nineties.

인공지능에 관심이 있는 독자들이라면, LSTM-RNN 의 90년대 초부터의 진화에 대해 알아보는 것으로 더 깊은 지식을 습득할 수 있을 것입니다.

Todays publications on LSTM-RNN use a slightly different notation and a much more summarized representation of the derivations.

오늘날의 LSTM-RNN 발표물들은 약간 다른 표기법과 더욱 축약된 파생물들의 표시를 사용합니다.

Nevertheless the authors found the presented approach very helpful and we are confident this publication will find its audience.

하지만, 글쓴이는 제공되는 접근법이 매우 도움이 되고 이 발표물이 듣는 이들을 찾을 것이라 자신합니다.

Machine learning is concerned with the development of algorithms that automatically improve by practice. Ideally, the more the learning algorithm is run, the better the algorithm becomes.

머신 러닝은 실용적으로 발전된 알고리즘들의 개발과 함께 자연스럽게 고려됐습니다. 이상적으로, 학습 알고리즘이 실행될수록 더 나은 알고리즘이 되었습니다.

It is the task of the learning algorithm to create a classifier function from the training data presented.

이것은 주어진 학습 데이터로부터 분류 함수를 만들기 위한 학습 알고리즘 테스크입니다.

The performance of this built classifier is then measured by applying it to previously unseen data.

이 분류기의 성능은 이전에 모델이 학습한 적이 없는 데이터를 적용시켜보는 것으로 측정되었습니다.

Artificial Neural Networks (ANN) are inspired by biological learning systems and loosely model their basic functions.

ANN 은 생물학적 학습 시스템들로부터 영감을 받아, 이들의 기초적인 기능들을 느슨하게 모델링합니다.

Biological learning systems are complex webs of interconnected neurons.

생물학적 학습 시스템은 뉴런들이 서로 연결된 복잡한 그물망입니다.

Neurons are simple units accepting a vector of real-valued inputs and producing a single real-valued output.

뉴런들은 실수 입력값의 벡터를 받아들이고 단일 실수 출력값을 생산하는 간단한 유닛들입니다.

The most common standard neural network type are feed-forward neural networks.

가장 흔하고 기본적인 신경망 타입은 feed-forward 신경망입니다.

Here sets of neurons are organised in layers: one input layer, one output layer, and at least one intermediate hidden layer.

뉴런의 세트들은 다음의 레이어들로 구성됩니다: input 레이어 하나, output 레이어 하나, 최소 한 개의 중간 hidden 레이어

Feed-forward neural networks are limited to static classification tasks.

Feed-forward 신경망은 정적인 분류 문제들에만 국한되어있습니다.

Therefore, they are limited to provide a static mapping between input and output.

때문에, 해당 신경망은 입력과 출력 사이의 정적 매핑을 제공하는 것만으로 역할이 제한됩니다.

To model time prediction tasks we need a so-called dynamic classifier.

시간 예측 문제를 모델링하기 위해선, 동적 분류기라고 불리는 것이 필요합니다.

We can extend feed-forward neural networks towards dynamic classification.

우리는 feed-forward 신경망을 동적 분류가 가능하도록 확장할 수 있습니다.

To gain this property we need to feed signals from previous timesteps back into the network.

신경망이 이러한 속성을 갖게 하기 위해, 우리는 이전의 타임스탭들로부터의 신호들을 네트워크로 다시 되돌려줄 필요가 있습니다.

These networks with recurrent connections are called Recurrent Neural Networks (RNN) [74], [75].

이러한 되풀이하는 연결을 갖는 네트워크 구조를 RNN 이라고 부릅니다.

RNNs are limited to look back in time for approximately ten timesteps [38], [56].

RNN 구조는 약 10개의 타임스탭 정도만 되돌아갈 수 있다는 한계가 있습니다.

This is due to the fed back signal is either vanishing or exploding.

왜냐하면 응답으로 받는 신호들이 사라져가거나 증폭돼버리기 때문입니다.

This issue was addressed with Long Short-Term Memory Recurrent Neural Networks (LSTM-RNN) [22], [41], [23], [60].

해당 이슈는 LSTM RNN 구조의 등장으로 해결되었습니다.

LSTM networks are to a certain extend biologically plausible [58] and capable to learn more than 1,000 timesteps, depending on the complexity of the built network[41].

LSTM 신경망은 생물학적으로 신뢰할 수 있도록, 또한 네트워크의 복잡도에 따라서 1,000 타임스탭 이상을 학습할 수 있도록 확장되었습니다.

In the early, ground-breaking papers by Hochreiter [41] and Graves [34], the authors used different notations which made further development prone to errors and inconvenient to follow.

이전의 Hochreiter 와 Graves 의 획기적인 논문에서는, 작가들이 서로 다른 표기법을 사용하여서 심화 개발 시 에러를 일으키는 경향이 있었고, 따라가기 불편한 점이 있었습니다.

To address this we developed a unified notation and did draw descriptive figures to support the interested reader in understanding the related equations of the early publications.

이를 해결하기 위해 우리는 통일화된 표기법을 개발하였고, 독자들이 초기 발표물들과 연관된 방정식에 대한 이해를 돕도록 기술적인 도형들을 그렸습니다.

In the following, we slowly dive into the world of neural networks and specifically LSTM-RNNs with a selection of its most promising extensions documented so far.

이후에는, 우리는 신경망의 세계, 그 중에서도 특히 가장 촉망받는 문서 확장성을 지닌 LSTM-RNN을 중심으로 천천히 잠수해볼 것입니다.

We successively explain how neural networks evolved from a single perceptron to something as powerful as LSTM.

우리는 성공적으로 어떻게 신경망이 단일 퍼셉트론에서 LSTM과도 같이 강력하게 진화했는지에 대하여 설명하였습니다.

This includes vanilla LSTM, although not used in practice anymore, as the fundamental evolutionary step.

이것은 기본 LSTM 을 포함합니다. 더 이상 실전에서 사용되지는 않지만, 기본적인 진화 단계로써 알아볼 것입니다.

With this article, we support beginners in the machine learning community to understand how LSTM works with the intention motivate its further development.

이 아티클로, 우리는 머신 러닝 커뮤니티의 초심자들이 LSTM의 심화 개발 동기와 함께 어떻게 작동되는 지에 대한 이해를 도울 것입니다.

This is the first document that covers LSTM and its extensions in such great detail.

이것은 LSTM과 확장성에 대해 자세히 서술한 첫 번째 문서입니다.

dictionary

supplementary material : 보조 자료
notation : 표기법
derivations : 파생물(?)
Ideally : 이상적으로
loosely : 느슨하게
recurrent : 되풀이하는, 재발하는
was addressed : 해결되다
prone : ~하는 경향이 있는
intention : 의도