What is Machine learning - yarak001/machine_learning_common GitHub Wiki

What is Macine Learning

  • Definitions
    • Mitchellโ€™s Machine Learning
      • โ€œ๊ธฐ๊ณ„ํ•™์Šต ๋ถ„์•ผ๋Š” ๊ฒฝํ—˜์— ์˜ํ•ด computer program์ด ์ž๋™์ ์œผ๋กœ ์„ฑ๋Šฅ๊ฐœ์„ ๋˜๋Š” ๋ฐฉ๋ฒ•์— ๋Œ€ํ•œ ์งˆ๋ฌธ๊ณผ ์—ฐ๊ด€๋˜์–ด ์žˆ๋‹คโ€ -Mitchellโ€™s Machine Learning-
      • โ€œ์–ด๋–ค task T์— ๋Œ€ํ•ด P์— ์˜ํ•ด ์ธก์ •๋˜๊ณ  ๊ฒฝํ—˜ E๋กœ ํ–ฅ์ƒ๋˜๋ฉด Computer program์€ ์–ด๋–ค task์™€ ์–ด๋–ค ์„ฑ๋Šฅ์ธก์ • P์— ๊ด€ํ•˜์—ฌ ๊ฒฝํ—˜ E๋กœ๋ถ€ํ„ฐ ํ•™์Šตํ•œ๋‹ค๊ณ  ํ•œ๋‹ค.โ€œ -Mitchellโ€™s Machine Learning-
        • E: ์ˆ˜์ง‘๋˜์–ด์•ผ ํ•  data
        • T: S/W์— ํ•„์š”ํ•œ ๊ฒฐ์ •์‚ฌํ•ญ
        • P: ๊ฒฐ๊ณผ๋ฅผ ํ‰๊ฐ€ํ•˜๋Š” ๋ฐฉ๋ฒ•
    • Elements of Statistical Learning
      • โ€œ๋งŽ์€ ๋ถ„์•ผ์—์„œ ๊ฑฐ๋Œ€ํ•œ data๊ฐ€ ๋งŒ๋“ค์–ด์ง€๊ณ  ์žˆ๊ณ , ์ด data๋ฅผ ์ดํ•ดํ•˜๋Š” ๊ฒƒ์ด ํ†ต๊ณ„ํ•™์ž์˜ ์ง์—…์ด๋‹ค. ์ค‘์š”ํ•œ pattern๊ณผ trend๋ฅผ ์ถ”์ถœํ•˜๊ณ  โ€œdata๊ฐ€ ์ด์•ผ๊ธฐํ•˜๋Š” ๊ฒƒโ€์„ ์ดํ•ดํ•˜๋Š” ๊ฒƒ์„ ์šฐ๋ฆฌ๋Š” data๋กœ๋ถ€ํ„ฐ ํ•™์Šต์ด๋ผ ๋ถ€๋ฅธ๋‹คโ€
    • Pattern Recognition
      • โ€œPattern์ธ์‹์€ ๊ณตํ•™์— ๊ทผ๋ณธ์„ ๋‘” ๋ฐ˜๋ฉด, machine learning์€ computer๊ณผํ•™์œผ๋กœ๋ถ€ํ„ฐ ์ž๋ผ๋‚ฌ๋‹ค. ๊ทธ๋Ÿฌ๋‚˜, ๊ทธ ํ™œ๋™๋“ค์€ ๊ฐ™์€ ๋ถ„์•ผ์˜ ๋‘๊ฐ€์ง€ ์–‘์ƒ์œผ๋กœ ๋ณด์—ฌ์งˆ์ˆ˜ ์žˆ๋‹คโ€ฆโ€
        • ๊ณตํ•™์  ๊ด€์ ์—์„œ ์‹œ์ž‘ํ•˜์—ฌ Computer๊ณผํ•™์„ ๋ฐฐ์šฐ๊ณ  ํ™œ์šฉํ•จ => ์šฐ๋ฆฌ๊ฐ€ ๋ณธ๋ฐ›์•„์•ผํ•  ์„ฑ์ˆ™ํ•œ ์ ‘๊ทผ๋ฐฉ์‹
        • ๋ฐฉ๋ฒ•์— ๋Œ€ํ•ด ์ฃผ์žฅํ•˜๋Š” ๋ถ„์•ผ์™€ ์ƒ๊ด€์—†์ด, โ€œdata๋กœ๋ถ€ํ„ฐ ํ•™์Šตโ€์— ์˜ํ•œ ๊ฒฐ๊ณผ๋‚˜ ํ†ต์ฐฐ๋ ฅ์— ์ข€๋” ๊ทผ์ ‘ํ•จ์œผ๋กœ์„œ ์šฐ๋ฆฌ์˜ ์š”๊ตฌ์‚ฌํ•ญ์„ ๋งŒ์กฑ์‹œํ‚ค๋ฉด, ์šฐ๋ฆฌ๋Š” ๊ทธ๊ฒƒ์„ ๊ธฐ๊ณ„ํ•™์Šต์ด๋ผ ๋ช…๋ช…ํ•  ์ˆ˜ ์žˆ๋‹ค.
    • An Alogorithmic Perspective
      • โ€œMachine learning์˜ ๊ฐ€์žฅ ํฅ๋ฏธ๋กœ์šด ์š”์†Œ๋Š” ์ฃผ๋กœ computer๊ณผํ•™, ํ†ต๊ณ„ํ•™, ์ˆ˜ํ•™, ๊ณตํ•™์˜ ์—ฌ๋Ÿฌ ๋‹ค๋ฅธ ํ•™๋ฌธ ๋ถ„์•ผ์— ๊ฑธ์ณ ์žˆ๋‹ค๋Š” ๊ฒƒ์ด๋‹ค. ...machine learning์€ computer๊ณผํ•™์—์„œ ํ™•๊ณ ํ•œ ์ธ๊ณต์ง€๋Šฅ์˜ ํ•œ ๋ถ„์•ผ๋กœ ์—ฐ๊ตฌ๋œ๋‹ค. ...์ด๋Ÿฌํ•œ ์•Œ๊ณ ๋ฆฌ๋“ฌ์ด ์ž‘๋™ํ•˜๋Š” ์ด์œ ๋ฅผ ์ดํ•ดํ•˜๋ ค๋ฉด ์ปดํ“จํ„ฐ ๊ณผํ•™ ํ•™๋ถ€์ƒ๋“ค๋กœ๋ถ€ํ„ฐ ์ข…์ข… ๋ˆ„๋ฝ๋˜๋Š” ์ผ์ •๋Ÿ‰์˜ ํ†ต๊ณ„ ๋ฐ ์ˆ˜ํ•™์  ์†Œํ”ผ์ผ€์ด์…˜(sophistics)์ด ํ•„์š”ํ•˜๋‹ค => ์—ฌ๋Ÿฌ ๋ฐฉ๋ฉด์˜ ํ•™๋ฌธ๊ณผ ์—ฐ๊ฒฐ๋˜์–ด ์žˆ์Œ, ํ•œ ๋ถ„์•ผ์— ๋„ˆ๋ฌด ์ง์ฐนํ•˜๋ฉด ์•ˆ๋จ
  • Definitions Machine Learning
    • โ€œMachine Learning์€ ์„ฑ๋Šฅ์ธก์ฒญ์— ๋Œ€ํ•œ ๊ฒฐ์ •์„ ์ผ๋ฐ˜ํ™”ํ•˜๋Š” data๋กœ๋ถ€ํ„ฐ model ํ›ˆ๋ จ์ด๋‹ค
      • Model์„ ํ›ˆ๋ จ => ํ›ˆ๋ จ ์˜ˆ์‹œ ์ œ์‹œ
      • Model => ๊ฒฝํ—˜์„ ํ†ตํ•ด ํš๋“ํ•œ ์ƒํƒœ ์ œ์•ˆ
      • ๊ฒฐ์ •์„ ์ผ๋ฐ˜ํ™” => ๊ฒฐ์ •์ด ํ•„์š”ํ•  ๋ฏธ๋ž˜์— ๋ณด์ด์ง€ ์•Š๋Š” ์ž…๋ ฅ๊ณผ ์˜ˆ์ƒํ•˜์ง€ ๋ชปํ•œ ์ž…๋ ฅ์„ ๊ธฐ๋ฐ˜์œผ๋กœ ์˜์‚ฌ ๊ฒฐ์ •์„ ๋‚ด๋ฆด ์ˆ˜ ์žˆ๋Š” ๋Šฅ๋ ฅ์„ ์‹œ์‚ฌํ•œ๋‹ค.
      • ์„ฑ๋Šฅ์ธก์ •์— ๋Œ€ํ•œ => ๋ชฉํ‘œ ์š”๊ตฌ์‚ฌํ•ญ๊ณผ model์ด ์ค€๋น„ํ•ด์•ผํ•  ์„ฑ๋Šฅ์˜ ๋ฐฉํ–ฅ์„ฑ ์ œ์‹œ

What Are the Key Concepts in Machine Learning?

  • Data

    ์ดํ•ด๋ฅผ ์œ„ํ•ด ๋‹จ์ˆœํžˆ row์™€ column์„ ์ง€๋‹Œ DB table์ด๋‚˜ Excel spreedsheet๋กœ ์ƒ๊ฐํ•  ์ˆ˜ ์žˆ์Œ(๋น„์ •ํ˜• data๋Š” ์—ฌ๊ธฐ์„œ ์ œ์™ธํ•จ)

    • Instance
      • data์˜ ๋‹จ์ผ ํ–‰(row)
      • Domain์œผ๋ฅด๋ถ€ํ„ฐ์˜ ๊ด€์ธก์น˜
    • Feature
      • data์˜ ๋‹จ์ผ column(feature)
      • ๊ด€์ธก์น˜์˜ ๊ตฌ์„ฑ์ด๋ฉฐ, data instance์˜ ์†์„ฑ(attribute)
      • ํŠน์ • feature๋Š” model์˜ ์ž…๋ ฅ์œผ๋กœ, ๋‹ค๋ฅธ featuer๋Š” model์˜ ์ถœ๋ ฅ ๋˜๋Š” ์˜ˆ์ธก๊ฐ’์œผ๋กœ ์‚ฌ์šฉ
    • Data Type
      • feature๋Š” data type์„ ๊ฐ€์ง
      • ์‹ค์ œ ๋˜๋Š” ์ •์ˆ˜๊ฐ’(real value), ๋ฒ”์ฃผํ˜•๊ฐ’(categorical value), ์ˆœ์„œ๊ฐ’(ordinary value)
      • ๋ฌธ์ž์—ด, ๋‚ ์งœ, ์‹œ๊ฐ„, ๋ณต์žกํ•œ type์„ ๊ฐ€์งˆ ์ˆ˜ ์žˆ์Œ
      • ์ „ํ†ต์ ์ธ machine learning ๋ฐฉ๋ฒ•์œผ๋กœ ์ž‘์—…์‹œ ์‹ค์ œ๊ฐ’์ด๋‚˜ ๋ฒ”์ฃผํ˜• ๊ฐ’ ์‚ฌ์šฉ
    • Datasets
      • Instance์˜ ์ง‘ํ•ฉ
      • Machine Learning ์ž‘์—…์‹œ ๋ชฉ์ ์— ๋”ฐ๋ฅธ ๋ช‡๊ฐœ์˜ dataset ํ•„์š”
    • Training Dataset
      • model์„ ํ›ˆ๋ จ์‹œํ‚ค๊ธฐ ์œ„ํ•ด์„œ machine learning algorithm์— ์ฃผ์–ด์ง€๋Š” dataset
    • Testing Dataset
      • model์˜ ์ •ํ™•๋„ ๊ฒ€์ฆ์„ ์œ„ํ•ด ์‚ฌ์šฉ๋˜๋Š” dataset(ํ›ˆ๋ จ์—๋Š” ์‚ฌ์šฉ๋˜์ง€ ์•Š์Œ),
      • ๊ฒ€์ฆ(Validation) dataset๋ผ ํ•  ์ˆ˜ ์žˆ์Œ
  • Learning

    Algorithm์„ ์ด์šฉํ•œ ์ž๋™ํ™”๋œ ํ•™์Šต์˜ ์‹ค์ œ

    • Induction
      • Machine Learning์€ ์œ ๋„(induction) ๋˜๋Š” ์œ ๋„ํ•™์Šต(inductive learning)์ด๋ผ ๋ถˆ๋ฆฌ๋Š” ๊ณผ์ •์„ ํ†ตํ•ด ํ•™์Šต
      • ์œ ๋„๋ž€ ํŠน์ • ์ •๋ณด(dataset)์œผ๋กœ๋ถ€ํ„ฐ ์ผ๋ฐ˜ํ™”(generalization)(model)์„ ๋งŒ๋“œ๋Š” ์ถ”๋ก  process
    • Generalization
      • ํ›ˆ๋ จ ์ค‘์— ๋ณด์ด์ง€ ์•Š๋˜ ํŠน์ • data instance๋ฅผ ๊ธฐ๋ฐ˜์œผ๋กœ ์˜ˆ์ธก, ๊ฒฐ์ •์„ ํ•ด์•ผํ•˜๋ฏ€๋กœ ์ผ๋ฐ˜ํ™”๊ฐ€ ํ•„์š”
    • Over-Learning
      • ํ›ˆ๋ จ data์— ๋„ˆ๋ฌด ๊ฐ€๊น๊ฒŒ(closely) ํ•™์Šต๋˜์–ด ์ผ๋ฐ˜ํ™”ํ•˜์ง€ ์•Š์•˜์„ ๊ฒฝ์šฐ
      • ํ›ˆ๋ จ data์™ธ data์— ํ˜•ํŽธ์—†๋Š” ์„ฑ๋Šฅ์„ ๊ฐ€์ง
      • over fitting์ด๋ผ๊ณ ๋„ ํ•จ
    • Under-Learning
      • model์˜ ํ•™์Šต๊ณผ์ •์ด ๋„ˆ๋ฌด ์ผ์ฐ ์ข…๋ฃŒ๋˜์–ด database๋กœ๋ถ€ํ„ฐ ์ถฉ๋ถ„ํžˆ ํ•™์Šตํ•˜์ง€ ๋ชปํ•œ ๊ฒฝ์šฐ
      • ์ผ๋ฐ˜ํ™”๋Š” ์ข‹์ง€๋งŒ ๋ชจ๋“  data์— ๋‚ฎ์€ ์„ฑ๋Šฅ์„ ๊ฐ€์ง
      • under fitting์ด๋ผ๊ณ ๋„ ํ•จ
    • Online Learning
      • ์‚ฌ์šฉ ๊ฐ€๋Šฅํ•  ๋•Œ domain์—์„œ data instance๋ฅผ updateํ•œ ๋ฐฉ๋ฒ•
      • noise๊ฐ€ ๋งŽ์€ data์— ๊ฐ•๋ ฅํ•œ ๋ฐฉ๋ฒ•์ด ํ•„์š”ํ•˜์ง€๋งŒ domain์˜ ํ˜„์žฌ์ƒํƒœ์™€ ๋” ์ž˜ ๋งž๋Š” model์„ ์ƒ์„ฑํ•  ์ˆ˜ ์žˆ์Œ
    • Offline Learning
      • ์ค€๋น„๋œ data์— ๋Œ€ํ•ด ์ƒ์„ฑ๋œ ํ›„ ๊ด€์ฐฐ๋˜์ง€ ์•Š์€ data์— ๋Œ€ํ•ด ์šด์˜์ ์œผ๋กœ ์‚ฌ์šฉ๋˜๋Š” ๋ฐฉ๋ฒ•
      • ํ›ˆ๋ จdata์˜ ๋ฒ”์œ„๊ฐ€ ์•Œ๋ ค์ ธ ์žˆ๊ธฐ ๋•Œ๋ฌธ์— ํ›ˆ๋ จ ๊ณผ์ •์„ ์ฃผ์˜๊นŠ๊ฒŒ ์ œ์–ดํ•˜๊ณ  ์กฐ์ •ํ•  ์ˆ˜ ์žˆ์Œ
      • Model์„ ์ค€๋น„ํ•œ ํ›„์—๋Š” update๋˜์ง€ ์•Š์œผ๋ฉฐ, domain์ด ๋ณ€๊ฒฝ๋  ๊ฒฝ์šฐ ์„ฑ๋Šฅ์ด ์ €ํ•˜๋  ์ˆ˜ ์žˆ์Œ
    • Supervised Learning
      • ์—์ธก์ด ํ•„์š”ํ•œ ๋ฌธ์ œ์— ์ผ๋ฐ˜์ ์ธ ํ•™์Šต process
      • โ€œ๊ต์œก ๊ณผ์ •(teaching process)โ€๋Š” model์˜ ์˜ˆ์ธก๊ณผ ์ •๋‹ต์„ ๋น„๊ตํ•˜์—ฌ model๋‚ด์—์„œ ์ˆ˜์ •
    • Unsupervised Learning
      • ์˜ˆ์ธก์ด ํ•„์š”ํ•˜์ง€ ์•Š์€ data์˜ ๊ตฌ์กฐ๋ฅผ ์ผ๋ฐ˜ํ™”ํ•˜๊ธฐ ์œ„ํ•œ ํ•™์Šต process
      • ์ž์—ฐ์ ์ด ๊ตฌ์กฐ๊ฐ€ ํ™•์ธ๋˜๊ณ  ์„œ๋กœ ์—ฐ๊ด€๋œ instance๋“ค๋กœ exploited๋จ
  • Modelling

    ๊ธฐ๋ณธ์ ์œผ๋กœ program์ธ Machine Learning๊ณผ์ •์— ์˜ํ•ด ๋งŒ๋“ค์–ด์ง„ ์ธ๊ณต๋ฌผ, ์šฐ๋ฆฌ๊ฐ€ ํ’€๋ ค๊ณ ํ•˜๋Š” ๋ฌธ์ œ์˜ model์ด๋ฏ€๋กœ model์ด๋ผ๊ณ  ๋ถ€๋ฆ„

    • Model Selection
      • ์„ค์ •(configure)์™€ model ํ›ˆ๋ จ๊ณผ์ • => model selection process
      • ๋งค๊ณผ์ •์—์„œ ์‚ฌ์šฉ ๋˜๋Š” ์ˆ˜์ •ํ•  model์„ ๊ฐ€์ง€๊ณ  ์žˆ์Œ
      • machine learning algorithm ์„ ํƒ์€ Model selection์˜ ํ•œ ๊ณผ์ •
      • ๋ฌธ์ œ์— ๋Œ€ํ•ด ์กด์žฌํ•˜๋Š” ๋ชจ๋“  ๊ฐ€๋Šฅํ•œ model์ค‘, ์„ ํƒ๋œ ํ›ˆ๋ จ dataset์— ์ฃผ์–ด์ง„ algorithm๊ณผ algorithm ์„ค์ •์€ ์ตœ์ข…์ ์œผ๋กœ ์„ ํƒ๋œ model์„ ์ œ๊ณตํ•  ๊ฒƒ์ž„
    • Inductive Bias
      • ์„ ํƒ๋œ model์— ๋‚ด์ œ๋œ ์ œํ•œ๊ฐ’
      • ๋ชจ๋“  model์€ ํŽธํ–ฅ๋˜์–ด ์žˆ์–ด model ์˜ค๋ฅ˜๊ฐ€ ๋ฐœ์ƒํ•˜๊ณ  ์ •์˜์ƒ ๋ชจ๋“  model์—๋Š” ์˜ค๋ฅ˜ ์กด์žฌ(๊ด€์ธก์—ฃ ์ผ๋ฐ˜ํ™”ํ•œ model์ž„)
      • ํŽธํ–ฅ์€ model์˜ ๊ตฌ์„ฑ๊ณผ ๋ชจ๋ธ์„ ์ƒ์„ฑํ•˜๋Š” algorithm ์„ ํƒ์„ ํฌํ•จํ•˜์—ฌ model์—์„œ ๋งŒ๋“ค์–ด์ง„ ์ผ๋ฐ˜ํ™”์— ์˜ํ•ด ๋„์ž…
      • Machine learning ๋ฐฉ๋ฒ•์€ ๋‚ฎ์€ or ๋†’์€ bias๋ฅผ ๊ฐ€์ง„ mode์„ ์ƒ์„ฑํ•  ์ˆ˜ ์žˆ๊ณ , ๋งค์šฐ ํŽธํ–ฅ๋œ model์˜ bias๋ฅผ ์ค„์ผ ์ˆ˜ ์žˆ๋Š” ์ „๋žต์„ ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ๋‹ค.
    • Model Variance
      • ํ›ˆ๋ จํ•œ dataset์— model์ด ์–ผ๋งˆ๋งŒํผ ๋ฏผ๊ฐํ•œ๊ฐ€
      • variance๋ฅผ ์ค„์ด๋Š” ์ „์ˆ ์€ ๋‹ค๋ฅธ ์ดˆ๊ธฐ์กฐ๊ฑด์œผ๋กœ dataset์„ ์—ฌ๋Ÿฌ๋ฒˆ ์‹คํ–‰ํ•˜๊ณ  model ์„ฑ๋Šฅ์˜ ํ‰๊ท ๊ฐ’์„ ์ทจํ•จ
    • Bias-Variance Tradeoff
      • model selection์€ bias์™€ variance์˜ trade-off๋กœ ์ƒ๊ฐํ•  ์ˆ˜ ์žˆ์Œ
      • ๋‚ฎ์€ bias๋Š” ๋†’์€ variance๋ฅผ ๊ฐ€์งˆ๊ฒƒ์ด๋ฉฐ, ์‚ฌ์šฉ๊ฐ€๋Šฅํ•œ model์„ ์–ป๊ธฐ์œ„ํ•ด ๋” ๋งŽ์€ ํ›ˆ๋ จ์ด ํ•„์š”
      • ๋†’์€ bias๋Š” ๋‚ฎ์€ variance๋ฅผ ๊ฐ€์งˆ ๊ฒƒ์ด๋ฉฐ, ๋น ๋ฅธ ํ›ˆ๋ จ์„ ๋  ๊ฒƒ์ด์ง€๋งŒ ๋‚ฎ๊ณ , ์ œํ•œ๋œ ์„ฑ๋Šฅ์„ ๊ฒฝํ—˜

What Problems Can Machine Learning Address?

Machine Learning์„ ์ดํ•ดํ•˜๋Š” ํ•œ ๋ถ€๋ถ„์œผ๋กœ, ์–ด๋–ค type์˜ ๋ฌธ์ œ๋ฅผ machine learning์ด ํ•ด๊ฒฐํ•  ์ˆ˜ ์žˆ๋Š”์ง€ ์ดํ•ดํ•˜๋Š” ๊ฒƒ์ด ์œ ์šฉํ•จ

์ง๋ฉดํ•œ ๋ฌธ์ œ type์„ ์•Ž์œผ๋กœ์„œ ํ•„์š”ํ•œ data์™€ algorithm์— ๋Œ€ํ•ด ์ƒ๊ฐํ•˜๊ฒŒ ๋˜๋ฏ€๋กœ ๊ฐ€์น˜์žˆ์Œ

  • Example Problems
    • Spam ํƒ์ง€
    • ์‹ ์šฉcard ์‚ฌ๊ธฐ ํƒ์ง€
    • ์ˆซ์ž ์ธ์‹
    • ์Œ์„ฑ ์ดํ•ด
    • ์–ผ๊ตด ์ธ์‹
    • ์ œํ’ˆ ๊ถŒ์žฅ
    • ์˜ํ•™ ์ง„๋‹จ
    • ์ฃผ์‹ ๊ฑฐ๋ž˜
    • ๊ณ ๊ฐ์„ธ๋ถ„ํ™”
    • ํ˜•์ƒ ํƒ์ง€
  • Types of Problems
    • Classification
    • Regression
    • Clustering
    • Rule Extraction

What Algorithm Does Machine Learning Provide?

Machine learning์„ ๊ณต๋ถ€ํ•˜๋Š”๊ฒƒ์€ machine learning algorithm์„ ๊ณต๋ถ€ํ•˜๋Š” ๊ฒƒ

์—ฌ๋Ÿฌ ๋ฐฉ๋ฒ•์ข…๋ฅ˜์™€ ํ™•์žฅ์ด ์žˆ๊ณ , ์–ด๋–ค ๊ฒƒ์ด ์—ญํ•™์  Algorithm์„ ๊ตฌ์„ฑํ•˜๋Š”์ง€ ๋น ๋ฅด๊ฒŒ ํŒ๋‹จํ•˜๊ธฐ๊ฐ€ ์–ด๋ ค์›€

  • Algorithm Similarity
    • machine learning algorithm์„ ๋ถ„๋ฅ˜ํ•˜๋Š” ๋ฐฉ๋ฒ•์€ ๋งŽ์Œ
    • Algorithm์œ ์‚ฌ์„ฑ์ด ๊ฐ€์žฅ ๊ฐ„๋‹จํ•œ ๋ฐฉ๋ฒ•
    • ๊ธฐ๋Šฅ์ด๋‚˜ ํ˜•ํƒœ์˜ ์œ ์‚ฌ์„ฑ์— ์˜ํ•ด groupํ™” ex)tree based methods, neural network inspired methods
    • tree based method์ด๋ฉด์„œ neural netword inspired method ์ธ Learning Vector Quantization๊ณผ ๊ฐ™์€ algorithm์ด ์กด์žฌํ•จ
    • Regression๊ณผ Clustering์ฒ˜๋Ÿผ ๋ฌธ์ œ์ข…๋ฅ˜์™€ algorithm์ข…๋ฅ˜ ๋™์ผํ•œ ์ด๋ฆ„์ธ category๊ฐ€ ์กด์žฌํ•จ
    • Machine Learning Algorithm ์ž์ฒด๋กœ ์™„๋ฒฝํ•˜๊ณ , ์™„์ „ ์ถฉ๋ถ„ํ•œ model์€ ์—†์Œ
  • Regression
    • model์— ์˜ํ•œ ์˜ˆ์ธก๊ฐ’์˜ error์ธก์ •์„ ์‚ฌ์šฉํ•˜์—ฌ ๋ฐ˜๋ณต์ ์œผ๋กœ ์ •์ œ๋œ ๋ณ€์ˆ˜๋“ค๊ฐ„์˜ ๊ด€๊ณ„๋ฅผ modelling
    • ํ†ต๊ณ„์˜ ํ•ต์‹ฌ์ด๋ฉฐ ํ†ต๊ณ„์  machine learning์œผ๋กœ ํ†ตํ•ฉ๋จ
    • ๋ฌธ์ œ์ข…๋ฅ˜์™€ algorithm์ข…๋ฅ˜๋กœ ์–ธ๊ธ‰ํ•  ์ˆ˜ ์žˆ์–ด ํ˜ผ๋™ํ•  ์ˆ˜ ์žˆ์Œ
    • ์‹ค์ œ regression์€ ์‹คํ–‰๊ณผ์ •(process)์ž„
    • Example
      • Ordinary Least Squares Linear Regression
      • Logistic Regression
      • Stepwise Regression
      • Multivariative Adaptive Regression Splines(MARS)
      • Locally Estimated Scatterplot Smoothing(LOESS)
  • Instance-based Methods
    • model์— ๊นŠ๊ฒŒ ์ค‘์š”ํ•˜๊ฑฐ๋‚˜ ํ•„์š”ํ•œ ํ›ˆ๋ จ data์˜ ์˜ˆ์ œ ๋˜๋Š” instance๊ธฐ๋ฐ˜์œผ๋กœ ๊ฒฐ์ • ๋ฌธ์ œ๋ฅผ model
    • ์˜ˆ์ œdata database๋ฅผ ๋งŒ๋“ค๊ณ  ์ตœ์ match์™€ ์—์ธก์„ ์œ„ํ•ด ์‹ ๊ทœ data๋ฅผ ์œ ์‚ฌ์ธก์ •์„ ์‚ฌ์šฉํ•˜์—ฌ database์™€ ๋น„๊ต
    • ์ด๋Ÿฐ ์ด์œ ๋กœ winner-take all method, case-based and memory based ํ•™์Šต์ด๋ผ ํ•จ
    • Example
      • k-Nearest Neighbour(kNN)
      • Learning Vector Quantization(LVQ)
      • Self-Organizing Map(SOM)
  • Regularization Methods
    • ์ผ๋ฐ˜ํ™”์— ์ข‹์€ ์ข€๋” ๊ฐ„๋‹จํ•œ model ์„ ํ˜ธํ•˜๋ฉด์„œ, model์˜ ๋ณต์žก์„ฑ์„ ๊ธฐ๋ฐ˜์œผ๋กœํ•˜์—ฌ model์— penalty๋ฅผ ๋ถ€์—ฌํ•˜์—ฌ ๋‹ค๋ฅธ model(์ฃผ๋กœ regression)์„ ํ™•์žฅ
    • Example
      • Ridge Regression
      • Least Absolute Shrinkage and Selection Operator(LASSO)
      • Elastic Net
  • Decision Tree Learning
    • Data์˜ ์‹ค์ œ ์†์„ฑ๊ฐ’์„ ๊ธฐ๋ฐ˜์œผ๋กœ ๋งŒ๋“ค์–ด์ง„ ๊ฒฐ์ • model์„ ๊ตฌ์„ฑํ•˜๋Š” ๋ฐฉ๋ฒ•
    • ํ•ด๋‹น record์— ๋Œ€ํ•œ ๊ฒฐ์ •์ด ๋งŒ๋“ค์–ด์งˆ๋•Œ๊นŒ์ง€ tree ๊ตฌ์กฐ๋ฅผ fork
    • classification์ด๋‚˜ regression ๋ฌธ์ œ์˜ data๋ฅผ ๊ธฐ๋ฐ˜์œผ๋กœ ํ›ˆ๋ จ
    • Examples
      • Classification and Regression Tree(CART)
      • Iterative Dichotomiser3(ID3)
      • C4.5
      • Chi-squared Automatic Interaction Detection(CHAID)
      • Decision Stump
      • Random Forest
      • Muitivariative Adaptive Regression Splines(MARS)
      • Gradient Boosting Machines(BGM)
  • Bayesian
    • Classification๊ณผ regression ๋ฌธ์ œ์— ๋ช…์‹œ์ ์œผ๋กœ ๋ฒ ์ด์ฆˆ ์ด๋ก ์„ ์ ์šฉํ•œ ๋ฐฉ๋ฒ•
    • Examples
      • Naive Bayes
      • Averaged One-Dependence Estimators(AODE)
      • Bayesian Belief Network(BBN)
  • Kernel Methods
    • ๋ฐฉ๋ฒ•๊ณผ ๋ฐฉ๋ฒ•์˜ ์ง‘ํ•ฉ์ฒด์ธ Support Vector Machine(SVM)์ด ๊ฐ€์žฅ ์ž˜ ์•Œ๋ ค์ง
    • ์‰ฝ๊ฒŒ modellingํ•  ์ˆ˜ ์žˆ๋Š” classification์ด๋‚˜ regression๋ฌธ์ œ์—์„œ input data๋ฅผ ๊ณ ์ฐจ์›์˜ vector๊ณต๊ฐ„์œผ๋กœ mapping
    • Examples
      • Support Vector Machine(SVM)
      • Radial Basis Function(RBF)
      • Linear Discriminate Analysis(LDA)
  • Clustering Methods
    • regression์ฒ˜๋Ÿผ ๋ฌธ์ œ์™€ ๋ฐฉ๋ฒ•์„ ์–˜๊ธฐํ•จ
    • centroid-based๋‚˜ hierarchical๊ณผ ๊ฐ™์€ modelling์  ์ ‘๊ทผ์œผ๋กœ ์กฐ์งํ™”๋จ
    • ์ตœ๋Œ€ ์œ ์‚ฌ๋„์˜ group์œผ๋กœ data๋ฅผ ์กฐ์งํ™”ํ•˜๊ธฐ ์œ„ํ•ด์„œ data์˜ ๊ณ ์œ ๊ตฌ์กฐ๋ฅผ ์‚ฌ์šฉํ•˜๋Š” ๋ฐฉ๋ฒ•
    • Examples
      • k-Means
      • Expectation Maximisation(EM)
  • Association Rule Learning
    • data๋‚ด ์†์„ฑ์˜ ๊ด€์ฐฐ๋œ ์—ฐ๊ด€๊ด€๊ณ„๋ฅผ ์„ค๋ช…ํ•  ์ˆ˜ ์žˆ๋Š” ์ตœ์ ์˜ ๊ทœ์น™์„ ์ฐพ์•„๋‚ด๋Š” ๋ฐฉ๋ฒ•
    • ์ด๋Ÿฌํ•œ ๊ทœ์น™์€ ์กฐ์ง์ด ์ด์šฉํ•  ์ˆ˜ ์žˆ๋Š” ๋Œ€๊ทœ๋ชจ ์ฐจ์› ๋ฐ์ดํ„ฐ ์„ธํŠธ์—์„œ ์ค‘์š”ํ•˜๊ณ  ์ƒ์—…์ ์œผ๋กœ ์œ ์šฉํ•œ ์—ฐ๊ด€์„ฑ์„ ๋ฐœ๊ฒฌํ•  ์ˆ˜ ์žˆ๋‹ค.
    • Examples
      • Apriori algorithm
      • Eclat algorithm
  • Artificial Neural Networks
    • ์ƒ๋ฌผํ•™์  ์‹ ๊ฒธ๋ง์—์„œ ์˜๊ฐ์„ ์–ป์Œ
    • classification๊ณผ regression์— ์ผ๋ฐ˜์ ์œผ๋กœ ์‚ฌ์šฉ๋˜๋Š” pattern matching์ด์ž๋งŒ ๋ชจ๋“  ๋ฌธ์ œ์— ๋Œ€ํ•ด์„œ ์ˆ˜๋ฐฑ๊ฐ€์ง€์˜ algorithm๊ณผ ๋ณ€ํ˜•์œผ๋กœ ๊ตฌ์„ฑ๋œ ์–ด๋งˆ์–ด๋งˆํ•œ ์˜์—ญ
    • Examples(Deep Learning ๊ณผ ๋ถ„๋ฆฌ)
      • Perceptron
      • Back-Propagation
      • Hopfield Network
      • Self-Organizing Map(SOM)
      • Learning Vector Quantization(LVQ)
  • Deep Learning
    • ์ธ๊ณต ์‹ ๊ฒฝ๋ง(ANN)์˜ ์ตœ์‹  version
    • ์ ๋” ํฌ๊ณ , ๋ณต์žกํ•œ ์‹ ๊ฒฝ๋ง์„ ๊ตฌ์„ฑ, labelling๋œ ์ž‘์€ data๋ฅผ ๊ฐ€์ง„ ํฐ dataset์—์„œ์˜ ๋ฐ˜์ง€๋„ํ•™์Šต์™€ ๊ด€๋ จ
    • Examples
      • Restricted Boltzman Machine(RBM)
      • Deep Belief Networks(DBN)
      • Convolutional Network
      • Stacked Auto-encoders
  • Dimensionality Reduction
    • Clustering์ฒ˜๋Ÿผ data๋‚ด ๊ณ ์œ ํ•œ ๊ตฌ์กฐ๋ฅผ ์ฐพ๊ณ  ํƒ์ƒ‰
    • ๋น„์ง€๋„ ๋ฐฉ์‹์ด๋‚˜ ์ ์€ ์ •๋ณด๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ data๋ฅผ ์š”์•ฝํ•˜๊ฑฐ๋‚˜ ์„ค๋ช…
    • ๊ณ ์ฐจ์› data๋ฅผ visualizeํ•˜๊ฑฐ๋‚˜ ์ง€๋„ํ•™์Šต์—์„œ ์‚ฌ์šฉํ•  data๋ฅผ ๋‹จ์ˆœํ™”ํ• ๋•Œ ์œ ์šฉ
    • Examples
      • Principal Component Analysis(PCA)
      • Partial Least Squares Regression(PLS)
      • Sammon Mapping
      • Multidimensional Scaling(MDS)
      • Project Pursuit
  • Ensemble Methods
    • ๊ฐ๊ฐ ๋…๋ฆฝ์ ์œผ๋กœ ํ›ˆ๋ จ๋œ ๋‹ค์ˆ˜์˜ model๋ฅผ ๊ฒฐํ•ฉํ•˜์—ฌ ์˜ˆ์ธก๊ฐ’๋“ค์€ ํŠน์ •ํ•œ ๋ฐฉ์‹์œผ๋Ÿฌ ์ „์ฒด ์˜ˆ์ธก๊ฐ’์„ ๋งŒ๋“ฆ
    • ์–ด๋–ค type์˜ model๋“ค์„ ๊ฒฐํ•ฉํ•˜๊ณ , ์–ด๋–ป๊ฒŒ ๊ฒฐํ•ฉํ• ๊ฒƒ์ธ์ง€์— ๋…ธ๋ ฅ์ด ํ•„์š”
    • ๊ฐ•๋ ฅํ•˜๊ณ  ์ธ๊ธฐ ์žˆ๋Š” ๊ธฐ์ˆ 
    • Examples
      • Bagging
      • Bootstrapped Aggregation(Boosting)
      • AdaBoost
      • Stacked Generalization(blending)
      • Gradient Boosting Machines(GBM)
      • Random Forest

What Other Fields Are Related to Machine Learning

  • Foundations

    ์ˆ˜ํ•™๊ณผ computer ๊ณผํ•™๋ถ„์•ผ ๊ธฐ๋ฐ˜

    ๋ฐฉ๋ฒ•์€ ์„ ํ˜•๊ณผ ํ–‰๋ ฌ ๋Œ€์ˆ˜๋กœ, ๋™์ž‘์€ ํ†ต๊ณ„์™€ ํ™•๋ฅ ๋กœ ์ดํ•ด

    • Probability
    • Statistics
    • Artificial Intelligence
  • Progenitors

    • Computational Intelligence
    • Data Mining
    • Data Science