Week17 Day3 - ai-esg/our-history GitHub Wiki

ํŒ€ NLP 11์กฐ Week17 Day3

๋ชฉ์ฐจ

์ผ์ž

  • 2021.11.24 ์ˆ˜

ํŒ€์›

  • ๋ฌธ์„์•”_T2075
  • ๋ฐ•๋งˆ๋ฃจ์ฐฌ_T2078
  • ๋ฐ•์•„๋ฉ˜_T2090
  • ์šฐ์›์ง„_T2137
  • ์œค์˜ํ›ˆ_T2142
  • ์žฅ๋™๊ฑด_T2185
  • ํ™ํ˜„์Šน_T2250

์ฃผ๊ฐ„ ์ผ์ •

ํ”ผ์–ด์„ธ์…˜

๋ฆฌ๋”๋ณด๋“œ ํ”„๋กœํ•„

gif ์‚ฌ์šฉํ•˜๋Š” ํŒ€ ๋งŽ๋„ค์š”

  1. ํ”„๋กœ์ ํŠธ ๋ช…
  2. AI_ESG + ์•„์ด์ฝ˜
    • ๋ถ€์ŠคํŠธ์ฝ”์Šค ์•„์ด์ฝ˜
  3. ๊ณ ๊ตฐ์‚ฐ๋„
    • ์„ฌ ํ•˜๋‚˜์”ฉ
  4. ๋–ผ์“ฐ๊ธฐ ์งค

์ตœ์ข… ์„ ํƒ

๋–ผ์“ฐ๊ธฐ

  • ๋ฌธ์„์•” -
  • ๋ฐ•๋งˆ๋ฃจ์ฐฌ - ๋‹จ๋น„
  • ๋ฐ•์•„๋ฉ˜ -
  • ์šฐ์›์ง„ - ๊ผฌ๋ถ€๊ธฐ
  • ์œค์˜ํ›ˆ -
  • ์žฅ๋™๊ฑด -
  • ํ™ํ˜„์Šน - ์งฑ๊ตฌ

์ตœ์ ํ™”

  • ๊ฐ์ž ๋” ํ•ด๋ณด๊ธฐ

์ตœ์ข… ํ”„๋กœ์ ํŠธ

์ตœ์ข…ํ”„๋กœ์ ํŠธ Wiki? ๋ฌธ์„œ ๊ด€๋ฆฌ

  • ์šฐ๋ฆฌ ํšŒ์˜ํ•œ ๋‚ด์šฉ ์ด๋Ÿฐ๊ฑธ ์ „๋‹ฌ

    • ์ €ํฌ๊ฐ€ ์ƒ๊ฐํ•œ ๋…ธ๋ ฅ
    • ๋ฐฐ์šด ๊ธฐ์ˆ 
    • ์•Œ๊ฒŒ๋œ ๊ฒƒ
  • ์žฌ๊ตฌํ˜„์„ ์œ„ํ•œ ๋ฌธ์„œ

    • ์ €ํฌ๊ฐ€ ๋‚˜์ค‘์— ์žŠ์–ด๋ฒ„๋ฆฐ๊ฑฐ ์ฐพ๊ธฐ๊ฐ€ ์ˆ˜์›”

๋ชจ๋ธ

  • sparse Embedding (mrc repo)
    • ํ…Œ์ŠคํŠธ ํ•˜๋Š” ์‚ฌ๋žŒ์ด ์งˆ๋ฌธ์„ ๋ช‡๊ฐ€์ง€ ๋งŒ๋“ค๊ณ ,
    • contexts ๋“ค ์ค‘์—์„œ ๋น„์Šทํ•œ๊ฑฐ ๋ฝ‘์•„๋ณด๊ธฐ

์ „์ฒ˜๋ฆฌ

  • ์‚ด๋ฆฌ๋Š” ๋ฌธ์ž ๋“ค
().,?!

~:;''""

  • ใ…‹ใ…‹,ใ…œใ…œ,ใ… ใ… ,ใ…Žใ…Ž ์ฒ˜๋ฆฌ ใ…‹ใ…‹, ใ…œใ…œ, ใ… ใ… , ใ…Žใ…Ž -> 2๊ฐœ๋กœ ์ œํ•œ

  • ๊ด‘๊ณ ์„ฑ ๋ฌธ์žฅ๋“ค์ด๋‚˜ ๊ด‘๊ณ 

๋ช…์†Œ ๊ฐ„๋ฐ๋ฉ”๊ณต์› ์ด40 -> 20๊ฐœ ์ด์ƒ์€ ๊ด‘๊ณ ์„ฑ ๋˜๋Š” ๋ถ€๋™์‚ฐ

์ƒ์œ„ 10 20 ๊ฐœ ์ •๋„์˜ ํ€„์ด ๋„ˆ๋ฌด ์ข‹์Œ

์ƒ์œ„ 20๊ฐœ ๋จผ์ €

1๋งŒ 20 20๋งŒ 20,000๋งŒ์ž

50๊ฐœ ๋ฏธ๋งŒ์˜ ๊ฒฐ๊ณผ๋Š” ์ œ๊ฑฐ?

  • ์ œ๋ชฉ์—์„œ ํŠน์ • ํ‚ค์›Œ๋“œ ์ œ๊ฑฐ

    • ๋นŒ๋ผ,์•„ํŒŒํŠธ,๋ถ€๋™์‚ฐ,์ •์น˜
  • ๋ธ”๋กœ๊ทธ ๋‚ด์šฉ์ด ์–ด๋ ค๊ณณ์„ ๊ฐ„ ๊ฒฝ์šฐ

    • ๋ฌธ๋‹จ ๋‹จ์œ„๋กœ ๋ถ„๋ฆฌ ํ›„ ๊ด€๋ จ์„ฑ ํŒŒ์•…
      • ํŠน์ • ๋‹จ์–ด

ํด๋”๊ตฌ์กฐ

  • install
    • install_requirements
  • data #ignore ์นดํ†ก ๊ณต์œ 
    • FE
    • BE
    • MODEL
    • DATA_GEN
      • tour_spot_name.json
  • code
    • FE
    • BE
    • MODEL
    • DATA_GEN
      • src

        • util
          • request.py
          • replace.py (๊ธฐ๋ณธ์ ์ธ ์ „์ฒ˜๋ฆฌ) re.sub
        • blog.py
        • instagram.py
        • google.py
        • from_api.py
      • crawling.py

    • main.py
    • .env (ํ™˜๊ฒฝ ํŒŒ์ผ) #ignore ์„œ๋น„์Šค ํ‚ค
      • .env.template ์–ด๋–ค ํ•ญ๋ชฉ์ด ์žˆ๋Š”์ง€๋งŒ dict ๋‚˜์—ด
  • .github

๋ฐ์ดํ„ฐ ํ˜•์‹

{
    "์„œ์šธ":{
        "๊ด€๊ด‘์ง€":{
            "๋‚จ์‚ฐํƒ€์›Œ":[
                {
                    "context" : "title\n~~~",
                    "url" : "https://~",
                    "type" : "blog"
                },
                {
                    "context" : "title\n~~~",
                    "url" : "https://~",
                    "type" : "instagram"
                },
                {
                    "context" : "title\n~~~",
                    "url" : "https://~",
                    "type" : "google"
                },
                {
                    "context" : "~~~",
                    "url" : "https://~",
                    "type" : "guide"
                },

            ],
            "???":[
                
            ]
        },
        "๋ฌธํ™”์‹œ์„ค":{
        },
        "ํ–‰์‚ฌ/๊ณต์—ฐ/์ถ•์ œ":{
        },
        "๋ ˆํฌ์ธ ":{
        }
    },
    "๊ฒฝ๊ธฐ":{
        ...
    },
    ...
}