AI_모델 성능 지표 - 100-hours-a-week/16-Hot6-wiki GitHub Wiki
상위 문서로 이동 : AI Wiki
1. 기존 모델 성능 지표
2. 식별된 성능 분석 상세 내용
3. 적용할 최적화 기법의 구체적 계획
4. 최적화 적용 후의 기대 성능
모델명 | 이미지 해상도 | 처리속도 | VRAM | 처리방식 | 비고 | 사용여부 |
---|---|---|---|---|---|---|
DALL-E3 | 1024x1024 | unknown | unknown | txt→img | API를 이용한 손쉬운 사용, 비용 문제 | ✅v1 예정 |
DeepSeek Janus Pro 7b | 768x768 | 4.2s | 약 40GB | txt→img | VRAM 사용대비 아쉬운 성능 | ❌ |
Stable Diffusion v1.5 | 1024x1024 | 3s | 약 11GB | img→img | 가볍지만 아쉬운 성능 | ❌ |
Stable Diffusion v2.1 | 1024x1024 | 3s | 약 11GB | img→img | 가볍지만 아쉬운 성능 | ❌ |
Stable Diffusion XL | 1024x1024 | 8s | img→img | LoRA를 이용해 Fine Tuning 하면 좋은 성능을 보일 것으로 예상 | ✅v1 이후 | |
Stable Diffusion XL + ControlNet | 1024x1024 | 8s | txt→img | Fine Tuning 예시 | ❌ | |
HiDream I1 Full | 1024x1024 | unknown | 약 30GB | txt→img | Base Model이 Llama 3.1인 Fine Tuning 예시 | ❌ |
Initial Image
Given this desk photo, generate a realistic image by compositing a desk setup that aesthetically matches and complements the existing environment
blurry, low resolution, distorted, deformed, people, human, cartoon, anime, painting, clutter, messy background, bad lighting, duplicate, twin, cloned
"A clean, modern desk setup in a bright open office, with a large LG monitor displaying " "a forest wallpaper and a MacBook beneath it. Add a minimalist LED desk lamp on the left, " "a sleek wireless charger pad beside the laptop, and all cables neatly organized and hidden. " "The scene should feel calm, tidy, and professional, with natural lighting and no clutter."
blurry, low resolution, distorted, deformed, people, human, cartoon, anime, painting, clutter, messy background, bad lighting, duplicate, twin, cloned
explain details
This image depicts a workspace setup with a focus on visual content and technology. Here are the details:
-
Primary Display:
- There is a large monitor mounted on a stand, displaying a forest scene with tall redwood trees. The time on the screen is 2:55.
-
Secondary Display:
- A laptop is placed on the desk, and its screen is displaying the same forest scene as the larger monitor.
-
Cables and Accessories:
- Several cables are connected to the laptop, possibly for power, data transfer, or peripherals.
- There is a small device with a blue button on the desk, which might be a USB hub or a small external hard drive.
-
Work Environment:
- The workspace appears to be in an office or a lab setting, with other desks and chairs visible in the background.
- There are people working in the background, indicating a collaborative or professional environment.
-
Signage:
- A sign on the desk reads "Please do not touch the screens," suggesting that the screens are sensitive or important.
The setup seems to be designed for monitoring or analyzing visual content, possibly for research, development, or quality assurance purposes.
1 | 2 | 3 | 4 | 5 |
---|---|---|---|---|
![]() |
![]() |
![]() |
![]() |
![]() |
explain details
The image depicts a modern workspace setup with a focus on productivity and comfort. Here are the details:
-
Monitor and Laptop:
- There is a large monitor mounted on a stand above a laptop. Both display the same image of a serene forest with tall redwood trees.
- The laptop is placed on a black mouse pad, and it is connected to the monitor via a cable.
-
Lighting:
- A desk lamp with an adjustable arm is positioned to the left of the monitor. The lamp provides focused lighting, which is beneficial for tasks requiring attention to detail.
-
Charging and Power:
- A power adapter is plugged into the laptop, indicating it is being charged.
- There is a power strip on the desk, which likely accommodates multiple devices.
-
Plant:
- A small potted plant is placed on the right side of the desk, adding a touch of greenery to the workspace.
-
Clock:
- The time displayed on the laptop is 8:56, which could be the current time or a reminder.
-
Workspace Setup:
- The desk is organized, with essential items like the laptop, monitor, lamp, and plant within easy reach.
- The setup suggests a focus on both functionality and aesthetics, creating a pleasant and efficient working environment.
This setup is ideal for someone who needs to work for extended periods, requiring good lighting, a comfortable keyboard, and a touch of nature to stay motivated.
1 | 2 | 3 | 4 | 5 |
---|---|---|---|---|
![]() |
![]() |
![]() |
![]() |
![]() |
"A clean, modern desk setup in a bright open office, with a large LG monitor displaying " "a forest wallpaper and a MacBook beneath it. Add a minimalist LED desk lamp on the left, " "a sleek wireless charger pad beside the laptop, and all cables neatly organized and hidden. " "The scene should feel calm, tidy, and professional, with natural lighting and no clutter."
1 | 2 | 3 | 4 | 5 |
---|---|---|---|---|
![]() |
![]() |
![]() |
![]() |
![]() |
Given this desk photo, generate a realistic image by compositing a desk setup that aesthetically matches and complements the existing environment
blurry, low resolution, distorted, deformed, people, human, cartoon, anime, painting, clutter, messy background, bad lighting, duplicate, twin, cloned
1 | 2 |
---|---|
![]() |
![]() |
- Hyperparameters like strength, guidance_scale, num_inference_steps were varied across images to observe their influence on the results.
Given this desk photo, generate a realistic image by compositing a desk setup that aesthetically matches and complements the existing environment
blurry, low resolution, distorted, deformed, people, human, cartoon, anime, painting, clutter, messy background, bad lighting, duplicate, twin, cloned
1 | 2 | 3 | 4 | 5 |
---|---|---|---|---|
![]() |
![]() |
![]() |
![]() |
![]() |
- Hyperparameters like strength, guidance_scale, num_inference_steps were varied across images to observe their influence on the results.
"A clean, modern desk setup in a bright open office, with a large LG monitor displaying " "a forest wallpaper and a MacBook beneath it. Add a minimalist LED desk lamp on the left, " "a sleek wireless charger pad beside the laptop, and all cables neatly organized and hidden. " "The scene should feel calm, tidy, and professional, with natural lighting and no clutter."
"blurry, low resolution, distorted, clutter, people, human, cartoon, overexposed, deformed, text"
Given this desk photo, generate a realistic image by compositing a desk setup that aesthetically matches and complements the existing environment
- blurry, low resolution, distorted, deformed, people, human, cartoon, anime, painting, clutter, messy background, bad lighting, duplicate, twin, cloned
- low quality, bad quality, sketches
1 | 2 |
---|---|
![]() |
![]() |
구조: Llama-3.1-8B-Instruct 기반 이미지 생성 모델로 보임
"A clean, modern desk setup in a bright open office, with a large LG monitor displaying " "a forest wallpaper and a MacBook beneath it. Add a minimalist LED desk lamp on the left, " "a sleek wireless charger pad beside the laptop, and all cables neatly organized and hidden. " "The scene should feel calm, tidy, and professional, with natural lighting and no clutter."
"blurry, low resolution, distorted, clutter, people, human, cartoon, overexposed, deformed, text"
"A modern office workspace featuring a large monitor on a stand displaying a forest scene " "with tall redwood trees, showing the time as 2:55. A laptop is placed on the desk, mirroring " "the same forest image. Multiple cables are connected to the laptop, possibly for data and power. " "A small device with a blue button sits on the desk, likely a USB hub or external drive. " "The background shows other desks and people working, indicating a collaborative lab or office " "environment. A visible sign on the desk reads 'Please do not touch the screens,' suggesting the " "setup is for research, monitoring, or visual content analysis."
"blurry, low resolution, distorted, clutter, people, human, cartoon, overexposed, deformed, text"
-
Token indices sequence length is longer than the specified maximum sequence length for this model (128 > 77). Running this sequence through the model will result in indexing errors
Token indices sequence length is longer than the specified maximum sequence length for this model (128 > 77). Running this sequence through the model will result in indexing errors
The following part of your input was truncated becausemax_sequence_length
is set to 128 tokens: ['for research, monitoring, or visual content analysis.'] -
토큰 수 초과로 제대로 된 이미지 생성이 안되었을 수 있음
이미 다른 개발자들이 Fine Tuning한 LoRA들을 이용해 성능 테스트를 해본다.
- Base Model: Stable Diffusion XL 1.0
- File Size: 217.88 MB
- Example Image
- Base Model: Stable Diffusion XL 1.0
- File Size: 217.87 MB
- Note: 3D 기능은 필요없으므로 Trigger Words에서
3D 0ff1c3
를 제거, 그외 필요하지 않은 부분을 prompt에서 제거하여 사용 - Example Image
- ChatGPT-4o를 이용해 다양한 Desk에 대한 이미지 확보
- 앞서 소개한 LoRA를 이용해 서비스에 적합한 다양한 이미지 확보
- 모델 양자화를 통해 모델 연산량을 줄인다.
- 우리의 서비스는 만족도가 우선이므로 해당 서비스로 만족도를 확보하는 것을 우선으로 최적화한다.
LoRA를 이용해 서비스에 최적화 된 이미지를 생성할 수 있도록 하는 것이 우선 목표 - 서비스에 최적화 된 이미지가 생성되면 모델 양자화 등을 통해 연산량을 감소시켜 사용자가 대기해야하는 시간을 줄이고 서버 부하를 줄이는 것을 목표로 최족화한다.