Design Documentation - Visuwanaath/AiJiraAssistant GitHub Wiki
Language for API framework:
Nodejs w/ Javascript + Express
Hosting service:
AWS + EC2 for the API
Model inference service:
AWS Bedrock
Smaller model for pre-parsing potential use: AWS Bedrock
Might experiment with sagemaker during testing phase.
Potential Model architecture:
The initial model for summarizing/parsing information
The primary model for inferring key points and conducting transactions to the API
Primary Thirdparty API's:
Potential third party dependencies:
Potential Models
Calculated assuming 13,500 input tokens per hour of speaking and 1,350 output tokens per hour of speaking
Model | Price per 1000 input tokens | Price per 1000 output tokens | Price for 1 hour of transcript |
---|---|---|---|
Amazon Titan Text Lite | $0.00015 | $0.0002 | $2.30 |
LLama 3 (8B) | $0.0004 | $0.0006 | $6.21 |
LLama 2 (13B) | $0.00075 | $0.001 | $11.48 |
Command-Light | $0.0003 | $0.0006 | $4.86 |
Command R | $0.0005 | $0.0015 | $8.78 |
Claude 3 Haiku | $0.00025 | $0.00125 | $5.06 |
Claude Instant | $0.0008 | $0.0024 | $14.04 |
Future notes:
AWS offers regex power output filtering and keyword filtering on a free tier.
AWS also offers topic filtering at a paid tier. AWS Bedrock Pricing