Design Documentation - Visuwanaath/AiJiraAssistant GitHub Wiki

Language for API framework:

Nodejs w/ Javascript + Express

Hosting service:

AWS + EC2 for the API

Model inference service:

AWS Bedrock

Smaller model for pre-parsing potential use: AWS Bedrock

Might experiment with sagemaker during testing phase.

Potential Model architecture:

The initial model for summarizing/parsing information

The primary model for inferring key points and conducting transactions to the API

Primary Thirdparty API's:

Potential third party dependencies:

Potential Models

Calculated assuming 13,500 input tokens per hour of speaking and 1,350 output tokens per hour of speaking

Model Price per 1000 input tokens Price per 1000 output tokens Price for 1 hour of transcript
Amazon Titan Text Lite $0.00015 $0.0002 $2.30
LLama 3 (8B) $0.0004 $0.0006 $6.21
LLama 2 (13B) $0.00075 $0.001 $11.48
Command-Light $0.0003 $0.0006 $4.86
Command R $0.0005 $0.0015 $8.78
Claude 3 Haiku $0.00025 $0.00125 $5.06
Claude Instant $0.0008 $0.0024 $14.04

Future notes:

AWS offers regex power output filtering and keyword filtering on a free tier.

AWS also offers topic filtering at a paid tier. AWS Bedrock Pricing