AI: Training AI Model - RyanL2004/teamlyse GitHub Wiki

๐Ÿ” Is SAMSum the Best Dataset for Training?

The SAMSum dataset is one of the best available for conversational summarization, but itโ€™s not perfect for all aspects of your AI. Hereโ€™s why:

โœ… Best for:

  • ๐Ÿ—ฃ๏ธ Dialogue-based meetings (since itโ€™s built from human conversations).
  • ๐Ÿ“„ Short summaries (great for summarizing meeting points).

โš ๏ธ Limitations:

  • โŒ No structured meeting minutes.
  • โŒ Lacks focus on action items, insights, or solutions.
  • โŒ Mostly text-based chat conversations, not real spoken meetings.

๐Ÿš€ What We Can Do Next

To make your AI better than just summarization, we should:

๐Ÿ“Œ Fine-tune with a custom dataset that includes:

  • ๐Ÿ“‹ Meeting summaries with key takeaways.
  • ๐Ÿข Real-world corporate meeting transcripts.
  • ๐Ÿค Data from decision-making discussions.

๐Ÿ“Œ Combine SAMSum with another dataset:

  • AMI Corpus (Annotated Meeting Transcripts) โ†’ ๐Ÿข Corporate discussions.
  • ICSI Corpus (Meeting recordings) โ†’ ๐ŸŽค Long-form meeting insights.

๐Ÿ”ฎ Will the Model Provide Real-Time Insights & Solutions?

โœ… Right Now: Focusing on Summarization

For now, our fine-tuned model will:
โœ”๏ธ Summarize meetings at the end.
โœ”๏ธ Extract the most important points.
โœ”๏ธ Identify who said what.

๐ŸŒŸ Future Goal: AI That Gives Real-Time Insights

To make your AI more intelligent and interactive, we need additional models and fine-tuning to:
โœ”๏ธ Analyze statements and problems during a meeting.
โœ”๏ธ Provide recommendations and solutions.
โœ”๏ธ Recognize decision-making patterns and best practices.


๐Ÿ’ก How Can We Achieve This?

  • ๐Ÿง  Combine BART with NLP reasoning models (like DeepSeek or OpenAI GPT-3.5).
  • ๐Ÿข Fine-tune it on corporate decision-making datasets.
  • โšก Use real-time NLP processing with AI agents.

๐Ÿ”— Whatโ€™s Our Next Step?

For now, we will:
โœ”๏ธ Fine-tune BART on SAMSum for summarization.
โœ”๏ธ Test how well it summarizes meetings.
โœ”๏ธ Later, we add real-time intelligence.


๐Ÿ“Œ So to Summarize:

  • โœ… SAMSum is great for summarization, but we will improve it with better datasets.
  • โœ… Right now, we focus on summarization.
  • โœ… **Later,