News 23 July 2023 - simon-oz/Weekly-AI-news GitHub Wiki

A recent report from OpenUK indicates that in 2022, the Gross Value added to the UK economy from Open Source Software is estimated to be £13.59 billion. So what does that mean? Contextualising it with the UK Tech Sector contribution at £50 billion in 2022, the directly attributable contribution from Open Source Software is therefore 27%, more than a quarter of the overall Tech Sector contribution.
17 Jul, Microsoft published a paper: “Retentive Network: A Successor to Transformer for Large Language Models”. RetNet achieves low-cost inference (i.e., GPU memory, throughput, and latency), training parallelism, and favorable scaling curves compared with Transformer. RetNet makes the “impossible triangle” possible, which achieves training parallelism, good performance, and low inference cost simultaneously. The code will be released within two weeks.
17 Jul, HPC-AI released its 65 billion parameter large language model. It utilizes the current most widely used large model, LLaMA, to provide an example of the tool’s groundbreaking pre-training solutions for the 65 billion parameter large model which improves the training speed by 38%. This can save enormous amounts for large model enterprises. HPC-AI only needs 32 A100/A800 GPUs to improve pre-training speed by 38% compared to other mainstream options in the industry.
18 Jul, Dao published a paper, FlashAttention-2: Faster Attention with Better Parallelism and Work Partitioning. FlashAttention2 (1) tweaks the algorithm to reduce the number of non-matmul FLOPs (2) parallelizes the attention computation, even for a single head, across different thread blocks to increase occupancy, and (3) within each thread block, distributes the work between warps to reduce communication through shared memory. These yield around2× speedup compared to FlashAttention, reaching 50-73% of the theoretical maximum FLOPs/s on A100 and getting close to the efficiency of GEMM operations.
19 Jul, Meta released LLaMA v2, a series of large language models trained on 40% more data (2 Trillion tokens) than Llama 1, and has doubled the context length to 4096. Meta also provided finetuned dialog models with over 100k samples. Llama 2 outperforms other open source language models on many external benchmarks, including reasoning, coding, proficiency, and knowledge tests.
19 Jul Reuters reported that Apple is working on AI offering similar to OpenAI’s ChatGPT and Google’s Bard, causing its shares up as much as 2% to a record high. Apple's new virtual assistant summarizes text and answers questions based on data it has been trained with, and the tool essentially replicates Bard, ChatGPT and Bing AI, and works as a web application, according to employees of Apple.
19 Jul, researcher from UCL, EleutherAI, Meta, StabilityAI and others published a paper “Challenges and Applications of Large Language Models”. The authors believed that due to the fast pace of the field, it is difficult to identify the remaining challenges and already fruitful application areas. They explored the challenges of LLMs from three views: Designing LLMs relates to decisions taken before deployment. Behaviorial challenges occur during deployment. Science challenges hinder academic progress.
20 Jul, Nature published a paper “How to introduce quantum computers without slowing economic growth”. The researchers believe that new ways of simulating materials, optimizing processes and improving machine learning — could transform society. They also suggested that Specialists should work together to create narratives around the usefulness of quantum technologies; however, the technology bottlenecks for quantum computing are unclear, and Would these benefits lead to more products and services that are better tailored to customer needs? What would the impacts be on the wider industrial landscape, and what new business models might emerge?
21 Jul, StabilityAI released FreeWilly, the large and mighty instruction finetuned open access language models (FreeWilly1 and FreeWilly2). FreeWilly1 leverages the original LLaMA 65B foundation model and was carefully fine-tuned with a new synthetically-generated dataset using Supervised Fine-Tune (SFT) in standard Alpaca format. Similarly, FreeWilly2 leverages the LLaMA 2 70B foundation model to reach a performance that compares favorably with GPT-3.5 for some tasks.
21 Jul, according to HDTECH, Google’s cofounder Sergey Brin returns to Google to work on the secret AI project Gemini, the company’s highly ambitious general-purpose AI project. Gemini would be a multi-modal foundational model that powers other AI models but no further details are known at the moment.