News 10th July 2023 - simon-oz/Weekly-AI-news GitHub Wiki
Last week, Salesforce released XGen-7b, which achieves comparable or better results when compared with state-of-the-art open-source LLMs (e.g. MPT, Falcon, LLaMA, Redpajama, OpenLLaMA) of similar model size, and its targeted evaluation on long sequence modeling benchmarks show benefits of our 8K-seq models over 2K- and 4K-seq models. Training cost of $150K on 1T tokens under Google Cloud pricing for TPU-v4.
4th July, according to iTnews, ChatGPT is used in peer reviews of Australian Research Council grant applications. However, ARC warns that this could be a breach of confidentiality, and has since released a statement advising peer reviewers not to use AI as part of their assessments.
4th July, OpenAI announced to make GPT-4 API and Code Interpreter generally available to all paying API users. It also announced a deprecation plan for some old models which will retire in 2024.
On 5th July, a group of researchers published a paper “FLACUNA: Unleashing the Problem-Solving Power of VICUNA using FLAN Fine-Tuning”. The researchers constructed a new dataset comprising a large number of tasks that demand problem-solving skills. Experimental findings strongly indicate that the enhanced problem-solving abilities of FLACUNA, are obtained through fine-tuning VICUNA on the FLAN dataset, leading to significant improvements across numerous benchmark datasets in INSTRUCTEVAL.
5th July, Nature published a paper “Accurate medium-range global weather forecasting with 3D neural networks”. The authors of the paper proposed that three-dimensional deep neural networks can be trained to forecast global weather patterns, including extreme weather, with accuracy greater than or equal to that of the best numerical weather prediction models.
5th July, researchers from Microsoft published a paper “LongNet: Scaling Transformers to 1,000,000,000 Tokens”. LongNet is a Transformer variant that can scale sequence length to more than 1 billion tokens, without sacrificing the performance on shorter sequences. Specifically, we propose dilated attention, which expands the attentive field exponentially as the distance grows.
6th July, Google and others published a paper “Focused Transformer: Contrastive Training for Context Scaling”. The paper introduces the Focused Transformer (FoT), a technique that employs a training process inspired by contrastive learning. This novel approach enhances the structure of the (key, value) space, enabling an extension of the context length.
7th July, according to Washingtontpost and this report, ChatGPT, the highly popular AI chatbot introduced in November, experienced a decline in its website's monthly traffic and unique visitors for the first time in June, as reported by Similarweb analytics.
10th July, according to TheVerge, Instagram’s Threads app surpasses 100 million users within only 5 day since its release.