BloombergGPT is one of the first built financially focused worked out LLM's (Large Language Models) like ChatGPT4 but also domain focused and specific to the financial world. I do think this will change the game in upcoming months/years, especially when paired with new mathematical/computational plug-ins to GPT models such as Wolfram Alpha which turbo charges and corrects GPT's poor math skills and gives computational math super powers to GPT . I'm including here the announcement from Bloomberg, https://www.bloomberg.com/company/press/bloomberggpt-50-billion-parameter-llm-tuned-finance/ The original paper on which BloombergGPT is based is also a challenging but very interesting paper for both describing the datasets on which these domain specific AI models are trained and how domain specific models work but a bit of a slog but worthwhile) https://arxiv.org/pdf/2303.17564.pdf Just for an experiment I also asked Bing's ChatGPT sidebar to summarize the above scientific 37 page paper .pdf link into a bullet point summary with page references that an intelligent high school student can understand. Here is what it wrote:
- BloombergGPT is a large language model for finance (page 3)
- It is trained on a huge amount of financial data and some general data (pages 5-10)
- It uses a novel tokenizer and a transformer-based architecture (pages 9-11)
- It outperforms existing models on financial tasks and does well on general tasks (pages 16-31)
- It can support various applications in financial technology (page 37)
Summary:
- The paper introduces BloombergGPT, a large language model (LLM) that is trained on a huge amount of financial data and can perform various natural language processing (NLP) tasks related to finance (page 3).
- A LLM is a type of artificial intelligence (AI) that can generate natural language texts based on some input or prompt. It can also understand and answer questions, summarize texts, generate code, and more (page 3).
- A LLM is trained on a large collection of texts called a corpus. The quality and quantity of the corpus affect the performance of the LLM on different tasks and domains (page 5).
- BloombergGPT is trained on a corpus of over 700 billion tokens (words or symbols), which is one of the largest domain-specific corpora ever created. It includes texts from news, filings, press releases, web pages, and other sources that are relevant to the financial industry (pages 5-8).
- BloombergGPT is also trained on some public datasets that cover a broad range of topics and domains, such as Wikipedia, C4, and The Pile. This helps the model to maintain general knowledge and skills while being specialized for finance (pages 8-9).
- BloombergGPT uses a novel tokenizer called Unigram, which breaks down texts into smaller units that can be efficiently processed by the model. It also uses a transformer-based architecture with 70 layers and 50 billion parameters, which are the building blocks of the model (pages 9-11).
- BloombergGPT is evaluated on several benchmarks that measure its ability to perform financial and general NLP tasks. The results show that BloombergGPT outperforms existing models on financial tasks by significant margins without sacrificing performance on general tasks (pages 16-31).
- BloombergGPT is a cutting-edge AI that can evaluate financial data quickly to help with risk assessments, gauge financial sentiment, and possibly even automate accounting and auditing activities (page 37).