A BERT-based summarization system for financial earnings calls. It can take a 60-minute transcripts of such meetings can compress the contents down into 5 bullet points.
Financial earnings calls are important events in investment managements: CEOs and CFOs present the results of the recent quarter, and a few invited analysts ask them questions at the end in a Q&A block.
Because this is very different prose from news, traditional summarization methods fail. So we pre-trained a transformer from scratch with a ton of high-quality (REUTERS only) finance news and then fine-tuned with a large (100k sentences) self-curated corpus of expert-created summaries.
We also implemented a range of other systems for comparison.
Oh funny, i've been working on a similar project, analyzing earning call transcripts using LLM's. My first attempt was with BERTopic. The results were awful. My second attempt was with a finetuned 7B version of Mistral, with heavy prompt engineering, the results were actually super good in my opinion... plus it runs on a single 3090.
https://link.springer.com/chapter/10.1007/978-3-031-28238-6_...
Financial earnings calls are important events in investment managements: CEOs and CFOs present the results of the recent quarter, and a few invited analysts ask them questions at the end in a Q&A block.
Because this is very different prose from news, traditional summarization methods fail. So we pre-trained a transformer from scratch with a ton of high-quality (REUTERS only) finance news and then fine-tuned with a large (100k sentences) self-curated corpus of expert-created summaries.
We also implemented a range of other systems for comparison.