If you're interested in Python, SLM fine-tuning, and/or emotion recognition, @PetrKorab presents a new, accessible tutorial based on the Mistral Small 3.1 model and a social media-based dataset. https://t.co/Zz5pSEBh28
This @TDataScience Python tutorial explains:
* Pre-processing of highly imbalanced data for LLM fine-tuning
* Last year Nature’s 𝐈𝐒𝐌𝐎𝐓𝐄 method for generating synthetic training samples https://t.co/X0xJdzTx9i
Two paths are redefining topic modeling: stronger statistical methods and generative LLM-driven approaches. @PetrKorab, Martin Feldkircher, and Márton Kardos break down where each shines and how they converge.
https://t.co/pHTVhQ3t9o
Discover how to use LLMs to automatically generate human-readable topic names from your model's output in a this new article by @PetrKorab, Martin Feldkircher, and Márton Kardos.
https://t.co/pHTVhQ3t9o
This Towards Data Science tutorial sheds light on a new topic model - 𝐅𝐀𝐒𝐓𝐨𝐩𝐢𝐜, presented at last year’s 𝐍𝐞𝐮𝐫𝐈𝐏𝐒, and compares its scores with the state-of-the-art 𝐁𝐄𝐑𝐓𝐨𝐩𝐢𝐜 on a customer reviews dataset. https://t.co/A7mDrbEiqa
Text cleaning made simple. 𝐖𝐚𝐬𝐡𝐞𝐫 implements all standard text processing steps into a no-code studio.
Check the TMS blog to read more. https://t.co/1slN4M06HD
Manual topic labeling? NO MORE!! @PetrKorab's newest article dives into how to label topic models with LLMs, specifically using Turftopic and GPT-4o-mini for reproducible, human-readable names.
https://t.co/0opWrHzEX7
Highly interesting!
"What Do Central Bankers Talk About? Evidence From the BIS Archive" by Martin Feldkircher, Petr Koráb, and Viktoriya Teliha.
"This paper analyzes the evolution of central bank topics using a corpus of over 20,000 speeches spanning nearly three decades and a range of topic models. We identify thirteen themes, including monetary policy, financial stability, digital payments, and climate-related finance. Examining their development over time, we classify these themes as “evergreens”, “waning threads”, or emergent “rising stars”, and show that early adoption and topic leadership are nearly equally shared between emerging and advanced economies’ central banks."
https://t.co/L5YQOmMaLV
Central bank communication is more important than ever. In our new working paper (Martin Feldkircher, Viktoriya Teliha), we analyze central bank speeches with both classical and transformer-based topic models. https://t.co/C2NeFsEvjg
Prioritize customer emails like a pro. Learn how to build and evaluate a bigram topic model pipeline (using BERTopic & FASTopic) to classify incoming emails and reduce response times to critical issues. Read @PetrKorab's full article free now.
https://t.co/xJgDrV3igF
This @TDataScience article:
✔️ shows how topic models help Customer Care prioritize email communication and dynamically decrease response time
✔️ explains metrics for evaluating the quality of top-notch BERTopic and FASTopic models in BI
✔️ easy-to-swallow Python code.
Have you ever cleaned 𝐭𝐞𝐱𝐭 𝐝𝐚𝐭𝐚𝐬𝐞𝐭𝐬? There's lots of stuff to remove before jumping on text analytics - 𝘯𝘶𝘮𝘣𝘦𝘳𝘴, 𝘴𝘵𝘰𝘱𝘸𝘰𝘳𝘥𝘴, 𝘱𝘶𝘯𝘤𝘵𝘶𝘢𝘵𝘪𝘰𝘯, 𝘦𝘮𝘰𝘫𝘪𝘴, 𝘣𝘭𝘢𝘯𝘬 𝘴𝘱𝘢𝘤𝘦𝘴, encoding errors.. 𝐖𝐚𝐬𝐡𝐞𝐫 makes it easy in a no-code way.
Journey to an unexplored world. Our article with Lucas Wise @FlorianHorky and @JarkoFidrmuc explains visualization strategies for Bitcoin wallet networks. We displayed 0.3. M bitcoin transactions for 2020 Russia-Ukraine, and Banking crisis in Cyprus, 2013.
Boost the engagement factor of your visualizations by following along @PetrKorab's latest tutorial, which shows how to create an animated word cloud of bigram frequencies to display a text dataset in an MP4 video. https://t.co/G5Y9Gf2ukB