Not frequenting this site anymore. Please find me on Mastodon where I am @[email protected] (https://t.co/juwj11SlgC) or LinkedIn (https://t.co/0wLrh8c6Bi)
@EnricoShippole struck again! 🔥
He just open-sourced a 590GB dataset containing 8 million SEC EDGAR filings (10-K, 10-Q, 8-K, etc.) and 43 billion tokens from 20 years of publicly traded company financial data. Includes both raw and parsed plaintext versions plus metadata.
Unofficial API providers charge hundreds of dollars per month for this data with strict rate limits. The official SEC EDGAR system limits you to 10 requests/second, meaning crawling 8 million filings takes over 10 days. This dataset gives you everything pre-crawled, cleaned, and parsed for free on Hugging Face.
You can train financial LLMs on 43B tokens of real corporate filings without paying API fees, build retrieval systems that understand SEC filing structure and terminology, fine-tune models on specific filing types (10-K annual reports, 8-K material events, etc.), or create benchmarks for financial document understanding and reasoning. Gives researchers and developers free access to domain data that's been locked behind paywalls, keeping financial AI development open when everything else is closing up.
He is not finished, the next release will include all other filing types.
Blog: https://t.co/MGohzB1cku
Dataset: https://t.co/bcBZGDThn0
It was great talking with @AnalyticAnna - we took quick, but also a reasonably deep dive into #AzureSQL Hyperscale elastic pools. Do take a look when convenient and looking forward to your questions and feedback!
Discover the power of Hyperscale elastic pools! 🎯Optimize performance & costs for your databases. Learn more with Anna Hoffman & Arvind Shyamsundar on Data Exposed.
Watch 📺: https://t.co/nH6M0Pd5H1 #AzureSQL
A significant milestone - enabling customers to migrate to #AzureSQL while ensuring gr8 price / perf. Using elastic pools for migrations ensures both isolation (security) of different workloads, and resource pooling. Kudos to Ajith & team for this awesome capability!
Azure Powershell and CLI for DMS now includes Elastic Pools SKU recommendation! Elastic Pools can help you share resources and significantly reduce costs. Try it out today!
https://t.co/80jYowNeqk
In this episode of Data Exposed with @AnalyticAnna, learn about @bobwardms new book - Azure SQL Revealed 2nd Edition - a must-read for anyone looking to translate their #SQLServer skills to Azure.
Watch 📺: https://t.co/TOSz7AT2yD #AzureSQL
📢📢📢Vector support for #AzureSQL DB is now in public preview, available to everyone, in every new and existing database.😍😍😍 Check out the announcement here, and then go build fantastic AI apps on your own data!
https://t.co/qu3Fbb3KCo
Now General Available (GA), Hyperscale elastic pools allow you to optimize performance and cost across a fleet of databases. Learn why you should consider Hyperscale elastic pools - this week on Data Exposed with @AnalyticAnna.
Watch 📺: https://t.co/yA1tjlXIIF #AzureSQL
Now General Available (GA), Hyperscale elastic pools allow you to optimize performance and cost across a fleet of databases. Learn why you should consider Hyperscale elastic pools - this week on Data Exposed with @AnalyticAnna.
Watch 📺: https://t.co/yA1tjlXaT7 #AzureSQL
Hot off the press, John Savill talks about #AzureSQL#Hyperscale database and how it's meant for all kinds of workloads, not just the bigger ones: https://t.co/eIsyMOkj0I
@ParentSquare MS Outlook classifies emails originating from addresses like donotreply+<random string>@parentsquare.com as junk email. My guess is since each such source email is unique with a random GUID, the algorithm thinks is spam. Can your IT folks please look at this?
Run static T-SQL code analysis against your live database directly from "Server Explorer" with the latest release of my free "EF Core Power Tools" extension for Visual Studio - and consider using Database project?
#sqlserver#dacfx#dotnet#azuresql
https://t.co/5EysaA9LXG
Analyze this! - "this" being a SQL Database project in Visual Studio - try it in my free "EF Core Power Tools" Visual Studio extension - static code analysis for T-SQL!
#sqlserver#azuresql#dotnet#dacfx
https://t.co/U7o6j1ID7X
@dataspeakers and #SQLServer folks - what are the major developer / data focused tech conferences in the Middle East and in Southeast Asia? I was looking at https://t.co/WwEmNbP4eH and did not spot any events catering to those two large geos.
🚀 Attention Data professionals!🎉
We're thrilled to announce the feature you've all been asking for!🌟 Introducing RegEx support in Azure SQL Database. Click the link below to sign up and experience the feature!
https://t.co/SBhkNljYQt
#sqlbits#AzureSQL#Regex@AzureSQL