Fink

@datageneralist

''The Data Generalist'' | Data Expert. Finance enthusiast. | Career Advisor | Make learning a constant in your life.

Joined September 2011

2.5K Following

689 Followers

12.8K Posts

Pinned Tweet

Fink @datageneralist

11 months ago

While everyone is focused on topical AI content, I put together some timeless principles for understanding AI systems. The focus on key fundamentals will help you use AI tools more effectively for many years. https://t.co/FWlKNImDvJ

555

Fink @datageneralist

about 2 hours ago

@kanavtwt Wrote who Json was 6 years ago lol https://t.co/PNopoLRJ5u

Fink @datageneralist

about 4 hours ago

@_LoveLiberty @AngryTomtweets The gallon and probably at Costco

datageneralist retweeted

Rohan Paul

@rohanpaul_ai

1 day ago

dot-com bubble vs. a possible AI bubble. From the famous "Dean of Valuation", Professor Aswath Damodaran, of NYU Stern School of Business, “And that’s the real big difference between the dot-com boom and bust and the AI boom. We don’t know whether there’ll be a bust. History suggests there will be a bust. The dot-com boom and bust had no huge capital expenditure in that cycle. In fact, there was very little traditional CapEx, or even R&D, driving it. People started apps. They basically started going on it. This has been the biggest infrastructure run-up I think I’ve ever seen in business. You can go back and compare it to the automobile business 100 years ago. The amount of money that’s being put into AI CapEx is immense, which means that when the correction comes, the pain will be more intense. And herein lies the second problem. The dot-com boom and bust was almost entirely equity-funded. You think, so what? Well, when the bust came, those shareholders lost 60%, 70%, 80%, or 90% of their money. You felt sorry for them, but the loss was restricted to the shareholders. The problem with the AI CapEx boom is that not only is it immense, but a big chunk of it is funded with debt, and the debt is coming from private capital rather than banks. There’s a very real chance that if there’s a correction and companies start having problems, that problem is going to show up as distress and default, and that really doesn’t stay restricted. It spills over into the rest of society. I’m not saying it’s going to be 2008, but 2008 is an example of what happens when lenders overreach, when they lend money at too low a rate, and the correction comes. The pain spills over. So that is my concern with this big market illusion: the potential societal cost of having to deal with debt coming due that you’re unable to pay. It’s much more painful than your share price dropping 90% and you feeling the pain." ---- From "Excess Returns" YouTube channel, (link in comment)

115

361

419K

Who to follow

Okong' Okuna

@XivTroy

~ I have no master, except my enterprise. I have no enemies, except my indiscretions: and my only claim to relevance is sanity ~

Thomas Robb ☁️

@BreakingSaaS

Husband. Vertical AI CFO/COO @SITETechnolog. Former @OneTrust + @Celonis + @MorganStanley. All opinions my own. Not investment advice.

The Market Huddle

@TheMarketHuddle

Join Patrick Ceresna and Kevin Muir bi-weekly as these two traders get together to discuss the market action while having a little fun in the process.

datageneralist retweeted

The Other Alistair @TotherAlistair

2 days ago

ow that hits hard

245

746

172K

Fink @datageneralist

1 day ago

File systems are great for fast research or rough drafts, but I wouldn't call any outputs based on facts when the LLM query is probabilistic.

Carter Rabasa

@crtr0

5 days ago

https://t.co/ro2ONOnaw4

180

354

110K

datageneralist retweeted

Shreya Shankar

@sh_reya

1 day ago

Building AI products is hard. But it's getting increasingly popular! I'm really excited to share that my friends and I are putting together (the best) lecture series on AI Product Engineering this summer!! We've got an awesome lineup of talks spanning data, evals, and UX. With more to come. The lecture series is completely free! And ~2k people have signed up already even though we haven't posted on social media yet! I can't wait. Join us and sign up: https://t.co/5DWcm4va5m

Fink @datageneralist

1 day ago

@p_millerd Sadly if you're not a SME, much of your audience is wowed over BS.

195

datageneralist retweeted

Yasmine Khosrowshahi

@yasminekho

2 days ago

Seth Godin gave a masterclass on how to build an unforgettable brand in the age of AI: 1. Marketing is not about spend. It is about creating the conditions for other people to eagerly spread your idea. 2. Authenticity is overrated. What customers actually want is consistency. Show up the same way every single time and that is worth more than any Super Bowl ad. 3. Everything your company does is a marketing decision. How you answer the phone. What you charge. How you design things. Marketing is not a department. It is everything. 4. Trust is simple. Make a promise. Keep it. Especially when it is hard. 5. Successful brands are built with your customers talking about you. Not you talking about you. 6. A brand is not a logo. A brand is a promise. Nike has a brand. Hyatt has a logo. One of them you know exactly what to expect. The other you do not. 7. You are measuring the wrong things. Follower counts. Stock price. Open rates. False proxies will take your business in the wrong direction faster than anything else. 8. Social media followers mean nothing. Godin has 400,000 Instagram followers and says if he posts about a new book maybe 12 people buy it. The number is a distraction. 9. Stop trying to be famous. The goal is not to get more famous. The goal is to get less famous and more trusted. 10. Average marketing reaches average people. Average people will not buy your product. You need the people who will talk about you, challenge you, and eagerly pay more for better. 11. When you pick your customers you pick your future. Stop trying to reach everyone. Start trying to deeply serve someone specific. 12. Better beats louder every time. One guy running a wine email list with 130,000 subscribers does $30 million a year in revenue. No ads. No social media hustle. Just consistently better. 13. The real opportunity with AI is not making things cheaper. It is making things better. The businesses that use AI to deepen relationships will win. The ones using it to cut costs will race to the bottom. 14. Your job is not to do your job. Your job is to solve problems for other people and make things better by making better things. Everything else is just noise. 15. When AI becomes the buyer it will always choose the cheapest option. If your entire business strategy is being the cheapest, AI will destroy you. The only protection is being worth it in ways that cannot be easily measured. 16. The next level of marketing is permission at a depth nobody has achieved before. The brand that knows your tools, your projects, your needs, and shows up to help without being asked will be impossible to replace. 17. Most businesses will use AI to spam more people faster. The businesses that win will use AI to serve fewer people better. That gap is the biggest opportunity in marketing right now. 18. You have a squadron of summer interns available for twenty dollars a month. They are not that good but they are very eager. The businesses learning to be good bosses of AI right now will have an enormous advantage over everyone waiting to figure it out later. 19. The question every business should be asking is not how do I get more attention. It is how do I become the kind of business that people would genuinely miss if it disappeared tomorrow. That answer is your entire marketing strategy.

212

121K

Fink @datageneralist

2 days ago

@drgurner @Dr_Singularity Maybe it's a latency thing?

Fink @datageneralist

3 days ago

@businessbarista @tenex_labs Do they have to be based in NYC area?

138

Fink @datageneralist

3 days ago

@sh_reya Hiring budgets for AI practitioners will always be much higher than data practitioners. Sounds like a permanent challenge because it's harder to quantify the impact of data practitioners.

datageneralist retweeted

Rudy Havenstein, Senior Markets Commentator.

@RudyHavenstein

3 days ago

EXCLUSIVE: Iran-U.S. Memorandum of Understanding

188

24K

datageneralist retweeted

rahul

@rahulgs

3 days ago

1. as a mental model it is more correct to think of fable+ class models as english -> code interpreters - converts your idea into code into "correct" code regardless of problem complexity and output complexity (diff size). Fable 5 will be the worst of this new class of models 2. diff size/complexity is to be managed purely for review: small diffs - in high risk areas of code (auth/identity/data access/network access/money movement) large diffs for code that can be empirically verified (frontend/backend plumbing/code without network or db access/performance code that can be empirically verified) 3. time it takes to ship software is completely disconnected from time to produce the PR - how long the work takes depends fully on ability to review/merge code while managing risk at scale 4. solving the bottlenecks for above matter enormously- linters/testing/CI/shadow mode verification/empirical verification 5. agency matters enormously- what are the biggest bottlenecks to speeding up the loop and eliminating them? what are the problems that need solving and when do they need solving? what does it take to the solution to all of them today? 6. deep understanding of the full stack matters enormously- what problems are worth pursuing? is there a higher level of problem abstraction to address first? should I give it the sub-sub task, the sub task, or the task itself. what are the major risks with this PR (order of importance: security holes/correctness holes/performance holes). is there a higher speed way of producing data that allows me to merge this? should this be run in shadow or in a sandbox or a flag. understanding every line of logic may not be needed but understanding and managing risk matters enormously. 7. the cost of complexity itself is changing. it might be now worth "maintaining" 50% more code to get a 5% performance win. getting the right abstractions matter less because larger refactors are less tedious. code quality nits become huge drag. very likely, a much smarter model will be maintaining your code so worth taking on more technical debt now. taking the time to hand architect and rebuild systems comes with an enormous cost of velocity 8. if it quacks like a duck and walks like a duck, it's a duck. For low risk cases, it might be more sane to treat code chunks (services / functions) as a black box, like we do for neural networks: do full empirical verification only: has code produced correct outputs for the last 10,100,1000,10k inputs ? can we quarantine this large piece of code - no outbound access to network / database ? what happens when this code is wrong? do we get hacked/or crash(memory/cpu)/is an inconvenience? is it internal facing or external? what can we do to address these risks? 9. eventually, logical verification (line by line review) will come at an enormous cost- save it for where it matters and build systems that are tolerant to empirical verification. is there a decorator that prevents db / network access? correctness bugs are significantly easier to rectify than access bugs 10. what are the rails that allow for even faster iteration? code permissions can be opt in - db writes, db reads, network egress (to where?), PII access. how long does it take to get shadow mode data? how many PRs can be tested? What are the categories of diffs

147

291K

datageneralist retweeted

Rizèl Scarlett 🇦🇬🇬🇾

@blackgirlbytes

4 days ago

Interesting read from an Anthropic study on how people use Claude Code. The more domain expertise you have in the task, the more successful you are with agentic coding. Success was measured by: - Passing test suites - PRs/commits that matched the user’s intent They rated domain expertise based on: - Use of accurate, field-specific language - Ability to give precise directions - Spotting and correcting mismatches / errors They made this chart to determine the levels of domain expertise based on: - use of accurate domain knowledge - ability to give precise directions - ability to catch errors

blackgirlbytes's tweet photo. Interesting read from an Anthropic study on how people use Claude Code.

The more domain expertise you have in the task, the more successful you are with agentic coding.

Success was measured by:
- Passing test suites
- PRs/commits that matched the user’s intent

They rated domain expertise based on:
- Use of accurate, field-specific language
- Ability to give precise directions
- Spotting and correcting mismatches / errors

They made this chart to determine the levels of domain expertise based on:
- use of accurate domain knowledge
- ability to give precise directions
- ability to catch errors

11K

datageneralist retweeted

Hamid Bendaas 🇩🇿🇵🇸 @HBendaas

4 days ago

It is funny that some economists were vocally in distress about why a complete closing of the Strait was not having the kind of energy shock they expected and instead of the fundamentals of economics being wrong it was just run of the mill corruption and lying

873

470K

Fink @datageneralist

4 days ago

@YardsPerPass Messi makes the first touch look way too easy considering the pass was a rocket

Fink @datageneralist

4 days ago

@buccocapital It's virtue signaling for VC

Fink @datageneralist

4 days ago

@CNNPR @kylascan Congrats @kylascan !

datageneralist retweeted

Jamin Ball

@jaminball

4 days ago

Awesome keynote by @alighodsi at the @databricks summit. Some takeaways below. TLDR - Databricks is now a data processing platform, a data platform, an agents platform and an apps platform Architecture updates Iceberg / table formats - v3 is now GA: the unified data layer. Delta & Iceberg files are laid out identically on disk, no rewriting to share across formats - v4 (targeting Q4 '26) finishes the job by unifying metadata. After that, Delta and Iceberg are effectively one format end-to-end Lakeflow ingestion (3 modalities, all GA): - ZeroBus: kills the need for Kafka. Hit one API endpoint with bursty/tiny-row data at high rate, no buffering, no millions of small files - Spark Real-Time Mode: sub-10ms latency in open-source Spark, closing the gap with Flink (Spark's old micro-batch floor was ~1s) - Lakeflow Designer: Alteryx-style drag-and-drop; talk to Genie, it generates inspectable, version-controllable Spark under the hood - 100+ connectors now (Salesforce, Workday, NetSuite, Meta, Google Analytics, and more) Lakebase + LTAP: - Lakebase: open-source Postgres on the lake. Serverless autoscaling to zero, plus branching: clone a petabyte DB in <1s via copy-on-write (agents love it) - LTAP: unifies the transactional (Lakebase) and analytical (Lakehouse) layers, a breakthrough the industry has chased for 40 years - Reyden: new query engine hitting tens-of-ms latency Unity AI Gateway: one pane of glass for all AI - Single entry point for every agent, harness & model - Commit spend to Databricks, buy tokens directly from OpenAI / Anthropic / Gemini on any cloud - Budgets + alerts down to the individual level - Guardrails, auditing, identity management for every agent - Register any MCP server, authenticate once - Free + open source (part of Unity Catalog + MLflow) Genie: The Agents Platform, powered by Ontology Instead of agents for-looping through your data live (slow, expensive, inaccurate), Genie Ontology runs in the background and builds a knowledge graph of your most important assets across lakehouse + Drive + SharePoint + email + many other sources. It ranks importance using "OntoRank" - basically PageRank for enterprise assets. Databricks' own instance has 4.5M ontology snippets. That context feeds four agents: Genie One: universal interface for any business user to ask questions across all data Genie Agents: turn any Genie conversation into a deployable autonomous agent Genie Code: coding agent that's elite at data engineering + ML/data science Genie Zero Ops: monitors pipelines, fixes a 2am break in an isolated branch, pings you for one-click approval New vertical apps Lakewatch: agentic SIEM / "security lakehouse" (also acquiring Panther Labs) Customer Lake: agentic CDP with LLM-powered identity dedupe + one-to-one "infinity campaigns"

117

140

62K

Fink @datageneralist

5 days ago

Fully endorse this message. https://t.co/SaSmRw1tYs

Fink

@datageneralist

Who to follow

Last Seen Users on Sotwe

Trends for you

Most Popular Users