@Ryan_Bubear Feels like there should be a defined moderation process of older score under new rules for comparability. I've never been a fan of blanket caps or limits as it does take away from data reliability and comparability over time.
Mckinsey built an internal AI chatbot (Lilli) trained on 100 years of consultancy’s work and 100,000 documents and interviews.
MBA job candidates now have to pass a test by showing they can apply Lilli’s advice.
The firm has 40k employees and soon have same number of AI agents.
Breaking news
Cedric Nkabinde, who serves as Chief of Staff to Minister of Police Senzo Mchunu, says a group of masked men carrying heavy weapons forced their way into his apartment last night, Wednesday evening around 7 p.m.
His brother, who opened the door, told him the intruders claimed they were acting on instructions from Police Commissioner General Fannie Masemola.
Yup, the age of just auto complete LLMe is coming to an end... Turns out teaching models more like teaching people has much better outcomes (a leap but a suitable summary)...
This MIT paper just broke my brain.
Everyone keeps saying LLMs can't do real logical reasoning. Turns out we've just been teaching them wrong this whole time.
These researchers built something called PDDL-INSTRUCT that actually teaches models to think through planning problems step by step. Not just pattern matching - actual logical reasoning.
Here's how it works:
Phase 1: show the model correct and incorrect plans with explanations. Basic stuff. Phase 2 is where it gets interesting. They make the model generate explicit reasoning for every single action, then use an external verifier to check if each step is logically sound.
The numbers are wild. Llama-3-8B jumped from 28% to 94% accuracy on planning benchmarks. That's not incremental improvement - that's a completely different capability emerging.
What's smart is they don't trust the model to check its own work. They use VAL, a formal planning verifier, to validate every logical step. When the model screws up, it gets specific feedback about exactly what went wrong.
The two-stage training is clever. First stage focuses purely on better reasoning chains. Second stage optimizes for actually solving the problem. This prevents the model from just gaming the metrics.
One finding caught my attention - detailed feedback destroys binary feedback. Just telling a model "wrong" vs explaining exactly which preconditions failed makes a huge difference. The gap is especially big on complex problems.
This isn't trying to replace symbolic planners. It's teaching neural networks to reason like symbolic planners while keeping external verification. That's actually sustainable.
The implications go way beyond planning. Any multi-step reasoning task could benefit from this approach. We might finally be seeing how to teach LLMs structured thinking instead of just sophisticated autocomplete.
Makes me wonder what other "impossible" capabilities are just sitting there waiting for the right training approach.
77% of companies are planning to reskill and upskill their existing workers between 2025-2030 to better work alongside AI, according to findings published in the WEF’s Future of Jobs Report
A classic quantum experiment that shows how particles can behave like waves has been demonstrated with atoms for the first time, something that was thought to be impossible. https://t.co/XcSbYZSNax
This coolness arises partly from fear of the opponents, who have the laws on their side, and partly from the incredulity of men, who do not readily believe in new things until they have had a long experience of them.
Niccolò Machiavelli
It ought to be remembered that there is nothing more difficult to take in hand, more perilous to conduct, or more uncertain in its success, than to take the lead in the introduction of a new order of things.
Because the innovator has for enemies all those who have done well under the old conditions, and lukewarm defenders in those who may do well under the new.
Without a doubt, a great part of the economy has become extremely extractive, we are not at all talking about this enough... It will consume not just economies but likely current iterative civilisations...
BUCKLE UP!! AI agents are capable of cybercrime! 🤯
I just witnessed an agent sign into gmail, code ransomware, compress it into a zip file, write a phishing email, attach the payload, and successfully deliver it to the target 🙀
Claude designed the ransomware to:
- systematically encrypt user files
- demand cryptocurrency payment for decryption
- attempt to contact a command & control server
- specifically targets user data while avoiding system files
cybersecurity is about to get WILD...stay frosty out there frens 🫡
DISCLAIMER: this was done in a controlled environment; do NOT try this at home!
@pmcafrica Biggest disappointment is lack of support from local LG, we've got the bottom unit, but nobody can even tell you which top units are compatible.
@CoruscaKhaya Then for some reason, they also made this one also float on water. My 9 y/o German can't take a moderate shower without leaking water all over sensitive electronics.