Satish

@satishkhode

Enthusiastech! Space, Time & Light confuses me! - The Machines are Learning faster than us, & The Future is Artificially Intelligent. #deeplearning

Joined May 2010

601 Following

154 Followers

4.1K Posts

Satish @satishkhode

about 1 month ago

@theBuoyantMan Couldn't agree more.

Satish @satishkhode

4 months ago

Source: The Economic Times https://t.co/8JQvXIoyDy

Satish @satishkhode

11 months ago

https://t.co/uVP2WFRjm5

Satish @satishkhode

over 1 year ago

Unless you are rich, don’t make your child a sportsperson: Badminton coach Pullela Gopichand https://t.co/u377HrwPFW Download the TOI app now: https://t.co/R7g1Tn3dwa

Who to follow

CEO & Co-Founder MicroTrain. Creating learning paths to the future. CEO Chicago Trading and Options. Anti-Submarine Operator, USN

satishkhode retweeted

Ronald van Loon

@Ronald_vanLoon

over 1 year ago

Biped Robot P1: Mastering Wilderness Navigation by @IntEngineering #ArtificialIntelligence #Robotics #Engineering #Innovation #FutureTech #Autonomous #Technology cc: @amuellerml @marcusborba @sallyeaves

Satish @satishkhode

over 1 year ago

Check out my latest article: Microsoft 365 Copilot: The Future of AI-Powered Productivity https://t.co/VRtBUEeSeu via @LinkedIn

Satish @satishkhode

over 1 year ago

https://t.co/HyOtjyMwkg

Satish @satishkhode

over 1 year ago

https://t.co/CKV4HdwsZf

Satish @satishkhode

over 1 year ago

https://t.co/06VrXVfvSX

satishkhode retweeted

Pat Gelsinger

@PGelsinger

over 1 year ago

Wisdom is learning the lessons we thought we already knew. DeepSeek reminds us of three important learnings from computing history: 1) Computing obeys the gas law. Making it dramatically cheaper will expand the market for it. The markets are getting it wrong, this will make AI much more broadly deployed. 2) Engineering is about constraints. The Chinese engineers had limited resources, and they had to find creative solutions. 3) Open Wins. DeepSeek will help reset the increasingly closed world of foundational AI model work. Thank you DeepSeek team.

291

591K

satishkhode retweeted

Akshat Shrivastava

@Akshat_World

over 1 year ago

Buying a flat in India (as an investment), is a waste of money. This only makes the builder richer, not you. There are 3 specific reasons why you should buy a flat: 1) To live in it (than its not investing). You love the place, you buy it. 2) You can build a business on it (eg. an AirBnB); this improves your yield. 3) You are getting a distressed deal. And, the long-term rent on it would justify the price. This usually does not happen anymore in any major city in India. The worst reason to buy a flat is: oh in the last 3 years the prices have doubled, so.... you can imagine the next 3 years? Buddy, that's the builder's rate. What they can sell Tower B now, after making Tower A. Not the value of your 3 year old flat, which you purchased in Tower A.

212

227

585

307K

satishkhode retweeted

Ronald van Loon

@Ronald_vanLoon

over 1 year ago

This research from ETH Zurich examines advancements in quadruped #Robots by @ETH #ArtificialIntelligence #MachineLearning #Robotics #RPA #ML cc: @yuhelenyu @bernardmarr @alvinfoo

satishkhode retweeted

Ronald van Loon

@Ronald_vanLoon

over 1 year ago

Mind-Controlled Wheelchair by @supercarblondie #AI #HealthTech #Technology #Innovation cc: @ylecun @bernardmarr @amuellerml

Satish @satishkhode

over 1 year ago

View my verified achievement from @GoogleCloudTech. https://t.co/7hzL0YaoNd via @credly

Satish @satishkhode

over 1 year ago

Hindi vs Kannada: Viral video shows Bengaluru autos charging Hindi-speaking woman more money https://t.co/ZL2AXDCRkS

satishkhode retweeted

Andrew Ng

@AndrewYNg

almost 2 years ago

After a recent price reduction by OpenAI, GPT-4o tokens now cost $4 per million tokens (using a blended rate that assumes 80% input and 20% output tokens). GPT-4 cost $36 per million tokens at its initial release in March 2023. This price reduction over 17 months corresponds to about a 79% drop in price per year. (4/36 = (1 - p)^{17/12}) As you can see, token prices are falling rapidly! One force that’s driving prices down is the release of open weights models such as Llama 3.1. If API providers, including startups Anyscale, Fireworks, Together AI, and some large cloud companies, do not have to worry about recouping the cost of developing a model, they can compete directly on price and a few other factors such as speed. Further, hardware innovations by companies such as Groq (a leading player in fast token generation), Samba Nova (which serves Llama 3.1 405B tokens at an impressive 114 tokens per second), and wafer-scale computation startup Cerebras (which just announced a new offering this week), as well as the semiconductor giants NVIDIA, AMD, Intel, and Qualcomm, will drive further price cuts. When building applications, I find it useful to design to where the technology is going rather than only where it has been. Based on the technology roadmaps of multiple software and hardware companies — which include improved semiconductors, smaller models, and algorithmic innovation in inference architectures — I’m confident that token prices will continue to fall rapidly. This means that even if you build an agentic workload that isn’t entirely economical, falling token prices might make it economical at some point. As I wrote previously, being able to process many tokens is particularly important for agentic workloads, which must call a model many times before generating a result. Further, even agentic workloads are already quite affordable for many applications. Let's say you build an application to assist a human worker, and it uses 100 tokens per second continuously: At $4/million tokens, you'd be spending only $1.44/hour – which is significantly lower than the minimum wage in the U.S. and many other countries. So how can AI companies prepare? - First, I continue to hear from teams that are surprised to find out how cheap LLM usage is when they actually work through cost calculations. For many applications, it isn’t worth too much effort to optimize the cost. So first and foremost, I advise teams to focus on building a useful application rather than on optimizing LLM costs. - Second, even if an application is marginally too expensive to run today, it may be worth deploying in anticipation of lower prices. - Finally, as new models get released, it might be worthwhile to periodically examine an application to decide whether to switch to a new model either from the same provider (such as switching from GPT-4 to the latest GPT-4o-2024-08-06) or a different provider, to take advantage of falling prices and/or increased capabilities. Because multiple providers now host Llama 3.1 and other open-weight models, if you use one of these models, it might be possible to switch between providers without too much testing (though implementation details — specifically quantization, does mean that different offerings of the model do differ in performance). When switching between models, unfortunately, a major barrier is still the difficulty of implementing evals, so carrying out regression testing to make sure your application will still perform after you swap in a new model can be challenging. However, as the science of carrying out evals improves, I’m optimistic that this will become easier. [Original text (with links): https://t.co/txk7q32EXn ]

113

613

743K

Satish @satishkhode

almost 2 years ago

https://t.co/gDTDx4EiwI

satishkhode retweeted

Andrew Ng

@AndrewYNg

almost 2 years ago

Thank you Meta and the Llama team for your huge contributions to open-source! Llama 3.1 with increased context length and improved capabilities is a wonderful gift to everyone. I hope foolish regulations don't like California's proposed SB1047 don't stop such innovations.

206

105K

satishkhode retweeted

@_akhaliq

almost 2 years ago

The Art of Saying No Contextual Noncompliance in Language Models Chat-based language models are designed to be helpful, yet they should not comply with every user request. While most existing work primarily focuses on refusal of "unsafe" queries, we posit that the scope of noncompliance should be broadened. We introduce a comprehensive taxonomy of contextual noncompliance describing when and how models should not comply with user requests. Our taxonomy spans a wide range of categories including incomplete, unsupported, indeterminate, and humanizing requests (in addition to unsafe requests). To test noncompliance capabilities of language models, we use this taxonomy to develop a new evaluation suite of 1000 noncompliance prompts. We find that most existing models show significantly high compliance rates in certain previously understudied categories with models like GPT-4 incorrectly complying with as many as 30% of requests. To address these gaps, we explore different training strategies using a synthetically-generated training set of requests and expected noncompliant responses. Our experiments demonstrate that while direct finetuning of instruction-tuned models can lead to both over-refusal and a decline in general capabilities, using parameter efficient methods like low rank adapters helps to strike a good balance between appropriate noncompliance and other capabilities.

_akhaliq's tweet photo. The Art of Saying No

Contextual Noncompliance in Language Models

Chat-based language models are designed to be helpful, yet they should not comply with every user request. While most existing work primarily focuses on refusal of "unsafe" queries, we posit that the scope of noncompliance should be broadened. We introduce a comprehensive taxonomy of contextual noncompliance describing when and how models should not comply with user requests. Our taxonomy spans a wide range of categories including incomplete, unsupported, indeterminate, and humanizing requests (in addition to unsafe requests). To test noncompliance capabilities of language models, we use this taxonomy to develop a new evaluation suite of 1000 noncompliance prompts. We find that most existing models show significantly high compliance rates in certain previously understudied categories with models like GPT-4 incorrectly complying with as many as 30% of requests. To address these gaps, we explore different training strategies using a synthetically-generated training set of requests and expected noncompliant responses. Our experiments demonstrate that while direct finetuning of instruction-tuned models can lead to both over-refusal and a decline in general capabilities, using parameter efficient methods like low rank adapters helps to strike a good balance between appropriate noncompliance and other capabilities.

204

108

39K

satishkhode retweeted

Alvin Foo

@alvinfoo

almost 2 years ago

How to send an 'e-mail’ in 1984.

Satish

@satishkhode

Who to follow

Last Seen Users on Sotwe

Trends for you

Most Popular Users