@BikingEddy@GoogleAI it works with any headphones on any Android device. Currently only in the US, Mexico and India - but we're working on expanding very soon.
@chrisnk14@GoogleAI yep, if you set the language on the left to "Detect language" it'll translate any language to the language you've selected on the right.
@ainativefirm@OfficialLoganK it's currently available on the google translate app on android with any headphones in the us, mexico and india. we're working on bringing it to ios and other regions soon.
if you don't have headphones connected, you'll hear translations on your phone speaker for both sides of the conversation. if you have headphones connected, the other person can either read text translations on screen or you can set audio output to your speakers for their language.
we're working on making these output settings less confusing though :)
Using whisper is so 2023. Just use gemini, pass in the raw audio directly and prompt the model directly with all the questions you have.
With instructor, we can get
- The exact mispronounced word
- The timestamp when we did it
- Advice on how to do better
Flash truly is the model that keeps on giving @OfficialLoganK
Gemini 1.5 002 beats OpenAI o1-preview on MATH, and it does it at 1/10th the cost and no thinking time.
When 2024 started, lot of people were critical of Google falling behind OpenAI. However, since then they have gathered themselves to pull the right strings. Whether it is hardware (chips), software (Pixel) or AI models (Gemini, Gemma, AlphaFold, etc.)
Really impressed by the team at Google DeepMind has outdone themselves month over month to bring superior releases one after the other. Scaling these models with some of the cheapest price points has put them ahead quickly.
Excited to see what more is coming.
r/singularity
u/callmepyro
o1-preview Math benchmark in thread.
ChatGPT's new voice mode is insane for learning a new language.
Literally a private teacher who can correct your pronunciation and help you progress step by step.
By far one of the best use cases.
(Prompt I used below)
Salesperson probably gets a commission that's a % of total sale (piecewise comp) - so they're incentivized to sell most expensive version with all the options.
If they give you all the options with a breakdown of the specs, you're more likely to select a lower spec with only the options you need. They can probably do better and upsell if they used menu effects to their advantage though.
Anyways this kind of info asymmetry is not as much of a problem anymore, you're one search away from all the info you need, even if the search needs to start from a picture of a car you see at the dealership π
You can now use your voice to add context to searches in Google Lens!
Press and hold on the shutter button in Lens, and it'll say "speak now to ask about this image." After speaking your question, let go of the button and Google Gemini will attempt to provide an answer.
@SajithCooray@harshap Tried scanning a UPI qr with a LankaQR enabled app too, but didn't work - seems it hasn't been implemented as part of the initial partnership. Not sure why not.
UPI is really everywhere... But it's made transacting as a tourist harder - permanently out-of-order ATM machines, (anecdotally) lower credit card accepting merchants than in SL, etc.
Makes sense that UPI doesn't extend to tourists without the aadhar/kyc layer. But why not integrate with the bureau of immigration and let me create a prepaid upi ready account online? Are all the questions I answered for the visa app and immigration officer sahib not enough to suggest that I won't be laundering money? If not at least give me a $25/day transaction volume limit lol.
Yes, you can create a prepaid account (PPI) and use UPI with it, but doing so is a pain as far as I can tell - it's only for g20 countries, and you need to visit a PPI issuer who can "perform money exchange operations" in person to set up the account. Seems revolut is trying to solve the problem, but it's wait-list only for now.
Figuring out how to price something like context caching seems like such an interesting problem that requires a pretty deep understanding of the underlying infra and model architecture.
Even more interesting to try and infer/reverse engineer the high level infra and model architecture decisions given the pricing...
Gemini 1.5 Flash continues to be the best value proposition for anyone building with LLMs.
- $0.0875 / 1 million tokens (cache prompts < 128K)
- $0.175 / 1 million tokens (cache prompts > 128K)
- $1.00 / 1 million tokens per hour (cache storage)
Big π’ by @shresbm and team!!
@raaidrt Forgot to mention, but lots of merchants also don't have cash for change. So I'd need to carry hard cash in small enough denominations too lol.