I spent 10% of my life contributing to the development of the #VisionPro while I worked at Apple as a Neurotechnology Prototyping Researcher in the Technology Development Group. It’s the longest I’ve ever worked on a single effort. I’m proud and relieved that it’s finally announced. I’ve been working on AR and VR for ten years, and in many ways, this is a culmination of the whole industry into a single product. I’m thankful I helped make it real, and I’m open to consulting and taking calls if you’re looking to enter the space or refine your strategy.
The work I did supported the foundational development of Vision Pro, the mindfulness experiences, ▇▇▇▇▇▇ products, and also more ambitious moonshot research with neurotechnology. Like, predicting you’ll click on something before you do, basically mind reading. I was there for 3.5 years and left at the end of 2021, so I’m excited to experience how the last two years brought everything together. I’m really curious what made the cut and what will be released later on.
Specifically, I’m proud of contributing to the initial vision, strategy and direction of the ▇▇▇▇▇▇ program for Vision Pro. The work I did on a small team helped green light that product category, and I think it could have significant global impact one day.
The large majority of work I did at Apple is under NDA, and was spread across a wide range of topics and approaches. But a few things have become public through patents which I can cite and paraphrase below.
Generally as a whole, a lot of the work I did involved detecting the mental state of users based on data from their body and brain when they were in immersive experiences.
So, a user is in a mixed reality or virtual reality experience, and AI models are trying to predict if you are feeling curious, mind wandering, scared, paying attention, remembering a past experience, or some other cognitive state. And these may be inferred through measurements like eye tracking, electrical activity in the brain, heart beats and rhythms, muscle activity, blood density in the brain, blood pressure, skin conductance etc.
There were a lot of tricks involved to make specific predictions possible, which the handful of patents I’m named on go into detail about. One of the coolest results involved predicting a user was going to click on something before they actually did. That was a ton of work and something I’m proud of. Your pupil reacts before you click in part because you expect something will happen after you click. So you can create biofeedback with a user's brain by monitoring their eye behavior, and redesigning the UI in real time to create more of this anticipatory pupil response. It’s a crude brain computer interface via the eyes, but very cool. And I’d take that over invasive brain surgery any day.
Other tricks to infer cognitive state involved quickly flashing visuals or sounds to a user in ways they may not perceive, and then measuring their reaction to it.
Another patent goes into details about using machine learning and signals from the body and brain to predict how focused, or relaxed you are, or how well you are learning. And then updating virtual environments to enhance those states. So, imagine an adaptive immersive environment that helps you learn, or work, or relax by changing what you’re seeing and hearing in the background.
All of these details are publicly available in patents, and were carefully written to not leak anything. There was a ton of other stuff I was involved with, and hopefully more of it will see the light of day eventually.
A lot of people have waited a long time for this product. But it’s still one step forward on the road to VR. And it’s going to take until the end of this decade for the industry to fully catch up to the grand vision for this tech.
Again, I’m open to consulting work and taking calls if your business is looking to enter the space or refine your strategy. Mostly, I’m proud and relieved this has finally been announced. It’s been over five years since I started working on this, and I spent a significant portion of my life on it, as did an army of other designers and engineers. I hope the whole is greater than the sum of the parts and Vision Pro blows your mind.
I've seen a lot of people asking "why does everyone think Twitter is doomed?"
As an SRE and sysadmin with 10+ years of industry experience, I wanted to write up a few scenarios that are real threats to the integrity of the bird site over the coming weeks.
I wish every single person in the West would listen to Putin's speech. Obviously, that won't happen so let me summarise as a professional translator for 10+ years. He states, as he has done from the outset, what his intentions and complaints are in the plainest terms possible.🧵
🧵Je tiendrai un séminaire sur #IEML (une langue ayant la puissance expressive d'une langue naturelle et la calculabilité d'une algèbre) pendant trois après-midi (13h-17h) les 24, 25 et 26 octobre 2022 à l’Université de Montréal 1/7
I currently don't think that knowledge graphs are suitable as a general medium of representation, because mental representations require dynamicity, generativity and dealing with contradictions. Minds require prelinguistic, executable languages for representation.
Hello World! 🤖
Today we announce the release of our first open-source project, Graphgen.
It's a command line tool used to generated subgraphs for @graphprotocol using annotated solidity contract files.
Check out the project on our medium post!
https://t.co/59PVip0lzT
Ad hoc loss term to avoid losing palettes during fast zooms. When all the pixels that sample a palette move off screen it seems to crush the palette and dump it into the background, where it can resolve into details.
https://t.co/KZaqKBACwG
#pytti#CLIP
Document parsing meets 🤗 Transformers!
📄#LayoutLMv2 and #LayoutXLM by @MSFTResearch are now available! 🔥
They're capable of parsing document images (like PDFs) by incorporating text, layout, and visual information, as in the @gradio demo below ⬇️
https://t.co/Hr0kFIPHXW
"Hosting SQLite databases on Github Pages" is absolutely brilliant: it adds a virtual filesystem to SQLite-compiled-to-WebAssembly in order to fetch pages from the database using HTTP range requests https://t.co/hs6X96LABZ
Here’s our new computer vision system achieving state of the art results in image segmentation, without needing any labeled training data. This new model was trained on random, unlabeled data, but quickly achieved state-of-the-art results. It’s awesome.