Teaches math and CS at Thomas Jefferson High School; attempts to balance obsessions with too many things. He/his. Opinions mine and not those of TJ or FCPS.
"It’s one of the best puzzle adventures ever created, now given a modern look that is just as breathtaking today as the original was back in 1997." —@GiNLounge
See why Riven is the puzzle adventure game of the summer (and some are even saying "Game of the Year" 🩵) ⤵️
@MBourjaily I’m not even a humanities teacher and the drop in students reading whole books is clear in how I interact with students. Very few report ever reading for pleasure, for instance.
Oof.
I wonder if every generation of teachers goes through this experience where all the cutting edge “high impact” stuff you learned when you started teaching slowly turns out to be mostly useless?
"Growth mindset" interventions have ≈0 impact on students' academic achievements, according to a new meta-analysis, and studies performed by authors with a financial interest in growth mindset are more likely to report that growth mindset works.
@mpershan But I feel like this thing happens a lot that’s, like, “yes there are learning styles…but trying to cater to them doesn’t actually help”. Lots of fads about real things that don’t really translate to classroom success and it takes a long time for that second part to come around.
@mpershan Sure, a perfectly reasonable read on this would be “growth mindsets are as valuable as we thought, but these simple interventions don’t seem to instill them effectively”. I have to think the idea still makes sense, its so intuitively powerful and well documented.
GPT-4 is getting worse over time, not better.
Many people have reported noticing a significant degradation in the quality of the model responses, but so far, it was all anecdotal.
But now we know.
At least one study shows how the June version of GPT-4 is objectively worse than the version released in March on a few tasks.
The team evaluated the models using a dataset of 500 problems where the models had to figure out whether a given integer was prime. In March, GPT-4 answered correctly 488 of these questions. In June, it only got 12 correct answers.
From 97.6% success rate down to 2.4%!
But it gets worse!
The team used Chain-of-Thought to help the model reason:
"Is 17077 a prime number? Think step by step."
Chain-of-Thought is a popular technique that significantly improves answers. Unfortunately, the latest version of GPT-4 did not generate intermediate steps and instead answered incorrectly with a simple "No."
Code generation has also gotten worse.
The team built a dataset with 50 easy problems from LeetCode and measured how many GPT-4 answers ran without any changes.
The March version succeeded in 52% of the problems, but this dropped to a pale 10% using the model from June.
Why is this happening?
We assume that OpenAI pushes changes continuously, but we don't know how the process works and how they evaluate whether the models are improving or regressing.
Rumors suggest they are using several smaller and specialized GPT-4 models that act similarly to a large model but are less expensive to run. When a user asks a question, the system decides which model to send the query to.
Cheaper and faster, but could this new approach be the problem behind the degradation in quality?
In my opinion, this is a red flag for anyone building applications that rely on GPT-4. Having the behavior of an LLM change over time is not acceptable.
Have you noticed any issues when using GPT-4 and ChatGPT lately? Do you think these problems are overblown?
Creating equity—true equity—is long and arduous work that has been taken up by many before us and is carried on by many around us. We join them in this work, and reaffirm our commitment to our students and their journeys.
Read BEAM's full statement:
https://t.co/vGsQKjAz43
Not sure I've ever read a better short take on the fatal flaw of accountability-based school reform - "the most efficient tool ever devised to destroy a student’s interest in learning" - as this one by @DLabaree. He zeroes in on the difference between effectiveness and efficiency
Goal-gradient theory: the closer we are to reaching a goal, the stronger we feel the motivation.
Think of it like a race. Halfway done? Really not digging it. Finish line in sight? You better believe I'm giving it all I've got.
How can we harness this in our classrooms?
1/
@CmonMattTHINK I used to use empty boxes in a unique color; I think it did help? Especially later when substituting complicated things (not numbers) for other variables. Draw the same colored box around both things.
@MathyMcMatherso Thank you for posting this! I’m still working on how much to encourage or discourage use of this in my courses and this gave a lot of food for thought.
Kids often turn to phones as their third space when we haven’t worked with them to create anything else + then we get mad at them for doing what they could. It’s important that we leave room for kids to create third spaces in schools that center themselves + their needs.