The Python Steering Council has voted to remove the "experimental" label from the free-threaded ("nogil") builds for Python 3.14.
Big step towards making them the default in a future version of CPython!
This paper systematically evaluates 14 prompting techniques across 10 Software Engineering tasks using four different LLMs.
Methods 🔧:
→ Performance was measured using task-specific metrics like Accuracy, F1 score, CodeBLEU, and BLEU.
→ Linguistic features such as Lexical Diversity and Token Count were analyzed for correlation with performance.
→ Contrastive explanation was used to identify factors contributing to technique effectiveness.
----------------------------
Paper - arxiv. org/abs/2506.05614v1
Paper Title: "Which Prompting Technique Should I Use? An Empirical Investigation of Prompting Techniques for Software Engineering Tasks"
Training on wrong answers outpaces training on correct ones.
10 times more learning emerges from plausible errors than from truths.
Large language models refine their accuracy slowly when they learn only from correct examples.
This paper introduces Likra, which trains one model head on correct answers and another on incorrect ones and uses their likelihood ratio to choose responses. This approach shows that each plausible wrong example can boost accuracy up to 10 times more than each correct example and sharpens the model’s ability to avoid mistakes.
⚙️ The Core Concepts
The Likra model trains two separate prediction heads on a foundation model. One head learns from correct question-answer pairs and the other learns from incorrect pairs. At inference it compares their likelihoods for each answer option and selects the answer with the greatest difference.
⚙️ Experimental Results
Supervised fine-tuning on correct answers yields a smooth rise from 60% to 66% accuracy as examples increase. Likra shows a sharp jump after only a few hundred negative examples, reaching over 80% accuracy and outperforming the positive-only model by a wide margin.
🔍 Impact of Near-Miss Examples
Training the negative head with plausible but wrong answers delivers the largest gains. Random irrelevant answers still help but less dramatically and unrelated text from different tasks offers even smaller benefits. This finding echoes the power of near-miss examples to guide learning.
📈 Shaping Model Confidence
The positive head gradually raises the likelihood of correct answers but leaves plausible wrong options relatively high. The negative head strongly lowers the likelihood of incorrect options while treating unrelated text as unlikely. Combining these effects lets the model distinguish correct answers more sharply.
⚖️ Implications
Negative examples reveal latent knowledge in the pretrained model and flip a switch that focuses probability mass on factual answers. This suggests that limited but carefully chosen wrong examples can accelerate learning and reduce hallucinations in language models.
----------------------------
Paper - arxiv. org/abs/2503.14391
Paper Title: "How much do LLMs learn from negative examples?"
Microsoft plans to #opensource the code behind the GitHub Copilot Chat extension under the MIT license in the coming months. They also aim to integrate core AI features directly into VS Code.
🔎 Find out more: https://t.co/JdSNhC9DB5
#InfoQ#SoftwareDevelopment
NVC++ now supports C++ Standard Parallelism, CUDA C++, OpenACC & OpenMP
They're interoperable - even in the same source file
Now you can combine libraries that use different parallel frameworks
Watch my @NVIDIAGTC talk anytime to learn more. It's free!
https://t.co/kooPvgEid4
‘2019년 연간 온라인쇼핑 동향’에 따르면 지난해 배달음식 주문 등 음식 서비스 거래액은 9조7365억원으로 전년 대비 84.6% 급증했다. 공정거래위원회는 2019년 국내 배달음식 시장 규모를 이보다 2배 이상 큰 20조원 규모로 추정한다 https://t.co/WOMvoGfFmg 오프라인의 시대가 저물고 있는 시장상황