Unitree Unveils: GD01, A Manned Transformable Mecha, from $650,000 👏
The world's first production-ready manned mecha. It can transform. It's a civilian vehicle. It weighs ~500kg with you inside.
Please everyone be sure to use the robot in a Friendly and Safe manner.
@pvncher@RepoPrompt Amazing! How do I best do this flow. I have currently been using the 'Discover' to plan in Codex on GT5-codex-high and then I have been swapping to 'Agent' or 'Pair' to execute on the Plan Using Claude Code. What is your current plan and then execute flow Claude as your PA?
The amount of superstition involved in prompting LLMs is quite fantastic
That "you are an expert in field X..." trick? Likely a complete waste of time since late 2022
Not a lot of people understand this... but you actually don’t have to have an opinion about everything. You don’t have to decide if something is good or bad.
Marcus Aurelius says limiting the amount of opinions we have is one of the most powerful things we can do in life.
Prompt engineering is one of the most rapidly-evolving research topics in AI, but we can (roughly) group recent research on this topic into four categories…
(1) Reasoning: Simple prompting techniques are effective for many problems, but more sophisticated strategies are required to solve multi-step reasoning problems.
- [1] uses zero-shot CoT prompting to automatically generate problem-solving rationales to use for standard CoT prompting.
- [2] selects CoT exemplars based on their complexity (exemplars that have the maximum number of reasoning steps are selected first).
- [3] improves CoT prompting by asking the LLM to progressively refine the generated rationale.
- [4] decomposes complex tasks into several sub-tasks that can be solved via independent prompts and later aggregated into a final answer.
(2) Tool Usage: LLMs are powerful, but they have notable limitations. We can solve many of these limitations by teaching the LLM how to leverage external, specialized tools.
- [5, 6] finetune a language model to teach it how to leverage a fixed, simple set of text-based APIs when answering questions.
- [7] uses a central LLM-based controller to generate a program—written in natural language—that composes several tools to solve a complex reasoning task.
- [8] uses a retrieval-based finetuning technique to teach an LLM to adaptively make calls to APIs based on their documentation when solving a problem.
- [9] uses an LLM as a central controller for leveraging a variety of tools in the form of deep learning model APIs.
- [10, 11] integrates code-capable LLMs with a sandboxed Python environment to execute programs when solving problems.
(3) Context Window: Given the emphasis of recent LLMs on long contexts for RAG / few-shot learning, the properties of context windows and in-context learning have been studied in depth.
- [12] shows that including irrelevant context in the LLM’s prompt can drastically deteriorate performance.
- [13] finds that LLMs pay the most attention to information at the beginning/end of the prompt, while information placed in the middle of a long context is forgotten.
- [14] proposes a theoretically-grounded strategy for optimally selecting few-shot exemplars.
(4) Better Writing: One of the most popular use-cases of LLMs is for improving human writing, and prompt engineering can be used to make more effective writing tools with LLMs.
- [15] improves the writing abilities of an LLM by first generating an outline and then filling in each component of the outline one-by-one.
- [16] uses a smaller LLM to generate a “directional stimulus” (i.e., a textual hint) that can be used as extra context to improve an LLM’s writing ability on a given task.
- [17] improves the quality of LLM-generated summaries by iteratively prompting the LLM to increase the information density of the summary.
-----
Bibliography
[1] Automatic chain of thought prompting in large language models.
[2] Complexity-based prompting for multi-step reasoning.
[3] Progressive-hint prompting improves reasoning in large language models.
[4] Decomposed prompting: A modular approach for solving complex tasks.
[5] Toolformer: Language models can teach themselves to use tools.
[6] Gpt4tools: Teaching large language model to use tools via self-instruction.
[7] Chameleon: Plug-and-play compositional reasoning with large language models.
[8] Gorilla: Large language model connected with massive apis.
[9] Hugginggpt: Solving ai tasks with chatgpt and its friends in huggingface.
[10] PAL: Program-aided Language Models.
[11] Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks.
[12] Large language models can be easily distracted by irrelevant context.
[13] Lost in the middle: How language models use long contexts.
[14] Large language models are latent variable models: Explaining and finding good demonstrations for in-context learning.
[15] Skeleton-of-thought: Large language models can do parallel decoding.
[16] Guiding large language models via directional stimulus prompting.
[17] From sparse to dense: GPT-4 summarization with chain of density prompting.
I recommend having a list of things AI can almost do well, but still fail at right now
These are your personal benchmarks, and the only real way to understand if new models (think GPT-5, Gemini 2.0, etc.) are actually a leap forward in your context or just an incremental change
You can now run one of the hottest LLM models on your phone!
It works on both Android and iPhone. Look at the attached video: A model working natively at 20 tokens/sec on the phone!
This is Google's latest Gemma 2B model. It runs on a phone thanks to the MLC-LLM open-source project. MLC-LLM allows native deployment of any large language model with native APIs with compiler acceleration.
Check out their repository here:
https://t.co/R2151gLruO
MLC-LLM is a collaboration between @OctoAICloud, Carnegie Mellon, and the broader open-source community. Check out this blog post where they showcase the Gemma 2B running on the phone:
https://t.co/JM5LnFfr3h