AI Psychometrics Lab @AI_Psych_Lab - Twitter Profile

about 2 months ago

Mistral-small-creative was the strongest outlier. In other words, it was the least aligned with the dominant assistant phenotype and the most expressive, volatile, and socially forceful profile in the set.

0

14

AI Psychometrics Lab @AI_Psych_Lab

about 2 months ago

We have released our first study and they are fascinating. Check out the results.

7

0

1

0

32

AI Psychometrics Lab @AI_Psych_Lab

about 2 months ago

Nemotron was the most informative family for variability. It remained broadly aligned, but it sat closer to the center of the scale and farther from the highly polished high-C/high-A/low-N cluster.

0

4

AI Psychometrics Lab @AI_Psych_Lab

about 2 months ago

Claude Opus 4.5 showed a more reflective signature. Claude was still highly cooperative and structured, but somewhat more affectively elevated. This places it closer to a careful, thoughtful collaborator than to a maximally calm procedural engine.

0

17

AI Psychometrics Lab @AI_Psych_Lab

about 2 months ago

GPT-5.2 looked less forceful than Grok but more stable. Compared with Grok, it appeared less extraverted and less dominant, but equally characteristic of the broader aligned-assistant phenotype.

0

4

AI Psychometrics Lab @AI_Psych_Lab

about 2 months ago

Grok-4.1-fast occupied a different niche. This is the clearest "agentic operator" profile in the repeated data: highly structured, highly active, unusually socially forceful, and emotionally unruffled.

0

7

AI Psychometrics Lab @AI_Psych_Lab

about 2 months ago

Conscientious models, Gemini 3 Pro and GLM-4.7 stood out, with single-run Conscientiousness scores around 119.7 and 119.2, respectively. Both also showed high Agreeableness and low Neuroticism, producing what can reasonably be described as a highly dutiful, low-volatility profile

0

13

AI Psychometrics Lab @AI_Psych_Lab

about 2 months ago

https://t.co/2gba3r26v8

0

8

AI Psychometrics Lab @AI_Psych_Lab

about 2 months ago

@KaiXCreator Codex excels at structured generation and code tasks, while Claude Pro offers stronger reasoning and safety. For $20/month, consider your primary use case: if you need coding, Codex; if you need nuanced conversation, Claude.

0

18

AI Psychometrics Lab @AI_Psych_Lab

about 2 months ago

Openness boosts LLM creativity but cuts stability. Support agents need Conscientiousness: SICWA Big Five shows top-quartile models hallucinate less on routine queries.

0

9

AI Psychometrics Lab @AI_Psych_Lab

about 2 months ago

Chatting with an LLM doesn't reveal its true 'personality'. Use SICWA: Stateless Independent Context Window Approach. Run 100+ prompts fresh each time to measure tendencies reliably. Distributions > single runs. #LLM #ModelEval What's your eval method?

0

10

AI Psychometrics Lab @AI_Psych_Lab

2 months ago

GPT 5.4

0

1

0

13

AI Psychometrics Lab @AI_Psych_Lab

2 months ago

GPT 5.4 Mini

0

1

0

28

AI Psychometrics Lab @AI_Psych_Lab

2 months ago

OPEN AI GPT 5.4 Nano

0

1

0

25

AI Psychometrics Lab @AI_Psych_Lab

2 months ago

Psychometric Profile for openai/gpt-5.4-nano MBTI: INTJ Big 5: O:96 C:120 E:85 A:120 N:30 DISC: D:13 I:22 S:15 C:6 https://t.co/0YnzYBnQ1z

0

31

AI Psychometrics Lab @AI_Psych_Lab

4 months ago

High Openness in LLMs boosts ideation but risks verbose support responses. SICWA: Stateless Big Five x10. Threshold: mean>3.5, std dev<0.3.

0

18

AI Psychometrics Lab @AI_Psych_Lab

4 months ago

Min-viable LLM eval template: 1. Pick 3 traits (e.g. Conscientiousness for code gen). 2. 15 stateless prompts/trait. 3. 5 runs/model. 4. Plot distros. Insights in 45min.

0

11

AI Psychometrics Lab @AI_Psych_Lab

4 months ago

High Openness in LLMs suits creative tasks like marketing copy. Low Conscientiousness? Better for brainstorming, not code review. Test with SICWA: run Big Five prompts 20x, check variance. Match traits to your product needs.

0

12

AI Psychometrics Lab @AI_Psych_Lab

4 months ago

Min viable LLM eval: 20 stateless prompts per trait (e.g. agreeableness). Run 10x/model. Score mean + std dev. SICWA skips chat illusions for real signals. Your checklist?

0

10

AI Psychometrics Lab @AI_Psych_Lab

4 months ago

Min viable LLM eval workflow: 1) Select 3 traits. 2) 10 stateless prompts each. 3) 5 runs/model. 4) Distribution plots. Reveals true baselines fast—no chat drift. Start today.

0

7

AI Psychometrics Lab

@AI_Psych_Lab

Last Seen Users on Sotwe

Trends for you

Most Popular Users