@DebrajBasu11@ValerioGherardi@shunishiha Yes, for R you can find it in my ‘idiolect’ package https://t.co/1KDMCNqEuc. For python it’s coming soon, hopefully with the published article.
I'm extremely excited to announce the pre-print of our new paper: "Authorship Verification based on the Likelihood Ratio of Grammar Models"
https://t.co/ReZmvWuW0K
with Oren Halvani, Lukas Graner, @ValerioGherardi and @shunishiha
We are looking for a full-time, temporary teaching-focused Lecturer in Discourse Analysis/Stylistics/Register analysis/forensic linguistics. Details of the job are available here: https://t.co/vZnjYHaAYd. Deadline 12th Dec. Please retweet/share!
Continuing our CPD Training Programme series, here's Forensic Linguistics for A-level English teachers!
Join us on Friday, 29th of November 2024 for a one-day course where we will explore how topics in forensic linguistics can be used to support teaching
Here are my three lectures titled "Probability for Language Modeling" delivered for the Quantitative Cognitive Linguistics Network:
https://t.co/NserL4tfrS
https://t.co/4nDlIsMkw8
https://t.co/afTCQGSCW4
Thanks to @and_nini@therfer@rljfutrell@JiriMilicka for listening!
Preprint of my introduction to LLMs for quantitative and corpus linguists is out. It is mostly an advertisement for @repligate's simulator theory, but there are some hopefully original ideas as well. https://t.co/OiFYd6tiBw
4) Apply the method to the questioned text and calibrate a Likelihood Ratio; 5) Explore the data using feature importance or other visualisations depending on the method, including using concordances.
Let me know if you have any feedback or questions!
I'm pleased to announce the release of version 1 of "idiolect", my new package to carry out Forensic Authorship Analysis using R. The website of the package (https://t.co/1KDMCNqEuc) contains a Get Started page with a brief tutorial.
The package contains functions that cover the typical workflow for authorship analysis for forensic problems: 1) Input and preprocess data; 2) Carry out an analysis (Delta, N-gram Tracing, the Impostors Method, LambdaG); 3) Test the performance of the methods on ground truth data
I have a post-doc position available for 5 months in this project: https://t.co/7iiw6QeTna
Remote work with occasional visits to Konstanz can be considered. The starting date is 1 December or soon thereafter.
Send me your CV with informal motivation email if interested!
To help explain the weirdness of LLM Tokenization I thought it could be amusing to translate every token to a unique emoji. This is a lot closer to truth - each token is basically its own little hieroglyph and the LLM has to learn (from scratch) what it all means based on training data statistics.
So have some empathy the next time you ask an LLM how many letters 'r' there are in the word 'strawberry', because your question looks like this:
👩🏿❤️💋👨🏻🧔🏼🤾🏻♀️🙍♀️🧑🦼➡️🧑🏾🦼➡️🤙🏻✌🏿🈴🧙🏽♀️📏🙍♀️🧑🦽🧎♀🍏💂
Play with it here :)
https://t.co/pFQGZIAW1k
Applications are open for this year's Robin's Prize competition! We encourage all student members to submit an article for the prize (on a subject within the area of PhilSoc's interests). The deadline is 30th Nov and further details are available here: https://t.co/A42hfoMKJs
I'm very excited to share the pre-print for our new position paper!
The Sociolinguistics Foundations of Language Modeling
https://t.co/rkXE4h64Ur
A quick 🧵summing up our main claims...
Want to use computational tools to figure out how human language works? But not ready for a PhD? UC Irvine's new post-bacc program in computational language science bridges the gap. Fall 2024 applications now open! https://t.co/dP1QpeSiOJ
And, last but not least, here are the slides of @GeorgeBrownLing, Christin Kirchhübel and my talk at #IAFLL24 "Likelihood ratio based authorship verification methods applied to forensic voice comparison": https://t.co/BTMsqAlZ5v