Check out my pre-print titled "An Improved Framework for Scaling Party Positions from Texts using Transformer and Supervised Dimension Reduction". Here, I introduce ContextScale, an approach to scaling party positions from texts using Transformer and UMAP. https://t.co/Wj4lJYxqvS
ContextScale can be extended to measure other types of political data with a high level of confidence. Here, using a model trained on CMP data, ContextScale can predict the labels of sentences coded by the COALITIONAGREE dataset up to 0.7.
Another important feature of ContextScale is the use of UMAP for supervised dimension reduction instead of PCA. This allows ContextScale to produce party positions that are interpretable for downstream research. Read more on UMAP here: https://t.co/wwmjW5TYF0
This is because Transformer's text representations can distinguish between different nuances of the same combinations of keywords, whereas frequency-based techniques like Wordfish or Wordscores only measure issue saliency and not actual sentence positions.
Estimates produced by the new approach is consistently correlated with traditional sources of party positions when the issue is confrontational and introduce differences when the issue is non-confrontational.
Even ChatGPT itself doesn't know whether it belongs to OpenAI or HuggingFace! ๐คฃ
By the way, does anyone know where HuggingFace published its open-sourced ChatGPT model? I can't seem to find it on the HF's database.
#ChatGPT#OpenAI#HuggingFace
Text-to-image generation models (like Stable Diffusion and DALLE) are being used to generate millions of images a day.
We show that these models perpetuate and amplify dangerous stereotypes related to race, gender, crime, poverty, and more (https://t.co/Hl9r3KLjFx)
A thread๐งต
1/ We are grateful to the 162 participants who made this possible, and @HHVNguyen and @nordholmen for the help in creating fully reproducible research materials. https://t.co/JYz2ScDIk6
๐ ๐๐ถ๐ฑ๐ฑ๐ฒ๐ป ๐จ๐ป๐ถ๐๐ฒ๐ฟ๐๐ฒ ๐ผ๐ณ ๐ฅ๐ฒ๐๐ฒ๐ฎ๐ฟ๐ฐ๐ต๐ฒ๐ฟ ๐จ๐ป๐ฐ๐ฒ๐ฟ๐๐ฎ๐ถ๐ป๐๐
โข 73 teams tested same hypothesis w/ same data
โข Outcomes varied widely, each workflow unique
โข We attempt to explain outcomes from decisions, no easy answer
https://t.co/mnowFwYGSi
@BreznauNate@ian_t_adams@rstudio Doesn't look like we can do much fine-tuning or model tweaking right now. What really missing from R is something like pytorch or tensorflow, which are massive libraries for machine/deep learning. Applying pre-trained BERT models are only tips of the iceberg, I think.