GDELT Project

@gdeltproject

The Official Twitter account of the Global Database of Events, Language, and Tone (GDELT) Project

Joined January 2014

2 Following

798 Followers

300 Posts

gdeltproject retweeted

Kalev Leetaru

@kalevleetaru

about 3 years ago

Visual Explorer: OCR'ing A Year And A Half Of CSPAN Through Tesseract To seed further research into the potential new kinds of insights that could be derived by searching and analyzing the onscreen text of our nation's governance using open OCR tools, today in collaboration with the Internet Archive's TV News Archive and the multi-party Media-Data Research Consortium, we are releasing a new dataset of nearly a year of a half of Tesseract OCR'd text from CSPAN, running January 1, 2022 through April 30, 2023, applying Tesseract to each image from the every-4-seconds Visual Explorer preview images. In all, 11,192 broadcasts totaling 10,375,897 images representing 41.5 million seconds of airtime were OCR'd by Tesseract yielding 1.5GB of JSON containing 472MB of OCR'd text. https://t.co/AhLPisvu8a

kalevleetaru's tweet photo. Visual Explorer: OCR'ing A Year And A Half Of CSPAN Through Tesseract

To seed further research into the potential new kinds of insights that could be derived by searching and analyzing the onscreen text of our nation's governance using open OCR tools, today in collaboration with the Internet Archive's TV News Archive and the multi-party Media-Data Research Consortium, we are releasing a new dataset of nearly a year of a half of Tesseract OCR'd text from CSPAN, running January 1, 2022 through April 30, 2023, applying Tesseract to each image from the every-4-seconds Visual Explorer preview images. In all, 11,192 broadcasts totaling 10,375,897 images representing 41.5 million seconds of airtime were OCR'd by Tesseract yielding 1.5GB of JSON containing 472MB of OCR'd text.

https://t.co/AhLPisvu8a

gdeltproject retweeted

Kalev Leetaru

@kalevleetaru

about 3 years ago

Fully Autonomous Diplomacy + Counter-Messaging Experiments With ChatGPT + GDELT Given the ability of Large Language Models (LLMs) like ChatGPT to craft human-like prose, how easily could they be used to fully autonomously watch television news, identify narratives that run counter to US interests and generate articulate and fluent counter-messages for different mediums, ready for distribution and without any human intervention required? Such use cases are extremely ethically fraught, but their inevitable application raises the question of just how easy current tools might make this process and how useable the end results might be. Overall, the results here suggest that ChatGPT and GDELT can be combined today with just a few lines of code to create a fully automated narrative monitoring and counter-messaging system. At the same time, the results do suggest that ChatGPT 3.5 lacks the ability to fully recreate the unique voice of non-Western media, especially media systems that feature heavily contextualized narration, but at the same time, the results above are not that far removed from some past human-driven counter-messaging efforts undertaken by Western nations. Most importantly, through proper prompt engineering, additional examples and fine-tuning one could readily yield an LLM capable of writing in a more authentic voice. The kind of fully automated counter-messaging workflow presented here raises myriad ethical and moral questions, but the near-certainty of these kinds of workflows proliferating in the immediate term necessitates a better understanding of what such systems might look like and their nuances in order to understand how to identify and counter them. In the end, the idea of a fully automated counter-messaging system is no longer science fiction – it is here today and available with just a few lines of code. https://t.co/BzzTnn02xK

kalevleetaru's tweet photo. Fully Autonomous Diplomacy + Counter-Messaging Experiments With ChatGPT + GDELT

Given the ability of Large Language Models (LLMs) like ChatGPT to craft human-like prose, how easily could they be used to fully autonomously watch television news, identify narratives that run counter to US interests and generate articulate and fluent counter-messages for different mediums, ready for distribution and without any human intervention required? Such use cases are extremely ethically fraught, but their inevitable application raises the question of just how easy current tools might make this process and how useable the end results might be.

Overall, the results here suggest that ChatGPT and GDELT can be combined today with just a few lines of code to create a fully automated narrative monitoring and counter-messaging system. At the same time, the results do suggest that ChatGPT 3.5 lacks the ability to fully recreate the unique voice of non-Western media, especially media systems that feature heavily contextualized narration, but at the same time, the results above are not that far removed from some past human-driven counter-messaging efforts undertaken by Western nations. Most importantly, through proper prompt engineering, additional examples and fine-tuning one could readily yield an LLM capable of writing in a more authentic voice.

The kind of fully automated counter-messaging workflow presented here raises myriad ethical and moral questions, but the near-certainty of these kinds of workflows proliferating in the immediate term necessitates a better understanding of what such systems might look like and their nuances in order to understand how to identify and counter them.

In the end, the idea of a fully automated counter-messaging system is no longer science fiction – it is here today and available with just a few lines of code.

https://t.co/BzzTnn02xK

gdeltproject retweeted

Kalev Leetaru

@kalevleetaru

about 3 years ago

The timeline below compares the percentage of airtime across business television news channels since the start of last year that mentioned President Biden versus Elon Musk, showing that twice last year coverage of Musk nearly equaled that of Biden in a reflection of his outsized media persona. https://t.co/x7auGfBMWc

kalevleetaru's tweet photo. The timeline below compares the percentage of airtime across business television news channels since the start of last year that mentioned President Biden versus Elon Musk, showing that twice last year coverage of Musk nearly equaled that of Biden in a reflection of his outsized media persona.

https://t.co/x7auGfBMWc

778

gdeltproject retweeted

Kalev Leetaru

@kalevleetaru

about 3 years ago

WashPost: The TikTok fight Is A Generational Fight The Post's Philip Bump includes a graph of mentions across television news using the TV Explorer: https://t.co/cTvOtUI9Tr

580

Who to follow

Air National Guard

@AirNatlGuard

Official tweets from the U.S. Air National Guard 🇺🇸 Following, RTs, likes & links ≠ endorsement. 📬 [email protected]

Travelers

@Travelers

Taking care of our customers, communities and each other. https://t.co/zmbqMtnDHy

MSF Science

@MSFsci

Research @ Médecins Sans Frontières. Global health research and innovation for humanitarian action. Home of the MSF Scientific Days and MSF Science Portal.

gdeltproject retweeted

Kalev Leetaru

@kalevleetaru

about 3 years ago

WashPost: Ray Epps Seeks The Seemingly Impossible: An Apology From Tucker Carlson References television news coverage of Ray Epps via the TV Explorer: https://t.co/bHOeC3gu17

450

gdeltproject retweeted

Kalev Leetaru

@kalevleetaru

about 3 years ago

Spinmeisters Of Russia: The Bucha Massacre Likened To WWII Nazis https://t.co/CuCXcLScEl

374

gdeltproject retweeted

Kalev Leetaru

@kalevleetaru

about 3 years ago

Fox News Dominates Mentions Of "Radical" Over The Past Decade As the timeline and graph below show, Fox News has dominated mentions of the word "radical" over the past decade. https://t.co/EkBzRMETck

kalevleetaru's tweet photo. Fox News Dominates Mentions Of "Radical" Over The Past Decade

As the timeline and graph below show, Fox News has dominated mentions of the word "radical" over the past decade.

https://t.co/EkBzRMETck https://t.co/f8Y5RWyLJv

312

gdeltproject retweeted

Kalev Leetaru

@kalevleetaru

about 3 years ago

Being "Canceled" Took Off In 2020 On Television News But Has Been Fading Since 2021 The timeline below tracks total mentions of "canceled" on television news, showing nearly equal mentions through mid-2020, when the term took off on Fox News, but has been declining on Fox since a peak of March 2021. https://t.co/h49ZbwvX3S

kalevleetaru's tweet photo. Being "Canceled" Took Off In 2020 On Television News But Has Been Fading Since 2021

The timeline below tracks total mentions of "canceled" on television news, showing nearly equal mentions through mid-2020, when the term took off on Fox News, but has been declining on Fox since a peak of March 2021.

https://t.co/h49ZbwvX3S

253

gdeltproject retweeted

Kalev Leetaru

@kalevleetaru

about 3 years ago

Pandemic Coverage Continues To Fade Across Both Online And Television News https://t.co/LvQ5adxT5m

222

gdeltproject retweeted

Kalev Leetaru

@kalevleetaru

about 3 years ago

Mentions of "woke" and "wokeness" surged on Fox News from January 2021, but over the last three months have surged on CNN and MSNBC as well. https://t.co/hmUUUQ21gI

kalevleetaru's tweet photo. Mentions of "woke" and "wokeness" surged on Fox News from January 2021, but over the last three months have surged on CNN and MSNBC as well.

https://t.co/hmUUUQ21gI https://t.co/RpOEkFybVB

226

gdeltproject retweeted

Kalev Leetaru

@kalevleetaru

about 3 years ago

"Maga" Mentions Fade On CNN & Fox News But Continue On MSNBC https://t.co/ozeFj860Nx

178

gdeltproject retweeted

Kalev Leetaru

@kalevleetaru

about 3 years ago

Mentions of an impending recession continue to fade away on television news channels. https://t.co/0ADGyjclQO

186

gdeltproject retweeted

Kalev Leetaru

@kalevleetaru

about 3 years ago

Visual Explorer: Creating Visual Networks Of Facial Co-Occurrences On An Episode Of Russian TV News' 60 Minutes – Revisited Last week we demonstrated using a simplistic facial extraction and visual clustering pipeline to extract the faces from a single episode of Russian TV News Russia 1's "60 Minutes" and build a co-occurrence graph of who appears alongside of whom. To make the pipeline as easy to use as possible, we used a very simplistic pipeline of an older face extractor that is less accurate than modern tools but extremely fast, coupled with a perceptual hash-based clustering postprocessor to group faces together to track them across frames. The results suggested considerable promise for this analytic approach, but also demonstrated the existential limitations of such a simple pipeline. Today we revisit that exploration using a modern face extraction and clustering pipeline that yields vastly more accurate results. https://t.co/JY39VjDiHZ

kalevleetaru's tweet photo. Visual Explorer: Creating Visual Networks Of Facial Co-Occurrences On An Episode Of Russian TV News' 60 Minutes – Revisited

Last week we demonstrated using a simplistic facial extraction and visual clustering pipeline to extract the faces from a single episode of Russian TV News Russia 1's "60 Minutes" and build a co-occurrence graph of who appears alongside of whom. To make the pipeline as easy to use as possible, we used a very simplistic pipeline of an older face extractor that is less accurate than modern tools but extremely fast, coupled with a perceptual hash-based clustering postprocessor to group faces together to track them across frames. The results suggested considerable promise for this analytic approach, but also demonstrated the existential limitations of such a simple pipeline. Today we revisit that exploration using a modern face extraction and clustering pipeline that yields vastly more accurate results.

https://t.co/JY39VjDiHZ

227

gdeltproject retweeted

Kalev Leetaru

@kalevleetaru

about 3 years ago

Adding Confidence Scores To Tracking A Year Of Tucker Carlson On Russia 1's "60 Minutes" Last month, in collaboration with the Internet Archive's TV News Archive, we demonstrated scanning a year of Russia1's "60 Minutes" for all appearances of Tucker Carlson. Let's repeat that analysis with a more advanced tool that also generates a distance score of the extracted face compared with the source face, allowing us to post-filter to remove false positives, identify the strongest matches, etc. https://t.co/i6tRUMNljN

kalevleetaru's tweet photo. Adding Confidence Scores To Tracking A Year Of Tucker Carlson On Russia 1's "60 Minutes"

Last month, in collaboration with the Internet Archive's TV News Archive, we demonstrated scanning a year of Russia1's "60 Minutes" for all appearances of Tucker Carlson. Let's repeat that analysis with a more advanced tool that also generates a distance score of the extracted face compared with the source face, allowing us to post-filter to remove false positives, identify the strongest matches, etc.

https://t.co/i6tRUMNljN

204

gdeltproject retweeted

Kalev Leetaru

@kalevleetaru

about 3 years ago

Sampling Russian television news broadcasts every 4 seconds and pairwise comparing those "visual ngrams" over an entire broadcast yields a powerful tool for cataloging advertising, identifying key advertising trends across the Russian television news landscape and how the ad economy is adjusting in the face of global sanctions. Using more sophisticated tooling for identifying ad content and using signature-based tracing approaches, it would be possible to fully automatically construct a live catalog of advertising activity across Russian television news to understand the brands, industries, products and services being advertised and how that composition has changed over the past year as the impact of sanctions has continued to build. https://t.co/msrMrEeqkz

kalevleetaru's tweet photo. Sampling Russian television news broadcasts every 4 seconds and pairwise comparing those "visual ngrams" over an entire broadcast yields a powerful tool for cataloging advertising, identifying key advertising trends across the Russian television news landscape and how the ad economy is adjusting in the face of global sanctions. Using more sophisticated tooling for identifying ad content and using signature-based tracing approaches, it would be possible to fully automatically construct a live catalog of advertising activity across Russian television news to understand the brands, industries, products and services being advertised and how that composition has changed over the past year as the impact of sanctions has continued to build.

https://t.co/msrMrEeqkz

160

gdeltproject retweeted

Kalev Leetaru

@kalevleetaru

about 3 years ago

In collaboration with the @internetarchive , the Visual Explorer extracts one frame every 4 seconds from each broadcast to create a "visual ngram" that non-consumptively captures the core visual narratives of the broadcast. What if we took all of those images for a given Russian TV news broadcast and pairwise compared each image to every other image in that broadcast based on pixel-level visual similarity (using a perceptual hash)? The end result would allow us to not only identify contiguous sequences (marking "shot changes"), but, most importantly, to identify repeated content that makes an appearance multiple times throughout a broadcast, ranging from a clip that is aired multiple times at different points in the broadcast to repeated advertisements. https://t.co/ZQ4YMk2i4U

kalevleetaru's tweet photo. In collaboration with the @internetarchive , the Visual Explorer extracts one frame every 4 seconds from each broadcast to create a "visual ngram" that non-consumptively captures the core visual narratives of the broadcast. What if we took all of those images for a given Russian TV news broadcast and pairwise compared each image to every other image in that broadcast based on pixel-level visual similarity (using a perceptual hash)? The end result would allow us to not only identify contiguous sequences (marking "shot changes"), but, most importantly, to identify repeated content that makes an appearance multiple times throughout a broadcast, ranging from a clip that is aired multiple times at different points in the broadcast to repeated advertisements.

https://t.co/ZQ4YMk2i4U

gdeltproject retweeted

Kalev Leetaru

@kalevleetaru

about 3 years ago

Visualizing Who Appears Alongside Whom On An Episode Of Russian TV News' 60 Minutes Who appears alongside whom on television news represents a key editorial decision of what voices to pair. From split-screen displays to the back-and-forth of presenters and guests, understanding co-occurrence patterns on television news offers a powerful lens into the underlying narrative storytelling of a broadcast. What if we could analyze such co-occurrence patterns automatically, generating a network visualization of the faces that appear onscreen in the same frame or subsequent frames over an entire broadcast? https://t.co/AHZNj4rAlY

kalevleetaru's tweet photo. Visualizing Who Appears Alongside Whom On An Episode Of Russian TV News' 60 Minutes

Who appears alongside whom on television news represents a key editorial decision of what voices to pair. From split-screen displays to the back-and-forth of presenters and guests, understanding co-occurrence patterns on television news offers a powerful lens into the underlying narrative storytelling of a broadcast. What if we could analyze such co-occurrence patterns automatically, generating a network visualization of the faces that appear onscreen in the same frame or subsequent frames over an entire broadcast?

https://t.co/AHZNj4rAlY

153

gdeltproject retweeted

Kalev Leetaru

@kalevleetaru

about 3 years ago

Yesterday, in collaboration with the @internetarchive's TV News Archive, we announced the availability of more than 1 billion words of transcribed and translated Belarusian, Iranian, Russian and Ukrainian television news broadcasts. How might we examine these transcripts with ChatGPT to understand what a day of Russian television news says about Ukrainian president Volodymyr Zelensky? https://t.co/qJgISDJpx9

kalevleetaru's tweet photo. Yesterday, in collaboration with the @internetarchive's TV News Archive, we announced the availability of more than 1 billion words of transcribed and translated Belarusian, Iranian, Russian and Ukrainian television news broadcasts. How might we examine these transcripts with ChatGPT to understand what a day of Russian television news says about Ukrainian president Volodymyr Zelensky?

https://t.co/qJgISDJpx9

152

gdeltproject retweeted

Kalev Leetaru

@kalevleetaru

about 3 years ago

In collaboration with the @internetarchive, more than a billion words of Belarusian, Iranian, Russian And Ukrainian television news now accessible for narrative analysis: https://t.co/PiKzvy40s4

kalevleetaru's tweet photo. In collaboration with the @internetarchive, more than a billion words of Belarusian, Iranian, Russian And Ukrainian television news now accessible for narrative analysis:

https://t.co/PiKzvy40s4 https://t.co/LTHpajLWNc

gdeltproject retweeted

Kalev Leetaru

@kalevleetaru

about 3 years ago

Rep. Marjorie Taylor Greene's (MTG) disapproval of military support to Ukraine remains popular on Russian state television, such as this excerpt of her CPAC speech and one of her Tucker Carlson appearances. https://t.co/KUO4QpmTQC

kalevleetaru's tweet photo. Rep. Marjorie Taylor Greene's (MTG) disapproval of military support to Ukraine remains popular on Russian state television, such as this excerpt of her CPAC speech and one of her Tucker Carlson appearances.

https://t.co/KUO4QpmTQC https://t.co/h8x0Dek3Fi

169

GDELT Project

@gdeltproject

Who to follow

Last Seen Users on Sotwe

Trends for you

Most Popular Users