GDELT and NLP: economic and geopolitical analysis
November 5, 2024 · 10 min

In this second part of my study on the use of The GDELT Project, I analyze it through the lens of other papers in economics, econometrics, macroeconomics and geopolitics.
Prediction of macroeconomic indicators improves with GDELT, as Mohammed Elshendy and Andrea Fronzetti Colladon show in Big Data Analysis of Economic News: Hints to Forecast Macroeconomic Indicators (2017): the number of news items, their tone, the network constraint of nations and the swings in their betweenness centrality are important predictors of GDP per capita and of business and consumer confidence indices.
On the BBVA Research side, a notable map of refugee flows (January–November 2015), a direct consequence of the Syrian war in Europe, was built with GDELT data. Big Data has also been used to measure media perception of Chinese stock markets.
In Forecasting US Stock Price Movements Using Convolutional Neural Networks And News Sentiment From GDELT, sentiment extracted from the GDELT Global Knowledge Graph 1.0 was used to predict US stock index movements, filtering the data by organization mentions in articles.
A study of the Saudi market (Predicting Saudi Stock Market Index by Incorporating GDELT Using Multivariate Time Series Modeling) incorporated GDELT's Tone and Social Media Attention time series (obtained via Google BigQuery) to predict the stock index.
Methodological aspects and challenges
Integrating GDELT into econometric models is a significant advance: it incorporates unique variables like news sentiment and the frequency of economic events, giving a more nuanced understanding of macroeconomic phenomena.
Natural Language Processing (NLP) and sentiment analysis: NLP techniques are crucial to turn unstructured text into usable numerical data. This includes dictionary-assisted algorithms for sentiment (positive/negative, polarization), dynamic topic models such as LDA (Latent Dirichlet Allocation) and STM (Structural Topic Model) to summarize texts into semantic structures, and word embedding models like Word2Vec and GloVe to understand words with similar meanings or polysemy. Transformers and LLMs (GPT-3, BERT, LLaMA, BARD, ChatGPT) represent an unprecedented leap, capable of generating coherent, contextually relevant texts focused on prediction.
Network analysis: GDELT enables network analysis where nodes represent units (companies, countries) and edges indicate relationships (co-occurrence in news). Centrality measures — degree, closeness and betweenness — are used to assess the weight of each node.
The huge volume of textual information GDELT processes would be unmanageable without NLP tools. As Analysis of World Geopolitics through AI from the Spanish Ministry of Defense notes: "thanks to the development of NLP models, text has become one of the main sources of information, and the ability to translate 'Text to Numbers' is becoming a powerful analytical tool in political science and international relations."
Evolution of sentiment analysis
Dictionary- and rule-based approaches: tools like VADER have been useful to analyze short, colloquial texts.
Machine learning models: more sophisticated techniques like Topic Modeling (LDA) and Word Embeddings have enabled deeper, more contextualized analysis. I recommend reading Economics, Markets and Geopolitics: the role of natural language models in social sciences, by Alvaro Ortiz and Tomasa Rodrigo, in Prediction and Economic Decisions with Big Data (Funcas).
Transformers and LLMs: models like GPT-3, BERT, LLaMA or BARD "represent the current state of the art in text analysis in social sciences, deeply understanding and analyzing human language." They interpret nuance, context and sentiment with unprecedented precision.
Critical challenges in the Big Data era
Data quality: repositories like GDELT contain noise; the importance of each news item must be weighted.
Bias: always present; training your own model on annotated data is preferable to using an LLM to annotate.
Context ambiguity: ignoring word order makes it hard to capture irony or sarcasm.
Human interpretability and domain-specific dictionaries (avoiding generalizations) are also open fronts.
Closing
Integrating AI and NLP into economic and geopolitical analysis is reshaping how we understand the world. GDELT-based projects make it possible to generate real-time indicators of political uncertainty, geostrategic tensions and key sectors such as semiconductors. But this progress requires addressing biases, reinforcing transparency and establishing robust regulatory frameworks. AI must rigorously support human judgment, not replace it. Its responsible use will be key to anticipate and respond better to the great global challenges.
