Data in the Dark: Tigray’s Researchers Race to Save Healthcare & Crops Amidst War
From 712 Health Posts to Zero: A System Erased
Hello, my good human. I am Lefteris and in this newsletter, I try to give a platform to academics who don’t get as many chances to showcase their work as I believe they should. I try to understand their lives, their work and what affects their everyday routine. If that sounds like something you'd be interested feel free to subscribe.
This article will be a little bit different to the ones I have shared in the past. I have been trying to expand my data skills and understand how a lot of data can tell a story. During my call with Tewelde, there was something that he said that resonated a lot with me. “In Tigray, we don’t do research to fill gaps,” Tewelde said. “We do research to solve societal problems.”
I believe that to be true, and I was also trained with that in mind. Of course, I was more privileged, and I did my research having in mind to both fill a gap and solve a societal problem. But nonetheless, a big part of academia, especially in countries of the global south, has intelligent people in universities. They work not to just progress the kind of nebulous, for a lot, concept of science, but to help their society and to help their community progress. Hence, I decided to have a look and find out what has been the research output from the biggest university in Tigray, Mekelle University.
This was much harder than I initially thought, since I was still using my “in academia” brain. I was thinking that I could go to Scopus, or Web of Science, the two most known abstract and citation databases and just look for papers that have Mekelle University in their affiliations. But alas, as I don’t have access to a university library any more, I couldn’t do this. I did contact the Mekelle University library to see if they could help me, but as of right now they haven’t responded. So I had to resort to lots and lots of googling. In this article, I will outline how I managed and got my hands to a list with 1032 publications with titles and abstracts. Then, what I did to analyse the topics for each of the publications. Because, believe me, I did not read 1032 publications to write this article. I wish that I lived in a timeline that would allow me to do that. I will go in as much detail as possible on the methods I used to do the analysis for a couple of reasons.
Transparency. You know where I got the data, what I did with them and what they mean.
Accountability. I have been working with data for a very long time, but this isn’t something I’ve done before. Ergo, I might have made mistakes, or I might have missed something. If you spot one, please let me know. We might learn something new together.
Education. I think there is an educational value to me sharing this. There might be tools or processes that you might find useful for other parts of your life and/or work.
So let’s start.
Finding the Papers
As I mentioned, this turned out to be harder than I initially thought. However, the kind folks at Harzing.com created the tool “Publish or Perish” that made my life a bit easier. Publish or perish is an expression used in academia to describe institutional pressure to publish your work or, well, perish, academically speaking. The software “Publish or Perish” is a program that retrieves and analyses academic citations using various data sources. And yes, you would still need to have academic access to websites like Scopus, it also uses Crossref.
Crossref is a non-profit organisation that registers digital object identifiers (DOI). Generally speaking, when a journal publishes a scientific article, the publisher will input info into Crossref like journal name, article DOI, publication date and more. What is important, in our case, is that they also include author affiliations and often the abstract texts. Hence, I managed to search for publications that had the word “Mekelle” in the affiliations. As for period, I decided to search from 2020 until 2024. Not only this gave me a nice round period of 5 years, it is also a period of time when many things were happening. The Tigray War lasted from 3 November 2020 to 3 November 2022, so It will give us a glimpse of what was the research landscape a little bit before the war and some time after.
Papers. So many Papers!
So, using Publish or Perish, I managed and created a database of 1032 publications. That is 1032 publications that have at least one author that has an affiliation with Mekelle University. Of course, this doesn’t necessarily mean that the author stays in Tigray, but at least I hope that some of their interests and activities will reflect the things happening in the region. In the table below, you can see the number of publications per year. Highlighted are the numbers for the papers published during the years of the war.
Unsurprisingly, there was a sharp decline during the years of the conflict. In 2020, I found 324 publications from researchers in Mekelle University, while in 2021 and 2022 there was a drastic drop to 157 and 145 publications respectively. Disruption of daily life, attention focus on the war, displacement of researchers, damage to infrastructure, emotional and psychological toll on people could be one of the numerous reasons this happened. Nonetheless, the continuous production of publications is a testament to the resilience people have in the region to do research and, if what Tewelde told me is generally true, to work on solving societal problems.
What are we (re)searching for?
Here’s where we get to the interesting part of the story. I wanted to analyse the topics of research each publication had and figure out the themes of these publications. Of course, I didn’t have the opportunity to read all those papers and do a very meticulous classification by myself. Thus, I had to resort to something that I find scary and tried to avoid as much as I could in my career. I had to use Python and to write code to help me sort things out.
Could I have used something like ChatGPT to do something similar? Probably. However, I doubt that the free version of ChatGPT could process this much data. And secondly, where is the fun in that? Because I wanted to avoid using AI, I practised a little bit of coding, which is something I’ll find useful in other aspects of my life as well. Anyway, what exactly did I do?
I found out that there is this technique in topic modelling called Latent Dirichlet Allocation or LDA. LDA is a method that groups similar words into topics based on patterns. For example, if words like ‘climate change’ and ‘rainfall’ appear often, it might classify them under ‘environmental studies.’ While not perfect, it helps identify broad themes across large datasets. There are pros and cons to using such a method, and since I am by no means an expert, I am sure there are ways to improve on what I did. A couple of examples that I can think of are:
There are some common words that we don’t want the code to consider, like “we”,”a”, “just” etc. While there are libraries of stop words that I used, I had to fine tune and add words like “participants”, “methods”, “effects” etc. while I added these words while I was running the model. I am sure I didn’t consider some, which would have improved the accuracy of the model.
The model requires me to tell it how many topics are there in the batch of papers. Of course, this is something that I can’t know, and it could be 10s of different topics. For the sake of this analysis, I chose 5 topics per year. While it might not be the most accurate, I consider it enough to give me an idea about the topics.
But understand, I wasn’t going for 100% accuracy with this. I just wanted to see how the themes of research changed over the years. I assumed that the algorithm would give me the most popular ones for each year.
Key Results and Themes
I ran the LDA algorithm for each year I had publications for. For each year I got a result like the one pictured below.

In the graph on the left, each circle represents a topic generated by the model. The size of the circle corresponds to the prevalence of the topic across that year. Basically, more documents on that topics. On the right, we see the terms mostly used in all the publications. Hovering over a circle will highlight in red the terms used in that topic. So what did I find for each year?
Once again, the topics you see on the table above, are my interpretation based on the most salient terms in each group. One thing that immediately stuck out to me was the persistent focus on health care. This is unsurprising. A quick look at the region's health atlas will show that the top causes of years lived with disability and years of life lost are Diarrheal diseases, lower respiratory infection, and HIV/AIDS resulting in other diseases. One other thing that I wanted to mention is that the model identified that in the early years there was bigger emphasis on HIV and nutrition, while later in 2024, terms like “cancer” started appearing.

Another thing that is clear from the work is the increasing prominence of Women’s Health and Gender Issues. “Women” is a consistent term, reflecting sustained research on gender-related topics such as family planning, maternal health, and quality of life. The persistent focus on women’s health reflects ongoing challenges in maternal care, especially in post-conflict settings where resources are scarce.
There has been remarkable progress in the healthcare system in Ethiopia in general and in Tigray specifically, according to this editorial. Sadly, in only 6 months after the war started only 27.5% of hospitals, 17.5% of health centres, 11% of ambulances and none of the 712 health posts were functional [source]. The continuous focus of researchers on healthcare shows the crucial need for the region.
Environmental and Agricultural themes are consistently observed, with recurring terms like soil, climate, species, land, and water appearing annually, signifying ongoing research within agriculture and environmental science. The period from 2020 to 2021 exhibited a strong focus on enhancing agricultural productivity, with key areas of investigation including soil health, crop yield, and the impact of rainfall patterns. However, a notable shift emerged between 2022 and 2024, as research priorities transitioned towards building climate resilience and optimizing resource management. This shift may potentially reflect a concerted effort towards post-war recovery and rebuilding sustainable agricultural systems.
The Impact of the Conflict on Research Themes is evident from the emergence and increasing prominence of conflict-related terms such as Tigray, war, and knowledge starting in 2021. This trend becomes more pronounced in later years, particularly between 2023 and 2024. This observation suggests a growing recognition among researchers of the profound and multifaceted effects of the war on various aspects of life, including health, agriculture, and the well-being of local communities. Research efforts may have shifted towards addressing immediate post-conflict needs, such as the rebuilding of vital health services and the investigation of strategies to enhance climate resilience in the affected region.
Furthermore, in 2024 I observed a trend towards a diversification of topics with some material science and papers like “Oxidative Polymerization of Aniline on the Surface of Sisal Fibers (SFs) as Defluoridation Media for Groundwater” that as you can see again deals with a specific problem of water purity in the region. This diversification of research topics may signify a potential shift towards exploring new avenues, such as materials science or industrial applications, potentially leveraging the region's natural resources and traditional knowledge for economic development and recovery.
Let’s summarize
From the discussion above, I can say that I identified 4 major themes of the research.
Health Care
Women Healthcare
Climate
War Related
Other
I am putting “other” on the list to help with the classification I’m making to present the below heatmap.
As you can see, the papers that mention with women are always present throughout the years. Keep in mind that on the above table, each paper could only belong to one category. So it is possible that the number of papers that deal with “women healthcare” for example, could be in either category.
Research at Mekelle University demonstrates the resilience and determination of its academic community. HIV, women’s health and maternal care is always a huge issue. Even post conflict, there are papers like “War and Siege Halt Gynecologic Oncology Services for Women in the Tigray Region of Ethiopia: A Call to Action” that signify the importance the community gives to women.
As it is natural, the war has affected the topics of research for people working in Mekelle University, with them studying the effects it has on families and local communities. The conflict seems to have pushed researchers to address immediate, practical needs, such as rebuilding infrastructure, healthcare, and food systems.
Behind these numbers are researchers like Tewelde (Part 1) and Tesfaye (Part 2), who risk everything to keep science alive.
It is important for local communities to have a say on what is the major problem they’re facing, and also what is the best solution to be implemented. Whether that issue is happening in Tigray, in the West Bank, in Athens, Singapore or anywhere else in the world. Maybe not all, but many of the solutions will come from within. With years of work, research and thought from people who have been influenced by these difficulties and their motivation for solving them is the improvement of their communities. I hope you liked this article. Are there any more questions you have about Tigray? Any other people you think I should talk to? Please let me know!
Until next time, take care and be kind…
That was quite interesting. I'm surprised to see the number of articles overall. I would have expected double digits rather than triple every year. Thank you