Bot Beat 3. Building Watchdog Tools

3. Building Watchdog Tools with Nieman Fellow Jaemark Tordecilla

May 14, 2024

A veteran Filipino reporter supercharges investigative journalism in the developing world. (21 min)

Transcript

Jaemark: If you think newsrooms in the US are shrinking, you haven't been to a newsroom in a developing country where you basically do everything because you're so small and the resources are so few. And so if you could get a tool that takes away a lot of the grunt work and that helps you in that journalism value chain…

[Bot Beat music swells, then settles]

Jay (Narration): Welcome back to Bot Beat, a new podcast that tracks how journalists are harnessing AI in their reporting. Through quick chats with some of these reporters, I hope to key you in on how exactly these cutting-edge tools may change the way we work. I’m your host, Jay Kemp. I’m an independent journalist reporting on AI within my field. In this episode, we’ll be expanding our view beyond newsrooms in the U.S. I’ll be chatting with Jaemark Tordecilla, a current Nieman Fellow at Harvard and the former editor of the largest news website in the Philippines, GMA News. He also has a background in computer science and tech. He calls himself that guy in the newsroom playing around with all the emerging technologies. I first came across Jaemark’s work when I stumbled on an article he wrote for the Reuters Institute about an investigative tool he built. We’ll be chatting about the Philippines, what his tool is trained to do, and the interesting ways AI could be particularly empowering for undersourced newsrooms in developing countries.

Jaemark: Yeah. So the Philippines is actually- the government is quite hostile to media organizations in the Philippines. The online landscape is rife with misinformation, and a lot of it is attributed to political actors, for example. So it's an uphill battle for many media organizations in the Philippines to find their space and to, to bring people back to the center, to, to not believe, you know, polarized news pages. In a similar way, it's happening here in America where people get their information from polarized sources instead of the mainstream. What we find in the Philippines is that where, for example, there was a pandemic or there's, a typhoon or another tragedy or something that's a matter of life and death. We find that audiences still try to, to, to get their news from trusted sources. But our challenge has been during elections where, most of those who consume media from the traditional sources are the traditional news regarding audiences. And so, politics in the Philippines has always been personalistic. And so you see politicians building up their online presence to, to reach those audiences with, with the information that they want out there. And so, it's continuing to be a challenge for media organizations to get the attention of those audiences.

Jay: And would you say that that relationship kind of spurred part of what went into you building this tool?

Jaemark: I just feel that it was a good use case to show to media organizations back home that, there are, there are useful ways to use AI beyond, you know, generating content and generating articles. And, there's always the big fear of this new technology with all the hype replacing our jobs, when in fact what I found was that, number one, it's not very good at what we do, which is getting the facts right. And, and the other thing is that, you know, generating. Content from. It is kind of the least interesting way to use the technology right now. But there are things that it does really well, it really does magically. It's really good at answering questions for which there are no right answers. But that's not what we do as journalists, right? We need precise, correct answers. And so it's also very good at summarizing information, getting through dense information and summarizing them, for our benefit. And so, that was the use case I saw when I used the, when I developed the tool, to try to use it to make life easier for journalists going through jargon and thick documents.

Jay: Absolutely. I think so much of what we do as investigative journalists is trying to break down the very complex to the very understandable, and it's kind of necessary for our democratic relationship between the government and the people, is that press pipeline of information. So I think that's a really great place to kind of transition and talk a little bit more about the tool itself. You got to Harvard in the fall, and you were telling me a little bit about how you started messing around with AI. Could you kind of break that down a little bit for me – how that investigation, that experimentation began?

Jaemark: Yeah. So I took an introduction to generative AI class. Which took us through the history of the models, of deep learning, from supervised to unsupervised learning, really, really technical stuff. And toward the end of the semester, my professor told us that a lot of the stuff that we were studying would become obsolete after OpenAI released a feature over ChatGPT that allowed you to develop apps without any coding.

Jay: Wow.

Jaemark: And so I was looking at the stuff that I was learning and trying to figure out if I could do, a tool that would take us through, audit reports, from the Philippine government. So in the Philippines, there's an agency called the Commission and Audit, which is tasked with basically writing an audit report for every government agency and for every local government unit in the Philippines. So we're talking about the provincial government. We're talking city government, we're talking municipal government, town government. So we're looking at hundreds of audit reports being produced every year. That's not to include the national agencies. And so there's really a lot of documents there. So you could just imagine the amount of manpower it would require if you had to go through each and every one of them. And because they're technical documents, they contain a lot of jargon. They contain a lot of facts and figures. And it really takes an expert eye to find the stories that would resonate with our audiences and to find whether there's something inside the document that is worth doing a story over. And so what I did with my tool was, basically helping summarize these documents and, that would allow an experienced reporter- and this is very important- an experienced reporter to see at first glance whether there's a story worth doing inside that document. And, using what we learned at the first part of the class, I was looking at basically hundreds of hours of training and thousands of documents to even develop a model that would be useful for journalists in the Philippines. And so when the custom GPTs were released, I tried it out, played around with it for a couple of days and actually built something useful out of it. And like I said, I was looking for something that would make journalist jobs easier, and I thought I had something. And I went back to journalists back home and asked them to test the tool and see if it actually did make their lives easier. What I found is that the tool is more useful for reporters who have had experience reading audit documents in the past because they already know what red flags are out there after ChatGPT gives them a summary of items. And it really is just a screening tool for them. It's a time saver so that they don't have to spend a couple of hours just trying to determine whether one document is worth a deep dive, or whether they should move on to the next one. Considering there are hundreds of documents with, again, hundreds of pages, inside them.

Jay: I think what I really admire about the tool is the way that it saw a need that was very much being experienced by so many journalists across the country and then very, very directly addressed that need. You had mentioned to me in our last conversation, you said it was helpful for finding one kind of needle in one kind of haystack. Could you expand a little bit upon what you hope this tool will be the baseline for, for future tools, future projects, future innovation?

Jaemark: Right. So it really was my target to try to build a tool that would work with audit reports, because I used to be an editor of the big newsroom, where we really depended on our workers finding stories in these documents. And I remember in my article, shout out to your audience, I hope they check it out. My example showed how, one story from, from one report actually led to corruption charges being filed against officials of our education ministry, over the procurement of overpriced laptops. Right. And so it really has a real impact in terms of not just journalism, but also civil society and how things are run in government. And so I knew that, given that a lot of newsrooms and a lot of reporters are working on these documents to find stories in them, that a tool like this would be immediately useful. But I know there are a lot of other types of documents at every newsroom. It's, you know, there's a mountain of them that every newsroom has to go through, especially investigative newsrooms. We're talking about legal documents, for example, or environmental reports or even, transcripts of town hall meetings here in the US. And if you have a tool to- if you already have a good idea of what you're looking for, in that mountain of documents. So, if you know what the needle kind of looks like, AI, I think, does a very good job finding those needles for you, in a particular haystack. So, and it helps, like I said, that, if you already know what sort of information you're looking for, you're going to be better at using these tools. It really supercharges a veteran reporter who has had experience going through documents like these. A couple of weeks ago, we had Marty Baron speaking to us, at the Nieman headquarters, the former editor of the Washington Post and the Boston Globe. And he had this, this line where he said, you are able to put men on the moon before we put wheels on luggage. And wheels on luggage has made our lives infinitely easier.

Jay: That's awesome.

Jaemark: And so this tool is an example of that. It's going to make the lives of many journalists hopefully easier. Even though it is, it doesn't seem so novel. It's just a search tool. But, you know, hopefully it would reduce the friction of many processes. And I do hope that a lot of newsrooms continue to publish their use cases for generative AI, because I suspect that most newsrooms would find use cases like this: Small things that would make their lives easier, or at least, not make themselves want to tear their hairs out. But yeah, it's all about these little incremental innovations. I think that where, where, the potential of generative AI lies, it's not going to be like this whole system that's going to generate all our content because, I think what we're finding, is that, it really is an enhancement for already good journalists, to, to help make their job, help make them do their jobs better.

Jay: I agree, and I think we're going to find in, you know, as our industry is grappling with this tool that we're going to have do a much better job at staying ahead of the curve if we are kind of sharing ideas and learning from each other across the news industry. So seeing a use case like this, and then not only seeing that you did it, but then shared what you did, how you did it, you sent it to other journalists to actually try it out and see if this was implementable in the field. I think that's really admirable. And I think that's the kind of open source information sharing that we're going to need to make sure that journalism survives the AI transition.

Jaemark: Yeah. And I think it's particularly important, for myself, working with journalists from back home, because we have unique sets of challenges in developing countries where, if you think newsrooms in the US are shrinking, you haven't been to a newsroom in a developing country where you basically do everything because you're so small and the resources are so few. And so if you could get a tool that takes away a lot of the grunt work and that helps you, in that journalism value chain, whether it's for reporting or writing or editing or production and distribution, I think generative AI has applications for each of those components. And if you're able to find that, you can really multiply the good that you do with just a handful of people. And so I think newsrooms from developing countries have a lot to contribute to the conversation, too. And so, with a tool like this, I wanted it to be the catalyst for that conversation, to get people from back home and from across the world who are working in countries whose newsrooms are, you know, facing resource deficits to think about this things and to share what they could learn, because I think we've got a lot to share too. I've been doing workshops with journalists from back home, and one of the things I showed them, aside from this tool, is how you could use open source AI tools like Whisper to transcribe interviews in Tagalog, which is the de facto national language in the Philippines. And it really does a great job, getting the Tagalog right. People are always surprised. And it's the type of thing that newsrooms here in the US take for granted, because you look at transcription as a solved problem already, you can go to Otter, or even your Microsoft Word document is able to do transcription right now. But in a lot of developing countries, transcription is really still a big pain point. And so even just using a tool like that, that's open source, that's free for everyone to use and enabling that, that already saves so much time for journalists who could, you know, be doing more reporting and doing more editing instead of spending their time transcribing stuff. I mean, it's not a perfect tool by any measure, but it's a good first draft for, for any journalist, working in the field. But for example, the Philippines has hundreds of regional languages, and not all of them work with this. I think none of them have worked with this tool, only the national language does. And so we could actually work on further improving our fine tuning tools like this to support even more regional languages. This tool itself works well with Filipino, but it doesn't work for other languages, like Georgian, for example, or like... If a newsroom in the Philippines is able to solve that fine tuning problem, could they share that knowledge with journalists in Georgia or in Latin America or elsewhere in the world? Because it's important to- given that it's a solved problem for the US, so US companies won't pay much attention to tools like this anymore. And so the innovation has to come from the places where it's still an issue. And, the benefits would be substantial.

Jay: Absolutely. And. I think you're touching on something that I think about a lot in regards to the internet as well, which is the way that people can use these tools as a leg up that they didn't have before. Thinking about people who can go viral on social media, or people who can use the internet to build a business from the ground up. The same thing for these kinds of tools, as long as you have the innovative spirit to do so. So I guess my last question for you is that I would really want to know what you would say to any journalists out there who are a little overwhelmed at the idea of using or building an AI tool, but see a need that could be filled if they had the spirit to do so.

Jaemark: Right. So now is the most exciting time to get into it because everything is still so nascent. I don't think there are any experts except for those who really do the research building these models, and they're usually inside the tech companies. But most of the people using these tools, even from the richest newsroom in the world, we all have the same access to the same level class GPT that that everyone has. And so, this is the most exciting time to get into it and to figure out use cases that we could share to the world. And I think the other big thing that my tools and my workshops do is that it shows journalists from my country that this is not the tech that's going to replace us. You know, it's not a doomsday scenario by any means for journalists. And so we have a real opportunity to contribute to how this technology is developed and is being used. And those use cases are going to be very valuable. And so I am seeing this time as a great time for opportunity, especially for newsrooms around the world, to help shape its future because, no one's figured it out quite yet.

Jay: Absolutely. And to experiment and to play around and to see what they can do.

Jaemark: Yep. Exactly.

[Bot Beat music swells, then settles]

Jay (Narration): A big thank-you to Jaemark Tordecilla for joining me in this episode. I also want to thank again my guests in the previous two episodes, Maggie Harrison from Futurism and Ishaan Jhaveri from the New York Times. And an extra shout-out to my mentor, editor, and now friend, Jeb Sharp. I hope you’ve been enjoying Bot Beat. I created this series to have a place to keep tabs on journalism and AI, and to think out loud with people about the possibilities and pitfalls. Now it’s your turn. I’d love to know what you think the urgent questions are, especially in a field that is moving so fast. And for journalists out there using AI – you want to talk to me about it? Comments and ideas can go to botbeatpod@gmail.com – I’d love to hear from you. I’m your host, Jay Kemp, and this is Bot Beat. Let’s catch up soon.