In today's fast-paced world, the consumption of news and information has become an integral part of our daily routine. However, the rise of fake news, deep fakes, and other forms of digital manipulation has created a need to scrutinize the authenticity of the information we consume. In situations of conflict and crisis zones, where the consequences of false information can be severe and even fatal, the ability to authenticate media files such as photos, videos, audio, and texts in real-time becomes crucial. Open source intelligence, or OSINT, has emerged as a popular method for collecting and analyzing information from publicly available sources in such situations. Despite its benefits, OSINT techniques have limitations, such as being labor-intensive, resource-intensive, and not always fool-proof. This article proposes a solution that aims to help journalists and news organizations collect and verify information from conflict and crisis zones, while examining the current use-cases for this solution and possible limitations.
In today's digital era, the proliferation of fake news, deep fakes, and digital manipulation has made it increasingly challenging to discern fact from fiction. Emerging threats to authenticity have had particularly dire, even deadly consequences in crisis and conflict zones. As such, the authentication of media files, such as videos and photos, has become paramount for journalism and news reporting.
Real-time verification of data authenticity is crucial in preventing the spread of false information, and this is where open source intelligence (OSINT) plays a vital role. OSINT involves the collection, analysis, and dissemination of publicly available information, and has become an invaluable tool for investigative reporting used by journalists and researchers to report on conflict and crisis zones.
OSINT collection and analysis encapsulates a wide range of research and verification techniques aimed at collecting publicly accessible information, ranging from satellite images to photos and videos. Some examples of these technical processes include:
Reverse image searches using engines like Google Images, TinEye, or Yandex find other instances of an image on the web and trace its origin. This can help to identify whether an image has been manipulated or taken out of context.
Metadata analysis of an image or video file can identify details such as the location, date, and time it was captured. This can help to verify the authenticity of an image or video and identify any inconsistencies.
Forensic analysis tools can be leveraged to examine the pixels and other technical elements of an image or video file to detect signs of manipulation. This can involve looking for inconsistencies in lighting, shadows, and reflections, as well as analyzing compression artifacts and other anomalies.
While the OSINT analysis process has made information verification more accessible, the current methods have limitations. They are labor-intensive, resource-heavy, and not foolproof, which can often hamper their effectiveness. Sourceable proposes a solution that could aid journalists and news organizations in efficiently collecting and verifying information from conflict and crisis zones, while also examining existing use-cases and potential limitations.
OSINT and visual forensics are more critical than ever for journalism and news today. With the rise of citizen journalism and social media, newsrooms are relying more heavily on user-generated content (UGC). However, UGC is often unverified, and its authenticity cannot always be guaranteed. OSINT techniques, such as reverse image searches, metadata analysis, and geolocation, can help verify the validity of visual data. By using these techniques, journalists can better ensure that the information they are reporting is accurate and reliable.
Moreover, the ability to authenticate visual data is particularly critical when reporting on conflict and crisis zones, where events unfold rapidly. In such situations, OSINT can provide a crucial advantage, allowing journalists to ensure information gathered is verifiable. This not only benefits journalists but also policymakers and the general public who rely on accurate information to make informed decisions.
The growth of OSINT and visual forensics has made it easier for professionals in journalism and news to verify the authenticity of visual data, ensuring that the information they report is accurate and reliable. This is essential for maintaining trust between the public and media outlets, which has been eroded in recent years. As such, the need for OSINT and visual forensics is likely to grow in the coming years, as the demand for accurate and reliable information continues to increase rapidly, especially as AI continues to evolve.
However, OSINT techniques can be extremely labor intensive, time consuming, and resource heavy. In the field of journalism and news, where breaking stories needs to be accurate, efficient, and quick, this process can be an impediment to the natural flow of information. Many newsrooms are expanding entire units dedicated solely to visual investigations, such as the New York Times and BBC, who annually spend around $2 million and $2.9 million respectively on visual forensics to verify content from UGC creators (UGCC) and other open-sourced information. However, many media outlets lack the expertise, time, and resources to conduct adequate levels of visual forensics to adapt to the growing need.
Sourceable is a social venture created by a group of cross-disciplinary graduate students from Columbia University to address the challenges of verifying breaking content for newsrooms. Their solution uses blockchain to ensure the metadata of the visual, audio, and written content is accurate in time, date, and geolocation. The tools also ensure that visual data shared with the media maintains the chain of custody of content creators, allowing journalists to ask follow-up questions to UGCC when needed. This streamlines the process of verification that journalists and newsrooms must undergo in their reporting, allowing them to break stories more quickly and accurately.
Sourceable’s in-app camera feature guarantees the accuracy of information uploaded to their platform by UGCC. This captures all critical metadata, eliminating any possibility of data manipulation by external sources. Metadata is information that describes the content of a media file, such as the date it was created, the location where it was captured, and the device that was used to create it. Because metadata is stored in plain text, it can be easily modified or deleted by anyone with access to the file. This makes it difficult to verify the authenticity of media files, especially those that come from conflict or crisis zones where the risk of tampering is high. By encrypting the metadata of media files, Sourceable establishes tamper-proof content by ensuring that the metadata was not modified or deleted without the encryption key.
However, encrypting metadata alone is not sufficient in establishing the authenticity of media files. With this in mind, Sourceable employs additional techniques, like cryptographic signatures in combination with metadata encryption that enable journalists and news organizations to track the origin and maintain the chain of custody of the media file they are verifying.
Sourceable is also developing a deep learning algorithm based on a GAN architecture to detect fake and real media files. Sourceable’s approach is to use a pre-trained GAN model and feed the suspected media file into the discriminator. The discriminator will output a probability value that indicates whether the media file is real or fake. If the probability value is low, then the media file is likely to be fake.
Currently, Sourceable is actively engaged in two primary tasks. Firstly, we are focused on the collection and aggregation of open-source manipulated and authentic images and videos. To achieve this, we utilize resources such as Kaggle, Meta's Deepfake Detection Challenge Dataset (DFDC), and other available online repositories. This comprehensive dataset allows us to gather a diverse range of media for analysis.
Secondly, we are exploring the utilization of pre-trained models that exhibit promising capabilities in accurately identifying deep fake and manipulated media. Some of the notable models we are considering include FaceForensics++, CoMoGAN, and DFDC models, among others. These models have demonstrated impressive performance in various studies.
Our objective is to leverage these pre-trained models and apply transfer learning techniques using the collected dataset. By pre-training these models and evaluating their performance, we aim to enhance their effectiveness in real-life scenarios. However, it's important to note that we are still in the early stages of our research, and we have not yet determined a specific algorithm or model to be used exclusively for practical applications.
To ensure that Sourceable’s app remains functional in regions with limited or unreliable connectivity, Sourceable relies on the offline data storage feature provided by Google Firestore. Sourceable built its backend architecture on Google Firebase because it offers a reliable and secure solution for processing real-time data within our app without the need for additional development efforts. When a user goes offline, Firestore stores all the data and changes made to the database locally on the device. This allows the application to continue working and modifying the data without an internet connection. Once the connection is restored, Firestore automatically synchronizes the local data with the server-side data, ensuring that all the changes made while offline are transferred to the database.
The use of blockchain and machine learning for visual investigations is also an emerging trend in media organizations. Although blockchain technology has been in existence for over a decade, it is only in recent years that its potential for visual forensics has been explored. Similarly, while machine learning has been a subject of research for decades, recent advances in computing power and data availability have made it more practical to apply these techniques to large-scale media analysis. As a result, Sourceable strives to leverage these innovative technologies to aid media organizations in visual investigation tasks.
Using blockchain to verify metadata is an innovative solution to the problem of authenticating media files from conflict and crisis zones. By using blockchain, it is possible to create a tamper-proof ledger that records the origin and authenticity of visual data. This technology ensures that the data remains unaltered, and its authenticity can be verified at any point in the future.
However, there are other significant potential threats associated with Sourceable’s solution. One of the most significant concerns is the use of geolocation coordinates, which can expose users to security risks and data breaches. In conflict and crisis zones, the use of such technology can be particularly dangerous. Privacy concerns are also a significant issue, as users may not want their identity to be revealed.
To address these concerns, Sourceable is identifying the best cybersecurity measures possible to employ into its solution. Some measures currently include using encryption to protect data in transit and at rest, as well as implementing robust authentication protocols to prevent unauthorized access to the system. With this in mind, it is crucial for Sourceable to work closely with those who will use our tools the most in order to ensure that user privacy and security are protected.
The development and deployment of Sourceable’s authentication process is an example of how technology can be used to address emerging threats of mis/disinformation from conflict and crisis zones. By working closely with local communities in Northern Syria, Southern Turkey, Lebanon, and Palestine, Sourceable has tested and proved how valuable this type of tool can be for on-the-ground users, as well as for journalists, researchers, media experts, and other professionals who need to verify the authenticity of visual data in hard to reach areas.
Sourceable has run two pilot projects with journalists on the ground over the span of nearly a year, and the feedback from users has been essential for informing further development and improvements. The first pilot project was conducted over an 8-week period over the summer of 2022, where journalists uploaded over 200 photos and videos on the Sourceable app from Syria and Turkey. The second pilot project spanned from summer 2022 to April 2023, focusing around coverage of the earthquakes that struck Syria and Turkey in February.
The two pilots projects identified the need to develop ways for users to upload content while offline, establish legal processes for accessing sensitive media, and streamline payment processes for users. By listening directly to the UGCC feedback, Sourceable can meet their information needs while also ensuring privacy and security are protected. This approach is particularly important in conflict and crisis zones, where the stakes are especially high.
Expanding the use of Sourceable to other countries is a crucial step in developing a tool that can be used globally to verify media files from conflict and crisis zones. Sourceable also is exploring the possibility of working with journalism students from university campuses across the US in order to expand, scale, and identify early adopters of the technology. By engaging with university students, Sourceable can gain a better understanding of the challenges faced by future journalists and researchers who will enter the ever-changing fields of news, journalism, and technology. By working with local communities and implementing improvements and address their unique needs, it is possible to create a tool that meets the needs of journalists, researchers, and other professionals while also ensuring their safety and security.
Sourceable is not a panacea, but rather a tool to help professionals in their investigative reporting efforts. It is critical to emphasize that the media and NGOs must continue to exercise due diligence and conduct their own research to verify the accuracy of stories and media files posted on the platform.
While Sourceable has the potential to revolutionize the way media files from conflict and crisis zones are verified, it is not a foolproof solution. The technology must be used in conjunction with other methods to ensure that accurate information is disseminated to the public. By continuing to work with local communities and implementing robust cybersecurity measures, the developers of Sourceable can create a tool that meets the needs of professionals while also promoting safety and privacy for its users, and ultimately, revolutionizing news and journalism.
Lena Arkawi is a Syrian American activist and communications strategist based in New York. She has over a decade of experience working with Syrian-focused NGOs, promoting gender equality, youth empowerment, and humanitarian assistance. In 2021, Lena along with a team founded Sourceable, an innovative platform and mobile application empowering citizen journalists to document, verify, archive, and share critical news with media, human rights groups, and NGOs in real-time.
Siddhant Rajeev Kumar is a seasoned and accomplished data scientist with a Master of Science degree in Data Science from Columbia University and a Bachelor of Technology degree in Electronics from the University of Mumbai. Currently serving as the Chief Technology Officer (CTO) at Sourceable Inc, Siddhant has led a team of talented developers in successfully building a mobile app and website.
Tariq Kenney-Shawa is a US Policy Fellow at Al-Shabaka, the Palestinian think tank and policy network, and an Intelligence Analyst at Storyful. He holds a MA in International Affairs from Columbia University and a BA in Political Science and Middle East Studies from Rutgers University. Tariq's research has focused on topics ranging from the role of narratives in both perpetuating and resisting occupation to the role open-source intelligence plays in liberation struggles. Follow Tariq on Twitter @tksshawa and visit his website at https://www.tkshawa.com/ for more of his writing and photography.