Promotion and tenure (P&T) practices have historically rewarded a prescriptive set of scholarly outputs: journal articles, books, conference presentations, all centered on citation metrics. There have been many calls to expand P&T to include more types of nontraditional scholarship, especially artistic performances, compositions, digital scholarship, and others and to formally acknowledge that there are more ways to share knowledge than the written word. We’re calling for an even more expansive view of scholarship: one that invites, accepts, and celebrates the entirety of research, including datasets. Datasets need to be considered high impact scholarly objects; and every researcher has data. And while data sharing can stem from a compliance perspective—“my funder or journal told me that I had to share data”—we suggest focusing on why data sharing is a good thing to do for researchers, academia, and the greater public. Open Science can be leveraged to transform recognition models and processes in academia—if we allow the use and reuse of our scholarship in novel and unexpected ways. This is not to suggest that all data can or should be shared; but there may be opportunities for researchers to share metadata, workflows, code, or algorithms, or even provide mediated access to their data. These different use cases and access mechanisms need to be encouraged and celebrated. We’re calling for researchers to challenge The Ivory Tower and more fully embrace Open Science. P&T, while flawed, represents the formal incentive structure in academia. This is not a call to do more, but a call to embrace what we are already doing and have it count more.
Dear Promotion and Tenure Committees,
Promotion and tenure (P&T) practices have historically rewarded prescriptive scholarly outputs and metrics: journal articles, books, and conference presentations are weighed alongside citation counts and journal impact factors. There have been calls to expand P&T to include more types of nontraditional scholarship, especially artistic performances, compositions, artwork, and scholar-activism, and to formally acknowledge that there are more ways to share knowledge than the written word (e.g., Okun 2022). We’re calling for an even more expansive view of scholarship: one that invites, accepts, and celebrates the entirety of the research lifecycle, including both high-impact data and data that represents null findings. Let’s celebrate data sharing and start a culture of data sharing rewards.
Data play an integral and central role in research across disciplines. Broadly defined, data are “anything you perform analysis on” (Briney 2015). Datasets are “the combined unit of your data and your documentation” (Deep Blue Data). Data are much more than numbers in a spreadsheet; they can be images, audio/video materials, architectural drawings, musical scores, etc. Data are present in all disciplines, even when it feels like a contradiction or a ‘dirty word' (Posner 2015; Hofelich Mohr et al. 2015). Data is the evidence underlying research; data includes code and software (e.g., McKiernan et al. 2023), oral histories and archival materials. Data can be created for a project, reused to reproduce results, or remixed to explore a novel or emerging topic. While some disciplines have a longer history of interacting with ‘data’ and creating ‘datasets,’ most researchers have data that could and should be included in P&T decisions.
Before we continue, we want to acknowledge that there is a spectrum of options for appropriately sharing data, including not sharing it, that are dependent on a wide number of factors. For example, data collected in conjunction with communities may need to be shared in limited venues, or even remain private, to respect the wishes and needs of that community. So recognizing that data sharing may look like giving the data back to individuals to control, publish, or destroy, we turn to the CARE Principles for Indigenous Data Governance for guidance (Carroll et al. 2020). These principles emphasize the authority communities have to control and govern their data, and that researchers have a responsibility to build relationships to appropriately decide what data can be shared while recognizing existing power differentials. So when we use the phrase “data sharing” we are envisioning a world where data is as open as possible but as closed as necessary (Nason and Taitingfong 2023). Phrased slightly different, we are advocating that data sharing decisions should be intentional, and the decisions made to share — or not share — data are thoughtful, documented, and appropriately rewarded in the P&T process.
With this nuance in mind, P&T Committee, we want to remind you of the influence you hold in affecting systemic change: if data sharing is important to you, your colleagues will value it, too. So we want to emphasize that data sharing is a good thing to do for researchers, academia, and, most importantly, the greater public. Sharing data helps build researcher reputation, reduce duplication of effort, ensure reproducibility and transparency in research (which is essential given the constant deluge of misinformation and current exacerbations by AI), encourage reuse by those who may not have the funds for their own data collection, inspire new ideas and innovation, and otherwise demonstrate a commitment to public access. Valuing data sharing would encourage further sharing. While some researchers share because they are told to by funder or journal, your encouraging sharing may help researchers appreciate the value of Open Science and the impact of research data. Open Science can be leveraged to transform recognition models and processes in academia if we allow the use and reuse of our scholarship in novel and unexpected ways. Focusing on why data sharing is a good thing to do—both for researchers and the greater public—will help foster a culture of sharing. For the public, much of this research is conducted using taxpayer dollars, so we, the taxpayers, should have free access to research results that we paid for! Academia has a tendency to, intentionally or unintentionally, remain in our ivory towers. However, it benefits all when we share our data and spur real-world impacts (Bayley 2023).
With all of these benefits in mind, P&T Committee the good news is that data sharing has been coming for years— and now you have the opportunity to formally acknowledge and reward it. As you may know, researchers who receive federal funding from National Institutes of Health (NIH) in the United States are required as of January 2023 to write a data management and sharing plan (DMSP) that states how they are going to publicly share their research data, or how the data will be protected and destroyed for technical, ethical or legal reasons precluding public sharing. Federal funding agencies will increasingly be requiring data sharing to some degree. The 2022 Office of Science and Technology Policy memo “Ensuring Free, Immediate, and Equitable Access to Federally Funded Research” (Nelson 2022) directs all federal funding agencies to develop guidelines for grantees. This includes smaller agencies, like the National Endowment for the Humanities. While federal funding agencies are requiring data sharing and allowing for expenses related to data curation, sharing, and preservation, it is no mean feat to prepare a dataset, and prepare it well, for publication. By leveraging this system already in place (grants), and building on already existing and required work (creating and implementing a DMSP), we can give credit where credit is due. We recognize that this may still be a limited group of faculty members – those at institutions who receive federal funding – but we also recognize that culture change happens slowly, over time, with intentional and deliberate support. Any starting point is better than standing still.
You may be wondering how to convince your colleagues that they, too, should share their data. Let us share a few examples of how publicly shared data has been reused in unique ways that broadly impact the greater public:
Poet Henry David Thoreau’s field notes were used to document that migratory bird arrivals and trees and shrubs budding is occurring earlier in the spring, further evidence of global warming (Primack and Gallinat 2016).
Public property records that include racial covenants which prevented people who were not white from purchasing homes, are being used to expose structural racism in local communities. An interdisciplinary team, in collaboration with communities across Minnesota, is working to expose structural racism in a visual way and encourage homeowners to update their property deeds to work towards reparations.
By analyzing a video posted on social media of scuba divers experiencing an earthquake while diving in the Persian Gulf, researchers were able to show that widely available, entertainment-grade microphones can be used to monitor for offshore seismic events, which could lead to vast improvements in early detection and warning of these events, especially in low and middle income countries (Salaree, Spica, and Huang 2023).
These examples remind us that datasets can be reused in ways that were never initially intended. Certainly, Thoreau wasn’t documenting the arrival dates of migratory birds with the goal of enabling future climate researchers. But curious and creative researchers have used his field notes to add to the growing body of literature proving that global warming is happening.
As we noted before, however, data sharing is not a binary choice; there are ethical and legal considerations that may impact a researchers’ ability to openly share data. Common reasons include:
Data collected on sensitive or marginalized populations that must be de-identified to protect said communities, but when sufficiently de-identified the data loses context and reusability;
Data that is collected in collaboration with marginalized communities, and data is shared exclusively with the community to control as they wish;
Proprietary data that is collected via a collaboration with a researcher and private industry—the private company may hold copyright of the work;
Data that cannot be shared due to ethical reasons, like geographic locations of endangered species.
Even in these instances, there are opportunities to share metadata, workflows, code, research instruments, algorithms, or even provide mediated access to data. These different use cases and access mechanisms need to be encouraged and celebrated. Researchers should be rewarded if they share their data, but not punished when it is absent. Researchers should be given the opportunity to explain the nuance of their work and the creative ways they’ve been able to share some of their data (such as in a closed/restricted access repository), return the data to an appropriate governing body, or even include the studied communities in the process to help destroy the data.
But how would we examine datasets that are in different states of openness, with different degrees of use and reusability? Unfortunately, the answer is that this will not be straightforward. Each researcher, each project, and each dataset will have different considerations that impact what can or should be shared in different venues. For that reason, we will have to look beyond sheer metrics— views, downloads, citations— and get to qualitative measurements. Measuring dataset impact in a quantitative manner alone will not be useful. As the conveners of the recent Make Data Count Summit stated, we need “meaningful and responsible data metrics” (Puebla et al. 2023). The evaluation must go beyond views, downloads, and citations, otherwise we end up where we started with journal articles and book publications being the special sauce of P&T.
Examining datasets more holistically will increase and integrate more equity into the P&T process. The existing “systems of counting and classification that perpetuate oppression” (D’Ignazio and Klein 2020) are holding researchers back. Regardless of the labor and care that goes into teaching, researching, presenting, and publishing, historic practices have favored quantitative metrics over meaning and value. With datasets, we have the chance to correct this, and build systems and evaluations that focus on quality and impact, rather than quantity. Evaluating datasets and their impacts will require P&T committees to educate themselves. But the best news is, there are already experts in this on your campus—your peers in the library, including data, publishing, and research impact librarians, can help with that!
P&T, while flawed, is the incentive structure in academia. This is not a call to do more, but a call to embrace what we are already doing and have it count more.
Concerned Data Curators
Bayley, Julie. Creating meaningful impact: The essential guide to developing an impact-literate mindset. Emerald Publishing Limited, 2023. https://doi.org/10.1108/978-1-80455-189-920231016
Briney, Kristen. Data management for researchers: Organize, maintain and share your data for research success. Pelagic Publishing Ltd, 2015.
Carroll, Stephanie Russo, Ibrahim Garba, Oscar L. Figueroa-Rodríguez, Jarita Holbrook, Raymond Lovett, Simeon Materechera, Mark Parsons et al. "The CARE principles for indigenous data governance." Data Science Journal 19 (2020): 43-43. https://doi.org/10.5334/dsj-2020-043
Deep Blue Data. “Depositor Guide.” Deep Blue Data, Accessed October 12, 2023. https://deepblue.lib.umich.edu/data/depositor-guide#deposit-in-dbd
D’Ignazio, Catherine and Laura F. Klein. Data Feminism. MIT Libraries Experimental Collections Fund, 2020. https://doi.org/10.7551/mitpress/11805.001.0001
McKiernan, Erin C., Lorena Barba, Philip Bourne, Caitlin Carter, Zach Chandler, Sayeed Choudhury, et al. (2023) Policy recommendations to ensure that research software is openly accessible and reusable. PLoS Biol 21(7): e3002204. https://doi.org/10.1371/journal.pbio.3002204
Mohr, Alicia H., Josh Bishoff, Carolyn Bishoff, Steven Braun, Storino, C., and Lisa R. Johnston. “When data is a dirty word: a survey to understand data management needs across diverse research disciplines.” Bulletin of the Association for Information Science and Technology, 42 no. 1 (2015): 51-53. https://doi.org/10.1002/bul2.2015.1720420114
Nason, Mike and Riley Taitingfong. (2023 November 6) ORCID US Community Call. ORCID US Community. https://www.youtube.com/watch?v=sfw2MxIVEdQ.
Nelson, Alondra. "Memorandum for the Heads of Executive Departments and Agencies: Ensuring Free, Immediate, and Equitable Access to Federally Funded Research" (2022), https://doi.org/10.21949/1528361
Okun, Tema. “White Supremacy Culture Characteristics.” White Supremacy Culture. 2022 https://www.whitesupremacyculture.info/characteristics.html
Posner, Miriam. “Humanities Data: A Necessary Contradiction,” Miriam Posner’s Blog, June 25, 2015, https://miriamposner.com/blog/humanities-data-a-necessary-contradiction/
Primack, Richard B. and Amanda S. Gallinat. “Spring Budburst in a Changing Climate.” American Scientist, 104 no. (2016), 102–109. http://www.jstor.org/stable/44808881
Puebla, Iratxe, Daniella Lowenberg, Matt Buys, Carly Strasser, Julia Lane, Stefanie Haustein, Nicholas Robinson-Garcia, Mike Thelwall, and Thed van Leeuwen. “Make Data Count Summit presentations.” Make Data Count Summit, 22 September 2023, Washington DC. Zenodo. https://doi.org/10.5281/zenodo.8370593
Salaree, Amir, Zack Spica, and Yihe Huang. “Solving a Seismic Mystery With the Audio From a Diver’s Camera: A Case of Shallow Water T-Waves in the Persian Gulf. Geophysical Research Letters, 50 no. 18 (2023), https://doi.org/10.1029/2023GL104544