Data Retention
In the era of data-driven decision-making, colleges and universities are increasingly collecting—and, notably, retaining and sharing—diversity-related data from students, employees, and affiliates, in some cases for decades after the individual's connection with the institution has ended. This data can provide valuable insights to promote inclusivity and equal opportunities, but important questions about data retention, privacy, and security must be weighed against these benefits. As a core component of effective data governance, policies and standards for data retention should increase transparency and accountability as it relates to the handling of PII.
Although institution type, location, and even department will play a role in compliance and regulation related to PII, the principle of data minimization indicates that even prior to considerations of retention, the least amount of data should be collected to meet the institution's goals and kept only as long as necessary. Officials at institutions within jurisdictions that pose risks to certain users should consider deleting any SOGI data that they are not required by law to maintain.
Determining who has the right to delete or keep data can be complex. Generally, the institution has the right to retain data for legitimate purposes, such as compliance with legal requirements, quality improvement practices, or research. However, under laws such as the European Union's General Data Protection Regulation (GDPR), covered individuals have the right to request the deletion of their data in certain circumstances. Understanding what laws apply to your institution is a critical step in determining data retention scenarios. Within the United States, each state may have specific requirements related to retention based on storage, sensitivity, reporting, and/or approved methods of destruction, and these requirements should be taken into consideration when deciding whether to collect data. In addition, there may be different requirements based on the type of institution (e.g., public, private, for-profit).
Schedules and Governance
A data retention schedule that specifies how long different types of data will be kept is helpful for operational use; accounts for the fact that gender data can change over time; and provides individuals with necessary information for truly informed consent. While it might be tempting to retain all data "just in case" it might be useful in the future, this approach creates significant risk. Retaining significant amounts of data increases the chances of retaining incorrect or outdated information, and it escalates the potential damage if a data breach occurs or if data is requested and legally obtained by outside parties who could use it in harmful ways.
Instead, data retention decisions should be based on a clear understanding of what data is needed, why, and for how long. These decisions should be reviewed periodically to comply with new regulations, policies, and institutional priorities. In most cases, a retention schedule will apply to existing data, but institutional leaders should consider how any new data collection will fit into the established schedule. This schedule should balance the need to retain data for analysis, legal compliance, and historical record-keeping with the need to respect privacy and mitigate the risks to individuals whose data is collected.
Authority over data retention in specific data domains should be established as part of the data governance process. Including retention discussions in data governance generally provides an opportunity for key stakeholders to understand applicable laws and regulations and then discuss retention options, balancing current needs against future risks.
Although many institutions have decentralized systems, with different departments or units collecting and storing their own data, considering data retention from an institutional perspective is essential. Institutions with centralized data governance, including clear policies and procedures for all areas of the enterprise, will find that retention discussions fit well within this existing structure. Institutions that have not implemented a data governance structure might consider using data retention as a catalyst for driving wider-reaching discussions of data governance, including how data is shared externally with entities such as vendors, alumni, and other external associations. Identifying immediate needs and developing standard operating procedures for data retention in this context can enhance understanding of the critical role and need for data governance more broadly. In addition to data retention schedules, it is important to regularly revisit other issues and considerations related to this data, including SOGI identities, categories, and security.
Archives and Backups
Every college and university should outline how it will back up, archive, and destroy paper and digital documents. Cleaning a database of information that could put people at risk is not sufficient if backups still contain those data. The approach to managing archives and backups should be comprehensive and unified; should incorporate input from several stakeholder groups; should apply throughout departments, units, and campuses; and should identify oversight and governance. The policy should be nimble enough to be amended or modified, and teams must record changes in a written policy document. All institutions should have clear plans ready to deploy to permanently delete data from databases and put protections in place the moment a shift in risk level is apparent, with duties already assigned to those who would be responsible for taking action. Deidentified data should be maintained only if it is not possible to use the data to re-identify individuals under any circumstances. Those requesting data may have the legal authority to acquire and use any key file or access-limited database that can be used to re-identify individuals; confidentiality policies and laws are not a safeguard.
Data Retention Policy
A data retention policy can comprise any or all of the following sections:
- Data map and data taxonomy: These should cover all data collected, from whom, and where it exists in the institution.
- Relevant legal and business requirements: This section explains which laws and industry standards apply to the data you gather, as well as how your organization ensures compliance with those regulations. It explains what to do in the event of rule violations or data breaches and should be modified when needs and legislation change.
- Data retention policies and procedures: Each record and data type will have various retention time frames and protocols. Consider the data storage location, format, backup techniques, and retention period. Specify the role or team that owns and manages each data type, as well as which sorts of documents don't need to be kept and should be destroyed immediately.
- Appropriate actions for discovery, legal, or audit requests: Institutions must have a uniform response to discovery, legal, or audit requests. This section should define the response procedure, who is accountable for making the response, and how it will be documented.
- Steps for data destruction: A critical element is specifying the applicable data destruction processes once the retention period expires or when the risk posed by the data changes, as well as how consumers can request that their data be deleted. Specify the method for shredding paper documents. Indicate which digital files must be removed manually and which are automatically cleaned by the system.
- Data archiving procedures: Some data is not needed for daily use but is archived for regulatory and compliance concerns. Archival protocols define the document formats, storage locations, and recovery methods. Certain paper documents might be kept off-site and retrieved only when needed.
- Exception processes: Institutions can make exceptions to data retention, deletion, or archival methods. Defining these cases, why they exist, the risk involved, and mitigation plans is critical to avoiding confusion or misinterpretation. Exception processes for other data may already be in place; ensure you work with the appropriate stakeholders to develop these policies.