/Docs/G/GA4GH/Data-Privacy-and-Security/Form/0.md
  Source views: Source JSON(ish) on GitHub (VSCode)   Doc views: Document (&k=Guide.r00t): Visual Print Technical: OpenParameters Xray
Data Privacy and Security Procedural Guidance
  1. Data privacy
    Privacy is a fundamental value and right of human societies. It extends to all aspects of the lives of individuals: the social, cultural, religious, political, physical, and the informational. Its protection also promotes other core human values and human rights. However, privacy is not an absolute right. Privacy protection involves the delicate balance of considerations at individual, familial, and societal levels. The following guidance assists in determining such balances relative to the protection of the core interest at stake and the Foundational Principles at the core of the Framework.
    1. Lawfulness of data processing
      All data should be processed in accordance with all applicable laws, regulations, norms, and guidelines and should only be disclosed in situations where consent has been provided, or there is a legal or legitimate interest/appropriate need for that disclosure/use.
    2. Data privacy risks and safeguards
      • Assessments of data privacy risks should include disclosure risks, and any harms reasonably likely to occur in the event of disclosure. These disclosures may result in individual or group discrimination, stigmatization, profiling or categorization that leads to unfair, unethical or discriminatory treatment contrary to human rights. The reputational risks for persons or organizations of allowing particular uses of data should also be considered.
      • Data privacy safeguards should be proportionate to the sensitivity, nature, and possible benefits, risks, and uses of the data. Such safeguards may include controlled access, pseudonymization, and anonymization of data, and quantitative techniques such as differential privacy, k-anonymity, ℓ-diversity, and t-closeness.
      • Data processing agreements (also known as data transfer, data use, and data sharing agreements) between persons and/or organizations are an important privacy safeguard.
      • Consideration should be given to adopting mechanisms that address compelled disclosure requests by state authorities of identifiable data and that prevent unauthorized access by third parties.
    3. Consent and other lawful bases
      • Data should be used strictly in accordance with the data subject’s (or their legal representative’s) consent for processing, and/or the terms and conditions of authorization for lawful processing by competent bodies or institutions (e.g. terms and conditions set by research ethics committees, waivers of consent), and in compliance with international and national laws (including tribal, indigenous, and aboriginal laws), regulations, general ethical principles, and best practice standards that respect conditions on downstream uses of the data.
    4. Re-identification
      • Any attempt to re-identify individuals or to generate information (e.g. facial images or comparable representations) that could allow the identities of research participants to be readily ascertained, should be strictly prohibited (and subject to sanction) unless where expressly authorized by law.
      • Reasonable steps should be taken to prevent the identity of data subjects being leaked or determined through indirect means such as metadata, URLs, and message headers.
    5. Data quality
      • In order to promote responsible and valuable sharing, data and any associated metadata should be, to the greatest extent reasonably possible, accurate; verifiable; unbiased; current; stored in systems that enhance security, interoperability, and replicability; and in compliance with commonly accepted standards for data and metadata annotation.
      • Regular quality assessments of datasets should be conducted.
    6. Identifiable data disclosure to the public
      • Subject to any applicable laws and/or the terms and conditions of authorization for lawful processing by competent bodies or institutions (e.g. research ethics committees), identifiable data should only be disclosed publicly in a publication or other format if: (1) data subjects have provided their explicit consent to public disclosure of their identifiable data and have been made aware of any reasonably foreseeable risks associated with the disclosure, and the disclosure is necessary for the purpose concerned; (2) data subjects have knowingly made their identifiable data public by their own explicit actions or permissions; or (3) disclosure serves a public interest, is necessary for the purpose concerned, and adequate safeguards are in place.
    7. Data sustainability
      • Where appropriate and in accordance with the data subject’s (or their legal representative’s) consent for processing, and/or the terms and conditions of authorization for lawful processing by competent bodies or institutions, and subject to appropriate safeguards, data should be retained for future processing through both archiving and using appropriate indexing and retrieval systems.
      • A plan should be established for the possible discontinuance of a database or initiative, and in particular should establish, if possible, whether the data will be archived or transferred to another database for use in future initiatives. If such archiving or transfer to another database is foreseen, the plan should make clear that data will continue to be shared with data users subject to ongoing governance oversight through e.g. a research ethics committee and/or data access committee. The lawful basis for the archiving or transferring of data to another database for use in future initiative (e.g. data subject consent) should be verified.
    8. Controlled access and registered access
      • Requests by data users for access to data should demonstrate to those managing access requests (e.g. data stewards, research ethics committees, and/or data access committees), at a minimum: (1) legitimate interest in and intended use(s) of the data; (2) accessibility of the data only to authorized individuals; (3) a reasonable and specified time period of data access; and (4) destruction of the data after agreed use.
    9. Data breach
    10. Accountability
      • All persons and organizations are accountable for promoting and protecting data privacy and security, including when data are shared with data users, repositories, and service providers.
      • Data stewards should keep track of all whereabouts of the data and the persons and/or organizations with access to the data.
      • Data stewards should clearly identify the individuals within their organization who are responsible for data privacy, data management, and reporting procedures (including a contact person or contact point for complaints). Appropriate and regular training for the identified individuals to discharge these duties should be provided.
      • Data stewards should track relevant new laws, regulations, policies, expectations, and best practices, sharing these with responsible individuals within their organization or entity, and with data users as appropriate.
      • Where relevant, ongoing communication links should be maintained between data stewards, data users, and research ethics committees and/or data access committees.
    11. Transparency
      • Policies and practices with respect to the privacy and security management of data and access arrangements should be made publicly available. Plain language summaries of these policies and practices and access arrangements should also be made public.
      • General information should be made openly available on an ongoing basis to data subjects as a group about how their data are being used and for what purposes.
      • For data that are not anonymized, a procedure should be established to provide individual data subjects, if they so request, information about how their data are being used and for what purposes.
    12. Complaints or inquiries
      • Procedures should be established to receive and respond to complaints or inquiries about policies and practices relating to the privacy and security of data or data access requests. The procedures should be easily accessible and simple to use and should involve a commitment to deal with all complaints in a timely fashion.
    13. Vulnerable populations
  2. Security
    Security is concerned with organizational, technical, and physical measures and standards to effectively manage risks to the sensitivity and integrity of data and the availability of resources and services. Due regard should be paid to the GA4GH Security Technology Infrastructure, which complements this policy. The following guidance promotes safe and effective data sharing environments.
    1. Organizational measures
      • As human errors are among the most difficult errors to control, organizations should, with ongoing commitment of adequate resources: (1) develop, monitor, and enforce policies (consistent with this policy) to secure data; (2) appoint a security officer responsible for implementing and enforcing security policies and practices, and responsible for monitoring them through standards, procedures, and baselines; (3) implement internal and external security reviews and audits; and (4) implement and require ongoing training and education of personnel on privacy and security policies and best practices.
      • The number of copies of data (as backup or otherwise) stored by persons or organizations should be kept to the minimum necessary to ensure adequate protection of the data in the event of primary copy data loss.
      • Each organization should implement Identity and Access Management (IAM) policies, procedures, and technologies to verify the identity of each individual to whom access rights are to be granted, and to ensure that each individual is given access to all of (and only) the type and volume of data and services required for a specified period of time. IAM includes identity proofing, credential issuance, rights authorization, identity authentication, and rights revocation. As part of the IAM policies, organizations should maintain a list of persons having access to data and the list should be reviewed regularly and authenticated.
      • Organizations that agree to recognize and accept authenticated identities and security attributes issued by other organizations (“federated identity”) have the responsibility of assuring the trustworthiness of the issuers, as well as the currency and authenticity of asserted identities. The GA4GH Authentication and Authorization Infrastructure (AAI) standard may be used to federate identity authentication and service authorization.
      • Consequences for data breaches should be clearly stipulated and enforced (see also the GA4GH Accountability Policy).
      • In the context of cloud computing, companies providing cloud computing services to store, analyze, or warehouse data should have good management infrastructure and robust data encryption capabilities. The responsibility is on the data user/organization to ensure this infrastructure is compliant with local laws and regulations when uploading data to the cloud. Organizations should ensure that cloud service providers have independently audited against comprehensive and internationally recognized and respected information security standards, such as those promulgated by the International Organization for Standardization (ISO) and Statement on Standards for Attestation Engagements (SSAE). Organizations should also ensure that cloud service providers have up-to-date third party audit certifications and are maintained throughout the duration of the cloud service.
    2. Technical measures
      • Physical and logical access to computer systems and networks should be restricted to authorized individuals, and access granted only for those information assets and functions required to perform the user’s assigned duties.
      • Whenever possible, data should be pseudonymized or anonymized at the earliest possible opportunity.
      • Where data are pseudonymized, an organization may assign a key to enable the data to be re-identified. The assigned key should not be derived from or related to the associated individual, should not be used for any other purpose, and should not disclose the mechanism used for re-identification. The direct identifiers associated with keys should be isolated on a separate dedicated server/network without external access. A defined procedure and auditable mechanism for reversing the pseudonymized data to (re)attribute to the data to a specific data subject should be in place.
      • Emergency-management and disaster-recovery plans and safeguards should be implemented, including regular back-ups.
      • Technical measures to secure data should comply with the relevant guidance and regulations (e.g. for clinical trials) and should aim to be interoperable with data sharing systems and software.
      • Every system that accesses, stores, or transmits data should record an audit log of all security-relevant events. Audit trails should be reviewed regularly, and all suspicious events should be investigated. Where possible, automated, enterprise-wide, audit trail monitoring, with alerts for misuse and algorithms to amend or terminate access, should be implemented. Audit logs should be maintained for a minimum of one year, or as otherwise required by applicable law, and carefully protected.
      • Configuration management of all hardware and software (including operating systems) should be implemented. Every change should be reviewed for potential privacy and security impacts.
      • Organizations should take recommended actions to protect data and services from known and emerging threats, which would include monitoring sources of security threat information and installing security-critical upgrades as soon as they become available and have passed quality assurance testing within the organization.
      • Organizations should protect data from new security vulnerabilities in any software used over the lifespan of a project involving the data. Such consideration should include ensuring that security patches to the software are promptly applied and that any vulnerabilities for which security patches cannot be applied in a timely way will be subject to scrutiny regarding alternative security safeguards.
      • Organizations should routinely test their security systems, and periodically (e.g. yearly) engage an independent third party to perform security assessment and penetration testing.
    3. Physical measures
      • Computers, network equipment, media, and facilities used to collect, access, store, process, transport, or transmit data must be continuously protected using appropriate physical, technical, and procedural safeguards that limit access to authorized individuals.
      • Physical security measures should be in place to protect data from natural hazards such as floods, fires, or earthquakes.
      • Hardware used for sharing data should be tamper-resistant.