Oppaat: Project guide for researchers: Take note data protection and data security

Data protection and the data security

Data protection means protecting the privacy and trust of an individual (data subject) and preventing non-consensual processing of their personal data. The processing of personal data must always happen for a specified purpose and on a lawful basis.

The Uniarts Helsinki data protection policy describes the main principles, obligations and procedures that the university adheres to in the processing of personal data. All researchers must comply with the GDPR, the Uniarts Helsinki privacy policy and the university’s other guidelines.

Taking care of data security means taking care of the availability, confidentiality and integrity of data. Data security can be seen as comprising the practical procedures that are used to guarantee the security of data.

Data security must be taken care of during all stages of data processing, both when it comes to handling equipment as well as choosing and using devices, systems, methods and services. Researchers must pay special attention to the secure processing of research material.

Classified and confidential materials as well as materials containing personal data may only be stored in a Uniarts Helsinki network folder. Researchers have access to a personal home folder and, on request, shared network folders that can be used to share files between team members. Home folders can only be accessed with a Uniarts ID or on a university computer, and the folder is personal. Network folders can only be accessed through the Uniarts Helsinki network or a Uniarts VPN connection. Please note that the tools may only be used to disclose the data to people who have the right to it, nobody else.

Processing personal data - responsibilities of researcher and project leader

Examples of data processing include the collection, recording, organisation, storage, adaptation or alteration, retrieval or transfer of personal data, as well as measures concerning personal data.

Individuals who process personal data as a part of a research project must comply with the codes of conduct in research data protection that the university has committed to following. University researchers must follow responsible conduct of research practices in processing personal data so that the ethics and quality of the research and the integrity of the scientific community are protected, as indicated in the European Code of Conduct for Research Integrity.

When drawing up research and data management plans, the requirements for processing personal data and compliance with the European GDPR must be taken into account.

All researchers must plan the processing of personal data in advance, covering its entire life cycle. The project’s risks must be assessed, and appropriate technical and organisational measures for protecting personal data must be taken based on the assessment.The processing of personal data must be limited to the minimum amount required for reaching the academic goals of the project. In addition, the data must be pseudo-anonymised or anonymised.

The principal investigator is responsible for the project complying with data protection legislation and data protection policy. They must also make sure that researchers who are to process personal data receive proper orientation into data processing practices before proceeding. The principal investigator specifies the employees’ respective responsibilities and obligations in processing personal data based on data protection policy.

Privacy notice and controllers

The processing of personal data in each research project is detailed in the project’s privacy notice, which provides a more detailed account of the specific issues concerning the scientific research in question. The privacy notice describes, for example, the purpose for processing personal data and the rights of the research subjects. It also names the project’s person-in-charge or the group responsible for the research. Research subjects must be given sufficient information on the contents of the research project.

The university acts as the controller in research projects where the purpose and methods of processing personal data are defined by the university. This is the case with research projects which are approved by the university and whose funding is directed to the university. The university and the researcher may also act as a joint controller, as is the case when the researcher defines the purpose of processing personal data.

Anonymous data

Direct identifiers
Information that is sufficient on its own to identify an individual includes a person's full name, social security number, email address containing the personal name, and biometric identifiers (fingerprints, facial image, voice patterns, iris scan, hand geometry or manual signature). These type of data are called direct identifiers.

Strong indirect identifiers
Other information that may be used to identify an individual fairly easily include a postal address, phone number, vehicle registration number, bibliographic citation of a publication by the individual, email address not in the form of the personal name, web address to a web page containing personal data, unusual job title, very rare disease, or position held by only one person at a time (e.g. chairperson in an organisation). A rare event can also reveal the identity of an individual. These types of information is called strong indirect identifiers.

At FSD, strong indirect identifiers also include the types of codes that can be used to unequivocally identify an individual from among a group of individuals. These include, for instance, a student ID number, insurance or bank account number etc.

Indirect identifiers
Indirect identifiers (or quasi-identifiers) are the kind of information that on their own are not enough to identify someone but, when linked with other available information, could be used to deduce the identity of a person. Background variables and indirect identifiers include, for instance, age, gender, education, status in employment, economic activity and occupational status, socio-economic status, household composition, income, marital status, mother tongue, ethnic background, place of work or study and regional variables. Indirect identifiers relating to region of residence include, for example, post code, neighbourhood, municipality, and major region.

Date can also be an indirect identifier. Date of birth is the most common example, but dates of death and dates of newsworthy events may also be indirect identifiers in research data when combined with other information. In health and medical research, treatment and sampling dates may also occasionally be indirect identifiers when linked to other information.

Processing research data containing identifiers

Identifiable data may be used for scientific research when the use is appropriate, planned and justified, and when there is a legal basis for processing the data.

From the point of view of research participants, processing personal data constitutes the risk of confidential information relating to them being revealed to outsiders. Therefore, personal data processing must be planned carefully. Data protection must not be jeopardised, for example, by careless preservation or insecure digital transfers. You can adapt the various guarantees presented in these Data Management Guidelines, including data minimisation, pseudonymisation and anonymisation, for your purposes when processing personal data. Anonymisation is one way of making the data available for sharing and reuse. If necessary, the data can be further protected by administrative and technical data security solutions.

Minimisation

Only the minimum amount of personal data necessary to accomplish a task (e.g. research) should be collected. Personal data must not be collected just in case they might be useful in the future. There has to be a clear, specified need for collecting the personal data.

Pseudonymisation

Pseudonymisation refers to the removal or replacement of identifiers with pseudonyms or codes, which are kept separately and protected by technical and organisational measures. The data remain pseudonymous as long as the additional identifying information exists.without additional, separate information.

Data are not pseudonymous if a specific data subject is identifiable from the data solely without additional information (ibid.). This could happen when indirect identifiers and exceptional records enable identification, even if personal identification numbers and other direct identifiers are stored separately and securely.

Pseudonymous data become anonymous when separately kept identifying information (decryption key, personal data and information on the techniques used to pseudonymise the data) are destroyed. If you cannot dispose of the separately kept personal data, you can make pseudonymous data anonymous by destroying the decryption key and information on the pseudonymisation processes, and by re-arranging the data, for example, according to new, randomised case IDs. The data are anonymous if they cannot be linked to the original personal data with reasonable effort.

For instance, research data of a longitudinal study remains identifiable for as long as the research group has the decryption key to the personal data of the research subjects. The data will not become anonymous even if the decryption key is coded twice. However, coding and double coding as well as pseudonymisation in general are useful guarantees to prevent the use of identifiers in analyses.

Anonymisation

Anonymisation refers to the various techniques and tools used to achieve anonymity. Data are anonymous if characteristic attributes (e.g. combinations of certain indirect identifiers) pertain to more than one person and a data subject cannot be identified with reasonable effort. When data are anonymous, individual data subjects cannot be identified from indirect identifiers or by combining the data with information available elsewhere. New data on the same research subjects cannot be added to an anonymous dataset. For the data to count as anonymous, anonymisation must be irreversible. An individual data unit (person) cannot be re-identified with reasonable effort based on the data provided or by combining the data with additional data points.

Completely anonymous data do not exist, but with well-executed procedures one can achieve a result where individual persons cannot be identified with reasonable effort.

Personal data

Personal data means any information relating to an identified or identifiable natural person. A natural person is considered identifiable if they can be identified, directly or indirectly, in particular by reference to an identifier such as a name, an identification number, an online identifier or to one or more factors specific to the physical, physiological, genetic, mental, economic, cultural or social identity of that natural person.

The information may refer to an individual's private or family life, health, physical characteristics, professional activities, and social behaviour. Research data may also contain identifiers relating to research subjects' family and friends or other third parties. Identifying information relating to these persons also constitutes personal data.

Processing of special categories of personal data

As a rule, the processing of personal data belonging to special categories is prohibited. Such data reveals the person’s racial or ethnic origin, political opinions, religion or philosophical beliefs, trade union membership, data concerning health, sexual orientation or activity and genetic and biometric data for identifying the person. Such data merits specific protection, because their processing could create significant risks to the fundamental rights and freedoms of the individual.

Source for the texts on this page: The Finnish Social Science Data Archive (FSD) `s Data Management Guidelines

Take contact

Data protection
privacy@uniarts.fi
MInna Eskola

Legal Services
Titti Luukkainen

Research Services
researchservices@uniarts.fi

Uniarts Helsinki guidelines

More information

European Code of Conduct for Research Integrity
Office of the Data Protection Ombudsman
Söderlund, Liisa & Rehbinder, Maria: The Researcher`s dilemma: Copyright and data protection

EU`s General Data Protection Regulation (GDPR)

The EU General Data Protection Regulation (GDPR) is followed in the data projection in all EU countries.

Personal data must always be processed in compliance with the data protection principles specified in data protection legislation. The data-protection principles state that personal data must be

processed lawfully, fairly and in a transparent manner in relation to the data subject processed confidentially and securely
collected and processed for a specific and lawful purpose
collected only to the amount necessary with regard to the purpose of the processing
updated when required ‒ inaccurate personal data must be erased or rectified without delay
kept in a form which permits identification of data subjects for no longer than is necessary for the purposes for which the personal data are processed.

Data protection tools for researchers

Plan your research project so that it involves as little processing of personal data as possible.
Analyse, which (amount and nature) personal data are necessary for your research. Minimise the amount of personal data being processed.
Make a risk assessment and plan appropriate protection procedures for the entire life cycle of the data processing.
Save the procedure guidelines for situations like data protection breaches.
Recognise the basis for processing.
Recognise the data subject’s rights related to the basis for processing and make sure they are fulfilled.
Document your data protection procedures to demonstrate your compliance with data protection regulations (accountability).
Recognise your role and responsibilities! As the controller, you are responsible for the lawfulness of the processing of personal data for its life cycle.
Build trust and ensure good conditions for future research by following data protection regulations and promoting transparency and openness.
Data protection tools are an essential part of every researcher’s tool kit. Increase your skill set and stay up to date.