Antti Orava, Information security and data protection officer
Merja Sagulin, Head of Research Services
Phone: +358 50 590 6797
Maria Rehbinder, Copyright Attorney
Phone: +358 50 570 3396
Data protection means protecting the privacy and trust of an individual (data subject) and preventing non-consensual processing of their personal data. The processing of personal data must always happen for a specified purpose and on a lawful basis.
Examples of data processing include the collection, recording, organisation, storage, adaptation or alteration, retrieval or transfer of personal data, as well as measures concerning personal data.
Individuals who process personal data as a part of a research project must comply with the codes of conduct in research data protection that the university has committed to following. University researchers must follow responsible conduct of research practices in processing personal data so that the ethics and quality of the research and the integrity of the scientific community are protected, as indicated in the European Code of Conduct for Research Integrity.
The required ethics review must be carried out before beginning the collection of personal data, following the instructions given by the Finnish National Board on Research Integrity (TENK) and the university. You may use the Horizon 2020 self assessment guide as a basis for assessing the actualisation of ethical principles.
When drawing up research and data management plans, the requirements for processing personal data and compliance with the European GDPR must be taken into account.
All researchers must plan the processing of personal data in advance, covering its entire life cycle. The project’s risks must be assessed, and appropriate technical and organisational measures for protecting personal data must be taken based on the assessment.The processing of personal data must be limited to the minimum amount required for reaching the academic goals of the project. In addition, the data must be pseudo-anonymised or anonymised.
The principal investigator is responsible for the project complying with data protection legislation and data protection policy. They must also make sure that researchers who are to process personal data receive proper orientation into data processing practices before proceeding. The principal investigator specifies the employees’ respective responsibilities and obligations in processing personal data based on data protection policy.
Project guide for researchers: Research ethics
The processing of personal data in each research project is detailed in the project’s privacy notice, which provides a more detailed account of the specific issues concerning the scientific research in question. The privacy notice describes, for example, the purpose for processing personal data and the rights of the research subjects. It also names the project’s person-in-charge or the group responsible for the research. Research subjects must be given sufficient information on the contents of the research project.
Research participants must also give their written consent for participation in the research.
The researcher draws up the privacy notice together with Legal Counsel (IPR) Maria Rehbinder (email@example.com) and Data Protection Officer Antti Orava (firstname.lastname@example.org). A template for the privacy notice and the research subject`s consent form is available by contacting Merja Sagulin (email@example.com).
The university acts as the controller in research projects where the purpose and methods of processing personal data are defined by the university. This is the case with research projects which are approved by the university and whose funding is directed to the university. The university and the researcher may also act as a joint controller, as is the case when the researcher defines the purpose of processing personal data.
Information that is sufficient on its own to identify an individual includes a person's full name, social security number, email address containing the personal name, and biometric identifiers (fingerprints, facial image, voice patterns, iris scan, hand geometry or manual signature). These type of data are called direct identifiers.
Strong indirect identifiers
Other information that may be used to identify an individual fairly easily include a postal address, phone number, vehicle registration number, bibliographic citation of a publication by the individual, email address not in the form of the personal name, web address to a web page containing personal data, unusual job title, very rare disease, or position held by only one person at a time (e.g. chairperson in an organisation). A rare event can also reveal the identity of an individual. These types of information is called strong indirect identifiers.
At FSD, strong indirect identifiers also include the types of codes that can be used to unequivocally identify an individual from among a group of individuals. These include, for instance, a student ID number, insurance or bank account number etc.
Indirect identifiers (or quasi-identifiers) are the kind of information that on their own are not enough to identify someone but, when linked with other available information, could be used to deduce the identity of a person. Background variables and indirect identifiers include, for instance, age, gender, education, status in employment, economic activity and occupational status, socio-economic status, household composition, income, marital status, mother tongue, ethnic background, place of work or study and regional variables. Indirect identifiers relating to region of residence include, for example, post code, neighbourhood, municipality, and major region.
Date can also be an indirect identifier. Date of birth is the most common example, but dates of death and dates of newsworthy events may also be indirect identifiers in research data when combined with other information. In health and medical research, treatment and sampling dates may also occasionally be indirect identifiers when linked to other information.
Identifiable data may be used for scientific research when the use is appropriate, planned and justified, and when there is a legal basis for processing the data (e.g. consent of participants or research carried out in the public interest).
From the point of view of research participants, processing personal data constitutes the risk of confidential information relating to them being revealed to outsiders. Therefore, personal data processing must be planned carefully. Data protection must not be jeopardised, for example, by careless preservation or insecure digital transfers. You can adapt the various guarantees presented in these Data Management Guidelines, including data minimisation, pseudonymisation and anonymisation, for your purposes when processing personal data. Anonymisation is one way of making the data available for sharing and reuse. If necessary, the data can be further protected by administrative and technical data security solutions.
Only the minimum amount of personal data necessary to accomplish a task (e.g. research) should be collected. Personal data must not be collected just in case they might be useful in the future. There has to be a clear, specified need for collecting the personal data.
Pseudonymisation refers to the removal or replacement of identifiers with pseudonyms or codes, which are kept separately and protected by technical and organisational measures. The data remain pseudonymous as long as the additional identifying information exists.without additional, separate information.
Data are not pseudonymous if a specific data subject is identifiable from the data solely without additional information (ibid.). This could happen when indirect identifiers and exceptional records enable identification, even if personal identification numbers and other direct identifiers are stored separately and securely.
Pseudonymous data become anonymous when separately kept identifying information (decryption key, personal data and information on the techniques used to pseudonymise the data) are destroyed. If you cannot dispose of the separately kept personal data, you can make pseudonymous data anonymous by destroying the decryption key and information on the pseudonymisation processes, and by re-arranging the data, for example, according to new, randomised case IDs. The data are anonymous if they cannot be linked to the original personal data with reasonable effort.
For instance, research data of a longitudinal study remains identifiable for as long as the research group has the decryption key to the personal data of the research subjects. The data will not become anonymous even if the decryption key is coded twice. However, coding and double coding as well as pseudonymisation in general are useful guarantees to prevent the use of identifiers in analyses.
Anonymisation refers to the various techniques and tools used to achieve anonymity. Data are anonymous if characteristic attributes (e.g. combinations of certain indirect identifiers) pertain to more than one person and a data subject cannot be identified with reasonable effort. When data are anonymous, individual data subjects cannot be identified from indirect identifiers or by combining the data with information available elsewhere. New data on the same research subjects cannot be added to an anonymous dataset. For the data to count as anonymous, anonymisation must be irreversible. An individual data unit (person) cannot be re-identified with reasonable effort based on the data provided or by combining the data with additional data points.
Completely anonymous data do not exist, but with well-executed procedures one can achieve a result where individual persons cannot be identified with reasonable effort.
Source for the texts on this page: The Finnish Social Science Data Archive (FSD) `s Data Management Guidelines
The EU General Data Protection Regulation (GDPR) is followed in the data projection in all EU countries.
Personal data must always be processed in compliance with the data protection principles specified in data protection legislation. The data-protection principles state that personal data must be
Personal data means any information relating to an identified or identifiable natural person. A natural person is considered identifiable if they can be identified, directly or indirectly, in particular by reference to an identifier such as a name, an identification number, an online identifier or to one or more factors specific to the physical, physiological, genetic, mental, economic, cultural or social identity of that natural person.
The information may refer to an individual's private or family life, health, physical characteristics, professional activities, and social behaviour. Research data may also contain identifiers relating to research subjects' family and friends or other third parties. Identifying information relating to these persons also constitutes personal data.
As a rule, the processing of personal data belonging to special categories is prohibited. Such data reveals the person’s racial or ethnic origin, political opinions, religion or philosophical beliefs, trade union membership, data concerning health, sexual orientation or activity and genetic and biometric data for identifying the person.
Such data merits specific protection, because their processing could create significant risks to the fundamental rights and freedoms of the individual.