Ethics and consent for data collection, use and reuse

Alex McKeown
August 31, 2023

The standard approach to securing consent for collecting and analysing data from individuals is, in a legal and regulatory sense, well defined. Assuming the uses to which the collected data are put are clear and explicit, and the capacity of those whose data is collected can be assured, the process of securing consent need not necessarily be especially ethically complex or contentious.

In the standard consent model, the person from whom consent is sought is provided with information about: why data is being collected; how it will be stored; what it will be used for; what will be done with the findings from the analysis or use of the data. The standard consent process also includes provisions guaranteeing that those whose data is collected can revoke permission for use of their data and have it destroyed if they wish at any later point. Most of us are familiar with this kind of consent being sought, for example when downloading an app or by a company whose commercial services we wish to use and for which they would need to collect data about you. We are also all familiar with consent being sought in a medical context, and indeed, much ethics literature leans towards thinking about consent models in this context. However, the ethical ramifications of consent in the medical and healthcare context go beyond only how personal data collected is to be used or reused. Because these ethical dimensions are important, they are extensive, and here we focus mainly on the particular issue of consent for data use and reuse.

If your organisation’s data processes require only straightforward, standard consent of the kind outlined above then the ethical risks associated with the consent process should be relatively limited. However, as we see increasingly frequently in medical treatment and research and in numerous commercial contexts, machine learning, predictive analytics, and artificial intelligence (AI), what it might be possible to derive from analysing data using these techniques pose a challenge to the ethical adequacy of the standard approach to consent.

Therefore, if your organisation does employ these contemporary data analysis techniques, it will have implications for how you will need to approach securing consent from people whose data you wish to collect. These innovations in data science are characterised by a greater degree of unpredictability of what might be found and what further uses these findings might entail.

Given such unpredictability, your organisation is ethically obliged to be clear about this with anyone whose data you seek consent to collect, and to be clear about any extra risks associated with data to be analysed using these techniques.

Ethics and consent in big data, machine learning, predictive analytics, and artificial intelligence

Traditional models of consent may not always be well suited to big data-driven projects and research, because they were conceived in an era before the machine learning and AI techniques that can pose a challenge to these traditional models were developed[1]. A hallmark of the claimed effectiveness of the big data analytic approach is, as indicated above, its ability to make novel predictive associations that can match, and may in future surpass, conventional human means. This has important ethical consequences of which your organisation should be aware.

Assuming that claims about the effectiveness of these new analytic techniques are realistic, if your organisation uses them or is considering doing so, it is vital for you to be aware of the possibility that, in view of their unpredictability, they may produce unanticipated insights and opportunities for future uses of the data you hold. Indeed, in many cases, it may well be that your organisation uses, or plans to use, machine learning and AI techniques precisely because they can reveal unpredicted, unexpected, or novel insights about the data.

Given this unpredictability, if your organisation uses, or is going to use, machine learning and AI techniques, it is ethically necessary that you consider whether, when, and how securing consent for the reuse of data for new purposes should be managed. This includes instances where those purposes might yield findings relevant to an individual whose data you hold – for example about their health, even if the data is not collected specifically for health-related purposes – and about which they may or may not want and expect to be informed.

For some general background about why your organisation is obliged to consider the ethical implications of its data governance processes, it might be helpful to read the first in this series of articles, here.

The specific context of consent for data use and reuse is ethically important for (at least) three reasons. The first two of these are particularly relevant if your organisation operates in health care or research. First, understanding the causal basis and progression of diseases requires their study over time. Second, as indicated above, since the apparent power of big data analytics derives from its ability to make novel predictions across disparate datasets about the interactions of similarly disparate risk factors, so it may be difficult to predict the scope of any health-related findings yielded by the analysis.

Third, and of general relevance to a wide range of sectors, this predictive novelty limits what can be communicated to people whose data your organisation holds about how it might be used in future. What is ethically foundational here is the welfare of the people whose data your organisation holds, as this could be compromised if procedures concerning their personal information are not adequate or properly observed. The harms, as well as the benefits, that might follow from predictive analytics are unpredictable, and this unpredictability can threaten your organisation’s ability to ensure it meets its primary ethical obligation; namely, ensuring the welfare of those people whose data it holds.

There is no overall consensus about how to define optimal participant welfare, or how consent for the reuse of data should be managed in an era of machine learning and AI[2]. However, several models of consent have been developed, which your organisation should consider in establishing the processes that it ought to implement to make sure that it meets its ethical obligations. Four of these are explained below.

Alternative models of consent

Blanket consent

In a ‘blanket’ consent model, individuals agree in advance for their data to be used for any future purposes considered appropriate and relevant by the organisation(s) that hold their data[3]. This has the advantage of maximising the potential uses to which the data can be put, but the disadvantage of failing to inform those individuals what these uses might be. This in turn may make some people reluctant to allow their data to be collected. Importantly, from a legal perspective, this model may, in worst case scenarios, fail the regulatory test for consent. As such, if your organisation wishes to use blanket consent, you will be ethically obliged to explain this balance of risks and benefits to those whose data you wish to collect and use, and you will also need to very clear about the envisaged uses, even if the list is very long.

Broad consent

A more clearly specified version of blanket consent is known as broad consent. In this model, permission is sought for a range of uses but not assumed for all purposes[4]. This model shares many of the advantages of the blanket consent model, although it too has potential drawbacks. For example, in the context of personal data used in health research, a study may be proposed which requires consent from individuals at high risk of developing a particular condition for the reuse of their data. Here, consent would depend on informing these individuals of their high-risk status.

Although this satisfies the traditional standard of consent that is sought for a specific purpose, it also presents its own ethical challenges, given the potential distress that such a disclosure might cause and its potential implications for a patient’s right not to know where risk is involved. Again, the particular range of ethical issues to be balanced arising from a broad consent approach should be explained to anyone whose data your organisation wishes to collect and use according to such a model.

Dynamic consent

Extending this approach, a third alternative is dynamic consent. This is similar to the traditional model, in that consent is sought on a case-by-case basis.[5] However, it differs in that consent is sought for each instance of reuse of the data for each specific purpose, rather than for its initial use, as is the case in the standard model.

Dynamic consent has the advantage of meeting the usual ‘gold standard’ of consent, in the sense that is properly informed, as permission is sought from participants to ‘opt-in’ for each new use of their data. However, it has the drawback, and not only in the context of health research, that it is also unlikely to be suitable in instances where the individuals whose data is collected are unwilling or unable to have ongoing engagement with the institution which collected it.

Finally, here, there are reasons to ask whether this depiction of the ‘gold standard’ is accurate, given that what is sought is permission to reuse personal data. For instance, some people might refuse to allow their data to be used for particular purposes to which they object. As such, despite its benefits in terms of how well-informed the donors of the data are, dynamic consent can limit the scale, value, and validity of the analysis carried out on it.

In the context of your own organisation, this aspect of dynamic consent might limit what it is able to achieve via analysis of the data that it holds. The complexity of this challenge is also amplified in the context of international business or research involving a range of datasets from different legal jurisdictions, as established protocols for conditions of reuse in these jurisdictions may not be uniform. With that in mind, here too it is clear that your organisation ought to be transparent and explicit about what a dynamic consent model will entail for those whose data it wishes to collect and use, if that is the model it proposes to use.

Meta consent

A fourth option, which still might be understood as a kind of dynamic consent, is ‘meta’ consent. Under a meta consent model, people whose consent is sought for data collection can choose how they prefer to provide consent – for example, whether they have a preference for a blanket or dynamic model for future uses of their data[6].

The meta consent approach, like dynamic consent, has the advantage of putting individuals whose data is collected in control of their data as much as possible. However, it may still not meet the gold standard of consent, given that it remains vulnerable to unpredictable and unknowable potential future uses of the data emerging when machine learning and AI techniques are applied in analysing it.

Given the wide range of individual, institutional, commercial, and societal interests involved in these scenarios, differences of opinion about which consent model is ethically and practically optimal are unavoidable. For instance, non-standard approaches to consent that enable easy reuse are, understandably, often favoured in the health science and policy arena, because they can expand research in beneficial new ways. However, for the reasons outlined above, public preferences for these differs.

With all of that in mind, it is important to keep in mind that what is considered a desirable and ethically appropriate approach to the reuse of data that your organisation holds and wishes to reuse may differ between those who hold the personal data and those whose personal data is held. Whatever the content of competing views might be, if traditional models of consent are inadequate, whether in healthcare, commerce, or any other industry, then a new and more satisfactory approach must be found. It is a vital ethical obligation of your organisation that it factors these considerations into the design of its data collection, handling, and reuse processes, and the consent procedures that accompany them.

How can IGS help your organisation to ensure ethical practice in its consent processes?

We can summarise what has been outlined here with a few key points about the ethical dimensions of consent for data collection, use and reuse, all of which IGS can help your organisation deal with to ensure that it observes the necessary ethical standards regarding those whose data it holds.

Securing consent is an ethical obligation as well as a legal one: even if your organisation’s data use and consent processes are lawful, the welfare of those whose data you wish to collect and use is at stake. As such, you are ethically obliged to ensure that the people whose data you wish to collect understand fully what is entailed by the analytic approach that you will take and what the balance of risks and benefits are, should they agree to provide their data under the terms that you give them.

IGS can help your organisation to ensure that the information it provides is comprehensive, explicit, and covers all reasonable risks that you and those whose data you wish to collect should be aware of.

Big data predictive analytics, machine learning and AI can have an impact on the adequacy of the standard informed consent procedure: because these modern data analysis techniques derive their value from being able to produce novel or unpredictable findings, they can undermine the standard model of consent, wherein the person supplying their data knows what the range of expectations from analysis of the data is likely to be.

Given that this can undermine the possibility that consent is properly informed, and therefore poses a risk to the ethical adequacy of the information you provide, IGS is able to assist your organisation in thinking through these risks, such that you can ensure you are able to respond to any concerns presented with by those whose data you wish to collect.

There are several models of consent which can be employed where these new data science techniques are used, each of which has ethical implications for those whose data is used: the contemporary context of data science, involving predictive analytics, machine learning and AI, presents a range of challenges, which requires a range of solutions to ensure that consent is adequate for the purposes to which your organisation wishes to put the data it holds. Each of these has a different balance of ethical implications, which you are obliged to explain to those whose data you wish to hold.

IGS’s data ethics services can provide your organisation with support to identify the right consent model for its needs, and to develop appropriately robust information which can be used in the process of securing consent.

[1] https://www.frontiersin.org/articles/10.3389/fmed.2018.00013/full

[2] https://pubmed.ncbi.nlm.nih.gov/25669218/

[3] https://www.nature.com/articles/gim2011135

[4] https://link.springer.com/article/10.1186/s40504-019-0096-3

[5] https://www.nature.com/articles/ejhg2015239

[6] https://www.bmj.com/content/350/bmj.h2146