Understanding Data Flow Mapping
Data flow maps (also known as data flow diagrams) are essentially visual representations of the flow of personal data in a specific processing activity, from source all the way to eventual destruction or disposal.
In essence, a good data flow map summarises the entirety of a data lifecycle in one short visual representation, covering which key systems, functions, parties, departments, and data repositories are present, depicting the manner in which they contribute to the data flow. They can provide an understandable overview of data handling structures and processes, making it easy to see who is handling data and where and how it is being stored.
Data flow maps often use symbols related to flow chart language, such as rectangles, circles and arrows to show data inputs, data outputs, the data stores and the routes between each data store.
There is no one way on the logistics and tools required to create a data flow map. They can be created on external premium software tools, they can be hand-drawn or they can be created on more standard desktop applications such as Microsoft PowerPoint or Microsoft Visio. In this article, we won’t be focusing on where you create your data flow maps, or specific tools you can use, but rather what to include within them and why they are important.
Importance of Data Flow Mapping
An indispensable component of any Data Protection Impact Assessment (DPIA) – one of the most important documents an organisation must produce before it commences a potentially risky data processing activity – is a data flow map. Alongside the DPIA, it’s important for a Data Protection Officer to have oversight and input of a data flow map to understand a data processing activity and the potential risks involved.
Data flow mapping is often a high level look at the life cycle of personal data in a given project. They can cover which organisations process the data and what data protection roles they have, which systems store the data, the methods of transfer. In doing so, you will be able to undertake a structured systems analysis.
Aside from just being included in DPIAs, having clear data flows in a data flow map form an essential first step in drafting DPIAs and thus meeting your obligations under Article 35 UK GDPR, for being able to construct one out the outset – even if only high-level – is a good indicator that you have fully grasped the core of the specific data processing activity.
They can help track potential security issues you may have, and also help convey the potential benefit of the project to key stakeholders.
Any gaps in the flow that you cannot complete, or dots that you cannot link up will serve as a starting point for your investigation into the intended processing activity. Keep in mind that this investigation should involve consultation with key business process owners.
Benefits of Data Flow Mapping
Additionally, a good data flow diagram will also allow you to make sure your organisation:
- Understands what is happening with your data and visualises and identifies potential issues with security and compliance.
- Validates the accuracy of the data flows with business stakeholders.
- Demonstrates a clear governance process.
- Improves data governance practices and identifies opportunities to streamline and simplify processes and technologies to better use data assets for commercial gains.
- Supports a sustainable process for maintaining records of processing activities as they are easy to review and maintain whilst the business and/or technology environment evolves.
- Creates accurate and sufficiently detailed privacy notices in compliance with Articles 13 and 14 UK GDPR.
- Creates a centralised inventory using technology to achieve efficiency.
- Plans a risk-based approach to remediation by:
- Identifying special categories of personal data that could be pseudonymised;
- Identifying business areas that require data protection impact assessments;
- Supporting data subject request response procedures; and
- Identifying cross-border data flows that require appropriate transfer mechanisms.
Creating Effective Data Flow Diagrams
Whilst individuals are free to construct data flow maps in the manner of their choosing, the steps listed below are a good example of a high-level strategy that one can take to successfully depict a processing activity in a data flow map that incorporates all essential information.
Data Flow Diagram Symbols and Notations
Consider the language, the colours and the design of your data flow map. Can you introduce a colour scheme to distinguish between controllers and processors? Can you distinguish flow of identifiable data from de-identifiable data by using dashed and dotted lines? Can you use shapes as a way of distinguishing different kinds of storage arrangements?
With the above in mind, you may wish to consider different styles. The UK Health Security Agency (‘UKHSA’) has published guidance for all applications to access UKHSA data classified as ‘protected’ and within this guidance it describes acceptable ‘data flow diagrams’ that must be submitted in all applications. In doing so, it draws on examples of notation styles, notably ‘Yourdon (1989) and DeMarco (1978), and Gane and Sarson (1979).
In the notation styles of Yourdon and DeMarco, the following applies:
- Circles represent processes;
- Squares of rectangles denote external entities;
- Horizontal parallel lines symbolise data storage; and
- Arrows indicate data flow.
And in the notation style of Gane and Sarson:
- Rounded rectangles represent processes;
- Squares or rectangles denote external entities;
- Open-ended rectangles symbolise data storage; and
- Arrows indicate data flows.
Once you have either created a language of your own, or utilised published styles such as the ones highlighted above, make sure to include a small table in your diagram to explain this language to readers.
Data Flow Mapping Process
Whilst individuals are free to construct data flow maps in the manner of their choosing, the steps listed below are a good example of a high-level strategy that one can take to successfully depict a processing activity in a data flow diagram that incorporates all essential information.
Identify the end-to-end process
It is advisable to break the intended processing activity down into its start at one end and its finish at the other. This will allow you to visualise the space you have left and prioritise the most important details for the remaining space. When defining the start point, it is useful to consider what personal data is collected, how it’s collected and from whom, and when defining the end, keep in mind any relevant retention periods and the moment at which data will be destroyed.
Identify Entities with Access to Data
Think of who the data controllers and data processors are in the project and make sure to give them plenty of space in your diagram, as they are the key parties. Any given point of the data flow should be easily attributable to at least one of them.
Think which systems, departments and functions will be used and ensure they are depicted in your data flow diagrams, according to which data processor or controller will be responsible for engaging them. This is arguably even more important when an external entity is involved in the project, as they are likely to be more of an unknown factor. Understanding how an external entity will process the data and where they will store data is crucial.
Focus on the flow of data from start to finish
Think of what needs to happen to the data to move it from the start point to the finish point. Relevant questions here could be:
- Which organisation might need access to the data for this to happen?
- Where will the data be stored between start and finish?
- What security measures will protect the data during both rest and transfer?
- Will the data be de-identified at any point?
- What are the security layers?
- Will data repositories be used?
- Will any data be transferred outside the EU/EEA?
Accessibility
Consider the accessibility of your data flow diagram. Would it make sense to all readers? Is there easy access to it which is not tied up to access controls or logins? Are there are any colours that might be less easy for certain audiences to distinguish? As the fundamental purpose of data flow maps is to summarise lengthy descriptions of data processing activities, it is very important to not jeopardise this by including elements in your data flow diagrams that would hinder their legibility to certain audiences.
A detailed view of what to include and consider in your data flow map
Although we have presented a high level overview, the following the following elements are useful to bear in mind when considering how your data flow diagram should be presented in more detail.
- What are the categories of data subjects?
- What are the categories of personal data that will be processed?
- Why does the organisation need the data, how it is used and can it be minimised?
- What is the legal basis for processing?
- How is the personal data collected?
- What format will the personal data be stored in?
- What is the location of the data?
- Who needs access to the data?
- How is personal data shared internally within the organisation and externally with third parties?
- Is the personal data transferred outside of the EEA? If so, how? What safeguards are in place to protect the data?
- What is the applicable data retention schedule? When the data will be erased or destroyed?
- What different technical and organisational security measures apply to the personal data?
- Who is accountable for the data and how might this accountability change as the data moves through the organisation and to third parties?
Conclusion
DFMs are not only an essential component of a DPIA, they’re also an important means of achieving numerous other important obligations under the UK GDPR, like respecting the principles of data minimisation, accountability and accuracy. There are a number of things to keep in mind when completing a DFM, from high-level steps like ensuring the language and style you use is comprehensible and intelligible for all audiences to ensuring you include specific details like whether there are different categories of data subjects involved.
Aside from all these different considerations, the main takeaway from this article should be that the ultimate aim of any good DFM is to communicate a data processing activity in a concise and accessible manner that is truly visually representative of all important systems, processes, functions and parties involved in moving that data from the start to the finish for the intended purpose.