Data Ethics

Get Started. It's Free
or sign up with your email address
Data Ethics by Mind Map: Data Ethics

1. Advancements in measurement & data collection

1.1. Leads to debate about the right, legitimate, and proper ways to use data

1.2. Data creators, suppliers, and users should engage in ethical considers about data

1.2.1. Aim must be to help focus on issues and attempt to remove confusion and ambiguity. Provide principles that can help people come to a conclusions about what is the right way to act and right thing to do.

1.3. Progress in data science & technology will always be ongoing and will never plateau.

1.3.1. New Data Technologies Blockchain Homomorphic Computation Quantum computing

1.3.2. Statistics & Computer Science Center of Data Revolution Previously very technical focused Current tools and the way they are being used is highly sensitive to subtle ethical issues Two types of Analysis Aimed at Research Aimed at Practice

1.3.3. Hard to create precise regulations and guidelines because of the rapid changes

2. Consent & Purpose

2.1. Should be obtained before an intervention based on the understanding of implications and consequences.

2.1.1. (1) An understanding of what the data might be used for in the future. Difficult concept because future is unknown. Data may be merged with other data sets to reveal information. Valuable for exploring aggregate properties of population. Important for decisions relating to individuals. Might not be possible to say what use given data will contribute to.

2.1.2. (2) An understanding of how the data are to be used. Assumes that the person being asked to consent has the expertise and knowledge to understand the data Obtain an explanation of the decision reached after such assessment. Find a strategy for identifying which variables are important in reaching a decision. Defines characteristic of modern data analytic tools. Ex: neural networks, Support vector machines, and ensemble systems, that are intrinsically complex.

2.2. Contains 3 Pillars: Accountability, Transparency and Responsibility

3. Trustworthiness

3.1. Key Factor is Provenance

3.1.1. Knowing where the data comes from and knowing the source proven reliable in the past. More eyes mean fewer lies, and fewer mistakes. If you do not understand the data, you should not trust it.

3.1.2. Applies to Mostly News articles and blogs

3.2. Occurs at the following levels.

3.2.1. Raw data

3.2.2. Analytic Methods

3.2.3. Analyst

3.2.4. Employer Organization

3.3. Comprehensive metadata

3.3.1. Understand the data and be able to track them back to origin. Requires that an analysis should be reproducible

3.3.2. If you do not understand the data, you should not trust it. A method cannot be trusted if it is being used to answer the wrong question. Variants of this theory

4. Privacy & Confidentiality

4.1. The right to be left alone and not be bothered.

4.2. The right to be protected from government intrusion.

4.3. Power to selectively reveal oneself to the world.

4.4. Key aspect of one’s identity.

4.5. Privacy and Data

4.5.1. Only exist relative to the person from whom we wish to keep something hidden. Dependent on the context and relationship between giver and receiver of data. Dependent on where the data will be put.

4.5.2. Adverse effect on the value of data sets if they allow people to opt in or opt out of being included.

4.5.3. Sometimes it is unethical not to use available data. Strike a balance between different kinds of risks and different kinds of gains. Risks and gains were clear and measurable

4.6. Privacy Paradox

4.6.1. Point out that the privacy balance is rather more nuanced than is typically understood. Offer privacy with one hand while creating privacy risks with the other.

4.7. Personal Data and Privacy

4.7.1. Names, addresses, usernames, passwords, account codes, e-mail addresses, and ID numbers. Information is removed from a database, then the entries are anonymized, they cannot be matched to their owners.

5. Data Technology VS. Other Advanced Technologies

5.1. Data Technology requires careful consideration of ethical issues for the following reasons:

5.1.1. Aspect of societal infrastructure

5.1.2. Interconnectedness of data

5.1.3. Dynamic nature of data

5.1.4. Real-tome and online analysis and decision making

5.1.5. Synergistic analysis through merging and combination of data sets

5.1.6. Lack of space, time and social context limitation

5.1.7. Ability to use for unexpected purposes & reveal unexpected information

5.1.8. Risk of exceptional intrusiveness

5.1.9. Potential for misuse, privacy breach, blackmail and other crimes

5.1.10. Subtle ownership issues

6. Metcalf outward facing general ethics codes for professions

6.1. Protect vulnerable populations

6.2. Protect/enhance the good reputation of the profession

6.3. Establish the profession as a distinct moral community

6.4. Provide a basis for public expectations and evaluation

6.5. Serve as a basis for adjudicating disputes among members

6.6. Create resilient institutions

6.7. Respond to past harms

7. Ethical codes for data collection, manipulation

7.1. Providing guidance on how to behave in difficult circumstances

7.2. Preserving privacy in a way that users and the public will find acceptable

7.3. Ensuring that data will benefit the public

7.4. Reassuring customers, the public and others about an organization's integrity

7.5. Reassuring employees that they work for a trust- worthy organization

8. Principle based approach

8.1. Code of Practice of the UK Statistics Authority based on eight principles

8.2. Data Management and Use: Governance in the 21st Century based on five principles

8.3. The Accenture Universal Principles of Data Ethics that has twelve principles

8.4. The ACM Code of Ethics and Professional Conduct

8.5. Highest Level of Principles

8.5.1. Integrity

8.5.2. Honesty

8.5.3. Objectivity

8.5.4. Responsibility

8.5.5. Trustworthiness

8.5.6. Impartiality

8.5.7. Nondiscrimination

8.5.8. Transparency

8.5.9. Accountability

8.5.10. Fairness

8.5.11. Robustness

8.5.12. Resilience

8.5.13. Usability

8.5.14. Efficiency

8.5.15. Independence

8.5.16. Six Main Principles Start with clear user need and public benefit Use data and tools with min. intrusion Create Robust Data Science Models Be alerts to public perceptions Be open and accountable Keep data secure


9.1. Comparted to oil in that it is processed to extract value

9.1.1. However data are not consumed like oil and gas

9.1.2. Data can be reused

9.1.3. Cannot be depleted

9.2. Numbers are not data without metadata

9.2.1. Minimum meta data are unit of measurement same as making comparison to two objects

9.2.2. Can data make sense outside of a theory or world view?

9.3. How data is used is important

9.3.1. bringing different sets of data together and comparing them in different ways can create different uses.

9.3.2. Unlimited ways it can be used

9.4. Captured automatically in most cases.

9.4.1. No additional time or resources needed to capture

9.4.2. This is considered observational data not at risk for being distorted

9.4.3. Administrative data can tell What people are What people do differentiate from What people say What people claim

9.4.4. If it might be useful record and store it Cheap to record Contrary to minimization principle

9.5. Individuals cast long data shadows

9.5.1. used in tracking individuals

9.5.2. emails

9.5.3. Possibility driven attitude

9.5.4. Credit card activity

9.5.5. Web activity

9.5.6. phone calls

9.5.7. can linger for extended periods

10. Personal Data

10.1. When combined with other personal data can tell you much about other people whos data you may not have

10.2. The most ethical of all issues regarding data

10.3. GDPR

10.3.1. Replaces UK's Data protection Act (DPA)

10.3.2. EU wide regulation protecting individuals

10.3.3. Increases accountability of those processing personal data

10.3.4. Details obligation associated with personal data

10.3.5. Defines personal data as: any information relating to an identified or identifiable natural person an identifiable natural person is one who can be identified, directly or indirectly, in particular by reference to an identifier such as a name, an identification number, location data, an online identifier or to one or more factors specific to the physical, physiological, genetic, mental, economic, cultural or social identity of that natural person’’

10.3.6. dictates how personal data can be used.

10.4. IP Addresses

10.5. Mobile Device identifiers

11. Data Ownership

11.1. If the data are anonymized

11.1.1. No problems arise

11.1.2. data from research of 100 participants whos identifying aspects have been removed

11.1.3. research data

11.2. GDPR states natural persons should have personal ownership of their

11.3. Digital watermarking

11.3.1. Steganography message or other information is concealed in a body of data

11.4. identifying signal is embedded in the data

11.5. Many times ill defined

11.6. In other contexts

11.6.1. Ownership is dictated by if that person has the right to control use

11.6.2. creator or legal entity for whom the creator is working is typically the initial owner

11.7. Who owns

11.7.1. Google recording search histories

11.7.2. personal data, data ‘‘about’’ someone?

11.7.3. A shop records the prices of the goods you have bought so it can calculate how much to charge you

11.8. Various authors say

11.8.1. ownership of personal data rests with the people that data are about

11.9. Alex Pentland

11.9.1. individuals might have control over their personal data, so that ‘‘a person’s data would be equivalent to their ‘money.’ reside in an account where it would be controlled, managed, exchanged, and accounted for just like personal banking services operate today. States it would idealistic and not necessarily realistic Not feasible

11.10. Potential of data commons

11.10.1. trusted organizations that store data from a variety of bodies

11.10.2. Similar to credit bureaus

11.10.3. Advantages include range of bodies feeding in data reduces selection bias risks

11.10.4. Looked at by ADRN links data from different UK govt. dept. in a secure environment

11.11. Legal uncertainty exists

11.11.1. EU Inception Impact Assessment Personal data cannot be owned Strict rules on access are in place gap exists with regard to non personal data Lack of legal harmonization across jurisdictions