The exponential growth in data over the past decade has impacted the legal industry; both requiring automated solutions for the cost effective and efficient management of the volume and variety of big (legal) data; and, enabling artificial intelligence techniques based on machine learning for the analysis of that data. While many legal practitioners focus on specific services niches, the impact of AI in the law is much broader than individual niches. While AI systems and concerns for their ethical operation are not new, the scale of impact and adoption of AI systems in legal practice makes consideration of the ethics of these systems timely. While there has been recent progress in development of ethical guidelines for AI systems, much of this is targeted at the developers of these systems in general, or the actions of these AI systems as autonomous entities, rather than in the legal practice context. Much of the ethical guidance – whether for AI systems or legal professional is captured in high level principles within more narrowly defined domains, more specific guidance may be appropriate to identify and assess ethical risks. As adoption and operation of AI software in routine legal practice becomes more commonplace, more detailed guidance on assessing the scope and scale of ethical risks is needed.

AI in the Law – Photo Credit: Adobe Stock

 

Date: December 10, 2020
Time: 15:20-15:45
Event: IEEE Big Data: Workshop on Applications of Artificial Intelligence in the Legal Industry
Topic: AI in the Law: Towards Assessing Ethical Risks
Venue: Online
Public: Public

To hire Dr. Steven Wright for consulting services or to keynote, moderate, or host your next event please email: Dr Steven A Wright

Performance Improvement in Ethics Assessment

Human systems with ethical issues have been observed to develop following the somewhat ad hoc CRIC cycle (Crisis, Response, Improvement, Complacency). Ethical issues become top of mind for the general public (and the legal profession) when egregious ethical failures come to light. Aggregate and individual progress in ethical performance can be difficult to predict in a CRIC cycle; and is often addressed only at points of drastic failures. Ethics is a branch of the humanities, with ethical performance typically assessed by manual human efforts using natural language enquiries. Widely accepted standards of ethical behavior can change over time as new norms of behavior become socially accepted. Assessments of ethical conformance by groups of people (e.g., an organization, a profession) are important to establishing, and maintaining, public confidence in that group. Training individuals to improve the ethical conformance of their behavior is currently widespread in many organizations and professions. The deployment of Artificial Intelligence (AI) systems is also driving demand for both an increasing number of ethics assessments (due to the increasing number and variety of AI systems) and a need for ongoing ethics assessments as these systems can learn and modify their behavior during operation.  The CRIC cycle is ill-suited to these needs for ongoing improvements in ethical performance.

Rather than perpetuating CRIC, more systematic quality improvements can be achieved through continuous improvement quality cycle approaches – e.g., the Plan-Do-Check-Act (PDCA) cycle. The PDCA cycle approach aligns with technological approaches for software performance improvements. Applying this approach in the context of ethical performance for software systems requires consideration of the appropriate metrics for ethics, and the relevant measurement and testing procedures for assessing ethics performance. Much of the ethical guidance – whether for AI systems or legal professional is captured in high level principles (e.g., Jobin’s synthesized principles (Jobin, et. al., 2019)) within more narrowly defined domains. Some ethics regimes (e.g. lawyers) have associated enforcement mechanisms interpreting those rules in the context of specific controversies. ML systems are domain specific because they learn best on data within a narrow, coherent, data domain. Software verification and validation proceeds through mechanized testing processes based on specific test cases rather than natural language human inquiry processes. The nature of AI systems as continuing to learn during operational phases requires ongoing testing to verify and validate proper operation including ethical constraints. The metrics for those ethical constraints themselves requires further elucidation.

Photo Credit: Adobe Stock

Guidance of greater detail and specificity may be appropriate to identify and assess ethical risks in the context of these domains, or, defining data domains around ethical risks to better match ML capabilities. New metrics would seem to be required for the assessment of ethical risks, but which metrics? And how many do we need? Guidance and standards for ethics are still nascent and targeted at principles rather than ethics performance benchmarks. Rather than developing metrics top down from broad principles, it may be more practical to develop them in the ethical context of particular tasks or organization or professional behavior patterns. From this perspective, the development of a set of metrics targeted at a particular ethical context (e.g., the rules of professional responsibility for lawyers) should proceed first. Later categorization of those metrics against a broader framework may provide a perspective on the scope of metric coverage; and enable insight from metrics developed in other contexts.

Traditional software is developed by writing down the program logic that governs system behavior. With Machine Learning (ML) software, the rules are inferred from training data. For many software systems, the code base may rely on third party libraries. In the case of ML systems, the training of the ML system may be done with third party data. Most software development processes stress (to varying degrees) the testing of the software under development. Operation of large-scale complex software systems typically frontloads functionality testing into acceptance tests then typically deals with software changes as release upgrades which may have some degree of regression testing depending on the operational environment and the software supply chain. ML software systems have fundamentally different operational characteristics because their learning mechanisms can change the behavior of the software outside of the traditional software update processes.

The testing challenge for ML systems is clearly significant without even considering the specific challenges of testing ML systems for compliance against high level ethical objectives with new metrics  and performance benchmarks. AI software can be expected to undergo revisions and updates just as other software does. With ML software continuing to learn new behavior during operation, ongoing assessments of ethics risks will be required. The data collection processes driving operation of the AI software may change overtime, and the AI system may learn new behaviors from new data. In addition, the metrics and measurement techniques for assessing ethics performance could be expected to evolve over time. The number of changing components reinforces the need for some continuous improvement mechanism for assessing the ethical performance of ML systems. A PDCA cycle focused on assessment of ethics risks could help maintain and improve ethical performance of these software systems.

Ethical guidance for the developers and operators of such AI and autonomous systems is slowly emerging. Software technology approaches for verification and validation of ethical software will challenge the specificity of existing professional guidelines for ethical conformance. The exponential growth in data and consequent changes in business practices demands responses from the professions, government, and the public. The volume of data challenges traditional natural language methods for ethics inquiries. New metrics and automated measurement approaches may be tractable if suitable ethical performance benchmarks can be established.  Continuous improvement approaches (e.g. PDCA cycle) could be applied to raise the performance benchmarks for ethical AI software over time.

If you need help with AI ethics issue, contact me.

An extended treatment of this topic is available in a paper presented at the IEEE 4th International Workshop on Applications of Artificial Intelligence in the legal Industry (part of the IEEE Big Data Conference 2020).

References

(Jobin et.al. 2019) Jobin, A., Ienca, M., & Vayena, E. (2019). The global landscape of AI ethics guidelines. Nature Machine Intelligence, 1(9), 389-399

The Problem Statement Problem

Problems have been described at the discrepancy between the current state and some desired future state. A problem-cause-solution pattern is common as a critical thinking approach providing argumentation supporting proposed solutions. This approach is particularly attractive if existing predictive models based on the causal actions are available. Based on new inputs to the model, existing predictive models provide a mathematical basis for calculating (predicted) new results within the limitations of the model. As a mathematical technique, predictive models have been successfully applied in a variety field from scientific endeavors to commercial activities like algorithmic stock trading, predicting accident risk for auto insurance, and healthcare outcomes.   

Photo Credit: Adobe Stock

When considering potential bias in problem solving, the predictive model is often a starting place. A good model is both as accurate as possible, and as simple as possible making it easy to understand and apply – and also easy to misapply if its limitations are not understood. Most models reduce the amount of control data because this simplifies the model enabling easier model development, validation and usage. Models are typically validated over a limited range of control variable values, but reality may not be constrained to that range.  Complex systems in the real world are often affected by multiple control variables, and those may interact in interesting non-linear ways. The phenomenon being modelling is typically assumed to have a stable pattern of behavior, but this assumption is not always true. Humans, animals and artificial intelligence software can all exhibit learning behaviors that evolve over time. Predictive models of systems with learning behaviors developed at one point in time, may not be valid after new behaviors are learned. While model developers strive for prediction accuracy, most models are approximations. The degree of precision of an approximation may limit the predictive power of a model.      

The tools we use alter our perception of the problems we solve – various paraphrased along the line of if one has a hammer, one tends to look for nails (quote investigator 2014). Even professionals with specialist expertise in particular fields tend to look problems from the perspective of their profession. Indeed, they may risk liability issues if they deviate from professional norms. The approach of identifying a cause may be seen as argumentative or evaluative, e.g., when there are multiple causes or explanations leading to a problem. The model development approach looks for variables that can be isolated and controlled, but these may not be the only causes of problems.  In a legal (liability) context, there is a notion of a proximate cause being a cause that produces particular, foreseeable consequences. This typically requires the court to determine that the injury would have occurred, “but for” the negligent act or omission (the proximate cause). The widespread adoption of big data collection and artificial intelligence techniques (e.g., machine learning) has increased attention on the need to move beyond statistical correlation to prove causality. Recent progress in development of causality proofs (e.g. causality notations – (Pearl & Mackenzie 2018) has enabled significant improvements in development of predictive models. While the problem-cause-solution pattern is common, there are situations where action (or a solution) is required without establishment of a cause. The continuing operation of the system may not afford time for causal determination, or the costs on inaction may be too great. In this action bias context, the objective may be to make “reasonable” actions (e.g. to avoid known bad outcomes) rather than attempt to resolve the problem.

A desired future state may be described in objective terms; a desired state, however, must be desired by some real human (ie. it is subjective) as non- humans do not have desires.  Broad consensus on some desired future state may provide some aura of objectivity. “Wicked problems” lie in the area where broad consensus of desired future state does not exist.  If there is no consensus on the desired future state, then that lack of consensus likely applies not just to proposed solutions, or causality model selections, but also to the problem statement.

Photo Credit: Adobe Stock

Problem statements delimit the scope of the problem to avoid extraneous matters and focus on the information relevant to the problem. The problem statement provides a context and forms a perspective on the problem. Perspectives include not just data observations, but also some meaning associated with those observations, focusing attention on the most relevant/ important observations.  Framing the problem from different perspectives may result in different solutions, e.g., different problem statement will likely have different causal explanations proposed, leading to different solution proposals.There is often a rush to solving a problem rather than clarifying the problem statement first. A problem statement should provide clarity around the four W’s of the problem:

  • Who – Who does the problem affect? Do they recognize it as a problem? Has anyone else validated that the problem is real? Who realizes the value if the problem is solved? Who else might have a useful perspective on the problem?
  • What – What is the nature of the problem? What attempts have been made to resolve the problem?
  • When – When does the problem happen? What are the antecedent and contemporary event? What is the Temporal Perspective?
  • Where – Where does this problem arise? Is there observational data of the problem context correlated with its occurrence? What is the Geographic Perspective on the problem?
Photo Credit: Adobe Stock

When it comes to your problem – what type of problem solver are you? Ad hoc or intuitive problem solvers risk spending their effort solving the wrong problem and not achieving the impact they might hope for.  A systematic approach to capturing the problem statement and then reframing it from multiple perspectives may take more time initially to develop the problem statement, but avoids the risk of solving the wrong problem. Why is the problem worth solving? Why are you trying to solve it? Once you have your client’s problem statement, you can then refine it to focus on the problems that matter for greater impact.

When developing the problem statement for your client, understanding diverse perspectives can impact the scope of the desired future state as well as constraints on viable solutions. If you are developing client problem statements, you might be interested in our free Guide to Writing Problem Statements

A course on the use of perspective to refine problem statements is now available.

  Problem Perspectives Course

If you need help with your problem statement contact me.

References

(quote investigator 2014) https://quoteinvestigator.com/2014/05/08/hammer-nail/

( Pearl & Mackenzie 2018) Pearl, J., & Mackenzie, D. (2018). The book of why: the new science of cause and effect. Basic Books.