I believe there is no deep difference between what can be achieved by a biological brain and what can be achieved by a computer. It therefore follows that computers can, in theory, emulate human intelligence — and exceed it.
In August 2016, during a school-year kick-off speech to students in 16,000 schools across Russia, Vladimir Putin announced, “Artificial intelligence [AI] is the future, not only for Russia but for all humankind. Whoever becomes the leader in this sphere will become the ruler of the world.” Then a year and a half later, Greg Allen, Chief of Strategy and Communications at the Department of Defense’s (DoD) Joint Artificial Intelligence Center (JAIC) reported, “Despite expressing concern on AI arms races, most of China’s leadership sees increased military usage of AI as inevitable and is aggressively pursuing it. China already exports armed autonomous platforms and surveillance AI.” That same year, Defense Secretary Mark Esper announced on November 5, 2019, that China had exported lethal autonomous drones to the Middle East: “Chinese manufacturers are selling drones advertised as capable of full autonomy, including the ability to conduct lethal targeted strikes.” In countering Russian and Chinese pursuit, possession, and export of lethal autonomy the 2017 DoD Artificial Intelligence Strategy emphasized:
Our adversaries and competitors are aggressively working to define the future of these powerful technologies according to their interests, values, and societal models. Their investments threaten to erode U.S. military advantage, destabilize the free and open international order, and challenge our values and traditions with respect to human rights and individual liberties.
The “powerful technologies” referred to in DoD’s AI Strategy and the comments made by Esper, Allen, and Putin refer to lethal autonomous weapons (LAWs), a subset of machines that employ AI. Although there is no internationally agreed-upon definition of LAWs, the DoD defines them as weapons that “can select and engage targets without further intervention by a human operator.” These are the “killer robots” referred to in the media and by organizations dedicated to banning them. Though technology for some LAWs exists, and variants of them have been on the battlefield for decades, fully autonomous lethal systems for offensive use have yet to make their battlefield debut.
In the quest to remain a “leader in this sphere” the United States (U.S.) Congressional and Executive Branches have prioritized research and development of autonomy for military applications. These priorities are evident in the fiscal year 2020 National Authorization Act (FY20 NDAA), the 2019 National Defense Authorization Act (FY19 NDAA), the President’s Executive Order on Maintaining American Leadership in AI, the Pentagon’s Third Offset Strategy, the National Defense Strategy, and DoD’s AI Strategy. Currently, there are efforts within DoD to facilitate the development of weaponized autonomous platforms, LAWs, capable of operating offensively, beyond human control. At this time, DoD policy, reflected in Department of Defense Directive (DoDD) 3000.09 directs Combatant Commanders to “integrate autonomous and semiautonomous weapon systems into operational mission planning” and identify how LAWs may satisfy operational needs.
So, in a word, LAWs are inescapable. The days of debating whether or not LAWs should be developed are over. Commentators have already shown that fully autonomous lethal weapons are not illegal per se, which is to say that the Law of Armed Conflict (LOAC) does not prohibit their use in all circumstances. Barring an agreed-upon prohibition, States are limited by their own policies, like DoDD 3000.09, and the limitations of the technology itself; the popular concern about robots running amok exaggerates their capabilities. From the United States’ perspective, DoDD 3000.09 requires “appropriate levels of human judgment” over autonomous weapons, including those capable of full autonomy. Though LAWs are not prohibited under the LOAC, and they can operate lawfully, few commentators discuss pragmatic safeguards for ensuring they actually do operate lawfully when put into operation. The existing legal framework for identifying and addressing potential LOAC concerns in weapons systems is ill-suited to the unique nature of autonomous weapon systems, because of:
- What we are pursuing;
- Where we are getting it;
- How we are acquiring it.
Together, these vulnerabilities set the stage for building risk into LAWs, an already immature and risky technology. While rigorous testing serves a critical role in minimizing these and other risks, it cannot and should not be the cure-all. In a reality where the inevitable trajectory of clashing international interests tosses LAWs into the crucible of armed conflict, the LOAC requires consideration of its tenets during the design of LAWs’ “decision-making” models, and in conjunction with those who will be held responsible for employing them: commanders, whose responsibility extends to the foreseeable consequences of their decisions. To this end, selected teams of judge advocates and combat-seasoned commanders, tasked as collaborators and issue-spotters, should be involved as early as possible in the design and development process of LAWs’ learning models. There are no legal barriers for this involvement, and the current regulatory system allows immediate implementation, limited only by industry’s willingness to participate.
In support of this proposition, Section II first defines LAWs, briefly explains the underlying technology, and discusses the “black box” problem, while Section III examines how LAWs’ algorithms raise LOAC issues during their development. Section IV describes the current weapons review process and why it is inadequate to mitigate the LOAC issues and risk factors of what, where, and how. Section V explains efforts already in place, where blind spots remain, and what more should be done.
II. Defining Lethal Autonomous Weapons
Autonomy uses artificial intelligence (AI) to mimic human decision-making. Though the U.S. Government has no accepted definition of AI, Section 238 of the FY19 NDAA defines AI as a system that “performs tasks under varying and unpredictable circumstances without significant human oversight, or that can learn from experience and improve performance when exposed to data sets.” The DoD further describes autonomous systems as “self-directed toward a goal in that they do not require outside control, but rather are governed by laws and strategies that direct their behavior.” As stated in the introduction, DoD defines LAWs as weapons that “can select and engage targets without further intervention by a human operator.” Upon human deployment, a LAW can identify a target and attack without further human direction, meaning it can operate with a human “out of the loop,” which is a particularly useful capability when operating in a swarm, in communications-denied or degraded areas, when the volume of data exceeds human capacity to review and analyze, or when there is not enough reaction time for human decision-making.
Autonomy is accomplished by algorithms, which are simply “a sequence of instructions telling a computer what to do,” or a set of problem-solving processes and rules. These instructions and rules are similar to the decision process a human uses to navigate through traffic to get to work, which can be optimized for different preferred outputs, like the most direct route, the least tolls, the most scenic, or most convenient to a grocery store. Given a decision model, an algorithm predicts the best route. A subcategory of algorithms, called learning algorithms, enable autonomy in LAWs. A learning algorithm looks for patterns within inputs (e.g., facial images gathered by its sensors), makes a prediction, and learns from the outcome, continuously improving. Learning algorithms come in different forms and may be referred to as learners, learning systems, agents, or recognizers, depending on the method used to achieve learning and the objective of learning. For this discussion, a LAW’s apparatus that enables autonomous “decision-making” will be referred to as a learner. Learners use deep learning and neural networks for unsupervised learning and “mimic the web of neurons in the human brain” by passing data through layers of filters, looking for patterns until it reaches the output layer, which contains the answer. A programmer sets goals for the learner and may also use reinforcement reward signals to incentivize correct decisions or penalties to deter incorrect decisions, a process called learning or training a model. After achieving its goal, the learner stores its experience to strengthen similar decision-making.
Lethal autonomous weapons will likely rely on several different types of learners. For example, one type, known as recognizers, look for patterns within images to classify and predict what the image depicts. Consider an example of a lethal autonomous drone trained to target snipers by a programmer unfamiliar with the LOAC. The programmer learning or training its unsupervised recognizer to identify snipers would give the drone’s software a data set containing images of service members (or combatants, generally), including those exhibiting characteristics associated with snipers. The recognizer would then apply layers of filters to the data to determine what it observed. The recognizer may look for identifying factors like a body in a prone position, camouflaged, motionless, physically isolated from other people, and with a weapon aimed in a particular direction. Each of these features form one layer, or node, and at the output layer, the recognizer would determine whether it was looking at a sniper. Upon reaching an answer, the recognizer would create a model for image classification of snipers. It would then continually refine its model as the recognizer encounters more images. Despite our ability to fine-tune a learner’s model, employ reinforcement learning with rewards and penalties, and control the data sets used for training, a learner’s decision-making remains opaque.
Evaluating a learner’s effectiveness and reliability proves difficult in machine learning because the decision-making occurs within its multiple layers of nodes and neural nets. This creates a “black box” scenario where algorithms create hidden algorithms unknown to software and testing engineers. According to a group of experts, called JASON, tasked with examining AI for DoD uses:
[T]he sheer magnitude, millions or billions of parameters (i.e. weights/ biases,/etc.), which are learned as part of the training of the net . . . makes it impossible to really understand exactly how the network does what it does. Thus the response of the network to all possible inputs is unknowable.
Ultimately, not only is testing the network’s response to all inputs impossible, but because a learner’s decision-making occurs in a black box, evaluators can never know why a learner acts the way it does. “You can’t just look inside a deep neural network to see how it works. A network’s reasoning is embedded in the behavior of thousands of simulated neurons, arranged into dozens or even hundreds of intricately interconnected layers.” The DoD’s AI Ethics Principles, which set standards for the use of AI, controls for this limitation. One of the five principles is that AI is “traceable,” meaning technicians can examine how the software reached its conclusions. Explainable AI is just that—traceable and knowable—but its early-stage tools are not yet suited for LAWs.
And so the black box problem appears irreconcilable with the requirement for “appropriate levels of human judgment” over LAWs. One may be tempted to suggest rigorous testing will be sufficient but it, too, has limits: “[T]he number of possible input states that such learning systems can be presented with is so large that not only is it impossible to test all of them directly, it is not even possible to test more than an insignificantly small fraction of them.” (emphasis added). If their decision-making models cannot be understood, and cannot be adequately tested, how is a commander to account for the reasonably foreseeable consequences of her decision to use LAWs? Commanders need not rely on faith alone; the black box has windows.
To resolve the black box problem and our inability to adequately test machine learning models, DoD must continue its quest for explainable AI, and in the meantime fully exploit the multiple human touch points occurring across the design timeline that offer critical opportunities for human involvement and understanding. Among them:
- Training decisions, including what data to use;
- Goal selection;
- Choice and weighing of reward and penalty signals;
- Evaluation of the learner’s output and its final decision-making model;
- Adjustments to a learner’s architecture;
- Engineering of the machine-operator interface, and how operator adjustments may interact with the learner;
- Integration of recommendations from end users, legal advice and legal reviews into training decisions, goal selection, reinforcement, and evaluation;
- End-user interface options and command decision to employ.
These touch points provide the means for injecting human judgment into a learner even though, when operationalized, a LAW operates fully autonomously, outside human control. In the simplified sniper-targeting drone example above, the drone simply did what its programmer trained it to do by setting reward signals for finding and targeting snipers. The drone’s model for making targeting decisions was learner-made, but human-taught. A LAW’s decision-making ability is highly dependent upon how its learner’s models are programmed and trained, and so the accuracy and reliability of a LAW’s performance is directly tied to the human trainers whose inputs, rewards, goals, and adjustments are knowable at the time of programming. But human insight into the black box is fleeting. Once the human touch point passes, that window closes and the model’s neural nets run the show, building off training and additional inputs from the environment around it. New windows open as humans interact with the model, but determining which input or adjustment led to a particular output becomes nearly impossible. Leveraging these windows permits appropriate levels of human judgment and enables commander compliance with the LOAC.
III. Algorithms Raise Legal Issues
The United States is bound by the Law of Armed Conflict, which embodies international treaty law and customary international law. All weapon use must adhere to the LOAC including fully autonomous lethal weapons, which is to say it must comply with the principles of military necessity, humanity, proportionality, distinction, and honor. But ensuring LAWs’ programming correctly accounts for the LOAC represents the low bar for legality; layered on top of LOAC requirements are operation-specific rules of engagement, policy considerations, human restraint, and international norms. Applying the LOAC tenets to military operations occurs during planning and execution, when a commander (or servicemember) makes real-time determinations as an operational situation unfolds. But autonomy changes that. The “when” in the decision-making process occurs much earlier. The United States has suggested that LOAC issues are tied to the LAWs’ programming, meaning a learner’s training must enable its later use to conform to the LOAC.
A. Legal Issues Under the Law of Armed Conflict
Among the LOAC issues raised by LAWs are the bedrock principles of distinction and proportionality. Distinction simply means only proper military objectives are made the subject of attack. A commander using a LAW must reasonably believe that the learner can distinguish between its intended target and those it must avoid. If used to select and engage targets autonomously, the LAW must be able to distinguish between combatants and non-combatants, and between military objectives and civilian objects. In conflicts where adversaries clearly indicate their military membership, like wearing a recognizable military uniform and openly bearing arms, a particular combatant’s targetable status would be readily apparent to a LAW. But where adversaries and civilians are outwardly indistinguishable, a combatant’s targetable status must be determined by other less visible clues, like past behavior and intent. For LAWs, interpreting body language and context pose significant hurdles, though not insurmountable. Yet, to be used lawfully, a commander must reasonably believe that a LAW can distinguish between correct and incorrect targets and behave predictably even when circumstances change after the LAW’s mission commences. If not, the commander’s choice to employ the LAW in that particular circumstance would be unlawful.
To comply with proportionality, a commander must ensure an attack’s likely collateral damage is not excessive in relation to the concrete military advantage expected to be gained. After making a proportionality determination, the commander using his LAW must reasonably believe its effects will conform to his estimation of damage. But, like the principle of distinction, the LAW’s programming and training has been conducted and tested long before the facts of the commander’s engagement present themselves. So, the commander must be able to predict with reliability how the LAW will behave. Unlike servicemembers, whose training and decision-making are relatively transparent, LAWs’ deep learning models are opaque. The commander cannot know how it was trained or how it will make decisions given situation-specific, real-time facts.
This disparity can be overcome to an extent with rigorous testing and evaluation and operator training, but ultimately, commander confidence requires well-trained LAWs—training which occurs during design. Thus, experts familiar with advising commanders on LOAC issues in military operations must be present during design to help equip LAWs’ learners with lawful and reliable parameters when their models are trained.
The DoD has determined that principles like distinction and proportionality are complicated and weighty enough to assign co-located legal advisors to deployed combat and combat support units. The DoD also mandated combatant commanders to obtain legal reviews of all plans, policies, directives, and rules of engagement for LOAC compliance. In an operational environment, a commander’s decision-making is reactive to real time circumstances, informed by battlefield experience and accompanying legal advice and judgment, among other information. This is in sharp contrast to LAW’s decision-making learners, which are trained and tested by human programmers—likely non-DoD. The combat learner’s decision-making models evolved and took shape in the hands of engineers long before the commander received it and likely lacked meaningful legal guidance during its training. Thus the key difference between addressing a commander’s real-time LOAC challenges, and addressing a LAW’s LOAC challenges is not tied to who makes the decisions, but when they are made.
B. Issues Arise During Programming
In November 2016, the U.S. submitted to the Convention on Certain Weapons Group of Governmental Experts (GGE), “Weapons that use autonomy in target selection and engagement seem unique in the degree to which they would allow consideration of targeting issues during the weapon’s development.” Restated:
[I]f it is possible to program how a weapon will function in a potential combat situation, it may be appropriate to consider the law of war implications of that programming. In particular, it may be appropriate for weapon designers and engineers to consider measures to reduce the likelihood that use of the weapon will cause civilian casualties. (emphasis added)
Such commentary reflects the U.S. view that LAWs must be designed in accordance with the LOAC, not that LAWs must themselves make legal decisions. To emphasize that point, the U.S. offered that “it might be appropriate to consider whether it is possible to program or build mechanisms into the weapon that would reduce the risk of civilian casualties.” In effect, this means the U.S. acknowledges that, although law of war issues typically arise within a particular military operation in real time, the unique character of autonomy bends the timeline for when such issues should be considered back to the point of programming.
In its August 28, 2017 submission to the GGE, the U.S. again emphasized the need to consider LOAC principles like distinction, proportionality, humanity, and military necessity when deciding whether to “develop or deploy an emerging technology in the area of lethal autonomous weapons systems.” In its January 2019 report on AI and National Security, the Congressional Research Service (CRS) reported that “domain adaptability” presents challenges for militaries when “systems developed in a civilian environment are transferred to a combat environment,” and that these failures are exacerbated when AI systems are deployed at scale. Thus the critical juncture for training an autonomous system’s learners to stay within the bounds of the LOAC lies squarely during design when the goals and parameters that guide a learner’s decisions are set. The design timeframe varies by the particular aspect of technology being developed, and so determining when a judge advocate’s involvement is timely must consider how the risks associated with autonomy render the current system for reviewing LOAC compliance in weapon systems inadequate.
IV. Current Process for Mitigating LOAC Issues is Inadequate
A. Legal Reviews for Weapons
When an agency contemplates buying a weapon, whether building one from scratch or adapting a commercially available variant, the current process requires at least one legal review and, for developmental weapons, an earlier legal review prior to full-scale engineering. As outlined in DoDD 3000.09, the acquisition of LAWs requires two legal reviews: a preliminary legal review prior to formal development, and another legal review prior to fielding. But these reviews examine a weapon’s legality too narrowly and too belatedly. Apart from providing “weapons reviews,” as they are referred to in shorthand, attorneys scrutinize weapons and weapon systems from many angles, like during an acquisition, for example, but only weapons reviews address potential LOAC concerns.
When conducting a weapons review, the legal advisor receives a requirements document, a general description of the weapon, a description of the mission, the desired terminal ballistic effects of the weapon, along with tests and lab studies, if included. The attorney’s review focuses on if the weapon is “illegal per se,” that is, whether the weapon is prohibited for all uses, including when the U.S. has agreed to a prohibition. The review also considers “whether the weapon is ‘inherently indiscriminate,’ i.e., if the weapon is capable, under any set of circumstances and, in particular, the intended concept of employment, of being used in accordance with the principles of distinction and proportionality.” Distilled further, if a weapon is not prohibited, if it can be aimed, and if its effects can be limited, it would pass legal review.
Under DoDD 3000.09, only autonomous weapons that use autonomy in new ways trigger (seemingly) additional requirements. The drone in the sniper example above would have been subjected to senior official approval before formal development, and senior official approval again before fielding. Although DoDD 3000.09 directs rigorous verification and validation (V&V) and testing and evaluation (T&E), from a legal perspective, none of the enhanced measures mandated by DoDD 3000.09 actually require any additional legal scrutiny beyond that already directed by Army Regulation (AR) 27-53 and DoDD 5000.01 for all weapons. All new weapons, whether autonomous or not, may receive a legal review before full-scale development, and must receive one prior to fielding. This means lethal, fully autonomous weapons used in ways never before seen in combat receive the same level of legal scrutiny as the L5 “Ribbon Gun,” a one-time contender to replace the Army’s tried and true M4 carbine. But lethal autonomous weapons are not M4s; LAWs are characterized by their software, which receives no scrutiny under the current weapons review process. Even if it did, the gates for weapons reviews occur so late in the acquisition process that any LOAC issues arising during design would long have been set and obscured within a LAW’s algorithmic black box.
B. Risk Factors Unique to Lethal Autonomous Weapons
The major risk factors rendering the current process for identifying LOAC compliance concerns fall into three categories: what we are pursuing, where we are getting it, and how we are getting it. The following discussion addresses each.
1. What we are pursuing.
As discussed in Section II, the technology that enables autonomy in LAWs presents significant obstacles to understanding how it works, even for the experts who create it. The greatest obstacles to fielding LAWs is the inability to test and evaluate them because combat presents near-infinite possibilities for LAWs’ decision-making.
The black box problem means we cannot know how a learner’s model makes decisions, what biases may be trained into the model, how it set about achieving its goals, how the built-in parameters affected its decision-making, and so on. What limited opportunities exist to observe the structure and contents of the black box, the human touch points, exist when the model is trained. For the attorneys conducting weapons reviews, the aperture of these already narrow windows is further constricted by time and distance. Relatively far into the process, the LAW’s legal reviewer receives from the developer or acquiring agency a prepared batch of information. With only the provided documentation, testing, and lab results, the legal advisor must learn how the LAW operates well enough to opine as to its legality.
Even if weapons reviews examined software capabilities, the information provided must somehow be comprehensive enough to identify issues buried deep within the learner’s model at the points in time humans imbued the model with injects of human judgment. The attorneys conducting the weapons reviews are separated by time and distance to such a degree that a written request for a weapons review and accompanying enclosures simply cannot produce a picture of how the model was built. Unless a legal advisor versed in the weapons review process participated at key points in a model’s training, and could enhance and explain information provided in the request for a weapons review, paper is simply insufficient to capture what must be glimpsed in person.
2. Where we are obtaining the technology.
Compounding our inability to adequately test LAWs, the research and development of their underlying technology occurs in scattered pockets, some within DoD but the vast majority outside DoD. A LAW will not arrive to the Pentagon’s front steps fully formed and ready for purchase. Thus DoD will most likely acquire various AI-enabled component technologies from multiple internal and external sources, often without knowing how they may ultimately be used, and then layering those on top of other AI technologies. Absent access to the design table, we are limited to testing upon acquisition (or seeking to acquire) the technology.
Even if the technology is generated internally, or is industry-developed and internally refined, convincing researchers, scientists, engineers, and developers that collaborating with an attorney in the early stages of designing a learner is actually beneficial may require a colossal culture shift in how the role of the attorney, and attorneys themselves, are viewed. This institutional recoiling could hamper any willingness to identify projects raising possible LOAC issues in order to avoid bringing attorneys into the design process, allowing those projects to slip through the cracks until they arrive at the required weapons review gate, too late for preventative legal involvement. Operating within the status quo, to the extent it excludes judge advocates from the design process, results in a detriment to the effective and lawful use of LAWs.
3. How we are acquiring the technology.
As discussed above, fully autonomous lethal weapons do not yet exist, but some capabilities do. Over time, machine learning capabilities will be layered together with other autonomous capabilities, and then fitted to a physical platform, punctuated throughout by iterations of testing, modifying, and refining the technology specifically for DoD’s needs. Along the way, DoD will look to industry for its technology, expertise, and resources to partner with DoD’s own technology, expertise, and resources to create the first LAWs. To effectuate this exchange, DoD will follow an acquisition strategy, or combination of strategies. Numerous strategies exist, but the traditional process follows the DoDD 5000-series, starting with DoDD 5000.01. The “5000-series,” for what are now called major capability acquisitions, has been derided as slow, ineffective, expensive, risk-averse, and cumbersome for industry and the DoD alike, making it a less attractive route for rapid development, production, and fielding of emerging technologies like LAWs. If DoD wished to develop a LAW from start to finish on its own, including research, development, testing, and evaluation (RDT&E), prototyping, and full-scale production, it would likely follow the 5000-series framework for a major capability acquisition. This scenario seems unlikely given that the lion’s share of research and development for the LAWs’ enabling technology will occur outside DoD’s purview.
Other more flexible pathways exist and that flexibility makes them more attractive for acquiring cutting edge technology. For example, Section 804 of the FY16 NDAA established Middle Tier Acquisitions (MTA) for two categories: rapid prototyping and rapid fielding of emerging military needs. They are intended to be completed quickly and are therefore exempt from the most cumbersome aspects of the 5000-series. Rapid prototyping requires operational capability within five years from requirement, and rapid fielding means production within six months and complete fielding within five years of a validated requirement. Another authority flows from Section 2447d of the FY17 NDAA, which permits non-competitive follow-on production contracts or other transactions for prototype projects when the project “addresses a high priority warfighter need or reduces the costs of a weapon system.” Section 2447d also grants Service Secretaries transfer authority, which means they can transfer available procurement funds to pay for low-rate initial production.
Despite its reputation, the 5000-series has its own efficiencies. Department of Defense Directive 5000.71 enables combatant commands to request processing of urgent operational needs, which means a validated request sees a fielded solution within two years. This process may be used in conjunction with Section 806 MTA. Section 806’s Rapid Acquisition Authority (RAA) authority used together with DoDD 5000.71 enables warfighter needs to be fulfilled exceptionally quickly.
Though not an acquisition pathway, the DoD may also pursue and adapt commercial technology derived from Independent Research and Development (IR&D) under 10 U.S.C. § 2372. Independent Research and Development envisions DoD adapting research and development conducted in the commercial sector for defense purposes. Under Section 2372, DoD reimburses contractor expenses for research and development conducted outside of the department’s control and without direct DoD funding. Projects must have potential interest to the DoD, and include those that improve U.S. weapon system superiority and promote development of critical technologies.
With all that flexibility and speed, one may wonder where in the process weapons reviews fall. Each acquisition pathway follows its own procedural rules and allows for varying degrees of overlap with other pathways, but the only one that dictates when weapons reviews must be conducted is the 5000-series. The 2019 version of Army Regulation (AR) 27-53 contemplates rapid acquisition strategies and acquisition of emerging technology and attempts to bridge the gap by requiring a weapons review pre-development for weapons or weapon systems sought through a rapid acquisition process. Acknowledging the importance of early reviews, AR 27-53, paragraph 6g requires preliminary legal reviews for pre-acquisition category projects, like advanced concept technology demonstrations, rapid fielding initiatives, and general technology development and maturation projects when the technology is “intended to be used . . . in military operations of any kind.”
Refocusing the issue, the 5000-series is the least likely path for acquiring LAWs’ technology because it is notoriously slow and rigid, but the governing DoD policy on LAWs, DoDD 3000.09, points to the DoDD 5000-series framework for the timing of weapons reviews within the acquisition process. Yet, as discussed in Section III.B, fluid timing of judge advocate involvement is a crucial element to mitigating LOAC issues. This is problematic.
Because LOAC issues raised by LAWs’ algorithms arise when the learners are trained, the current acquisition process, regardless of pathway, renders weapons reviews either too late, too narrow, or too disconnected from the various human touch points that allow consideration of targeting issues during the weapon’s development. Those human touch points offer crucial windows for appropriate levels of human judgment to be incorporated into LAWs’ algorithmic models and their training—judgment tempered by legal counsel similar to that which commanders receive during military operations. Fortunately, no regulatory hurdles prevent an enhanced legal advisor role, but hesitancy from industry could.
V. Building on Current Efforts to Address Blind Spots
The current legal framework allows for broadening the scope of judge advocate involvement. The services have taken steps to involve judge advocates earlier on in a weapon’s development, even before the weapon or its technology enters the acquisition process. But these efforts are just the first steps and blind spots remain. The following section touches on the permissive character of regulations governing legal advisor involvement, some efforts to expand the scope of current legal advisor involvement, where vulnerabilities remain, and how to use existing resources to address them.
A. Getting the right people in the right place.
Generally, those seeking legal advice in carrying out DoD business may readily obtain it. The issue is not a lack of legal advisors but not knowing how to or being unwilling to use them. Figuring out how judge advocates add value during design and training of the LAWs’ enabling technology opens doors of possibilities but remains an unanswered question, partially because LAWs’ technology only exists in incomplete fragments, and partially because lawyer involvement in the earliest stages only occurs on an ad hoc basis, if at all.
Within the acquisition arena, attorneys play important roles throughout the process, but are not tasked with reviewing LOAC concerns in weapon systems. For instance, when an acquisition is contemplated, legal advisors located within requiring agencies prepare acquisition packages, provide support to contracting units reviewing proposed solicitations, participate as members of acquisition teams offering legal and non-legal counsel, and offer legal advice to source selection decision authorities. Within the Army, many of those attorneys are not co-located with the agency they support; rather they belong to a contracting support unit (e.g. Contracting Support Brigades), a Staff Judge Advocate’s Office, or within Army Material Command. Despite their involvement as legal advisors, these attorneys’ roles are not especially intended for spotting design or operational issues associated with the LOAC, and they are not physically co-located in the places most likely to encounter them. Their roles in refining requirements for a LAW would be more concerned with accurately describing what the LAW needs to be able to do, not how the LAW must do it. An attorney assisting with refining an agency’s needed capabilities for a LAW could simply include a requirement for LOAC compliance. But the complexity of translating what that actually means—and threading LOAC compliance through programmer, evaluator, and operator—lends itself poorly to simple insertion as a contractual requirement. Furthermore, downstream attorneys reviewing performance of that requirement are as ill-equipped as the weapons reviewer to spot potential flaws or operational defects in how a programmer trained a model to function within the LOAC. Recent efforts to modernize how the Army acquires emerging technology and advances certain types of technology set the stage for an expanded judge advocate role.
The Army’s hub of innovation and cutting-edge research resides within Army Futures Command (AFC), headquartered in Austin, Texas, with offices scattered throughout the U.S. Judge advocates and civilian attorneys working within AFC already advise its cross-functional teams (CFTs), CCDC research labs, the Artificial Intelligence Task Force (AI TF) and its Applications Lab. The breadth of legal advice they offer remains in its nascent stages, but could include early issue-spotting across the spectrum of legal topics, including LOAC issues. This is one of the locations within the Army most likely to encounter the technology for LAWs in its earlier stages, either by virtue of the Army’s own internal research and development, or resulting from some variety of Army-Industry partnership. The judge advocates and civilian attorneys within AFC and it subordinate units may be dispatched outside of AFC, including upon industry request, wherever their presence is needed. The vulnerability resides in the assumption that AFC (and sister service equivalents) is an omniscient entity, when AFC is but one agency within DoD with limited resources and capable of seeing only those projects that fall within its broad reach.
To standardize efforts on this issue, DoD should promulgate a consistent, uniformly applicable policy requiring the employment of judge advocates in service of identifying LOAC issues in LAWs. The judge advocate/commander teams should be situated within AFC but mobile and readily available to whomever needs them. Recalling the sniper-targeting drone example from Section II.B, the programmer unfamiliar with the LOAC would doubtlessly also be unfamiliar with its prohibition on targeting those who are hors de combat, meaning they are “out of the fight.” It takes little imagination to envision a scenario where a sniper exhibits the same qualities as an unconscious soldier lying motionless aside his weapon, a civilian hunter awaiting a clear shot, or a medic rendering aid to a fallen comrade. Each may appear to be laying in a prone position, camouflaged, motionless, isolated, and aiming a weapon in a particular direction, yet only the sniper would be a valid target.
Training a learner’s model to identify the nuances of what makes the sniper’s legal status different—and thus subject to attack—requires both a firm understanding of the law that governs when one is out of the fight and the characteristics, behavior and tactics employed by one who is fairly in it. Put another way, the model must set a sniper apart from a teenager hiding with a paintball gun. The experienced operational commander (or former operator) would understand these characteristics and be able to articulate them so a programmer could train the model to search for and recognize them. The judge advocate versed in dispensing operational advice would complement the commander’s tactical expertise with legal perspective, thus adding dimension and detail to the programmer’s understanding, ergo the model’s understanding, of the LOAC. Lethal autonomous weapons’ models are simply extensions of humans’ prediction and problem-solving models; they both need multiple sources of “expertise” in developing their decision-making. The entity within the Army with attorneys best-situated to team up with commanders and offer their expertise at the critical time is AFC.
While judge advocates offer the advantages of training, experience, and education, they are not the only attorneys able to provide such support. The DoD abounds with highly capable civilian attorneys and those with prior service as judge advocates across all services. Their expertise and experience with military operations and the acquisitions process provides a valuable resource. On the issue of whether the attorney must be conversant in coding, a familiarity with the concepts would be desirable, but the emphasis should be instead on collaborating with the various experts designing the technology, which requires communication and interpersonal skills and a well-rounded support network as much as anything else.
B. Doing the right things.
The role of the judge advocate/commander or operator team should be in assisting the engineers, scientists, and programmers build LOAC durability into the deep learning algorithms’ architecture, leveraging the human touch points, so that when a commander or operator manipulates the LAWs’ various capabilities and constraints, whatever machinations take place within the black box also stay within the bounds of the LOAC. As a practical matter, a LAW is useless unless a commander can reliably control it. Knowing that she or he is accountable for the foreseeable consequences of its behavior, a commander contemplating using a LAW that she or he does not understand, would simply bench it. Outwardly, a commander experiences a model’s training through its performance and the LAW’s operator interface, which is the means by which the commander “makes informed and appropriate decisions in engaging targets.” Thus, the interface provides a critical means for the commander to set mission-specific parameters on the LAW. A recent study by the Combat Capabilities Development Command (CCDC) Army Research Lab (ARL) examined what it takes for a human to trust a robot. The research team found that soldiers reported lower trust after seeing a robot commit an error, even when the robot explained the reasoning behind its decisions. The lack of trust endured, even when the robot made no more errors. The heart of the issue is trust, which means those responsible for designing LAWs’ deep learning models must not only have a keen awareness of commanders’ real-time operational needs but also how to translate those needs through the operator interface into a LOAC-resilient model.
To this end, like in the sniper-targeting drone example discussed above, a judge advocate/commander team would provide real-world operational scenarios, offer insights on the interplay between targeting decisions and the LOAC, theater-specific rules of engagement and policy considerations, and explore how different options built into the operator interface could control for varying levels of risk. The team could also assist with ensuring the machine and human share the same objective, and that they are able to adjust in unison as circumstances change. Related to this concept is understanding each other’s “lanes” or in other words, the machine and human knowing the limitations of the others’ decision capabilities, and how that may change as objectives change. Integrating operational realities into a learner’s model means they must be taught, and what better teachers than those who bear the responsibility in real life?
The DoD has mandated legal advice for all operational decision-makers and offers in-theater judge advocates to dispense it, but judge advocates offer more. They bring to the table critical thinking skills and a diversity of thought that is important to the collaborative process, and is exactly what they offer commanders in operational settings. Viewing the lawyers as teammates as opposed to ivory tower gate keepers maximizes the skill set they possess. Providing the same access for researchers, programmers, and engineers as the military offers operational commanders just means the judge advocate’s place of duty changes; their advice is required at the design table, not just while deployed.
C. At the right time.
Ensuring the right people are in the right place at the right time hinges on when DoD gets its first opportunity to examine autonomous technology. If the first opportunity comes as part of the acquisition process, the Federal Acquisition Regulation (FAR) and its supplements, applicable to the vast majority of acquisitions, permit early and ongoing legal involvement beyond legal reviews. As discussed above, for developmental weapons or weapon systems, AR 27-53 provides that initial reviews may be made at the earliest possible stage, and pre-acquisition technology projects intended for military use must receive a preliminary legal review.
The Navy counterpart to AR 27-53, Secretary of the Navy Instruction 5000.2E, requires that potential acquisition or development of weapons receives a legal review during “the program decision process.” The Air Force equivalent, Air Force Instruction 51-401, requires a legal review “at the earliest possible stage in the acquisition process, including the research and development stage.” But in practice, across all services, actual legal advisor involvement more closely aligns with the baseline requirements discussed in Section IV.A, meaning early involvement of legal advisors to spot LOAC issues rarely occurs.
In an effort to integrate judge advocates earlier into the process pre-acquisition, the Air Force includes judge advocates as members of cross-functional acquisition teams, advising within an assigned portfolio, like F-15s, Cyberspace, or Intelligence, Surveillance, and Recognizance (ISR). Air Force judge advocates also provide direct legal support to the research labs. Of the ten research lab directorates, three have in-house legal counsel and the remaining satellite locations receive support from a nearby legal office. If LOAC-specific issues arise, servicing legal advisors send them through their channels to a single office at the Air Force Judge Advocate’s Office (AF JAO).
In the Navy, the judge advocates performing weapons reviews engage in outreach with program managers, educating them about their responsibilities to get legal reviews and involve legal advisors in the acquisition process. Legal advisors are also physically located in or near some research labs, though their support does not envision addressing LOAC concerns. For all services, unless the researchers, programmers, and engineers know to ask, LOAC issues may well go unnoticed until it is too late to fix them. A DoD policy could change that.
As discussed in Section IV.B.2, DoD’s first opportunity to examine autonomous technology will likely arise from outside DoD. This scenario leads to the greatest challenge and most promising solution to mitigating the various risk factors bearing on LAWs and the LOAC: access. Specifically, whether industry is willing to bring DoD into its design process.
The DoD has been directed to engage with industry. In his March 2017 memorandum the Deputy Secretary of Defense encouraged cooperation with industry: “While we must always be mindful of our legal obligations, they do not prevent us from carrying out our critical responsibility to engage with industry.” Congress goes beyond encouragement and directs the DoD to “accelerate the development and fielding of artificial intelligence capabilities [and to] ensure engagement with defense and private industries.” In Section 238(c)(2)(H) of the FY2019 NDAA, Congress states that designated officials “shall work with appropriate officials to develop appropriate ethical, legal, and other policies for the Department governing the development and use of artificial intelligence enabled systems and technologies in operational situations.” (emphasis added). Industry engagement is not only permitted, it is mandated.
Though DoD may desire industry engagement, that willingness is not necessarily mutual. Barriers include mistrust of DoD, more lucrative and less cumbersome options elsewhere, resistance to supporting DoD’s mission, lack of awareness about opportunities to work with DoD, and lack of understanding how to access those opportunities. The DoD has taken strides to address the latter four concerns by creating an approachable physical presence in tech hubs like the Army Applications Lab in Capitol Factory, Austin, Texas, SOFWERX in Tampa, Florida, the Air Force’s AFWERX innovation hubs in Washington, D.C., Las Vegas, and Austin, and the AI Lab in Pittsburgh, Pennsylvania. It has also expanded opportunities for quick turnaround payoffs with on-the-spot contracts awarded during industry engagement events, like the Air Force’s Pitch Days, the Navy’s Small Business Innovation and Small Business Technology Transfer (SBIR/STTR) and NavalX, and the Army’s Innovation Days. Reverse Industry Days foster transparency and encourage communication by offering industry a chance to share its practices and lessons learned with the military to improve its processes to secure more industry collaboration.
Pitch Days, Innovation Days, Industry and Reverse Industry Days, flexible acquisition strategies discussed in Section IV.B.3, and ease of access to DoD’s storefront-type locations help nudge forward industry-DoD cooperation. But the intractable problem remains; fostering trust within industry that DoD’s participation during design does not equate to giving away the crown jewels. For many companies, guarding the inner workings of their processes and technology is the same as guarding the viability of the company itself. Allowing an unknown government employee to observe, poke, prod, and question is simply unthinkable. Overcoming that intransigence means taking consistent, measured steps to incentivize access.
This can and should be accomplished from many angles. Among them, tying design process access to money by making it a condition of contract or other transaction award, with an emphasis on those agreements that entail researching, designing, and developing autonomous capabilities that could later be used in a LAW. As seen in DoD technology challenges, like the Defense Advanced Research Projects Agency’s (DARPA) robotics challenge, commercial start-ups placed a premium in “establishing themselves as the market standard” far and above their own investments in their technology. Commercial firms are willing to trade technology, or access to it, in exchange for notoriety and DoD adoption.
Another is to start with small successes, sending judge advocates to participate in isolated lower-threat projects. Judge advocates already support industry outreach efforts, as discussed in Section V.A. Literally, they are physically present when private sector innovators hawk their creations hoping for a deal with DoD. Leveraging that presence with training, a strong support network, and a clear objective (access to the design process) advances DoD’s interests for early involvement in the design of learners whose future calling may be within a LAW.
Most importantly, DoD needs a clear and consistent policy, announced to all potential industry partners, that its objective in pursuing machine learning autonomy is to actually be able to use it, which means minimizing the risk that vulnerabilities, indiscernible during testing, are smuggled inside the black boxes we buy. And to achieve that, the policy should encourage industry to invite judge advocate/commander teams as collaborators and facilitators as early as possible to identify and prevent possible LOAC issues before they arise. Whenever feasible, when DoD contemplates acquiring machine learning technology, the request for proposals should include a requirement that DoD gets the intellectual property (IP) and data necessary for weapons reviews. The potential contractor and DoD could negotiate a special license for the pertinent data required for the sole and express purpose of conducting weapons reviews, accounting for the need to recertify the license as the learner modifies itself over time.
These efforts could avoid costly delays in later acquisition stages, provide the private developers a means to keep their valuable IP and data rights yet allow DoD the access it needs to help engender trust and reliability for the end user, and prevent mishaps and other operational challenges during operation.
The complexity of how LAWs’ enabling technology learns, combined with its industry origins and unpredictable uses, and the rapid, risk-absorbing acquisition pathways employed to obtain it require adjusting the current process for identifying and addressing potential LOAC issues in weapon systems. Though weapons reviews serve an important and necessary function, and rigorous testing will ferret out many of the problems, they should not be the only safeguards against the unique LOAC issues posed by autonomy in weapon systems. Relying solely on weapons reviews and ad hoc requests for legal support fails to consider how autonomy transforms battlefield LOAC concerns into laboratory LOAC concerns, and ignores the limitations of arms-length legal reviews. Because no legal barriers exist to judge advocates’ enhanced participation in the design process, the DoD should take immediate action to incentivize the use of judge advocate/commander teams by commercial developers working on machine learning capabilities, and DoD organizations should be required to request it. Project managers, cross-functional team members, DoD employees engaging with industry, and anyone participating in projects to design machine learning models for DoD applications should be empowered to identify those human touchpoints when a judge advocate should be present. Lethal autonomous weapons will be commanders’ tools, intended to assist them achieve mission success, and judge advocates trusted legal advisors. As the military prepares for LAWs to assume their inevitable place in formation, changing the fundamental nature of war, leveraging the judge advocate’s historical role as combat advisors is the right place to start.