(Credit: Alexander Limbach-stock.adobe)
Generative Pre-Trained Transformers and the Department of Defense’s
Own Generative Artificial Intelligence Large Language Model
By Lieutenant Colonel Thomas S. Hong
This commentary makes the case that the Department of Defense (DoD)
needs to create its own generative artificial intelligence (GAI) large
language model (LLM) similar to ChatGPT.[1]
Before digging in more thoroughly, however, I must start with a full
disclosure[2]
that this commentary was not written with the help of any generative
pre-trained transformer (GPT) chatbot or other GAI LLM.[3]
Given GPT technology’s uncanny ability to create or improve written
prose on many subjects, it was certainly tempting to run some ideas
through the free version of ChatGPT.[4]
However, pumping the brakes on the impulse, I decided that I enjoy the
mental spectacle of my own thoughts battling it out in my own
headspace. Some of those thoughts get kicked with a roundhouse and
stuffed in the recycling bin, while others are coddled like
rosy-cheeked newborns who make a parent’s heart swell with pride. But
to be frank, the temptation to use a GAI LLM tool is strong because
time is a precious resource and GPT chatbots like ChatGPT do one thing
better than me: they do not procrastinate when prompted to write out
a coherent thought. It seems to move out with alacrity and, in
seconds, spits out impressive text on general or specialized topics
prompted. I will readily admit, it also does so better than most
humans when accounting for speed, coherence, and writing mechanics.[5]
I will also disclose that this introduction survived the mental
recycling bin because I thought it could be a discrete way to show
that no GAI LLM was used in its drafting due to its first-person
narrative.[6]
I can imagine there will still be retorts of doubt since newer
chatbots like GPT-4 are becoming more human-like in style, word
choice, creative prose, and can effortlessly provide a first-person
commentary with a unique style. However, if some sentences in this
commentary appear to be awkwardly constructed, note these were
deliberately left that way as indicia of humanness or, at least, the
quirkiness of the author.[7]
Although, it is easy to conceive that newer GPT chatbots may mimic
less-than-perfect writing, when prompted, that could be passed off as
solely human-created.[8]
From this initial disclosure and burgeoning debate, this commentary
turns to the topic at hand: the future of military law practice in the
age of GPT chatbot prevalence. More specifically, the focus here is on
whether all attorneys, paralegals, and other legal professionals like
judges and law professors should deeply contemplate what GPT chatbots
will do to, or for, the legal profession. Within this extended thought
process, one of the tantalizing questions to address is, “Does the DoD
need its own GPT chatbot?”[9]
For an emerging group of legal professionals, the answer is a
resounding “Affirmative!” However, if you disagree, I look forward to
hearing from you after you have finished this commentary and
considered the nascent thesis developed here.
Curating a path to an affirmative response for a DoD GPT chatbot can
be done with a review of technological breakthroughs in artificial
intelligence (AI). One good starting point is to recall when a
chess-playing AI named Deep Blue beat a world chess champion in
1997.[10]
When this made the news, one can imagine legal professionals at the
time were only mildly amused as their profession did not depend on how
well they did on a chess board.[11]
Some also probably registered similar reactions when AlphaGo beat a
Go grandmaster in 2016 and when DeepMind beat professional gamers in a
strategy video game in 2019.[12]
Again, Go and video game expertise do not impact legal professionals
in the workplace. Perhaps forward-thinking legal professionals raised
more eyebrows when IBM’s Watson beat brainiac Jeopardy champions in
2011.[13] However, it was unlikely that legal professionals would put
down the coffee pot with an incredulous look when the news spread.[14]
In contrast, when GPT chatbots almost passed the bar exam in 2022 and
GPT-4 finally did in 2023, we got an eye-opening introduction to the
capabilities of GPT-powered chatbots, a technology that looks capable
of replacing large numbers of legal professionals.[15]
After reading many articles, conversing with various people, and
testing out ChatGPT personally, many were convinced that we, as legal
and knowledge professionals, must get smart on this technology and get
energized now.[16]
In fact, Model Rule 1.1 of the American Bar Association’s (ABA’s)
Model Rules of Professional Conduct states that “[t]o maintain the
requisite knowledge and skill, a lawyer should keep abreast of changes
in the law and its practice, including the benefits and risks
associated with relevant technology.”[17]
Given the rapid rollout of various LLMs in the past year, it is easy
to say with confidence that GPTs are now considered highly relevant
technology for lawyers.
For purposes of this commentary, being “energized” means researching
and advocating for the DoD to develop its own GPT for legal
professionals and all DoD knowledge professionals.[18]
As the DoD starts to take ever wider steps to become energized, legal
professionals can be thought leaders who come to the table from the
users’ perspective with a desire to be valuable members of the
development team.[19]
They can and should be brainstormers, designers, trainers, testers,
and overall change agents in the specialized area of GPT chatbots.
After all, legal professionals are in the language and knowledge
business, and a DoD GPT will raise the language and knowledge
capacities of these personnel to a higher degree. How much “higher”
will largely depend on the size, quality, and usefulness of the
datasets and how powerful the AI engine behind the DoD GPT will be.[20]
Given our dual training and mastery of both civilian and military law
and regulations, coupled with access to vast amounts of data to train
a DoD GPT, including controlled unclassified information (CUI), DoD
legal professionals can become skilled users of a DoD GPT.[21]
Once they become highly proficient users, legal professionals can also
assist other, non-legal DoD knowledge professionals in the use of a
DoD GPT in their respective fields. As a professor quipped during a
legal research lecture at the U.S. Army’s Judge Advocate General’s
Legal Center and School (TJAGLCS), legal professionals are the
undisputed experts at searching through other staff sections’
regulations and telling them the answer when asked a non-legal
question couched as a legal one.[22]
With a DoD GPT that has the capabilities of similar GPTs in existence
or in development, the turnaround time from “legal” may be drastically
reduced.[23]
This would be just one of the many beneficial use cases for a DoD GPT.
The next consideration is how we can bring this capability to the DoD
now.
For legal professionals to take on this unofficial yet important goal
to join or even lead the effort to bring about a DoD GPT, teamwork and
collaboration will be essential. To further explore and develop this
multifaceted and multilayered idea, TJAGLCS formed the TJAGLCS AI
Study Group (TAISG).[24]
Volunteer faculty, center staff, and resident students—both long- and
short-term students at TJAGLCS—formed this informal and open study
group. The purpose of the group is to provide a digital and in-person
venue for legal professionals at TJAGLCS to discuss all topics related
to AI and to collaborate on projects that may benefit the Judge
Advocate General’s (JAG) Corps specifically and the DoD generally. The
study group members have shared articles and resources to stay
informed on the latest developments in AI and how they may apply to
the study and practice of military law. Moving forward, this group has
discussed the desire for further action. Although readers may see a
need for additional lines of effort, the initial seven areas discussed
by TAISG members are outlined below.
LTC Hong speaks at the Contract Attorney’s Course at The Judge Advocate General’s Legal Center and School in Charlottesville, VA. (Credit: Billie Suttles, TJAGLCS)
1. Self-Directed Non-Technical Education in AI and Its Myriad
Subdisciplines
Fully aware that many legal professionals in the Army JAG Corps,
other Services’ JAG Corps, and the DoD have studied AI technologies
before the advent of ChatGPT, the immediate goal should be to
drastically enlarge the number of legal professionals who will
self-study AI and its application to DoD legal practice. Although AI
for use in the military can take many forms and applications (such as
machine learning (ML) and machine vision for military intelligence or
targeting purposes), the ideal area for growth would be in the study
of technology behind what makes LLMs and GPT-powered chatbots capable
of passing the bar exam and possibly becoming an AI legal expert. This
desire for more students of AI technology from the legal profession is
to build up a large group of informed users who, eventually, can act
as expert users and advisors to AI scientists, engineers, and
technicians.
For those seeking external motivation to start a program of
self-study, keep in mind that your efforts are in line with the 2018
DoD’s AI Strategy, which states that we “will harness the potential of
AI to transform all functions of the Department positively, thereby
supporting and protecting U.S. [Service members], safeguarding U.S.
citizens, defending allies and partners, and improving the
affordability, effectiveness, and speed of our operations.”[25]
Further plans for educating select members of the DoD workforce are
included in the 2020 DoD AI Education Strategy.[26]
Even though there are no current plans to resource formal AI education
for legal professionals, those interested can start learning through
self-study. Although not the focus of this commentary, there are many
readily accessible means to acquire knowledge in AI technologies for
the non-technical user. For example, Army legal professionals can take
advantage of free audio and e-books, videos, and courses from several
educational resources from the DoD.[27]
Interested legal professionals can reach out to the author for a
growing list of educational and training resources.[28]
2. Conduct Simple Experiments with Commercially Available GPT
Chatbots
Knowing that many legal professionals have already experimented with
GPT chatbots, the effort advocated here is to be deliberate and
coordinated in capturing and sharing individually learned insights on
how to apply GPT chatbots in the practice of military law. Combined
with this is the need to share the lessons within the DoD to
exponentially increase the collective insights into the capabilities
and limitations of using existing GPT chatbots to conduct the DoD’s
legal support missions.
After the DoD establishes official policies on employee
use of commercial GPT chatbots, conducting experiments could mean
using the identical inputs (prompts) over time to see if outputs
(answers) change or using identical inputs between GPT chatbots to
gauge the accuracy or “expertise” levels of different GPT systems.[29]
These inputs and outputs would be saved and used to test a DoD GPT
chatbot if, or when, one is developed.[30]
This is just one example of how legal professionals can design
experiments to test and train future GPT chatbots. Sharing these
lessons learned with other legal professionals will not only help many
to understand the capabilities but also help them become connoisseurs
of GPT chatbots and know how to rate them appropriately.[31]
One of the key factors to successfully developing a DoD GPT chatbot
may be how well legal professionals, as collective users, plug into
the ML operations (MLOps) methodologies that the AI community is
developing.[32]
One possible way for legal professionals to plug into the process
would be to become active and engaged team members of the integrated
product teams (IPTs) to help develop and shape the DoD GPT chatbot.[33]
They would also act as liaisons to the legal community, not just
passive legal reviewers after the product is ready to be rolled out.
The input and feedback from legal professionals in the IPTs can be
invaluable to developers, especially if there is a robust discussion
about GAI LLMs and GPTs among the larger legal community. After
integrating legal representatives into the MLOps process, the next
line of effort moves into researching technology companies and their
GPT chatbot products.
Before moving on, the point made above is re-emphasized here because of
its importance: individuals must wait for the official DoD policies and
guidance on how DoD personnel can use commercial chatbots to experiment
with government information, even if it may seem sterilized. The Navy’s
Chief Information Officer (CIO) issued a memorandum in September 2023 to
“offer interim guardrail guidance when considering the use of Generative
[AI] or [LLMs].”[34]
In the guidance, the CIO states that “[t]he responsibility and
accountability for user-induced vulnerabilities, violations and
unintended consequences incurred by the use and adoption of LLMs
ultimately resides with each individual organization’s respective
leadership.”[35]
However, the guidance warns that that “aggregation of individual
generative AI prompts and outputs could lead to an inadvertent release
of sensitive or classified information.”[36]
In October 2023, the DoD’s Chief Digital and Artificial Intelligence
Officer (CDAO) issued his “Interim Guidance on Use of Generative
Artificial Intelligence” to the senior Pentagon leadership, combatant
command commanders, and defense agency and DoD field activity
directors.[37]
The guidance is marked controlled unclassified information, so the
author can only share it with other Federal Government personnel.
Needless to say, it is the responsibility of individual government
personnel to read this guidance and receive approval from the
appropriate leader before proceeding with using commercial GPTs for
government work.
A useful mental framework to have before engaging a commercial GPT
chatbot is to think that all your inputs and outputs will be publicly
available. In other words, what you think are private interactions with
the chatbot may not be. Just imagine that every keystroke inputted may
one day be broadcasted on the jumbotron on One Times Square.[38]
Even without reading the DoD CDAO’s interim policy memorandum, reviewing
the Navy’s warning above should make government personnel pause before
deciding to ask a commercial GPT chatbot a question or give it a task
that might be incorporated in their government work. As more government
personnel become tempted to use commercial GPT chatbots to do their
work, it is imperative that leadership in every organization set up its
own policies to protect government information from inadvertently
weakening operational security.
3. Market Research on GPT Chatbots and Requirements
Development
Using the methods of market research prescribed by the Federal
Acquisition Regulations (FAR) and other techniques used in Federal
procurement (such as Broad Agency Announcements and Commercial
Solutions Opening), legal professionals can assist by conducting
informal market research on GPTs and in requirements development.
Legal professionals would bring their trained minds and critical
thinking skills to the effort. A discrete way to assist would occur at
the requirement development phase when legal professionals can help
draft key performance parameters (KPPs). One of the KPPs could be that
the DoD GPT chatbot must be able to achieve certain tasks, such as
pass the Multistate Bar Examination (MBE) and Law School Admission
Test (LSAT) by a certain percentage or draft a legal review on certain
topics, as a baseline capability.[39]
Although market research and requirements development on a DoD GPT
chatbot may seem daunting, the DoD’s CDAO is providing resources to
DoD organizations such as Tradewinds, a “framework for sourcing,
funding, and developing solutions to challenges in the [AI]/[ML],
digital, and data analytics space.”[40]
According to the website, Tradewinds “launched the Tradewinds
Solutions Marketplace, a ground-breaking digital repository of
post-competition, readily-awardable pitch videos, which address the
Government’s greatest challenges in the AI/ML, digital, and data
analytics space.”[41]
This online resource was built so DoD organizations can find companies
and contract vehicles focused on AI solutions.[42]
These resources and tools will allow legal professionals with little
experience in market research to act as the eyes, ears, and brains in
the search for a DoD GPT solution.
According to news reports, the DoD has already started developing a
ChatGPT-like prototype chatbot called Acqbot.[43]
It is self-described as “Powering the evolution of Government
Contracting” and “Acqbot makes it easy for contracting officers to
manage the contract lifecycle.”[44]
Little is known about how the DoD is developing this capability, but
there may be developmental synergies between Acqbot and a general-use
DoD GPT. Moreover, the issues of datasets and security requirements,
discussed in the next line of effort, will be relevant to both Acqbot
and a DoD GPT.
4. Identification of Datasets and Security Requirements for a DoD
GPT Chatbot
Initial research on what makes GPT chatbots so impressive points to
the LLM and the huge datasets behind the “leap ahead” level of
performance of GPT-4 and its competitors. However, surveying the level
of transparency on what constituted the datasets, it is unclear
exactly what sources the GPT chatbots used to produce the substantive
and fact-based outputs.[45]
Some GPT chatbots may be able to provide sources they used in the
output but basic questions remain as to how ChatGPT and other LLM
products can fully and accurately cite all sources used in the
output.[46]
For example, it would be impressive to see what sources a GPT chatbot
used to “know” the law and legal procedures on, say, qui tam lawsuits.
Was it “fed” volumes of cases, treatises, law review articles, legal
blogs, Wikipedia entries, and so on? If so, a full list should be
readily available for the users to review. Here, the legal
professional can fact-check and judge the adequacy of the sources as
well as look for updates or changes to the law.
In addition to checking references, legal professionals can also be
value-added in evaluating the level of reasoning, complexity, and
integration of various subjects in the outputs. It may be debatable
whether a GPT chatbot can actually “reason” and “integrate” multiple
ideas like a human, but the examples that some GPT developers have
provided seem to closely mimic human thought, such as understanding
why a meme is funny.[47]
Beyond the issues of checking references and assessing a GPT’s skills
in reasoning, many questions related to GPT databases and security are
left to explore. The questions span the gamut, including these that
come to mind: (1) How large should the raw data and training data be
to achieve the best result? (2) What training models work best with
the datasets used? (3) What is to be made of open source databases and
nonpublic, secured data? (4) What can a programmer do about older
datasets that may be useful on one level but not as useful in another?
(5) What about updates and new data created from the outputs? These
questions, issues, and more will need to be discussed with AI data
experts who can shed light on these matters. After all, we may be
asking the wrong questions due to a lack of technical and scientific
knowledge.
One major benefit of developing a DoD GPT would be the DoD’s
retention of control over selecting the datasets used to train the
GPT. This is called “federated learning,” and the DoD could choose the
universe of data with more deliberation after consulting with AI
scientists regarding the optimal size of the data or number of
datasets.[48]
In other words, if the dataset universe is too small, it could affect
the quality, accuracy, completeness, and, ultimately, the utility of a
DoD GPT. Too large and other factors like time and cost could increase
and swing the cost/benefit analysis in the opposite direction.
A significant concern that requires research from a multitude of
angles is cybersecurity and the protection of data in the development
and deployment of a DoD GPT. The required security at the different
stages of development could drive the focus, from algorithms to
computation to the storage of data, including inputs and outputs once
the GPT reaches full operational capability. The security protections
needed to prevent any compromise of a DoD GPT must be front and
center.[49]
Any deliberation on the possible use cases for a DoD GPT includes
considerations of when there should be hard limitations or
restrictions on the use of a DoD GPT. This research area could
dovetail with the existing research and legal analysis on the ethical
use of AI in national security and military operations.[50]
One concern that comes to mind is the possibility of unauthorized
users gaining access to input data, even within the DoD. For example,
if both Government prosecutors and defense counsel are using a DoD GPT
and they input data into the system (such as parts of a case file for
analysis) in preparation for trial, is it possible that one side can
use the DoD GPT to glean information on what the adversarial side is
asking the GPT and what outputs were given?[51]
This assumes that input data fed into a DoD GPT is looped back into
the dataset for use in computing answers to future prompts.[52]
Again, these questions and issues only skim the tip of the iceberg,
but if the best and brightest AI and data experts are partnered with
the DoD, the hope is that a strong security solution will be
implemented and a DoD GPT chatbot can be deployed.
5. Application and Use Cases for a DoD GPT Chatbot
In this commentary, the foundational questions of whether a DoD GPT
chatbot can and should be used in military legal practice have been
presumed. However, the hope is that a robust discussion can occur
within the legal community as many legal professionals take up
interest in researching and writing about how a DoD GPT chatbot can
affect their area of practice. One way to frame the inquiry into
whether a DoD GPT can make an impact is to divide up the query by the
main practice areas found in the Army JAG Corps, such as legal
assistance, administrative law, ethics, military justice, contract and
fiscal law, national security, labor and employment, environmental,
and other highly specialized areas such as intellectual property
law.[53]
Some have already considered possible use cases for a GPT by asking
ChatGPT how it can be used in the legal profession.[54]
The typical use case list would contain common legal work such as:
summarizing investigative files; drafting findings and
recommendations; developing direct- and cross-examinations based on
the investigative files and sworn statements; and routine ethics
opinions like gifts and post-government employment letters.[55]
The answers provided by ChatGPT only scratch the surface. If more
minds were put to this task, the probability of greater insights and
recommendations would be netted and shared, and this would lead to
further insights and recommendations; thus, the start of a
legal-focused AI spring or boom coming out of the DoD.[56]
The issues with the proper application and limits on use cases of a
DoD GPT would need to be examined and expanded upon once possible use
cases are set for discussion and debate. It is easy to envision that
the debate on the acceptable and unacceptable uses of a DoD GPT can
also inform the application of the DoD’s ethical principles for AI on
other systems beyond LLMs (such as AI and computer vision).[57]
In any case, the DoD will likely need to partner with private
technology companies leading the field of AI to explore the
possibilities.
6. Assist in the Development of Acquisition Strategies
Congress has provided the DoD with new procurement authorities to
accomplish its mission to modernize for the future fight.[58]
Since developing a DoD GPT will most likely involve the private
sector, many legal professionals, as both future users and experts in
contract and fiscal law, can assist the procuring arm of the DoD in
developing an acquisition strategy.[59]
Although acquisition strategies are only required for major system
acquisition programs (FAR-based contracts) that rise to a certain
total cost level, a strategy can be developed for any procurement to
promote the goals of
effective, economical, and timely decisions.[60]
Even though Other Transactions (OTs) may be the preferred method for
partnering with industry on a DoD GPT, and thus not requiring an
acquisition strategy, developing an acquisition strategy nonetheless
will help shape how the DoD can negotiate an OT agreement.[61]
Informed by personal study, market research, and individual and
collective experimentations, legal professionals immersed in the GPT
arena can help draft the statement of need, product or service
descriptions, logistic and security considerations, and many other
areas addressed in successful acquisition plans. Of special importance
for GPT technology procurements will be how to negotiate the data
rights and intellectual property (IP) elements. This emerging area
intersects Federal acquisitions law with IP, licensing arrangements,
and, perhaps, inventions and patent law. The DoD will need the experts
at the DoD IP Cadre to weigh in on how best to negotiate all relevant
aspects of a DoD GPT that fall into their areas of expertise.[62]
7. Collaboration within the DoD
As mentioned above, teaming with many DoD individuals, groups, and offices will be key in
this effort to bring about a DoD GPT. As with all successful
initiatives, a good starting point would be the creation of a central
digital gathering point within the DoD for legal professionals to
share and collaborate on projects with like-minded people. This
starting point could be a webpage on the DoD’s Milsuite.mil or a
Government Microsoft Teams channel that is open to all DoD legal
professionals.[63]
There are many different ways to create a central digital gathering
point, but the main point is that the time to create a network of
legal professionals is now.
Creating something informal yet effective could be done
with little cost and effort. The pandemic years have taught us that
computer-based communication tools are available so people can share
their thoughts within a large group with a few mouse clicks, a
microphone, and a video camera. If the central hub is set and opened,
optimism exists in many corners of the DoD that it will be available
for all DoD personnel to join and participate. Although these ideas
are quite simple, thrilling questions remain as to who will create
this digital media space, when, where, what it will consist of, and
how they will create it. After these questions are answered, still
more questions are surely to follow related to resourcing and
organizational sponsorships.
Conclusion
To those who agree with the thesis of this commentary—that the DoD
should create its own GPT—know there are strong desires in certain
legal circles to join in that effort. For now, these people are
informally engaging with one another in the lines of effort outlined
above. However, it is reasonable to think that if anything of
consequence is to see the light of day, it will only be because of
tremendous individual initiative, teamwork, and collaboration. If
enough legal professionals in the DoD see the urgency to act on these
(and other future lines of efforts) and are open to working with
others, there may be positive short-term gains achieved without any
significant investment of capital or loss of time.
Another idea to foster is to take the above lines of effort and turn
them into a conference agenda. If a DoD GPT symposium can attract
hundreds of legal professionals to gather to discuss their thoughts,
research, and progress throughout the DoD, a groundswell of action can
develop. The hope is, even if the DoD is coming into the GPT
capability development late in the game, once the collective minds of
legal professionals are united on bringing a DoD GPT into the world,
the American people will welcome the news. They will welcome it
because it will be known that the incredible power of GPT technology
is being studied, developed with the assistance of, and utilized by
the trusted legal professionals of the DoD; legal professionals who,
rather than operating with a profit motive, stand for principled
counsel, mastery of the law, stewardship, and servant leadership.[64]
LTC Hong is the Chair of the Contract and Fiscal Law Department at
The Judge Advocate General’s Legal Center and School in
Charlottesville, Virginia.
[1] This commentary was originally drafted in April 2023 before GPT-4
Turbo was rolled out to the public, but I will keep the original
ChatGPT reference here as it is a generic label for an AI
capability. Moreover, the proposal is for a government created,
trained, and maintained GAI, not just a custom DoD application
programming interface (API) that connects to a commercial
product.
[2]
We are now living in the era where disclosure of this nature may
become standard practice. It was reported that over 200 e-books
listed ChatGPT as an author or co-author in Amazon’s Kindle store;
however, the question remains—how many written works since ChatGPT’s
public debut have incorporated GPT outputs and not properly credited
it? See Greg Bensinger,
ChatGPT Launches Boom in AI-Written E-Books on Amazon,
Reuters, (Feb. 21, 2023, 3:34 PM),
https://www.reuters.com/technology/chatgpt-launches-boom-ai-written-e-books-amazon-2023-02-21.
[3]
The term “GPT” is being used generically in this commentary. The
term “generative pre-training” was introduced in a paper written by
four OpenAI personnel in June 2018.
See Improving Language Understanding by Generative
Pre-Training, OpenAI.com (June 11, 2018), https://openai.com/research/language-unsupervised.
The paper can be downloaded at the link above. The term “generative
pre-trained transformer” was also coined by OpenAI and is the
combination of two ideas: unsupervised pre-training and
transformers. Id. A “chatbot” is defined as “a bot . . . that
is designed to converse with human beings” by the Merriam-Webster
online dictionary. Chatbot,
Merriam-Webster, https://www.merriam-webster.com/dictionary/chatbot (last visited
Jan. 2, 2024).
[4]
ChatGPT is a product created by OpenAI, which describes the product
as a model that “interacts in a conversational way.” See
Introducing ChatGPT,
OpenAI.com, https://openai.com/blog/chatgpt (last visited Jan. 2,
2024).
[5]
The next generation of ChatGPT is GPT-4 (which requires the
pay-to-use ChatGPT Plus) and is another OpenAI product that it
promotes as being able to “solve difficult problems with greater
accuracy, thanks to its broader general knowledge and
problem-solving abilities.”
GPT-4 Is OpenAI’s Most Advanced System, Producing Safer and More
Useful Responses,
OpenAI, https://openai.com/gpt-4 (last visited Jan. 2, 2024).
[6]
See Jack Dunhill,
GPT-4 Hires and Manipulates Human into Passing CAPTCHA Test,
IFLScience (Mar. 16, 2023),
https://www.iflscience.com/gpt-4-hires-and-manipulates-human-into-passing-captcha-test-68016
(describing GPT-4 using very human-like hacking plans and
execution, such as hiring a human to solve a Completely Automated
Public Turing Test to Tell Computers and Humans Apart (CAPTCHA) for
it).
[7]
Some companies are claiming that its product can detect whether a
written work is generated by AI, but the author’s personal testing
on free sites showed that the detection technology is not consistent
and, thus, not useful.
[8]
Imperfect writing may be a strategy to pass written work product as
human-produced, similar to the story of when a computer simulated
speaking like a thirteen-year-old boy to pass the Turing test.
See Computer Simulating 13-Year-Old Boy Becomes First to Pass
Turing Test,
The Guardian (June 9, 2014),
https://theguardian.com/technology/2014/jun/08/super-computer-simulates-13-year-old-boy-passes-turing-test.
[9]
In this question, reference to the DoD would include all military
and Civilian legal personnel from all the Service components and
civilian agencies under the DoD umbrella.
[10]
Deep Blue Defeats Garry Kasparov in Chess Match,
History,
https://www.history.com/this-day-in-history/deep-blue-defeats-garry-kasparov-in-chess-match
(last visited Jan. 2, 2024).
[11]
See Deep Blue, IBM, https://www.ibm.com/history/deep-blue
(last visited Jan. 2, 2024).
[12]
Steven Borowiec,
AlphaGo Seals 4-1 Victory Over Go Grandmaster Lee Sedol,
The Guardian
(Mar. 15, 2016),
https://www.theguardian.com/technology/2016/mar/15/googles-alphago-seals-4-1-victory-over-grandmaster-lee-sedol;
James Vincent,
DeepMind’s AI Agents Conquer Human Pros at StarCraft II,
The Verge
(Jan. 24, 2019),
https://www.theverge.com/2019/1/24/18196135/google-deepmind-ai-starcraft-2-victory.
[13]
IBM’s Watson Supercomputer Crowned Jeopardy King, BBC (Feb. 10, 2011),
https://www.bbc.com/news/technology-12491688.
[14]
“Putting down the coffee pot” is a reference to a movie scene in
Glengarry Glen Ross, a 1992 American movie adapted by David
Mamet of his play by the same name. See MovieClips,
Put That Coffee Down! – Glengarry Glen Ross (1/10) Movie CLIP
(1992) HD,
YouTube
(May 23, 2012), https://www.youtube.com/watch?v=r6Lf8GtMe4M.
[15]
Michael James Bommarito & Daniel Martin Katz, GPT Takes the
Bar Exam (2022), https://papers.ssrn.com/sol3/papers.cfm?abstract_id=4314839;
GPT-4,
OpenAI, https://openai.com/research/gpt-4 (last visited Jan. 2, 2024). As
for passing the bar exam, see OpenAI’s technical report that shows
that GPT-4 passed the Uniform Bar Exam (UBE) with a score of
298/400. GPT-4, supra.
[16]
See, e.g., Andrew Perlman,
The Implications of ChatGPT for Legal Services and Society,
The Practice, Mar./Apr. 2023,
https://clp.law.harvard.edu/knowledge-hub/magazine/issues/generative-ai-in-the-legal-profession/the-implications-of-chatgpt-for-legal-services-and-society.
[17]
Model Rules of Pro. Conduct
r. 1.1 (Am. Bar Ass’n
2020).
[18]
The term “knowledge professional” can essentially cover all DoD
personnel because without general and specialized knowledge, DoD
personnel would not be capable of conducting the department’s
day-to-day business and accomplishing its missions. Even if the term
is limited to language-based knowledge, it would essentially cover
all DoD personnel.
[19]
In August 2023, the DoD “announced the establishment of a generative
[AI] task force, an initiative that reflects the DoD’s commitment to
harnessing the power of artificial intelligence in a responsible and
strategic manner. Deputy Secretary of Defense Dr. Kathleen Hicks
directed the organization of Task Force Lima; it will play a pivotal
role in analyzing and integrating generative AI tools, such as . . .
LLMs, across the DoD.” Press Release, U.S. Dep’t of Def., DOD
Announces Establishment of Generative AI Task Force (Aug. 10, 2023),
https://www.defense.gov/News/Releases/Release/Article/3489803/dod-announces-establishment-of-generative-ai-task-force.
[20]
There are many articles on how much data is needed to create an
effective GPT chatbot. See, e.g., Ameya Paleja,
Alpaca AI: Stanford Researchers Clone ChatGPT AI for just $600,
Interesting Eng’g
(Mar. 21, 2023),
https://interestingengineering.com/innovation/stanford-researchers-clone-chatgpt-ai.
[21]
The availability of CUI information to interface with a DoD GPT can
potentially make the DoD GPT the most valuable GAI LLM in existence;
thus, care must be taken from the ground up to protect and secure a
DoD GPT.
[22]
The professor was Lieutenant Colonel Craig Scrogham, Vice-Chair of
the Contract and Fiscal Law Department at The Judge Advocate
General’s Legal Center and School, and he consented to this
attribution. I believe the quote is: “We are the best Control-F’ers
out there!”
[23] The task of finding sources of information to answer non-legal
questions may become easier with the development of DoD’s
GAMECHANGER. GAMECHANGER is a natural language processing website
that allows DoD personnel to “[s]earch over 50,000 policy
documents.” GAMECHANGER,
https://gamechanger.advana.data.mil/#/gamechanger (last visited Feb.
12, 2024) (requiring a Common Access Card). Based on its tagline, it
would make Control-F’ers’ lives easier. However, its utility may be
hampered by lackluster results; for example, the author searched the
database with the terms “generative ai,” “generative artificial
intelligence,” and “large language model,” and netted no DoD policy
documents—only executive orders and congressional hearing
transcripts and similar documents came up. If you have a Common
Access Card, you can try GAMECHANGER at the URL supra.
[24]
The Judge Advocate General’s Legal Center and School Artificial
Intelligence Study Group (TAISG)’s first meeting was held on 13
February 2023.
[25]
U.S. Dep’t of Def.,
Summary of the 2018 Department of Defense Artificial
Intelligence Strategy: Harnessing AI to Advance Our Security and
Prosperity
4 (2019) [hereinafter 2018
DoD AI Strategy Summary],
https://media.defense.gov/2019/Feb/12/2002088963/-1/-1/1/SUMMARY-OF-DOD-AI-STRATEGY.PDF.
[26]
U.S. Dep’t of Def.,
2020 Department of Defense Artificial Intelligence Education
Strategy
(2020) [hereinafter 2020
DoD AI Education Strategy],
https://www.ai.mil/docs/2020_DoD_AI_Training_and_Education_Strategy_and_Infographic_10_27_20.pdf.
[27]
Resources like Percipio (usarmy.percipio.com) and DoD MWR Libraries
(www.dodmwrlibraries.org) are available for free to Army personnel.
Another valuable resource is from the Air Force and its Digital
University (https://digitalu.af.mil/), which provides access to MIT
Horizon, “an enterprise-level content library to cover the latest
emerging technology topics.” MIT Horizon,
MIT Open Learning, https://openlearning.mit.edu/courses-programs/mit-horizon (last
visited Feb. 8, 2024). Lawyers should find MIT Horizon’s purpose
statement enticing: “Convenient micro-assets are designed to help
technical and non-technical learners stay informed about the latest
technologies to drive impact and innovation.” Id.
[28]
The study group has posted resources in its Microsoft Teams
Channel.
[29]
It may be the case that outputs will never be identical even with
the same input used each time because the GPT chatbots go through a
massive number of calculations for each input and the calculation
pathway may differ. See Jaron Lanier,
There Is No A.I., New Yorker (Apr. 20, 2023),
https://newyorker.com/science/annals-of-artificial-intelligence/there-is-no-ai
(“There is only a giant ocean of jello—a vast mathematical
mixing.”).
[30]
Security concerns on using a commercial GPT like ChatGPT by Federal
employees need to be addressed because even commercial companies are
starting to ban employee use of ChatGPT. See Mark Gurman,
Samsung Bans Staff’s AI Use After Spotting ChatGPT Data Leak,
Bloomberg
(May 2, 2023, 1:54 PM),
https://www.bloomberg.com/news/articles/2023-05-02/samsung-bans-chatgpt-and-other-generative-ai-use-by-staff-after-leak?leadSource=uverify%20wall.
[31]
Another way for DoD attorneys to get involved in testing and
evaluating chatbots is to participate in testing and voting
activities like LMSYS Chatbot Arena Leaderboard.
Chatbot Arena: Benchmarking LLMs in the Wild,
https://chat.lmsys.org/
(last visited Feb. 12, 2024). The activity created by LMSYS is to
compare the outputs of two chatbots and vote for the one that is
better. The technology behind this website seems to be an API
connected to all the mainstream chatbots and the user interface is
designed to make it convenient to compare many chatbots. The purpose
of the website is stated as: “Our mission is to build an open
crowdsourced platform to collect human feedback and evaluate LLMs
under real-world scenarios.” Id. at “About Us.”
[32]
See Harshil Patel, MLOps vs. DevOps vs. ModelOps,
Censius.ai, https://censius.ai/blogs/mlops-vs-devops-vs-modelops (last
visited Jan. 2, 2024) (“[Machine learning operations (MLOps)] is a way for data
scientists and operations experts to collaborate and communicate
in order to manage the production ML lifecycle.”).
[33]
The 2020 DoD AI Training and Education Strategy describes IPTs as
“multidisciplinary groups of [p]roduct [m]anagers and AI developers
whose roles are central to delivering AI capabilities.” 2020
DoD AI Education Strategy, supra note 26, at intro. The strategy calls for the creation of a cadre of IPTs
who are “composed of product managers, data scientists, AI/ML
engineers, IT technicians, and UI/UX designers from the ‘Create AI’,
‘Drive AI’, and ‘Embed AI’ archetypes.” Id. at 5 &
n.14.
[34] See
Memorandum from U.S. Dep’t of Navy, Chief Info. Officer, subject:
Department of the Navy Guidance on the Use of Generative Artificial
Intelligence and Large Language Models para. 1 (6 Sept. 2023),
https://www.doncio.navy.mil/ContentView.aspx?id=16442.
[37] The memorandum was sent to the author and is marked CUI, thus, it
is not publicly available. Government personnel must get a copy of
this memorandum and read it before utilizing commercial GPT chatbots
for any task, even if one thinks the contemplated prompt or input
does not implicate government information or work. The difficulty
will be knowing when an input would cross into the area of
government information or work (vice personal matters), thus,
triggering required reviews and approvals. These difficulties would
be eliminated for the most part if a DoD GPT chatbot existed. Even
with a DoD GPT chatbot, however, questions would arise as to how
certain inputs can be compartmentalized and access limited by or to
others, such as when electronic walls must be erected between the
prosecuting offices and trial defense attorneys.
[38]
More research is needed, but the retrieval of individual user’s
inputs and sessions may be possible even if the chatbot states that
it does not retain previous interactions for the sake of a user’s
privacy. For all we know, the data is still on the company servers
and the response that it does not retain previous interactions could
be just another guardrail placed by the company. Just like hacking
into a traditional network, people have tried, and will continue to
try, to hack GPT chatbots to get behind the guardrails.
[39]
Other standardized legal tests could be more military law-centric,
such as the exams given to graduate students at TJAGLCS. These exams
could include multiple-choice, short answer, or essay
questions.
[40]
About Tradewinds,
Tradewinds, https://www.tradewindai.com/about (last visited Jan. 2,
2024).
[43]
Jon Harper,
AI Bot Developed to Help Defense Department Write Contracts
Faster,
DefenseScoop
(Feb. 8, 2023),
https://defensescoop.com/2023/02/08/ai-bot-developed-to-help-defense-department-write-contracts-faster.
[44]
AcqBot, https://acqbot.com (last visited Feb. 9, 2024).
[45]
See Kyle Barr,
GPT-4 Is a Giant Black Box and Its Training Data Remains a
Mystery,
Gizmodo (Mar. 16, 2023),
https://gizmodo.com/chatbot-gpt4-open-ai-ai-bing-microsoft-1850229989.
[46]
According to a company called Scale AI, its LLM for U.S. National
Security can provide “citations” or sources used in the output.
See
Donovan: AI Digital Staff Officer for National Security,
Scale, https://scale.com/federal-llm (last visited Jan. 2, 2024). OpenAI
is also working toward the same goal as seen in their developer
forum, which talks about “ChatGPT-4 with Citations / Sources.”
See ChatGPT-4 with Citations / Sources,
OpenAI, https://
community.openai.com/t/chatgpt-4-with-citations-sources/164323 (last
visited Feb. 9, 2024).
[47]
See Stephen Johnson,
GPT-4 is Surprisingly Good at Explaining Jokes,
Freethink (Mar. 18, 2023),
https://www.freethink.com/robots-ai/gpt-4-jokes.
[50]
See Matthew Ivey,
The Ethical Midfield in Artificial Intelligence: Practical
Reflections for National Security Lawyers, 33
Geo. J. Legal Ethics
109 (2020).
[51]
If this was possible, ABA’s Rule 1.6 (Confidentiality of
Information) could be implicated for the attorney that used the GPT
chatbot and inputted certain details that could be considered a
disclosure of client information. See Model Rules of Pro. Conduct
r. 1.6 (Am. Bar Ass’n
2020).
[52] This hypothetical is only focused on teasing out the technical
capabilities (or vulnerabilities) of a GPT and legal professionals’
actions in gray areas. It would be easy to see that attempting to
secretly discover an adversarial party’s legal research, even if it
focused on search terms directed to a legal data base, would be
unethical.
[53] For more details on these practice areas, see
U.S. Dep’t of Army, Field Manual
3-84,
Legal Support to Operations
(1 Sept. 2023).
[54]
See, e.g., Pearlman, supra note 16.
[55]
Lieutenant Colonel Sean B. Zehtab contributed this list.
[56] The reference to spring or boom is to contrast with AI winters.
See Valeriia Kuka,
The Story of AI Winters and What It Teaches Us Today (History of
LLMs. Bonus),
Turing Post (June 30, 2023), https://www.turingpost.com/p/aiwinters.
[57]
See 2018
DoD AI Strategy Summary, supra note 25. The example of computer vision needs to be further developed, but
there may be corollaries between how LLMs can “hallucinate” and give
a wrong answer with how AI-powered computer vision classifies images
and makes errors in the identification of objects or people.
See Brian Hayes,
Computer Vision and Computer Hallucinations,
Am. Scientist,
Nov./Dec. 2015, at 380. The question will be what percentage of
error will be acceptable or unacceptable for deployment.
[58]
10 U.S.C. §§ 4021, 4022; see also Douglas Steinberg,
Leveraging the Department of Defense’s Other Transaction
Authority to Foster a Twenty-First Century Acquisition
Ecosystem, 49
Pub. Cont. L.J.
537 (2020).
[59]
An “acquisition strategy” is defined in the Federal Acquisition
Regulation as “the program manager’s overall plan for satisfying the
mission need in the most effective, economical, and timely manner.”
FAR 34.004 (2023).
[60]
The Defense Federal Acquisition Regulation Supplement requires a
written acquisition plan for any developmental acquisitions with a
total cost of $10 million or more or when deemed “appropriate by the
department or agency.” DFARS 207.103(d)(i)(A), (C) (Dec.
2023).
[61]
The DoD Other Transactions Guide contains useful information
on negotiating other transactions. See Off. of Under Sec’y of Def. for Acquisition & Sustainment,
U.S. Dep’t of Def., Other Transactions Guide (2023),
https://www.acq.osd.mil/asda/dpc/cp/policy/docs/guidebook/TAB%20A1%20-%20DoD%20OT%20Guide%20JUL%202023_final.pdf.
[62]
See Intellectual Property Cadre,
Off. of Assistant Sec'y of Def., Acquisition
https://www.acq.osd.mil/asda/ae/ada/ip-cadre.html (last visited
Jan. 2, 2024).
[63]
Searching Milsuite.mil for groups focused on GPT technology netted
zero results. There are many communities under the search term
“artificial intelligence” but those groups are based on
organizational affiliations or not targeted to legal professionals.
[64]
The U.S. Army JAG Corps has created these “Four Constants” but these
standards are readily applicable throughout the DoD legal
profession. For an explanation of the Four Constants, see
The Judge Advoc. Gen.’s Corps, U.S. Army, Four Constants
(n.d.),
https://www.jagcnet.army.mil/Sites/JAGC.nsf/46DCA0CA1EE75266852586C5004A681F/$File/US%20Army%20JAG%20Corps%20Four%20Constants%20Smart%20Card.pdf.