Consider a situation where an AI chatbot informs a user of a filing deadline for a personal injury claim. If this statement is later presented in court, does it violate the rule against hearsay? Should this AI-generated statement be treated as a declaration under the hearsay rule? Or does the lack of human authorship raise new challenges regarding its admissibility?
As AI-generated communications become increasingly prevalent, courts must decide whether these machine outputs fall under existing hearsay rules or require new legal frameworks. Evidence rules regarding hearsay were established for human speech, but now that machine intelligence is advancing with the capability to generate human-like statements, hearsay rules may need to evolve.
Under the Federal Rules of Evidence, hearsay is a declarant’s out-of-court statement offered to prove the truth of the matter asserted in the statement.[1] The definition of hearsay applies only to human declarants, while machine-generated evidence is generally treated as non-hearsay evidence because no actual “person” is involved.[2] Examples of non-controversial instrument-generated evidence include, inter alia, things like breathalyzer readings, GPS tracking data, and automated security system logs. These are passive, non-interpretive data recordings that are not treated as hearsay because they are not “statements” made by a person. Now, however, controversy exists with the introduction of statements made by “smart home” devices such as “Alexa” or “Siri”, and generative AI outputs made by chatbot models, the most prevalent of which is ChatGPT. The legal issue presented with these newer forms of machine-generated evidence is the quasi-intelligence of the statements made with the intent to sound human, as opposed to machines like breathalyzers or speed radar guns that record observable facts without any exercise of discretion.
These systems with the capability to “communicate” generate responses using algorithms, often taking on a tone that can be perceived as conversational or advisory. Looking specifically at ChatGPT, GPT stands for Generative Pre-Trained Transformer. The software can achieve such human-esque tones because it is designed to use natural-language processing and trained with reinforcement learning through human feedback. The effect of this human training is significant in the discussion of the hearsay context because it allows for context-aware outputs. Although a chatbot may generate a statement autonomously, its production is shaped by human-created data, which begs the question of whether these outputs echo the intentions and biases of its human trainers.
Even if AI-generated statements were treated as hearsay, they arguably could not qualify for admission under the listed exceptions, because most exceptions assume a human speaker for purposes of perception or reliability. Hearsay exceptions are specifically designed to allow statements containing an “indicia of trustworthiness” based on the circumstances surrounding their making.
For example, looking at the excited utterance exception, statements may be admissible where they are made about a startling event while the declarant was under the stress of excitement that said event caused.[3] The reliability in these statements stems from the idea that the declarant’s stress or shock from the event minimizes any opportunity for conscious reflection or fabrication. As a non-human entity, AI systems don’t get startled. The same problem exists for exceptions such as present-sense impressions or dying declarations – they inherently depend upon the declarant’s internal mental state.[4] It may seem that an AI-generated output could fall under the business records exception if such output were a routine part of a business’s regular processes. However, a human custodian must testify to the record’s regularity and trustworthiness.[5] The reliability that forms the basis for these exceptions stems from human nature and behavior. Without this human connection, autonomous AI-generated outputs struggle to meet the foundational requirements of the rules.
The question of whether AI-generated statements should be treated as passive mechanical data or human testimony is one of policy. To treat AI statements as hearsay has the potential to limit the use of tools integral to business, law enforcement, and everyday life. However, accepting AI-generated output without scrutiny begets the risk of admitting potentially false, biased evidence that lacks accountability. Unlike humans, artificial intelligence systems cannot be cross-examined, with their reasoning hidden behind the walls of algorithms. The reliability of machine-generated evidence lies in its programming, maintenance, and manner of operation.[6] In this same vein, human error may adversely affect machine output at the programming stage, which can lead to misanalysis and false statements in a system such as ChatGPT.[7] This danger is even greater in the face of Generative AI, which can generate content that mimics the characteristics of the datasets it learns from.[8] Chatbots “hallucinate” facts – meaning that they generate false information, facial recognition misidentifies people, and algorithmic tools have been seen to show bias in criminal risk assessments.[9] Unlike human witnesses, AI systems lack perception and intent, yet can produce authoritative and factual statements. Traditional reliability safeguards are inapplicable on this front – it is impossible to challenge the memory or bias of a machine that functions via an algorithm. Admission of these outputs as objective or neutral allows undue weight to be given to evidence that cannot be meaningfully questioned.
In confronting the rise of machine-generated evidence, courts must again face the foundational questions of “What is a statement?” and “Who, or what, is a declarant?” Artificial intelligence blurs the line between human testimony and mechanical output, generating responses that mirror human speech but lack memory, intent, or the ability to be cross-examined. Longstanding exceptions to the hearsay rule were not built for these algorithmic models trained on billions of data points. As this technology progresses, legal minds may need to develop new frameworks that treat AI not as a witness or passive machine, but as an entirely new category of evidence.
[1] Fed. R. Evid. 801.
[2] Fed. R. Evid. 801(b) (defining “declarant” as the “the person who made the statement” (emphasis added)).
[3] Fed. R. Evid. 803(2).
[4] Fed. R. Evid. 803(1), 804(b)(2).
[5] Fed. R. Evid. 803(6).
[6] See G. Alexander Nunn, Machine-Generated Evidence, 16 SciTech Law. 4, 6 (2020).
[7] See Andrea Roth, Machine Testimony, 126 Yale L. J. 1972, 1978 (2017).
[8] See What is Generative AI?, NVIDIA, https://www.nvidia.com/en-us/glossary/generative-ai/ (last visited Apr. 30, 2025).
[9] See AI on Trial: Legal Models Hallucinate in 1 out of 6 (or More) Benchmarking Queries, Stan. Univ. Hum.-Ctr. A.I. (May 23, 2024), https://hai.stanford.edu/news/ai-trial-legal-models-hallucinate-1-out-6-or-more-benchmarking-queries (discussing AI hallucinations and sanctions faced by lawyers who cite fictional cases hallucinated by ChatGPT).
Recent Comments