There are several open-source generative AI tools that can be downloaded and run on a local machine, which can reduce concerns about sending sensitive information to external servers and reduces your environmental impact. For example, GPT4All and Dolly 2.0 allow users to operate a language model without connecting to the internet. Running these models locally can provide greater control over data because all information remains within your computing environment.
Another option is the family of LLaMA-based models from Meta. While certain licensing restrictions apply, some variations can be deployed offline. They are typically lighter-weight than large commercial models, so they do not require extensive computing resources. Additionally, the Hugging Face Transformers library offers a wide array of smaller language models that may be adapted for confidential tasks without exposing any data to a remote server.
There are also some inexpensive models designed especially for lawyers (beyond those offered by Lexis and Westlaw), which promise strong data security measures. We met the creator of Descrybe.ai at a recent conference, and the technology is impressive.
Before you copy any client document into ChatGPT, remember: Model Rule 1.6 does not require harm before a breach exists, only an un-consented “reveal.” ABA Formal Opinion 498 and Resolution 604 both caution that transmitting client material to an external generative-AI service can amount to disclosure, even if no human reads the text.¹ Running an open-weight model entirely on your own hardware avoids that transmission risk—but you are still responsible for reasonable safeguards, and many licenses (e.g., Llama 3’s) forbid using the weights to compete with the provider. Local inference also spares the planet the energy cost of sending every prompt to a distant data-center, though it still draws real watts from your outlet.²
Why You Might Go Local
Confidentiality
No third-party server means no transmission outside the firm, reducing exposure—but not eliminating the duty of competence.³
The ABA warns that lawyers must “understand the technology sufficiently to ascertain its confidentiality implications,” and some states (e.g., Kentucky) now require written client consent before using GenAI at all.⁴
Environmental Footprint
Local inference shifts compute from hyperscale data-centers (which collectively may emit an extra 1.3–1.7 Gt CO₂ by 2030) to your office, trimming network traffic and letting you choose green power sources.⁵ A single GPT-class prompt can power a lightbulb for 20 minutes; running an 8 B model locally draws far less.⁶
Implementation Tips
Install a runner. brew install ollama (macOS/Linux) or download LM Studio/GPT4All on Windows.
Quantize wisely. Convert the weight file to 4-bit GGUF to cut memory by ~75 %. FlashAttention-2 and similar kernels restore much of the speed loss.⁷
Treat prompts as documents. Anything you paste—local or cloud—goes straight into the model’s context window. Log retention, backups, and endpoint security still matter.
Selected References
MODEL RULES OF PRO. CONDUCT r. 1.6 (AM. BAR ASS’N 2024); ABA Comm. on Ethics & Pro. Resp., Formal Op. 498 (2021) (virtual practice); Resolution 604, AM. BAR ASS’N (Feb. 2023).americanbar.orgamericanbar.orgamericanbar.org
Meta Platforms, Inc., Llama 3.1 Community License Agreement §4 (2025); see also Nanda Kumar et al., Reconciling the Contrasting Narratives on the Environmental Impact of LLMs, Sci. Rep. (2024).llama.meta.comnature.com
Kentucky Bar Ass’n, Real World Applications of Generative AI in the Legal Arena (2024).kybar.org
Ollama, REST API Documentation (2025); LM Studio, Local Server Docs (2025).github.comlmstudio.ai
Reuters, AI Economic Gains Likely to Outweigh Emissions Cost (Apr. 22, 2025); Axios, AI Warning (Apr. 22, 2025).reuters.comaxios.com
Kimberly Truong, OpenAI CEO Claims Polite Prompts Cost Millions in Electricity, PEOPLE (Apr. 26, 2025).people.com
Tri Dao, FlashAttention-2: Faster Attention with Better Parallelism and Work Partitioning (2023) (preprint).arxiv.org
(This handout is intended for the Law Professors’ GenAI Sandbox. Feel free to redistribute within your institution.)