AI--English 800: Evaluating Resources

The Ethics of AI

Environmental issues

The rapid expansion of GenAI has created increased demand for the natural resources that power its processes, such as electricity and water.

Further reading: “A Computer Scientist Breaks Down Generative AI’s Hefty Carbon Footprint” Scientific American, May 2023

Misinformation and bias

Many researchers who study the spread of misinformation (incorrect or false info) and disinformation (deliberately misleading information) are concerned about GenAI’s ability to easily and quickly spread false content.

Further reading: “An A.I. Researcher Takes On Election Deepfakes.” New York Times, April 2024

Copyright

Because Generative AI tools have advanced so rapidly, existing copyright law has struggled to adapt. The rightful copyright owner of GenAI outputs is still unclear, and this area of law will continue to change as new ownership claims are made.

Further reading: “Boom in A.I. Prompts a Test of Copyright Law,” New York Times, December 2023

Data privacy

Some experts are concerned because Generative AI companies are not clear about how users’ data are protected. Other concerns include the lack of consent from creators whose work was used to train AI tools.

Further reading: “Generative AI's privacy problem,” AXIOS, March 2024.

Accessibility and equity

Generative AI holds the possibility of creating content that is more accessible to everyone, including people who use assistive technology. However, it’s also important to ensure that GenAI-created content and tools are accessible and pass digital accessibility checks.

Further reading: “‘Without these tools, I’d be lost’: how generative AI aids in accessibility”Nature, April 2024

Citing AI

Here are some fundamental ideas that hold true for citing AI generated content, no matter which citation style you're using:

  • Do cite or acknowledge the outputs of generative AI tools when you use them in your work. This includes direct quotations and paraphrasing, as well as using the tool for tasks like editing, translating, idea generation, and data processing. 
  • Do not use sources that are cited by AI tools without reading those sources yourself. There are two different reasons for this:
    • Generative AI tools can create fake citations.
    • These tools may cite a real piece of writing, but the cited content may be inaccurate. 
  • Be flexible in your approach to citing AI-generated content, because emerging guidelines will always lag behind the current state of technology, and the way that technology is applied. If you are unsure of how to cite something, include a note in your text that describes how you used a certain tool. 

  • When in doubt, remember that we cite sources for two primary purposes: first, to give credit to the author or creator; and second, to help others locate the sources you used in your research. Use these two concepts to help make decisions about using and citing AI-generated content. 

Evaluating Sources for AI-Generated Content

It is critical to evaluate sources used in research for credibility, including for accuracy and authority. Text-generating AI tools like ChatGPT, have limitations for research, including:

  • making things up (called a “hallucination”)
  • referring to information sources that don’t exist
  • presenting false information in an authoritative tone

To address these limitations of generative AI tools, take these steps:

  • Double-check the information you get in a popular resource by confirming it in authoritative sources.
  • Use research databases and search engines to find specific sources to cite in academic papers.

Check the claim.

When evaluating the accuracy of popular sources, you are evaluating the claim rather than the source. To check the claim, you need to locate the claim in another, trusted source. Rather than ask yourself "who is behind this information" ask "who can verify this information?"* For low-stakes claims, a simple Google or Wikipedia search may suffice. For other claims, think about who is likely to care enough about the topic to publish about it. You may search for the topic on government websites, in trusted news sources, or through research databases and library search engines.

Check the citation.

Search directly for the source. You may use Google Scholar or the library search tool to search for a specific article or book. If you cannot locate the source, the citation may have been hallucinated.

Search for the text directly

Often AI is used by online writers because of time constraints, the pressure to publish a lot of material quickly, and for the same reasons students might use it, like ease or an unfamiliarity with the topic. If you ask ChatGPT or other generative AI to summarize a particular topic, 20 different times, then it usually spits out the same answer, verbatim. Usually, that means you can google a quotation from the popular source and find it almost exact;y replicated across different sources

Cut yourself some slack

AI is getting more efficient and effective as time passes, making it difficult for even tech experts to discern immediately. Focus less on if the text is AI and more on the veracity of what it's claiming, which we'd evaluate with our information literacy skills, same as any other source.

Fact Checking is Always Needed

AI "hallucination"
The official term in the field of AI is "hallucination." This refers to the fact that it sometimes "makes stuff up." This is because these systems are probabilistic, not deterministic.
 

ChatGPT often makes up fictional sources
One area where ChatGPT usually gives fictional answers is when asked to create a list of sources. See the Twitter thread, "Why does chatGPT make up fake academic papers?" for a useful explanation of why this happens.
 

There is progress in making these models more truthful
However, there is progress in making these systems more truthful by grounding them in external sources of knowledge. Some examples are Microsoft Copilot and Perplexity AI, which use internet search results to ground answers. However, the Internet sources used, could also contain misinformation or disinformation. But at least with Copilot and Perplexity you can link to the sources used to begin verification.
 

Scholarly sources as grounding
There are also systems that combine language models with scholarly sources. For example:

  • Elicit
    A research assistant using language models to automate parts of researchers’ workflows. Currently, the main workflow in Elicit is Literature Review. If you ask a question, Elicit will show relevant papers and summaries of key information about those papers in an easy-to-use table. 
  • Consensus

    A search engine that uses AI to search for and surface claims made in peer-reviewed research papers. Ask a plain English research question, and get word-for-word quotes from research papers related to your question. The source material used in Consensus comes from the Semantic Scholar database, which includes over 200M papers across all domains of science.