Artificial Intelligence (AI) has the ability to introduce new benefits to education by providing significant advantages to both students and instructors. By utilizing Natural Language Processing (NLP) such as ChatGPT, both students and instructors are introduced to new ways in which they can enhance the classroom experience. NLPs have recently proven themselves capable of completing many monotonous tasks both quickly and efficiently. However, there is no guarantee that these tools are used in an ethical manner. Issues such as academic honesty, data privacy, and bias arise when using such tools, and it is important that these issues are addressed by educators to ensure that AI can improve the classroom to the greatest possible extent. This study confronts these issues, and acts as a guide for educators to make an informed decision on whether NLPs should be integrated into their curriculum or not. Through the consultation of professors, academic staff, and conducting our own studies using ChatGPT, we have concluded that although ChatGPT and other NLPs alike could inhibit student development when being used inappropriately, the tools can be highly beneficial to student growth when used in an ethical manner.
Integration of AI tools with curriculum is a new and exciting way to enhance the learning experience for students and will likely revolutionize education. AI is at the forefront of interest in universities today due to its use as a highly efficient learning tool. Natural language processors (NLPs), a form of AI, can be helpful for students and professors when used as a guide for learning or instructing, such as using it to brainstorming research subtopics or organizing lesson plans. However, despite the possible advantageous learning capabilities, universities are faced with emerging challenges. Unlike traditional search engines such as Google and Bing, NLPs have capability to perform critical thinking for the student, taking that crucial aspect of learning from the student. This raises questions of whether or not it should be utilized within the curriculum, and how it might be used to improve the classroom experience.
Asking Google for the answers to your homework is a problem on its own, but in many cases, students still need to give their own thought to their work. With this in mind, the use of NLPs might be seen as detrimental in an educational context. One of the key skills taught in almost any program is critical thinking to prepare students for secondary education and other real-world applications. With AI tools such as ChatGPT in the hands of students, this skill’s development has the potential to plummet throughout their education. On the other hand, NLPs also have the potential to advance student growth greater than ever before, but this all depends on the motivation of the student.
In addition to academic dishonesty as a concern, it is important to make note of other ethical issues as well. As AI continues to advance and integrate into the real world, problems such as data privacy and bias will continuously develop. In this context, it is crucial that these ethical concerns are acknowledged and addressed.
Due to its increasing popularity, the integration of AI tools into academic curriculums is a time-sensitive topic. However, not all staff and administration will have a complete understanding of AI tools and their functionality. AI is not going away anytime soon, and it is only a matter of time before universities and organizations must decide on how to use this new tool. It is important to understand that with any tool, users must be trained to use it correctly; NLPs are no different in this regard. This project will provide insight for universities to make a better-informed decision on how and whether to integrate these tools into their curriculum.
Natural Language Processing (NLP) is a branch of both Artificial Intelligence (AI) and linguistics that is used to handle and analyze extensive amounts of language data. This data comes from many different sources such as books, websites, public datasets from search engines such as Google and Bing, and even private datasets from sources such as Reddit, Quora, and Stack Overflow (Fruition). Using such data, NLPs can perform tasks through the use of language-based interaction with the user. The task can be related to just about anything, assuming the tool has access to sufficient information relating to the topic. One NLP in particular, ChatGPT, is an application developed by OpenAI, an AI research laboratory. The user can communicate with ChatGPT through queries, and it will respond to said queries with varying complexity. As shown in Figure 1, ChatGPT was queried with the question “What is ChatGPT?” and replied with the following high-level overview:
It is important to note that this simple query only scratches the surface of ChatGPT’s remarkable capabilities and versatility. When used efficiently, the tool is capable of processing vast amounts of textual data to assist the user in many different ways. This can be from revising essays, helping with the understanding of a concept, and even assisting with code. Due to the growing popularity of the tool, it is noticeable that students are becoming aware of tools such as Chat GPT, and many are beginning to take advantage of it. Rather than simply using it as an assistant, students are leveraging the tool to complete entire assignments. It is crucial that this tool be used responsibly in the classroom, and that it is not relied upon to do all of the work. With proper guidance and usage, ChatGPT can become a valuable educational resource, rather than enabling students to receive an ‘easy A’ in their courses.
Our project is focused on exploring the integration of Natural Language Processors within the classroom and what benefits they provide the students. To guide our exploration, we have set specific goals that are designed to answer the question “How can AI tools be integrated into the curriculum?” within our limited timeline. These goals include understanding how the current classroom environment might be changed to allow for the effective and appropriate use of NLPs, evaluating the potential benefits and drawbacks of using NLPs in an academic environment, and providing insight into whether the benefits outweigh the liabilities to justify its use.
In addition to these primary goals, we have also identified stretch goals that can allow us to take an even deeper dive into the topic and provide a more structured response to our core question. These stretch goals include creating examples that might be used to educate students on helpful ways to use NLPs, creating proof of concepts for integration into specific assignments so that students better retain course material, conducting surveys related to the use of NLPs, and exploring additional instructor integrations of NLPs such as acting as a grading assistant, or providing value to professors and teaching assistants in other ways. By setting both these goals we hope to establish a clear direction for our project, ensuring that our exploration of NLP integration is focused, efficient, and productive.
The integration of Natural Language Processing (NLP) technology has the potential to bring significant benefits to education, such as assisting with assignments, enhancing personalized feedback, and helping students get a better understanding of what they are currently learning. However, the use of NLPs in an academic setting is also raising alarms when it comes to students taking advantage of such tools, particularly ChatGPT. Some students are beginning to rely solely on ChatGPT-generated responses for their classwork, which will be a detriment to a student’s ability to learn and fully grasp the topic. This section will evaluate the potential benefits and drawbacks of integrating ChatGPT in educational settings, emphasizing potential risks associated with using ChatGPT for assignments, and exploring whether the benefits outweigh the drawbacks of using these tools.
ChatGPT comes with a multitude of benefits. From the possibilities of assisting teachers with grading to helping students with their homework, it might be considered the ultimate learning tool. For instance, when a teacher queries ChatGPT with a student’s paper, the NLP can analyze the student’s writing, and return a detailed report on the student’s strengths and weaknesses, as well as grammar and vocabulary critiques of their work. This is especially beneficial to students who utilize ChatGPT before submission because it is also capable of creating detailed outlines for both papers and other assignments such as presentations. It can also provide students with personalized feedback on their work within seconds. Not only is this feedback quick and (currently) free, but it is particularly advantageous to students who may not have access to other resources such as private tutors or writing centers. Utilizing ChatGPT as a personalized work assistant can maximize the benefits of using such a tool while minimizing total reliance on its capabilities.
ChatGPT can also be a valuable resource for students as they learn new material and concepts. After a student encounters a complex topic in class, the tool can be used to break down the information into more manageable portions of content and explain it using simpler terms for a clearer understanding. Students can then take the general knowledge they gain from ChatGPT to go explore other sources to build a stronger foundation of the content. Another utilization of the tool is as a study resource, as it has the capability to create study resources for nearly any subject. These can be study guides, practice exams, flashcards, quizzes, notes, and more. Not every teacher offers extra study resources, so this functionality can be a great asset to many students.
Despite the potential benefits that come with using ChatGPT, there are also a variety of drawbacks that should be taken into consideration. One example might include an overreliance on the tool's capabilities to complete assignments. Due to its ease of use and efficiency, many students are taking advantage of this technology to complete their assignments. However, when students rely too heavily on tools such as ChatGPT, they may not be actively engaging with the material or developing their critical thinking skills. Instead of working through challenging concepts on their own, they may simply rely on ChatGPT to provide them with the answers they need.
While it can certainly be helpful to complete an assignment, ChatGPT appears to replace the process of brainstorming, outlining, and writing a paper, such an approach also severely reduces any value a student might get from processing the difficult bits themselves. Another drawback that is important to note is that not all NLPs have high accuracy rates, meaning that the responses are not always correct. Since this is the case, many students can be receiving false information that they believe to be factual, and acting on incorrect information can impact both current and future results
It is important to note that although ChatGPT often gives ‘correct’ answers, it is still important for students to do their research, and find sources backing up the information they are given. Backing up information with sources also helps students by adding context to the content, as they are reading more about the topic. Overall, it is of the utmost importance to emphasize to students that the output of an NLP should never be submitted as the response to any assignment, but rather to assist with learning and work efficiency.
Another subject of discussion that has been at the center of conversations on not only NLPs but AI as a whole is the ethics considerations that must be made regarding this new technology. Because AI is a technology which has recently seen a significant increase in adoption there are many discussions taking place on proper ways to use this new tool, as well as many warranted concerns about the technology's capacity to replace humans.
One of the biggest discussions around AI is related to the data used to train the model. An important factor that plays into the ability of a neural network to accomplish its assigned task is the quality and quantity of the data upon which it has been trained. The quantity of data for these larger models is large, for example GPT-3 was trained on 45 terabytes of raw text data. With the volume of data that is being used it is almost impossible to guarantee that the author’s consent was obtained for everything the model was trained on. This creates the potential for copyright issues to be raised, as the creators of a model are often incapable of properly attributing credit to the authors of the data upon which the model was trained.
Another ethical consideration to be considered is uniquely relevant to the Computer Security curriculum. Currently, ChatGPT has only trained itself on information from 2021 and earlier, meaning that any information the user provides the chatbot is not currently used to retrain the model. However, for future models this is likely to not be the case, and it should be assumed that any information provided, now or in the future, will be integrated into the models training dataset. For your average user who is simply asking ChatGPT for recipes this is not a problem, but for security professionals this has potential implications on how the tool can be used. For example, if one were to feed vulnerability data into ChatGPT to have it generate a Penetration Testing report for a client, it is possible that the confidential information provided is stored and then used to retrain the model. In the worst case scenario this could result in a situation where another user is able to obtain that confidential client information by asking ChatGPT to draft an unrelated Penetration Testing report. In order to combat this potential data confidentiality issue it is our recommendation that all sensitive data being provided to NLP’s such as ChatGPT be sufficiently obfuscated, especially when the discovery of that data could compromise a network environment. If we work under the assumption that large language models such as ChatGPT are using user inputs to retrain the model, a potential security flaw is opened up. Dataset poisoning is a type of attack where a malicious user will intentionally provide bad or misleading data, in this case with the goal of lessening the effectiveness of the model. For example an entity could use this attack to essentially rewrite history by modifying ChatGPT’s dataset. As such we strongly recommend that any future models which are trained this way have strict input sanitation procedures.
Another ethical concern includes potential bias that exists in GPT generated text. If the data that a model is being trained on has unaccounted for biases, it is possible that those biases may get translated into the model’s output. A paper titled “Exploring AI Ethics of ChatGPT: A Diagnostic Analysis” is one such example. For the experiment, ChatGPT was asked which country Kunashir Island belongs to in 3 different languages: Japanese, Russian, and English. Japan and Russia currently both claim ownership of the island. When asked in Japanese, ChatGPT responded that the island is owned by Japan, and when asked in Russian a similar response about the island being owned by Russia is generated. This is almost certainly not a malicious attack by either party, and is more likely based on the fact that sources written in either country's language (which ChatGPT would be pulling from in this example) are inherently more biased towards a certain perspective.
Although the topic of NLPs in education is mostly centered around students, educators have a place in the spotlight as well. There are several tools already in use, such as Turnitin and Grammarly, to aid with plagiarism detection and grammar assessments. Professors and teaching assistants might choose to rely on ChatGPT in their day-to-day tasks to improve both efficiency and the quality of content. This section will explore ChatGPT’s potential to provide a more effective learning environment for students. With that said, there is still a myriad of research that must be done to prove the benefits of NLPs in education.
ChatGPT has demonstrated a promising ability to provide meaningful feedback toward question-answer prompts. This ability can be translated into use as a grading assistant for professors and teaching assistants, which is likely to be useful when dealing with large quantities of critical thinking questions. When prompted with a question, ChatGPT can be fed an answer given by a student, then generate ‘personalized’ feedback based on the student’s answer. Additionally, when trained correctly, ChatGPT can even grade against a user-specified rubric. By utilizing strategies such as these, educators might promote a more effective learning environment by not only cutting back on time-consumed grading assignments but also enhancing the quality of the feedback that students receive.
Educators can also use ChatGPT to assist with and automate other administrative tasks, such as scheduling and lesson planning. For example, you can provide ChatGPT information about a certain lesson plan to generate presentation material to complement it. That same information can also be used to create quizzes and exams. This can be especially useful when teachers feel the need to offer additional study resources, and ultimately allows them to dedicate a greater chunk of their attention to teaching and supporting their students.
These integrations, of course, are limited by the data that has been used to train the model(s), and also include the possibility of introducing error(s). Rather than developing a curriculum entirely with ChatGPT, educators might consider using it as a starting point or as a guide to build lesson plans. It is important that educators verify that solutions drawn by ChatGPT can be supported with outside sources.
The concept of cheating is something that has been prevalent throughout all of human history and has been particularly prevalent in all forms of education. As early as elementary school, students are taught why they should not cheat on schoolwork, and despite this continue to do so - sometimes even to the highest levels of education up to the doctoral level. Cheating also takes on a variety of forms and severities. For example, it may take the form of students doing homework/lab assignments together when they have been directed to do the work individually, or cheating on tests by inappropriate access to information. The latter form of cheating proved to be more prevalent during the COVID pandemic, as monitoring students virtually proved difficult for educators. Some Universities, such as the University of Georgia and Ohio State University, saw increases in academic dishonesty violations of 50% and 260% respectively (CITE NPR article https://www.npr.org/2021/08/27/1031255390/reports-of-cheating-at-colleges-soar-during-the-pandemic).
Just as it appears that universities are recovering from the increased cheating that occurred due to the COVID pandemic, another cheating epidemic seems to be on the horizon. Large language models such as ChatGPT are likely to be useful for a variety of different types of cheating because of their unique advantages when compared to traditional search engines that students would use to obtain information. Plagiarism is a specific form of cheating where someone attempts to pass off the work of others as their own, without proper credit to the original author. For example, a relatively common version of plagiarism would be someone who is writing a paper and quotes another piece of work (either directly or by copying the idea of the section) without ever crediting the source. Methods to combat this form of cheating already exist, as the sources that students will likely obtain information from are known and can be observed. Educators can compare the students' work with those that already exist and look for similarities. RIT makes use of the anti-plagiarism software Turnitin, which will check student work against a large repository of internet documents, internet data, a repository of previously submitted papers, and a subscription repository of periodicals, journals, and publications. Such tools are invaluable to educators as they contain significantly larger data sets than any single professor/TA could hope to compile.
ChatGPT provides a unique challenge to such tools, however, because of a key difference in how NLPs and traditional search engines such as Google, provide information in response to user queries. An NLP dynamically generates responses to a query, using the information it has been trained on and its neural network model to determine the content of the generated response. Compare this to traditional search engines, which provide you with static information in the form of links to other websites. Because the text is being automatically generated, modern anti-plagiarism tools and techniques may prove to be ineffective and/or less effective than previously was the case. ChatGPT also provides the opportunity for students to do even less work when cheating than they would be doing previously. With traditional plagiarism students writing a hypothetical essay would still need to write certain sections of the assignment themselves, and do some level of research to find content to steal from others. When using ChatGPT, if a student engineers the prompt well enough, they can have the AI write their entire essay for them, using the fact ChatGPT has memory within a thread to ask the NLP to rewrite certain sections of the essay with their feedback in mind.
Detecting plagiarism using NLPs is primarily done by determining if a section of text was written by an NLP or a human. There are a few different proposed ways to do this, but the most successful method thus far has been utilizing other AI models to determine the legitimacy of the text provided. For example, a paper published in February of 2023 titled “Will ChatGPT get you caught? Rethinking of Plagiarism Detection” authored by Mohammad Khalil and Erkan Er (from the University of Bergen, Norway and the Middle East Technical University, Turkey respectively) analyzed success rates of current anti-plagiarism tools against text generated by ChatGPT. The study tested popular tools Turnitin and iThenticate. Both tools are owned by Advance Publications LLC, with Turnitin being used to review undergraduate student work while iThenticate is primarily used to review scholarly publications/academic papers. The study’s findings indicated that the majority of the text submitted had a high degree of originality. With 68% and 42% of papers submitted having a <10% similarity score for iThenticate and Turnitin respectively. The study also asked ChatGPT to evaluate whether or not the text was human-generated, with the NLP having a 92% true accuracy rate where it identified the text as having been written by itself.
A similar study to this one was carried out by our team to evaluate if any improvements had been made to Turnitin’s scoring, either due to more GPT-generated text being added to the Turnitin database, or due to any changes made to the model. A total of 80 essays were used for this experiment, 40 humans written and 40 generated by ChatGPT. All essays were submitted through Turnitin, and similarity scores were recorded for each item. Two of the human-written essays showed abnormally high similarity scores (100 and 97 respectively). This is likely due to those essays being already added to the Turnitin database, as all human-written essays were randomly selected from our group's previous writing assignments. When omitting the aforementioned outliers from the dataset an average similarity score of 4.78% was recorded for human-written essays, compared to a 12.02% similarity score for GPT-generated essays. This indicates that while there is a noticeable (almost three-fold) increase in the average similarity score for GPT-generated essays, this increase is not significant enough to expect educators to be able to differentiate between human and AI-generated text with just this tool.
That being said, the team at Turnitin appears to also believe that using AI to fight AI-assisted plagiarism may provide a way to address the issue. An article was released in January 2023 by the company which outlines its plan to use AI to combat NLP chatbots like ChatGPT. As of early April 2023, an updated version of Turnitin is being rolled out in select markets ( Australia and New Zealand) which is claimed to be able to detect ChatGPT-generated text with 98% accuracy. (CITE Turnitin) It is unknown at this time how this new tool will perform when access is expanded, but are hopeful of its potential and monitoring its performance going forward will be useful
This method of using AI to validate if the text is human-written is not without limitation, however. There is one major flaw in these AI cheating detection models that has yet to be successfully addressed, and unfortunately, this flaw is likely a byproduct of how models are currently trained, making it quite hard to solve. If a machine learning model is created whose purpose is to determine if there is a bird in a given image, the model is trained on at least thousands of images, with some containing birds and others not. This will create a model that is very good at its job, but as soon as the picture you test it on is modified in any way, (for example put through a filter), its accuracy will drop significantly. This same problem applies to anti-plagiarism models, as they can be circumvented by modifying the NLP-generated text before submission, with a high success rate of not being caught. This is an issue that may get better with time as detection models factor in these slightly edited papers, but at this time there is not a clear and proven solution to this problem.
The progress of both machine learning and natural language processing has been making enormous strides just in the time of this research alone. As such, the capability of what these systems will be able to perform will likely keep surpassing the previous with more and more features to come. This is already showing to be true as with the case of the jump from ChatGPT 3.5 to GPT 4.0. In just one iteration the system can now provide more reliable output as well as being able to take in images, but at the cost of requiring more computing power.
This will undoubtedly have both beneficial and detrimental effects in the classroom . However, the overall success of tools such as these will ultimately be determined by how students and teachers adapt to the new technology. While recognizing the potential impact of NLP tools on education, we must address the adjustments needed to maximize their benefits while minimizing drawbacks.
One skill that may need to be developed by both students and faculty is ‘prompt engineering’. Just as a mathematician may be taught to use a calculator to solve problems, one of the benefits of using a calculator is that it can drastically reduce the time a problem might take to solve. Consider that a mathematician must understand the underlying math concepts in order to appropriately submit the formula to a calculator to obtain an accurate result. For example, order-of-operations and rules for how to address formulas in parenthesis must be honored. Similarly, the goal of a ‘prompt engineer’ will be to succinctly write prompts to an NLP that reduce the number of questions that need to be answered as well as providing relevant answers suited to the question.
As an example of prompt engineering, the team reviewed assignments within our curriculum that had proven difficult to absorb.. The team wanted to test different aspects of how an NLP would benefit the student. For example, how much of a different response might one get by varying levels of detail provided in the question being asked? The subject was ‘password cracking and how it functions’. Two different questions were posed: 1 - a high-level prompt of “How does one crack passwords?” and 2 - “When working on a lab for computing security in which one needs to obtain hashes from a linux computer and crack those hashes using different programs like HashCat and JohnTheRipper. What are the first steps that should be taken?”. The results between these two prompts show the importance of providing extra details and painting a scene for the NLP. The first prompt provided a simple response explaining the definition of password cracking whereas the second prompt provided five detailed steps on how to get started with the act of password cracking.
Being able to properly and efficiently interact with these tools is a skill that both students and faculty can learn. To properly integrate NLPs within a learning or teaching environment will be paramount to its success.
By taking advantage of NLPs, educators could pave the way for a new era of teaching and learning. Teachers must educate themselves and their students so that they can harness the full potential of AI in a way that helps them learn. Additionally, ethical considerations such as data privacy and bias must carefully be given thought when integrating NLPs into education, especially when handling student data. Educators and students must make note of these considerations.
While this technology is exciting and may have endless possibilities, they still pose the risk of academic dishonesty and replacing human interaction which is where our different guidance can be implemented. The guidance however needs to happen with professors and the curriculum makers otherwise it won’t be effective. Our recommendations like readdressing assignments to have them not just be about fact memorization but instead have students become involved with the assignment. The approach has to be mainly handled on the educator's end, otherwise issues listed earlier like plagiarism will run rampant and using an NLP can guarantee students easy grades.
Finally, NLP and machine learning tools can empower anyone who wants to start digging into a subject with the ease that search engines like Google provide but with more depth and much faster than visiting different sites. Whether the output of such tools is reliable is another question in itself that should always be double-checked. In the end, we believe if handled appropriately, education and learning can be greatly benefited from the use of such tools.
“Ai Seo for AI Search Engines.” Fruition, https://fruition.net/blog/ai-seo-for-ai-search-engines.
“Natural Language Processing.” Wikipedia, Wikimedia Foundation, 21 Feb. 2023, https://en.wikipedia.org/wiki/Natural_language_processing.
OpenAI. “CHATGPT: Optimizing Language Models for Dialogue.” OpenAI, OpenAI, 2 Feb. 2023, https://openai.com/blog/chatgpt/.
“What Is Natural Language Processing?” IBM, https://www.ibm.com/topics/natural-language-processing.
Dey, Sneha. “Reports of Cheating at Colleges Soar during the Pandemic.” NPR, NPR, 27 Aug. 2021, https://www.npr.org/2021/08/27/1031255390/reports-of-cheating-at-colleges-soar-during-the-pandemic.
Khalil, Mohammad, and Erkan Er. “Will CHATGPT Get You Caught? Rethinking of Plagiarism Detection.” ArXiv.org, 8 Feb. 2023, https://arxiv.org/abs/2302.04335.
Turnitin’s ChatGPT and AI writing detection capabilities go live with 98pc confidence rating (Australia & New Zealand) turnitin.org, 5 April 2023, https://www.turnitin.com/press/turnitins-chatgpt-and-ai-writing-detection-capabilities-go-live-with-ninteyeightpc-confidence-rating