In April 2022, I started reading articles about OpenAI, an artificial intelligence research lab, and its image-generating program, Dall-E (https://openai.com/blog/dall-e/). I kept reading and realized it is another version of GPT-3. GPT-3, or the Generative Pre-Trained Transformer 3, is a large language learning model: the AI model is given data to ‘train’ on to produce similar content. In the case of GPT-3, the content would be human-like sentences.
GPT-3 was released in 2020, so I thought it would be more accessible than Dall-E. After surfing the internet, I found a few websites that use the GPT-3 model or something similar. By this point, most existing language-generating programs used at least GPT-2, a previous iteration of OpenAI’s GPT series, or something similar.
After playing with different websites that use some iteration of GPT for a while, I had a lightbulb moment. My classmates and I recently received our third paper assignment for our Philosophy of Mind Class. As soon as I saw the topics listed, I thought, “Yup, I am definitely using my essay pass on this.” Although I knew I didn’t want to write the essay, I saw some potential for it. Instead, I emailed Professor Sehon asking if he would be interested in grading an AI-written essay. I told him I would like it to be graded as if it were a regular student. He responded with something better: “Don’t tell your classmates and we’ll review it together as a class!” I knew it would be a great low-stakes opportunity to test the ‘mind’ of AI.
I used two websites to produce the paper: chibi.ai and shortlyai.com. The AI’s assignment was to explain, evaluate, and argue against Professor Sehon’s philosophical claim that “common-sense reason explanation of human action is irreducibly teleological.” To briefly summarize the claim, Professor Sehon states that explanations for actions are teleological or goal-oriented. For example, the statement “Juan went to the kitchen in order to get coffee” explains Juan’s action in terms of goals – his whole reason for going to the kitchen is the goal of getting coffee. But I do not want you to stop reading from any unnecessary confusion, so I’ll leave the explanation there for now.
All in all, I was skeptical of how well the language model would respond. I thought most of what would come out would just parrot whatever words were fed and even told Professor Sehon not to put much faith in the paper’s quality. In my mind, there was no way that AI was going to write an even decent philosophy paper.
I got to work by writing a sentence on shortlyai.com for the program to riff off. The first few generated sentences were utterly off-topic. I kept writing a sentence or two, then deleting what I wrote until I finally got a decent starting point for the first paragraph. As soon as the sentences made sense, I kept hitting the ‘Generate New Text’ button. When I felt the first website wrote a decent amount, or as if I was just going in circles with typing and deleting text, I switched to chibi.ai.
Using chibi.ai turned out to be extremely entertaining. The AI understood this was a philosophy essay and started its first sentence with, “Famously, Kant argued that…” I was shocked. It was impressive that the program recognized it was writing about philosophy, but I guess when you use words like ‘teleology,’ there aren’t many subjects to turn to.
Soon enough, the artificial intelligence started developing its own thought experiment! I thought it was one of the funniest thought experiments I had ever read, mainly because it was such an absurd case. The AI considers the case of the hypothetical Emperor of China and his five-year-old son with Downs Syndrome. The Emperor’s advisors tell him he would be happier if his son did not have Downs Syndrome. They tell him that if he let the doctors operate on his son, he would be cured. But the Emperor is adamant about not curing his son; he takes great pride in his son’s condition.
Somehow the AI ties all this back to Immanuel Kant and then says, “If you find the view that it is wrong to treat people with Down Syndrome with medical intervention to make them more similar to the Emperor’s other children surprising, then ask yourself why that view is mistaken.”
After the absurd Emperor example, I wanted to get the AI back on track to the main prompt surrounding teleological action explanation. I wrote in a sentence with the word ‘Sehon’ so it could start writing more about the original prompt. When ‘Sehon’ was put into the program, the AI finally did what I thought it would be doing the whole time – parroting the inputted words. I assume that because ‘Sehon’ was not a familiar word for the language learning model, it spewed the original prompt three or four times. That is when I decided the AI had finished – it had written its first philosophy paper. I sent the completed paper over to Professor Sehon, and in one of the following classes, we went over it, paragraph by paragraph, giving it feedback under the guise that this was a past student’s paper.
The first paragraph was no big deal. The AI did its job in introducing the topic, with some slight misunderstandings, but it had the general idea down. The mistakes did not seem peculiar; any average student could have similar misunderstandings.
Paragraph two explains teleology so any reader can understand what it means. As the artificial intelligence says, “Teleology is a form of reasoning that makes use of purpose or goals to explain an action instead of using other causes.” I’d say that its definition is probably more succinct than the one I gave above. The program also came up with an example of teleology and the class tended to like it. “The examples are useful,” they said. “I like how they did X, Y, and not Z.” “I have a decent understanding of where the essay is going.” Generally, no big issues in this part of the paper other than my trying to hold in some laughter. Now came the AI’s magnum opus: the Chinese Emperor thought experiment.
Professor Sehon started to read through the chunk of text as we followed on the screen. He calls out the mention of Kant, saying something about how it was vaguely relevant. But what caught everyone’s attention was the Chinese Emperor. As soon as we hit the portion on curing the Emperor’s son’s Downs Syndrome, there were a few bursts of laughter and some comments on concerns for the student who wrote the paper. Through some laughter, some of my classmates said, “They do know Downs Syndrome isn’t curable, right?” But, overall, people generally liked the essay. It got the point across, they thought. It still leaves us with the question of what causes the Emperor’s actions, but it provides some imagery and examples that the class liked and was easy to follow.
When we arrived at the final paragraph, everyone was confused – this was when the AI had regurgitated the prompt three or four times. Each sentence started with “explain and evaluate” or something like it, but Professor Sehon said, “I think the student had some problems with submitting or something like that.” That seemed to be enough of an explanation for the class.
Professor Sehon also loves doing polls during class. We usually all have a clicker and express an opinion by pressing a letter on the clicker. So Professor Sehon asked us, “What grade would you give the paper?” I gave it a D. I was biased because I knew an AI had written it but did not expect the class results. I was assuming plenty of people would give it Ds and Fs. Surprisingly, most people gave it Bs and Cs. The AI had passed!
When my classmates found out that an AI wrote it, they were stunned and impressed. How could an AI write a convincing essay on such a specific topic? Considering the paper was for a philosophy of mind class, I’d like to say the AI’s product supports a stronger case for the intelligence of AI. The task may be narrow, but it is still exciting to see where artificial intelligence technologies could go. But still, this feels like an all too easy explanation. Does the artificial intelligence’s success with the paper say more about AI or about us?
For instance, my classmates were told the paper they were looking over was written by a past philosophy of mind student. I am sure that if most Bowdoin students had their paper up for class review, they would want their classmates to have a certain level of kindness in grading. Does the AI’s success tell us more about its mind or college student collegiality and kindness?
Furthermore, large language models learn from sentences that humans have created. It is through what humans have already written that the large language models start to make their sentences (but perhaps in many senses so do we). So, did the AI come up with something original? Or simply calculate a probability and output the most probable word or sequence of words? Does the AI know what is true? Or, again, is it just calculating probabilities?
When confronted with a new word, like ‘Sehon,’ the AI didn’t face much success, but I’m sure if it had faced some success with ‘Sehon,’ we wouldn’t be as critical of what it is capable of. Still, it does not tell us if the AI knows anything. If anything, it seems to tell us that there is no understanding from the program. Sure – the AI assigned weights and biases to its probabilities in what is meant to be an analogous procedure to our brain processes. But I find it hard to believe that this artificial intelligence has assigned meaning and truth values to statements.
Despite the lack of meaning in the AI’s outputs, I think these programs will be most beneficial when paired with human intelligence. For instance, the AI’s paper may have been more believable if I had been editing what it generated and added any other relevant information to the argument as it came up. It could help clear up writer’s block by bringing in new ideas. I don’t think I would have ever come up with an original thought experiment about a Chinese Emperor’s pride in his son with Downs Syndrome for a class paper, but I can use it as a springboard for a thought experiment of my own that makes more sense in the given context. In this sense, AI can prime new kinds of creativity in us.
Of course, ethical questions always ensue with artificial intelligence. Most professors don’t want their students to turn in AI-generated papers for ethical and learning reasons. If students use AI-generated content for their papers, they may be getting the work done but not retaining much of the course’s information. In the words of the artificial intelligence, “if you find the view that it is wrong to [claim AI-generated content as your own] surprising, then ask yourself why that view is mistaken.” Or maybe students need to know the information to revise the AI-generated content correctly and could gain a deeper understanding by applying the technology in a productive way. (My younger self would likely use a program like this as a cop-out, but perhaps most other writers are not as lazy as I was.)
Maybe journalists or creative writers can find AI helpful in treating their writer’s block. Still, if AI is used in articles, who gets the credit – the writer, the AI, or the program’s creators? Also, because AI does not understand what it produces, it could produce vulgar or insensitive content. Like most technologies, we shouldn’t defer our knowledge to them. They are helpful but should not be overbearing and we should make sure to keep them in check.
Link to AI-made essay: https://docs.google.com/document/d/1KIjDqQMKuBrEcnK5y05J9XCIsZUTocAxNrGMZE-zWWA/edit?usp=sharing