I've written about this before but I think this is mainly a symptom of the modern education system being based around a perverted incentive system.
College students don't take classes to learn; they take classes as part of a transaction to earn grades, a form of capital that is eventually exchanged for a degree. In many cases actual passion about learning might hurt your grade if you spend effort outside the transactional scaffold and study extra topics or make personal projects. The assignments themselves are reduced to formulaic checklists that can be graded by an unskilled intern or TA, where essays with a topic sentence, three body paragraphs, and minimal grammatic errors are given full marks even if they're just academic drivel (or more or less bullshit).
So of course students end up cheating or checking out and not learning much, just fulfilling requirements until they finish. A student using GPT to do assignments is naturally robbing themselves of any learning opportunities offered by the assignment, but it doesn't matter in this context of grade transaction being valued over actual education.
Maybe if we moved away from traditional grading structures we could better our system around real learning and development.
For a while I had hoped that bootcamp style education might be able to get around some of those issues, but those have largely been warped around job placement statistics and so now teach the bare minimum to fool a recruiter into giving you an interview and maybe an offer.
Yes, and my god would it be nice to finally stop this charade of "school is for learning" meanwhile unanimously everyone goes through the process explicitly to land a job, all the while the curriculum gets centered around application in jobs.
We need separate systems, one for actually learning without financial pressure, and one for joining the workforce - before anyone brings up College/Uni as the solution, they can be if transformed but as of right now there are too many in Uni doing so to get a job.
When you have a system organized around the morality of work and not working ostracizes you from society, health care, shelter, food security, etc, what do you expect?
My wish is the next 20 years brings a post scarcity society where wage slaving for a corporation so your family doesn’t die if they get cancer isn’t so much a moral imperative.
Post-scarcity will never happen, we live on a finite planet, and that's why mass industry is as much of a problem as Capitalism. We need to divorce from the idea that more production is inherently better than less production.
Post scarcity has more to do with divorcing labor from production than producing unlimited amounts of widgets. A basic living, often proposed as basic income, is therefore available to all without everyone fully employed. Our work morality that equates labor with godliness will make this morally hard to accept, but I dream someday art can be a labor of its own merit, or even building realms in Minecraft. I don’t frankly see what’s intrinsically superior to laboring at McDonalds or as an accountant for some globo corporation over playing video games all day. But most people believe breaking your soul in a cubicle is somehow morally superior.
> Post scarcity has more to do with divorcing labor from production
For almost everything, we’re there. Especially when it comes to food production. There’s still some employment to be had, but precious little when compared with the number of people who can be fed by that limited labor.
For most of our basic needs, this holds true. Yet, the economy demands our labor, arguably hallucinating jobs into existence to meet those demands.
I agree. Realistically housing, medical care, and a few other basic necessities aren’t there. And farming the way we do is blasting the earth. We still have a way to go. But the nobility of exploitation and a life of pointless toil will likely persist without some serious dislocations in society. It’s possible AI will bring those about. But as a society we don’t let go of our prisons easily.
> College students don't take classes to learn; they take classes as part of a transaction to earn grades
I am really grateful that I was in a position where I was able to go to college to learn. My parents allowed me to live at home rent free for 3 1/2 years while I worked a full time job on 3rd shift. I took 2 classes a semester and I didn't go into any of the schools degree programs. I paid mostly out of pocket and took whatever classes I thought would be useful. I took almost every CS class, a UX class, a few graphic design classes, some business law, project management, and an accounting class. My advisor was always seemed really concerned I wasn't working towards any particular degree. I never looked at my grades and never stressed about passing any exams because they didn't matter to me. All I wanted was to pull as much knowledge as I could out of a class.
Goodhart's Law back at it again. Though there might be something just adjacent that I'm not familiar with. Sure seems like we're asking "how do we measure the success of this process?" and we keep going "money".
All this does is underscore for me is how many careers that require a college degree are going to soon be made obsolete by technology. As my grandchildren grow older into secondary school ages, this grandfather is going to be extolling the virtues of vocational trade schools.
When I was in school, I was lucky enough to have a philosophy teacher who would actually debate things with students. One of my favorite discussions with him was about our math classes banning the use of calculators, and his class banning the use of Wikipedia. His argument was these restrictions trained us in the skills we would need to succeed on the exam, where we would not have calculators or Wikipedia to help us. He would intentionally make poor arguments like this with the expectation that we would be able to identify what was wrong; here it was that he was “kicking the can down the road” and was hoping we would then ask, well, why we aren’t allowed to use calculators/Wikipedia in the exam?
We did ask that (I think he was proud) and then we had a good discussion of the applicability of the exam to real life, where we concluded that real life circumstances might occur that imitate the restricted conditions of the exam - e.g. you may be in a situation where you don’t have a calculator but still need to do maths, or you may be asked to research something for which a Wikipedia page does not yet exist. Other good arguments I remember being suggested in favor of exams were “the time limit of exams approximates real life time pressure” and “competence in the most restricted form of the aptitude implies competence in less restricted forms, but the reverse is not true”.
I get the impression that schools today do not utilize exams like they did back then, which I think is probably a bad thing and definitely makes it harder to catch out ChatGPT-using students.
If all you can do is use tools and don't really have any knowledge except applying tools, you're not going to be more valuable than anyone else who can use those tools.
In the case of GPT, basically everyone on the planet.
That's not a rat race I'd want to be in, but to each their own.
Guided discussion about open-ended questions can be the most satisfying kind of teaching for both teachers and students. It’s also resistant to cheating with GPT, at least when conducted live and in person [1]. It becomes harder to pull off as classes become larger and subjects become more technical—but maybe that’s the kind of learning that should be replaced by interactive bots anyhow.
[1] I’m teaching a discussion-based university class online this semester, and last week one student with her camera off sounded as if she might be reading her comments aloud rather than speaking spontaneously. I wondered if she might have composed them with ChatGPT.
It's also the norm at small German universities and, till recently, many larger ones. It has not harmed German students and I believe puts Germany in a better position right now.
I listened to a podcast with Scott Aaronson that I'd highly recommend [0]. He's a theoretical computer scientist but he was recruited by OpenAI to work on AI safety. He has a very practical view on the matter and is focusing his efforts on leveraging the probabilistic nature of LLMs to provide a digital undetectable watermark. So it nudges certain words to be paired together slightly more than random and you can mathematically derive with some level of certainty whether an output or even a section of an output was generated by the LLM. It's really clever and apparently he has a working prototype in development.
Some work arounds he hasn't figured out yet is asking for an output in language X and then translating it into language Y. But those may still be eventually figured out.
I think watermarking would be a big step forward to practical AI safety and ideally this method would be adopted by all major LLMs.
The scheme Aaronson is talking about is statistically sound and will not penalize a student who happens to have a particular style.
Basically, you replace the random noise used in the sampling with a cryptographic strategy which biases certain n-tuples based on a pseudorandom number generator. It does not work with temperature = 0 but almost no one uses that.
If the student was stupid enough to not modify the essay, you would be able to prove with 99.999% confidence that it was written by a particular LLM. If they modify it slightly then you will have a lower confidence, but they will still be caught.
> Scott Aaronson: Exactly. In fact, we have a pseudorandom function that maps the N-gram to, let’s say, a real number from zero to one. Let’s say we call that real number ri for each possible choice i of the next token. And then let’s say that GPT has told us that the ith token should be chosen with probability pi.
OP makes a good argument as to why this would be ineffective:
... even if we could definitively tell whether any given word was produced by ChatGPT, we still couldn’t prevent cheating. The ideas on the paper can be computer-generated while the prose can be the student’s own. No human or machine can read a paper like this and find the mark of artificial intelligence.
> or even a section of an output was generated by the LLM
A probabilistic approach like the one described cannot detect small-scale LLM usage. It takes a lot of text for the random "undetectable" word pairings to become statistically evident with any sort of confidence.
I don't know if these open source models will be better. The one's I've seen so far have been pretty terrible. OpenAI did a million little hacks to get the LLM to produce such good responses. Even google couldn't figure out a better way to train their model than to fit it to GPT responses.
This may be a heretical opinion here, but open source doesn't always win esp if its an incredibly technical task and it costs a lot to train and host. Why didn't open source get us a reasonable search engine? Why isn't LibreOffice as good as Microsoft Office? Why do we pay for any software?
I mean sure, you can always cheat. You can pay someone to write essays for you or take your tests. But watermarks would be a huge step forward to detect AI generated content.
As a student studying for exams, chatGPT is phenomenal. Instead of needing to ask my peers, I can take a practice question and ask chatGPT not for the solution but for a hint, which is a game changer.
I'm a professor and up to this semester I've required a paper to summarize some area. I've already decided (before this was posted) to stop requiring them. It's too easy to create a paper that summarizes material without learning anything.
Everyone learns and tests differently, so this is no panacea. But I absolutely loved this one exam I had where the prof had me in for a 20 min chat about a topic. An oral exam, but it was so informal and felt like a comfy chat.
Written tests are torture for me. But chatting about what I’ve learned?! Let’s go!
i was a university lecturer teaching a large foundation subject. at the start of semester i would show them how to plagiarise, summarise and obfuscate an essay — on-screen in real-time. it was lots of fun. we also showed them how our assessment practices were resistant to cheating (using a mixture of oral presentations, reflective writing, monitored testing and group moderation). it seemed to work well with very little plagiarism — except for those students who ignored (or missed) the message.
Wonder how many teachers/professors have considered using chatgpt to grade papers (turn this tech around)? Like you (or your grad assistants) could absolutely help your crazy workloads the same way.
I would not be surprised if the students in my beginning programming class are using it; on a discussion assignment to give an example of a Python dictionary that you would want to use to solve a problem that interests you, about 60% of the entries had approximately the same structure for the concluding paragraph (all of them had different wording, but the theme was the same):
“This Python dictionary could be useful for $HOBBYIST who is learning about $HOBBY and could help people familiar with $HOBBY to remember terms or use it for guidance in improving their skills at $HOBBY.”
I imagine OpenAI et al are saving all LLM output. Soon they can offer an anti-cheating service that checks a given submission to see if it was generated by their program.
> I’m a student you have no idea how much we are using ChatGPT
And honestly, I don't care much if they use any automation tool in their homework. They go to (and pay for) school to learn. If they're using tools to augment their learning, good for them. If they're using tools to fool others and themselves, it's their loss.
You can't get the cat inside the bag anymore, let's focus on improving education instead of redoubling efforts to desautomate things.
I teach (TA/Grad Researcher). We know how much you all use ChatGPT. Just like we know how much you all cheat. We also know that the level of cheating exponentially increased during covid. We're not dumb.
But the truth is there's little we can do about it. Often it is difficult to prove concretely, and so there's a risk/reward function that needs to be considered. But I'll give you an example where I had clear proof. Two students uploaded code from another student. Into their github. The git logs were not removed and had the name of the original owner (who's mistake was not making their repo private). This was taken to the department head. Department head did not want to do anything about it. Why? Because every cheater we prosecute gets logged and administration determines that the number of cheaters is proportional to the number of students we persecute. If a department has a lot of cheaters then we get reduced funding and a lot of audits. I'm sure you can put together the rest of the story here.
The truth is that there's only a handful of ways around this: Small classes, in class assessments, project based homeworks. Repetitive and "doing" homework is good and necessary but these shouldn't be major parts of the grade (or even graded). Take home exams are even fine and probably worthwhile. The reason being is that when you write a takehome exam you operate under the assumption that this is open book and open internet.
We have to think about incentives here. If we're being honest, most students aren't there to learn, they are there to get their piece of paper. (How do we fix this?) Departments are not aligned to prioritize learning, they are prioritized to make money. Training for a degree is perfectly acceptable, but if the students aren't invested in it then what incentive do they have to not cheat (you're never going to win this game. Students are good at cheating and will always win). We have to consider that grades are also the main signal, but that even though this may be a decent signal when aggregated across students that it is a very noisy signal across individuals (e.g. Did you take the hard professor that never gives an A but you learn a shit ton from or did you take the professor that gives easy As and you don't get a deep understanding of the material? Why do students pick one over the other in a very predictable way?).
ChatGPT isn't anything new or a crisis. But it does put higher pressure for us to rethink how our system has changed and what incentives we created.
The direct solution would be more oral examinations, presentations, that kind of thing.
We should also dramatically realign the education system to value real learning, personal and community growth, and a culture of learning rather than just daycare (not that childcare for parents isn’t tremendously important, believe me), funding zillions of BS college admin positions, job training for corporations that are to cheap to train their own employees, and grades (as we know them. Grades have so much nonsense loaded on top of them. Are they a measure of a student’s learning? Are the an incentive? A punishment? A way of ranking students? It’s literally all this and more!) Then students would have much more incentive to actually learn (because the system should be operating in good faith) rather than trying to BS their way through an assignment quickly.
Unfortunately I don’t see that ever happening, at least in America.
It's an unusual 2 year or so window where this has been fairly possible, and still not realised or accepted by instructors. It's just too outside the current paradigm. It has coincided with students not having to do exams during COVID, which also made attainment easier and made degrees less clearcut in measuring reliably.
I truly believe in 10 years for many subjects, we'll have all new forms of assessment, based, perhaps, on oral exams, or AI like GPT models working with those, but in any case, something unrecognisable from now.
An ex-colleague is a professor, and he recently caught a student handing in a homework written by chatgpt. The only reason he noticed is because he wanted to check one of the references and ended up nowhere. The student was stupid enough to let chatgpt generate the bibliography too.
There is a difference of using GPT as a tool and as someone to outsource your work too. As a tool it's like a super powered spell checker. I don't think any university in the world bans the use of spell checkers. As someone you outsource your work too, that's cheating by any standard.
How is this an issue, just have in class tests and exams, writing, multiple choices etc. It would end up giving lower weights to slower and more thoughtful writers.
In law school, one professor asked if we preferred a take home exam or a proctored exam. To his surprise, we nearly unanimously chose proctored.
A take home exam or an essay is not less stress. It just expands the time boundaries of the stressor. The intensity of the stress doesn't ever get as high as an in-class exam perhaps. But the total stress was, in our lived experience, much greater.
At my undergraduate school (Caltech) almost all exams were take home. At my law school there was only one take home exam.
I definitely found take home less stressful. Back in those days I was very much not a morning person. With take home I could take the test at a time of day when I was at peak alertness which often occurred at around midnight. For most tests there were several days between when the test was handed out and when it was due, so I could order the tests so that I could take the ones I needed more preparation for at the end of finals week.
With in class tests they might occur when I'm less alert, and they might schedule the subjects I need the most prep time for early in the week and the ones I don't need prep for late in the week.
They're still good. If a student uses Chatgpt to produce working code they can understand and then modify them what's the problem? If you can't evaluate that then you weren't making a significant effort to detect whether it was simply good googling anyway.
If the teacher evaluates the right things Chatgpt isn't a problem but just an optional tool.
I was talking to a student at UC Berkeley the other day. He said ChatGPT use is very prevalent and professors don't know how to deal with it right now.
this problem is just an exacerbation of the longstanding duel between students and educational institutions caused by incentive misalignment due to credentialism. the proposed solution of using "ai-proof" evaluation techniques seems a bit like conducting a coding interview and not letting the interviewee check documentation (in the real world, you're encouraged to use this tool), but realigning higher education around the current year's best ai model is also a laughable prospect.
knowing the type of student I was, I would have heavily used chatGPT for assignments in the same way that I always used sparknotes. but now when I want to learn something out of intrinsic motivation, gpt4 is an extremely valuable tool (of many in the toolbox), and someone attempting to take it away from me would be doing my education a disservice. it's impossible to solve the problem that this article presents without addressing the underlying forces on students causing external regulation vs students being intrinsically motivated to learn
Anyone who thinks they can still grade and catch this use in clever students should not be evaluating
There’s zero way that you’ll be able to detect a price wise use. Maybe the student asks it to write a report and then resummarizes it in their own words.
It would save you 90% of the effort of doing the whole thing (and maybe is even valuable)
I think we just need to move past what education was. It’s barely doing anything more than babysitting as it is anyway.
This is the only rationale solution, I mean education regarding most of physics and math when I took it never made sense to me. They would want you to memorize equations, or work without a calculator or without the internet. There is no situation in life anymore where you would be doing engineering without access to resources, the internet and a calculator. The same could be applied to nearly all areas of education right now beyond like 5th grade. Education has been frozen in time, and this is just the final nail that will hopefully make them re-work it.
I don't know why you're downvoted. Absolutely a lot of the education was just grinding for pretense and these Automation tools greatly expose it.
The tools won't think for you and you have to find a way to teach that, but that wasn't what's being done.
I criticize blind faith on Chatgpt all the time and want people to use it with caution and intention but I'm super happy at how it rocks so many boats. So many processes with questionable use rendered insignificant due to semantic text generation. Good opportunity to improve.
The UK equivalent of college 'GPA' is degree 'class'. The class of my undergraduate degree was determined entirely by the scores I achieved on eight 3-hour exams taken in the final few weeks of my final year.
ChatGPT couldn't possibly help someone cheat such a system.
He's not impressed with its ability to produce a passing college essay, and is pretty certain that anyone who is able to finesse the mess that ChatGPT produced into one doesn't actually need to use it in the first place.
I taught into computer science courses ~15 years ago, and went to university myself a bit before that. I know exactly how much "cheating" goes on on assignments, as does everybody involved. Chatgpt doesn't change that. It's the reason final exams are worth so much. To use a cliche, you're only cheating yourself if you want to get the 10% that assignments are worth without putting in the effort.
tl;dr: Letter points out how easy it is to put together a well thought out essay by leaning heavily on ChatGPT, and points out possible solutions for testing student's ability to think.
College students don't take classes to learn; they take classes as part of a transaction to earn grades, a form of capital that is eventually exchanged for a degree. In many cases actual passion about learning might hurt your grade if you spend effort outside the transactional scaffold and study extra topics or make personal projects. The assignments themselves are reduced to formulaic checklists that can be graded by an unskilled intern or TA, where essays with a topic sentence, three body paragraphs, and minimal grammatic errors are given full marks even if they're just academic drivel (or more or less bullshit).
So of course students end up cheating or checking out and not learning much, just fulfilling requirements until they finish. A student using GPT to do assignments is naturally robbing themselves of any learning opportunities offered by the assignment, but it doesn't matter in this context of grade transaction being valued over actual education.
Maybe if we moved away from traditional grading structures we could better our system around real learning and development.
For a while I had hoped that bootcamp style education might be able to get around some of those issues, but those have largely been warped around job placement statistics and so now teach the bare minimum to fool a recruiter into giving you an interview and maybe an offer.