Student assignments in a time of language modelling

Some thoughts on accountable writing and critical thinking

Apr 21, 2023

In a previous post I wrote about large language models (LLMs) I suggested that that a lack of accountability is one reason for thinking that their use of ‘language’ is anything but human. I began the post because I wanted to think about how educators might respond, and there seemed to be quite a lot to get out of the way first. So this is the post I set out to write.

There are two ways I think educators can respond to the current surge in use of LLM-based tools. The first is to develop writing assignments that are accountable. I don’t say ‘LLM-proof’ because students will use the tools they have, and these tools are being advertised, reviewed and hyped as important to their success. But accountable assignments centre aspects of human writing that LLMs are not only bad at in their present iteration, but will always be bad at, and so the rewards for using them will be limited.

File:Bios robotlab writing robot.jpg — Industrial robot writing CC-BA 2.0 Mirko Tobias Schaefer via Wikimedia Common

The second approach is to turn face-on to the technologies emerging from the AI labs and use them, as all technologies can be used, to ask questions, to develop critical skills, to think ‘out of the box’ and to imagine alternative futures.

Accountable assignments

So what would accountable writing assignments look like? I can think of several alternatives. In all of them, ‘writing’ could include media other than text, and modes other than the essay, but even if we imagine all of them as a standard 5000-word essay, there is still huge scope for experiment here.

Assignments that focus on writing from a particular position or point of view can help to produce accountable writers. Students are often reluctant to take up a stance, perhaps because school has taught them that ‘academic’ or ‘scientific’ writing absents the self. Or perhaps because social media has taught them that putting forward a public position can lead to a violent erasure of the self. On the other hand, some students put forward opinions confidently without recognising that they are partial or that they need justification (‘fluent bullshit’). Everything about the university experience should communicate that this is a space where positions can be safely held – but must be justified - and where disagreement does not lead to erasure. Writing at university should teach techniques like argumentation, referencing other people’s ideas, presenting evidence, reflecting on experience, defining and using terms. But the reason for all of these techniques is to make the writer accountable for the position they hold – the toolkit is meaningless without the purpose.

Writing about personal experience gives a natural sense of authority in what we have to say (who knows this better than I do?), and can be a good starting point for the practice of taking positions. But it can also be exposing, and more so for some students than for others. As Roz Ivanic explained in Writing and Identity, students can develop an identity and point of view without bringing in personal material. A stance can be taken in relation to an idea or a method, a professional situation or a set of data. A style can be developed in coding. Students can be asked to write as if they were somebody else – to try out an academic voice as a kind of ventriloquy, or write from different points of view, as a way of finding out what they think, or might think, for themselves. Of course LLMs can produce arguments for and against a position, and they can produce first person sentences. But they are very bad at connecting the two. The first person ‘I’ these models put forward has no coherent position of its own, and no reason to defend it.

A related approach to writing-as-positioning might be writing as becoming a member of a community. The terms ‘community of practice’ or ‘knowledge community’ are sometimes used in a rather idealistic way about university study. From a student perspective, the specialised use of language is just another reason for feeling that the subject they have chosen is trying to keep them out. But elements of community are there in every cohort, class or learning group - after all, they have at least their chosen course in common! - and can be fostered through writing. First year learning groups can build definitions together, or share and annotate references that they have found helpful. Second years can annotate examples, or undertake collaborative assignments, to help them share skills and clarify what they want their own writing to do well. Final year students can give feedback on each other’s work, perhaps agreeing criteria for good writing, and being graded on their contributions. Especially powerful, I think, is when authorised figures from the community, including lecturers and guest lecturers, share their journeys as writers. Research is a slow and contingent process, writing is difficult whether solo or in a team, and no writer is too experienced to still be learning.

In the same way I think assignments that allow writing to unfold as a practice can generate accountability. Writing takes time. Break that time down into stages or activities or even roles (planner, curator, generator, editor, reference manager, proof reader…). Marks or feedback can be given on these different stages, or on drafts, or on reflective tasks alongside the production of writing. Writers develop over time: a good tutor is there for the journey. Having taught creative writing I know that the right prompt, or a new discovery in reading, or just slogging away for long enough can produce sudden leaps forward and sideways. This is surely one of the joys of being involved with students as writers! But even the most surprising leap has some kind of a run-in, and if we value the process as well as the achievement, students will value it as well.

Writing as process can also bring the body back into play. We do not want proctored exams to be the only place students’ bodies are connected with their writing. We can have in-person presentations, vivas, question and answer sessions and student-led events in which students account in an embodied way for the texts they have produced. We can hold live writing sprints, an activity that vividly connects the writing mind and body, practice and place, helping many students to overcome blocks and anxieties.

File:Medical Students Writing @ Lewis Katz School of Medicine.jpg — Medical students writing, 2019. CC-SA 4.0 Wikimedia commons

Finally, assignments that focus on what a piece of writing does in the world - for its readers and ‘users’ as well as its producers – seem to me to create a powerful kind of accountability. One approach is to ask students to write for different readers, ideally real ones. This leads on inevitably to questions of purpose. How does this writing try to organise the attention and intention of other human beings? What does it ask its readers to think, feel and do, beyond asking a tutor for a grade? Writing can also be focused on real-world changes that students are invested in. LLMs have no data in their training set about the challenges local communities face, no user needs or real-world evidence gathered by students themselves. Such evidence might address concerns that arise from students’ own lived experience, or the context of the classroom or local area. By drawing writing out of a situation and a need for change, these assignments support accountability beyond the ordering of words.

Supporting accountability

For students to be accountable to what they write, we must also create conditions in which they feel safe to take up different positions and to try new ways of communicating - which means responding to the many reasons they find this difficult. These reasons are not the same for all students. They are inseparable from the racialised and gendered bodies they bring to the act of writing, their histories of speaking and not speaking, writing and not writing, being and not being heard. As differently-bodied human beings we may struggle to inhabit these difficulties but we can be present and accountable with our own vulnerable writing bodies. We can share with students the uncertainties of writing, either as curious writer-researchers who don’t have all the answers, or as actual collaborators, writing alongside students and helping them to find a readership - something progressive educators have advocated for decades.

All forms of accountable writing take time, attention and care, and not only from student writers. It is amazing how rarely this is commented on by the AI advocates who have become our go-to experts in assessment (though sometimes it is noted in passing that ‘these assessments may be more time consuming’). Those who see the AI revolution as a bracing antidote to ‘lazy’ academic habits such as ‘setting essays’ seem not to have written or given feedback on an essay. Even if it was the only kind of student writing ever assigned (it isn’t) there are many genres of essay – the personal statement, extended argument, research report, case study – and writing any of them well is demanding. Supporting them is demanding too. Tutors may have to negotiate topics, help students plan their work, suggest different approaches, give feedback on work in progress, recommend other writing as reading, and finally grade a long piece of work with only a brief heuristic for guidance. If extended writing gives students the space to develop their own interests and ideas, to express their own understanding and response, there are no off the peg solutions to teaching or grading it. The problem is not academic ‘laziness’ but the fact that student writing takes so much time and attention.

When curriculum teams are under pressure to produce courses that can easily be ‘scaled’, no wonder teachers as well as students look to technology to take some of the strain. The problem is not that the technology exists or that its users are lazy or unethical but that learning (and teaching) are developmental activities. They take time, attention, intention, and other resources. Given these resources, if we set assignments that centre accountability, we can tell students honestly that AI is of limited use to them. We can shine a light on the arcane tools of academic writing - referencing, acknowledging diverse viewpoints, citing evidence, using established methods and terms, and yes, even signing up to integrity policies - as ways of being accountable. We can have a better conversation about what writing is for.

Using the tools critically

An alternative approach to making LLMs less relevant is looking critically at what they are doing, how, and why. For a long time I’ve advocated for critical digital literacies - developing curious, questioning and creative approaches to the technologies we are offered as users. Educational responses to the current AI surge should, I think, come from this space, drawing on the long history of thoughtful responses to technology, rather than seeing this one as too advanced for human criticism, too protean, too disruptive, too inevitable. But I have some unease about the kinds of critique that are often suggested.

The current generation of LLMs can certainly be criticised. They get facts wrong, invent academic references, defame real people, show biases, and swerve from insults to abject apologies in the space of a few lines. Google’s proprietary LMM, Bard, claimed in its launch demo that the James Webb telescope “took the very first pictures of a planet outside of our own solar system” and when someone pointed out that Bard was wrong, Google’s parent company lost $163m. Considering that we all use Google to check this kind of thing, you have to wonder how fact-checking AI is going to work when it is fully integrated into Google search. But for now, yes, we should keep checking.

Checking is what employees are being encouraged to do as they integrate generative language models into their workflows. And students should of course do the same. But language models are improving fast, thanks to the billions of dollars and thousands of human minds being put to work on it. GPT-4 is qualitatively better than its predecessors when it comes to reasoning, for example. Bing connects GPT-4 to the live internet, improving its accuracy and currency. Do we really think generative models won’t be adding genuine references soon, based on keyword searches? Crib sheets to help students spot the flaws in AI writing will be out of date before each semester is over.

And what is the endpoint of teaching human beings to be great AI fact-checkers? The Head of Assessment at the International Baccalaureate suggests that:

When AI can essentially write an essay at the touch of a button, we need our pupils to master different skills, such as understanding if the essay is any good or if it has missed context, has used biased data or if it is lacking in creativity. These will be far more important skills than writing an essay, so the assessment tasks we set will need to reflect this.”

I see how student writers can benefit from asking ‘if an essay is any good’. (Personally, I think peer review is more helpful than hitting that GPT button. Peers are likely to make the same kind of errors and be working in the same ‘zone’ of development as students themselves.) But if we are not interested in improving students as writers at all, only in the ‘different skill’ of evaluation, students will soon be in the same position as the fact-checking Google users who can’t escape Google’s version of the ‘facts’. How are students to know good writing if they don’t write, or only with AI writing tools in support (something the IB embraces)? How will they appreciate context if they don’t practice writing from a context of their own? How will they know what ‘bias’ feels like and where it comes from? How will they recognise ‘creativity’ and its absence when every corner of the internet is littered with LLM writing like plastic particles in the depths of the ocean? What do they think is the point of writing that a piece of AI writing may have ‘missed’?

Critical thinking doesn’t arise in empty space. It needs material to be critical with and about. It needs context and purpose and values to support its assessments. Good writers are usually good critics of writing, but that does not mean criticism in itself develops good writing. Practice matters.

The kind of critical thinking we ask for also matters. If we ask students only to assess how the generative AI project is working on its own terms, as users of its product, we accept those terms and may even help to improve the user experience. As Autumm Caines noted in a recent blog post, OpenAI:

‘… made ChatGPT available as a research preview to learn from real-world use. The spokesperson called that step a “critical part of developing and deploying capable, safe AI systems. We are constantly incorporating feedback and lessons learned”. (My italics).

Every time we spin that GPT wheel, we are helping improve and embed it, train new users and provide new use cases. A critical approach would surely invite students to assess the project on terms other than its own. What is the underlying technology here, and is it doing what it seems, or claims, to be doing? What is the business model? Who is profiting, and who is being exploited? What are the risks and how might we mitigate them? What could happen to human writing, thinking, and intellectual work in different scenarios of widespread use? What are the risks to different kinds of human being? Because writing and language are not side effects of a generic hive mind but come from embodied people of different genders, races, nationalities and vulnerabilities, whose access to AI tools and benefits are massively unequal. What alternative futures can we imagine? These critical questions could lead to some valuable thinking and learning that will stay relevant, whatever improvements may be made to the latest LLM applications.

The space of critical digital literacy is not an easy one to engage students with. To be meaningful, it probably has to come from inside the subjects they have chosen, and it helps if we have already established that questions about how we know, and what technology has to do with it, are an interesting and relevant part of their university experience. I have for some years been collecting practical ideas for raising these questions that don’t need anyone to use the word ‘epistemology’. But I do not suggest it is easy.

Using the tools creatively

What about the idea that LLMs can teach students to write (better)? Perhaps by offering sample responses, or encouraging students to write and refine their own prompts? Mike Sharples is one of the prominent thinkers who has argued for this approach, and he has valuable experience of using LLMs in creative ways.

In my own much briefer experience, I find that GPT does encourages a kind of playfulness in writing prompts for writing (‘prompt engineering’ as it will inevitably become known). Perhaps because there are no real students on the other end to react to (what they might perceive as) a lack of relevance and seriousness in my requests. Instead there is a willing slave/student turning out whatever I have imagined, instantly to order. There is something slightly compulsive about this, an idea I may visit in another blog post. But back at the chalk face, it is tough devising assignments that are both relevant to student learning as laid down in the course rubric and also engaging, compelling and original. If ChatGPT can help us do this more creatively, that is a good thing. But there is a lot of hit and miss involved. It’s not obvious why prompts that generate passable content from a LLM should generate original responses from students. If we are going to ask students to devote their precious time to a project, I’d rather trust expert guidance based on years of teaching, such as this from Kay Sambell and Sally Brown, or read some case studies in authentic assessment, like these from the University of Liverpool, or just share ideas with colleagues and negotiate with students about what they want to write.

A problem I have with sending students to ChatGPT to improve their own writing is that it is terrible at the things we usually want students to improve. Everything, that is, that I explored under the rubric of accountable assignments: understanding purpose, voice, readership and genre; recognising how specialised language can lend credibility (and how it can do the opposite); responding to different kinds of argument; taking up positions, and reflecting on writing as a process. Students definitely benefit from different examples to get a grasp on these issues. But for all the reasons outlined, I think they should be human examples, so students see writing as a diverse, messy, inexact, variously motivated practice they are developing for themselves. Then perhaps they can aspire to be a writer among writers, and not a human version of ChatGPT. Finding examples of writing is time consuming (I still have files of them collected over years). I welcome short cuts and borrowings. I just don’t think this short cut is a good one.

Research on the use of GPT type tools to improve student writing is still in its early days. Even so, results such as this pre-print on the outcomes of student ChatGPT engagement and more general findings about the difficulties non-experts have in devising prompts for LLMs, suggest there may not be as much benefit as is hyped or hoped for. Especially when you consider that time spent playing with these interfaces – compelling as they are – has an opportunity cost. There are other things students might be doing to improve their writing.

When I’ve tried ChatGPT and Bing as prompts for my own creative process, I’ve found them little better than random poetry generators and less useful than writing exercises. I’m reminded of the oulipo movement of the mid 20^th century that developed playful and arbitrary rules such as counting syllables, or not using the letter ‘e’ (especially tough in French!). Oulipo was founded by a writer (Queneua) and a mathematician (de Lionnais), so the idea of using algorithmic rules to elicit writing is not new. It was closely associated with the surrealist movement, that Andre Bretton called ‘pure psychic automatism’, so the idea of using automated input to unleash the unconscious in writing is not new either. (The idea of all that text swirling around inside large language models as a kind of writerly unconscious is an interesting one!). But as these examples show, GPT interfaces are not necessary for students to share innovative, creative and alternative prompts for writing. My teenager delights in using predictive text to generate random messages and making up scenarios to suit them. Any tool can be used creatively in the hands of creative educators and learners.

But tools are not neutral. Just as language is not ‘simply’ the words we use to express our meanings to other people, tools are not ‘simply’ the means we use for exercising our personal intentions in the world. Tools carry the history of how they were designed and made. They shape practices and contexts and possible futures. Oulipo and Surrealist tools came from a self-selecting avante-garde but they were intentionally open, public, playful and free. LLM applications come from the most powerful corporations on the planet. With so many other tools we can use creatively, we must surely weigh the risks against the creative possibilities.

Buy All That Is Evident Is Suspect: Readings from the Oulipo: 1963 - 2018 Book Online at Low Prices in India | All That Is Evident Is Suspect: Readings from the Oulipo:

imperfect offerings

Discussion about this post