What is a language model and what do they look like when you encounter them online?
- In class:
- Explore: Sea and Spar Between
- Explore: AI Dungeon
What do language models look like right now and what should humanities scholars do with and/or about them?
-
Read:
- Kevin Roose, “The Brilliance and Weirdness of ChatGPT,” The New York Times, December 5, 2022.
- “The Gray Area” with Sean Illing, interview with Timnit Gebru (podcast)
- Emily M. Bender and Timnit Gebru et al., “On the Dangers of Stochastic Parrots: Can Language Models Be Too Big?” Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency (March 2021): 610-623.
- Marika Cifor, Patricia Garcia, et al., “The Feminist Data Manifest-No” (2019)
- Plato, from the Phaedrus, 247c-275b (Canvas).
- Jacques Derrida, “Plato’s Pharmacy,” part 1, from Dissemination, trans. Barbara Johnson (Univ. of Chicago Press, 1981) (Canvas) ← just get as far through / in as you can; we’ll work through this in class as well
-
Explore: ChatGPT
-
Optional:
- Abeba Birhane and Deborah Raji, “ChatGPT, Galactica, and the Progress Trap,” WIRED, December 9, 2022.
- Elisabeth Pain, "How to (Seriously) Read a Scientific Paper", Science Careers, March 21, 2016
What is the history of language generation in a literary context and what are some examples?
-
Read:
- Jessica Pressman, "Electronic literature as comparative literature," Futures of Comparative Literature (Routledge, 2017): 248-257.
- Theodor H. Nelson, “Chapter 0: Hyperworld” Literary Machines (1981).
- Shan Carter and Michael Nielsen. "Using artificial intelligence to augment human intelligence" Distill 2.12 (2017): e9.
- Daniel C. Howe and A. Braxton Soderman. "The aesthetics of generative literature: lessons from a digital writing workshop" Hyperrhiz Journal of New Media Cultures (2009).
- David Jhave Johnston, "ReRites: Machine Learning Poetry Edited by a Human," Glia (2019).
- Lai-Tze Fan, “Symbiotic Authorship: A Comparative Textual Criticism of AI-Generated and Human-Edited Poetry.” ReRites: Responses, edited by Stephanie Strickland, Anteism, 2019, pp. 57-64. (Canvas)
-
In class:
- Lab: Computationally-generated poetry (and other forms of literary text)
What does computer-generated literature look like right now, what are the models/tools used to make it, and how should literary scholars analyze it?
-
Read:
- Mark Riedl et. al “An Introduction to AI Story Generation," Medium
- Some GPT-generated text:
- Pamela Mishkin. “Love and AI” (2021)
- James Yu, “Singular”
- Josh Dzieza, “The Great Fiction of AI”
- Pick one story from the Wordcraft Writers Workshop
- Jill Walker Rettberg, "Algorithmic failure as a humanities methodology: Machine learning's mispredictions identify rich cases for qualitative analysis," Big Data & Society 9.2 (2022): 20539517221131290.
- Daphne Ippolito et al. "Creative Writing with an AI-Powered Writing Assistant: Perspectives from Professional Writers" arXiv preprint arXiv:2211.05030 (2022).
-
Optional:
- Thomas Winters and Pieter Delobelle. "Survival of the Wittiest: Evolving Satire with Language Models" ICCC (2021).
- Eric Nichols, Leo Gao, and Randy Gomez. "Collaborative storytelling with large-scale neural language models" Motion, Interaction and Games (2020): 1-10.
-
In class:
- Lab: Text generation with the HuggingFace transformers library and BERT
What is the text used to train large language models and how has it been analyzed? How can literary scholars contribute to its analysis and/or critique?
- Read:
- Suchin Guruangan et al, “Whose Language Counts as High Quality? Measuring Language Ideologies in Text Data Selection,” arXiv preprint arXiv:2201.10474 (2022) https://arxiv.org/abs/2201.10474
- Eun Seo Jo and Timnit Gebru, “Lessons from Archives: Strategies for Collecting Sociocultural Data in Machine Learning,” Proceedings of the 2020 Conference on Fairness, Accountability, and Transparency (2020): 306-310
- Hannah Rose Kirk, Abeba Birhane, et al, “Handling and Presenting Harmful Text in NLP Research,” arXiv preprint arXiv:2204.14256 (2022) https://arxiv.org/abs/2204.14256
- Jessica Marie Johnson, “Markup Bodies: Black [Life] Studies and Slavery [Death] Studies at the Digital Crossroads,” Social Text 36.4 (December 2018).
- Sadiya Hartman, “Venus in Two Acts” Small Axe 26 (June 2008).
What is the text that is generated by LLMs and can we use literature to analyze it? What are some other ways to make use of the text generated by LLMs and are they ethical? Can we look a little more under the hood to understand how LLMs make their predictions?
-
Read:
- Li Lucy and David Bamman, “Gender and Representation Bias in GPT-3 Generated Stories,” Proceedings of the Third Workshop on Narrative Understanding (2021): 48-55. Association for Computational Linguistics.
- Joon Sung Park et al., “Social Simulacra: Creating Populated Prototypes for Social Computing Systems,” Proceedings of the 35th Annual ACM Symposium on User Interface Software and Technology (October 2022): 1-18.
- Brandon Rohrer, “Transformers from Scratch” ← spend as much or as little time on this as you’d like
-
Explore: Matt Wilkens et al., BERT for Humanists ← focus on textual apparatus; we will work through some of the tutorials in class this week and next
-
Optional:
- Ashish Vaswani et al, “Attention is All You Need,” Proceedings of the 31st Annual Conference on Neural Information Processing (December 2017): 6000-6010.
- Tom Brown, Benjamin Mann, et al., “Language Models are Few-Shot Learners,” Proceedings of the 34th Annual Conference on Neural Information Processing (December 2020): 1877-1901
- Anna Rogers, Olga Kovaleva, and Anna Rumshisky, “A Primer in BERTology: What We Know about How BERT Works,” Transactions of the Association for Computational Linguistics (2020) 8: 842–866.
- Enrique Manjavacas and Lauren Fonteyn, “Adapting vs. Pre-training Language Models for Historical Languages,” Journal of Data Mining and Digital Humanities (2022)
-
In class:
- Lab: More with the HuggingFace transformers library and BERT
What does literary scholarship that makes use of large language models look like right now? What might it look like in the future?
-
Read:
- Ted Underwood, “Do Humanists Need BERT?” The Stone and the Shell, July 15, 2019
- Rabea Kleymann, Andreas Niekler, and Manual Burghardt, “Conceptual Forays: A Corpus-Based Study of ‘Theory’ in Digital Humanities Journals,” Journal of Cultural Analytics 7.4 (December 2022).
- M. Besher Massri, Inna Novalija, et al., “Harvesting Context and Mining Emotions Related to Olfactory Cultural Heritage,” Multimodal Technologies and Interaction 6.7 (2022)
- Hoyt Long, “Learning to Live with Machine Translation,” forthcoming in American Literary History (Canvas)
-
Optional:
- Jinbing Yang, Yann Ciaran Ryan, et al., “Detecting Sequential Genre Change in Eighteenth-Century Texts,” Computational Humanities 2022
- Dallas Card, Serina Chang, et al, “Computational analysis of 140 years of US political speeches reveals more positive but polarized framing of immigration,” PNAS 119.31 (2022)
- Li Lucy and David Bamman, “Characterizing Language Variation Across Social Media Communities with BERT,” Transactions of the Association for Computational Linguistics (2021) 9: 538–556.
- Margherita Parigini and Mike Kestemont, “The Roots of Doubt: Fine-Tuning a BERT Model to Explore a Stylistic Phenomenon,” Computational Humanities 2022
-
In class:
- Lab: Using BERT to detect word similarity
2/28 - Guest lecture by Jacob Eisenstein
-
Due: Final project proposal
-
Read:
- Bernd Bohnet, Vinh Q. Tran, et al., “Attributed Question Answering: Evaluation and Modeling for Attributed Large Language Models,” arXiv preprint arXiv:2212.08037 (2022). https://arxiv.org/abs/2212.08037
-
Optional:
- Jacob Eisenstein, Daniel Andor, et al. "Honest students from untrusted teachers: Learning an interpretable question-answering pipeline from a pretrained language model," arXiv preprint arXiv:2210.02498 (2022). https://arxiv.org/abs/2210.02498
- Sandeep Soni, Lauren Klein, and Jacob Eisenstein, “Abolitionist Networks: Modeling Language Change in Nineteenth-Century Activist Newspapers,” Journal of Cultural Analytics (2021) (or just read this shorter version in Public Books)
How do broader critiques of capitalism, colonialism, and “capture” inform our understanding of LLMs, their uses, and their limits? Are the goals of computer science research and (digital) humanities scholarship compatible at all?
- Read:
- Matthew Hannah, “Toward a Political Economy of Digital Humanities,” forthcoming in Debates in the Digital Humanities 2023, ed. Matthew K. Gold and Lauren Klein (Univ. of Minnesota Press, 2023): 3-26. (Canvas).
- Meredith Whittaker, “The Steep Cost of Capture,” ACM Interactions 28.6 (November 2021): 50-55.
- Inioluwa Deborah Raji, Emily Bender, et al., “AI and the Everything in the Whole Wide World Benchmark,” Proceedings of the 35th Annual Conference on Neural Information Processing (2021).
- Toma Tasovac and Natalia Ermolaev, “Parrots,” Startwords 3 (2022) and the three essays that issue includes:
- Ted Underwood, “Mapping the Latent Spaces of Culture”
- Gimena del Rio Riande, “On Spanish-Speaking Parrots”
- Lauren Klein, “Are Large Language Models Our Limit Case?”
- Olúfẹ́mi Táíwò, “Introduction,” in Elite Capture: How the Powerful Took Over Identity Politics (and everything else) (Haymarket Books, 2022) (Canvas)
- Read:
- TBD
Continued from 3/14: How do broader critiques of capitalism, colonialism, and “capture” inform our understanding of LLMs, their uses, and their limits? Are the goals of computer science research and (digital) humanities scholarship compatible at all?
-
Read:
- Nick Couldry and Ulises A. Meijas, “Data Colonialism: Rethinking Big Data’s Relation to the Contemporary Subject,” Television & New Media 20.4 (2019).
- Karen Hao, “Artificial Intelligence is Creating a New World Order,” MIT Technology Review (2022) and the four essays that issue includes:
- Karen Hao and Heidi Swart, “South Africa’s private surveillance machine is fueling a digital apartheid”
- Karen Hao and Andrea Paola Hernández, “How the AI industry profits from catastrophe”
- Karen Hao and Nadine Freischlad, “The gig workers fighting back against the algorithms”
- Karen Hao, “A new vision of artificial intelligence for the people”
- Jason Lewis, Noelani Arista, et al., “Making Kin with the Machines,” Journal of Design and Science 4.3 (2018).
- Sabelo Mhlambi, “From Rationality to Relationality: Ubuntu as an Ethical & Human Rights Framework for Artificial Intelligence Governance” (2020)
-
Explore:
- Indigenous Protocol and Artificial Intelligence Working Group
- Malavika Jayaram, Aarathi Krishnan et al, “AI Decolonial Manyfesto”
-
Explore (suspciously!):
-
Optional:
- Paola Ricaurte, “Data Epistemologies, The Coloniality of Power, and Resistance,” Television & New Media 20.4 (2019)
- Abeba Birhane, “Algorithmic Colonization of Africa,” Scripted 17.2 (August 2020)
Topics/readings to come from class