Large Language Models are biased – but can still help analyse complex data

In a pilot study, researchers have found evidence that Large Language Models (LLMs) have the ability to analyse controversial topics such as the Australian Robodebt scandal in similar ways to humans – and sometimes exhibit similar biases. The study found that LLM agents (GPT-4 and Llama 2 ) could be prompted to align their coding results with human assignments, through thoughtful instructions: ‘Be Sceptical!’ or ‘Be Parsimonious!’. At the same time, LLMs can also help identify oversights and potential analytical blindspots for human researchers.

Journal/conference: Microsoft Journal for Applied Research (MSJAR)

Link to research (DOI): 10.48550/arXiv.2405.06919

Organisation/s: ARC Centre of Excellence for Automated Decision-Making and Society

Funder: ARC Centre of Excellence for Automated Decision-Making and Society is funded by by the Australian Government through the Australian Research Council.

Media release

From: ARC Centre of Excellence for Automated Decision-Making and Society

Research finds Large Language Models are biased – but can still help analyse complex dataIn a pilot study, researchers have found evidence that Large Language Models (LLMs) have the ability to analyse controversial topics such as the Australian Robodebt scandal in similar ways to humans – and sometimes exhibit similar biases.The study found that LLM agents (GPT-4 and Llama 2 ) could be prompted to align their coding results with human assignments, through thoughtful instructions: ‘Be Sceptical!’ or ‘Be Parsimonious!’. At the same time, LLMs can also help identify oversights and potential analytical blindspots for human researchers.LLMs are promising analytical tools. They can augment human philosophical, cognitive and reasoning abilities, and support ‘sensemaking’ –– making sense of a complex environment or subject –– by analysing large volumes of data with a sensitivity to context and nuance absent in earlier text processing systems.The research was led by Dr Awais Hameed Khan from the University of Queensland node of the ARC Centre of Excellence for Automated Decision-Making & Society (ADM+S).“We argue that LLMs should be used to assist — and not replace — human interpretation.“Our research provides a methodological blueprint for how humans can leverage the power of LLMs as iterative and dialogical, analytical tools to support reflexivity in LLM-aided thematic analysis. We contribute novel insights to existing research on using automation in qualitative research methods,” said Dr Khan.“We also introduce a novel design toolkit — the AI Sub Zero Bias cards, for researchers and practitioners to further interrogate and explore LLMs as analytical tools.”The AI Sub Zero Bias cards help users structure prompts and interrogate bias in outputs of generative AI tools such as Large Language Models. The toolkit comprises of 58 cards across categories relating to structure, consequences and output.Drawing on creativity principles, these provocations explore how reformatting and reframing the generated outputs into alternative structures can facilitate reflexive thinking.This research was conducted by ARC Centre of Excellence for Automated Decision-Making and Society (ADM+S) researchers Dr Awais Hameed Khan, Hiruni Kegalle, Rhea D’Silva, Ned Watt, Daniel Whelan -Shamy, under the guidance of Dr Lida Ghahremanlou, Microsoft Research, and Associate Professor Liam Magee, from the ADM+S node at Western Sydney University.This research group began their collaboration at the 2023 ADM+S Hackathon where they developed the winning project Sub-Zero. A Comparative Thematic Analysis Experiment ofRobodebt Discourse Using Humans and LLMs.Associate Professor Liam Magee has been mentoring the group since first meeting them at the Hackathon.“The ADM+S Hackathon was instrumental in bringing together these researchers from across multiple disciplines and universities,” said Associate Professor Magee.“The research has been a tremendous group contribution, and I’d like to acknowledge both the efforts of the team and the logistical support of Sally Storey and ADM+S in making this possible.”The paper Automating Thematic Analysis: How LLMs Analyse Controversial Topics has beenaccepted into the Microsoft Journal for Applied Research (MSJAR), an industry publication, and will be published in Volume 21, in August 2024.Access the AI Sub Zero Bias toolkit here

Large Language Models are biased – but can still help analyse complex data

Turbocharging protein binding site prediction with geometric attention, inter-resolution transfer learning, and homology-based augmentation | BMC Bioinformatics

Zero-shot transfer of protein sequence likelihood models to thermostability prediction

Poisoning medical knowledge using large language models

Semi-supervised recognition for artificial intelligence assisted pathology image diagnosis

Chemistry wordoku #062 | Puzzle

Hot Topics

Turbocharging protein binding site prediction with geometric attention, inter-resolution transfer learning, and homology-based augmentation | BMC Bioinformatics

Zero-shot transfer of protein sequence likelihood models to thermostability prediction

Poisoning medical knowledge using large language models

Related Articles

Balancing Act: Pregnancy and Bipolar Disorder

Cohesion at the cellular level: flexible yet stable

Gut bacteria influence responses to immunotherapy in patients with asbestos related cancer

Quick Links

Must Read

Turbocharging protein binding site prediction with geometric attention, inter-resolution transfer learning, and homology-based augmentation | BMC Bioinformatics

Zero-shot transfer of protein sequence likelihood models to thermostability prediction

Poisoning medical knowledge using large language models

Semi-supervised recognition for artificial intelligence assisted pathology image diagnosis

Popular Articles

Turbocharging protein binding site prediction with geometric attention, inter-resolution transfer learning, and homology-based augmentation | BMC Bioinformatics

Zero-shot transfer of protein sequence likelihood models to thermostability prediction

Poisoning medical knowledge using large language models