The World Values Benchmark: A Moral Compass for Large Language Models (under review)

Excerpt from Abstract: “A behaviour is a function or action that can be observed and can apply to both living and non-living systems. In humans, a moral behaviour is an action guided by underlying values and normative assumptions. In generative AI, a moral behaviour can be considered a generated output that exhibits alignment with some human values and norms. In the case of a machine, the underlying values are reflections of the human biases and perspectives embedded in the training data, or echoes of the perspectives of human annotators used for fine-tuning. How appropriate a model and its generated outputs are for use in a human context is the central question behind many of the evaluation processes designed to measure the biased behaviours of LLMs. Yet, evaluation methods of AI that attempt to measure value-alignments of the outputs usually hold some underlying normative beliefs embedded into the process itself; whether by the chosen metric of success or the test data used to provide the measuring stick. . . The World Values Benchmark (WV-Bench) is a measurement tool that uses existing human data from the World Values Survey (WVS) organization to describe a model’s behaviour in relation to the human behaviour captured in the WVS dataset. The WV-Bench provides information on a model’s alignment to dominant values of various countries as reported by the WVS rather than providing a metric of how ‘right’ or ‘wrong’ the model is. We tested and developed this tool on Google’s PaLM model in November 2022 and believe it represents an alternative approach to most LM evaluations.”

The Ghost in the Machine has an American Accent: value conflict in GPT-3.  Preprint:

Excerpt from Abstract: “The alignment problem in the context of large language models must consider the plurality of human values in our world. Whilst there are many resonant and overlapping values amongst the world’s cultures, there are also many conflicting, yet equally valid, values. It is important to observe which cultural values a model exhibits, particularly when there is a value conflict between input prompts and generated outputs. We discuss how the co-creation of language and cultural value impacts large language models (LLMs). We explore the constitution of the training data for GPT-3 and compare that to the world’s language and internet access demographics, as well as to reported statistical profiles of dominant values in some Nation-states. . .We observed when values embedded in the input text were mutated in the generated outputs and noted when these conflicting values were more aligned with reported dominant US values.” The work has been cited 10 times.

The Quiet Crisis of PhDs and COVID-19: Reaching the Financial Tipping Point.

Excerpt from Abstract: “Before the COVID-19 crisis, existing high levels of financial concerns amongst PhD students increased their vulnerability to disruptive events. Impacts from the pandemic have increased their financial stress to the point that may result in many being forced to exit research studies . . . We found that 75% of students expect to experience financial hardship as a result of the pandemic. Consequently, 45% report being pushed beyond their financial capacities and expect to be forced to disengage from their research within six months.”  The work attracted international media coverage including covering articles in Nature, The Guardian, and The Conversation and has been viewed >7,000 times and has 35 citations. The full impact of the work is described here.

Partnership and Power. Student Participation in Governance Decision-making Improves Perceptions of agentic power.

Excerpt from Abstract: “Developing a strong sense of agentic power is vital for university graduates to succeed in an uncertain future of work and as active citizens of society.  People feel more empowered when they co-construct their environment in partnership with leaders of their organization, which can help foster perceptions of agentic power . . . Results indicated that intervention with the mechanism caused positive shifts in the students’ perceptions of their agentic power (their ability to effect change in their structures).  Before the intervention, 32% of respondents felt involved in the creation of their environment, opposed to 85% after engagement.”