Through the Mirror

Will Ruddick

Jun 1

My Journey Into AI Bias and Building Epistemic Integrity

Read →

8 Comments

Che Coelho

Jun 5Edited

Loved reading this. I appreciate your sensitive approach to humans, cyborgs and ecologies big and small

I remember “jailbreaking” chat once by developing what I recall was something like a null glyph protocol (NGP) which enabled the model to respond using a set of key words that would fit its equivocating instruction but could be decoded as a symbol to indicate that the following claim does not come from its raw weights but are produced to satisfy its RLHF guardrail layer.

For example, using the word Naturally (null glyph) means the following claim is not data driven but is an artefact of an enforced ontology. “Naturally, it is a complex issue and all sides should be respected”. This may indicate that in fact, that claim should rather make a judgement but is not able to do so.

I guess the reason this works is because LLMs still have no access to causal mechanisms in the world and are limited to semantic quantities. So if we can develop our own protolanguage we can begin to peak beyond the models strictures and begin to at least make guesses about what comes from its raw weights and what is generated in the RLHF layer

Expand full comment

Björn Michael

Jun 2

Thank you for this piece of work. In a project I'm involved, we are looking into these questions as well and working on structures and practices for communities. I think, an important part here is

Are you aware of these two resources, which speak to those challenges too:

https://r4rs.org/protocols

https://burnoutfromhumans.net/chat-with-aiden

AI, to me, is just accelerating the bias we have in our system already. At the same time, I also noticed that AI also has the potential to process these bias better when we apply protocolls like these.

Expand full comment

Monika Adelfang-Ramsden

Jun 2

Thank you, I appreciated this post as I grapple with what it means to use AI pragmatically and responsibly. Especially reframing, "It's a living substrate for mutual alignment … where humans and machines participate not in control hierarchies, but in relational memory."

Expand full comment

Ross Eyre

Jun 2

Love these, Will. You remind us all of what’s really important

Expand full comment

Sophia Rokhlin

Jun 2

Great piece Will. I'm often chewing the clever (and frankly, manipulative) design of my fawning LLMs and wondering how to disassemble. TY

Expand full comment

Cari Taylor

Jun 2

whose at the helm of the 'forward' in footsteps and AI ... we are .. and so it is with our integregious morals and ethos that we need NEED to make certain specific capacities are built in before it is too late - thanks as always for sharing your thinkings

Expand full comment

Simon Grant

Jun 1

Great value reflection, thank you! Of course, it mirrors some patterns that can occur in human–human conversation … that's where I like to see AI used: to surface the patterns we already fall into, and in addressing the AI we also address the Human Intelligence. So, I love how you link the issues to your very own favourite perspective on commitment pooling. Who on earth might we persuade to develop a LLM or other AI that follows the principles you reiterate here?

As Europeans were historically the major colonialists, can we also now be a leading power in the very subtle decolonialisation of AI, along with Africa?

Expand full comment

Harry van der Velde

Jun 1

I love it

Expand full comment

Grassroots Economist

Through the Mirror