Weekly Dwell #11

This week, I've been in conversations regarding artificial intelligence alignment, and whether we can actually come to any conclusions about what alignment is. With many technologies, the answers are simple and can be dictated in a direct and clear way. This is what draws a lot of people to computer science, engineering, and mathematics - this ability to solve hard problems and perhaps find multiple solutions, but see and explain it clearly rather than the loose and flexible disciplines of the humanities and social sciences. But AI has largely broken that distinction between social and computer science, bringing together nuance and subjectiveness that then further muddies the debate. Topics of ethics, privacy, consent, and agency now get tossed around alongside code and actual mechanics of how to do what it is meant to do.

On the top of alignment, there is essentially no consensus on what "alignment" even means, besides vague terms of bias, discrimination, hate speech, and the like. Aligning on these are pretty subjective, and become even more subjective when we turn to trying to align on general "values" which is a big sticking point for a lot of the ethics industry. The focus on values is incredibly high and a huge talking point within the industry, and yet no one really aligns on what values generally mean, what values are most important, and what "aligning on X value" even looks like. As with every company, the "values" chosen will be up to the company, and specifically the leaders of those organizations, and will hopefully trickle down in one way or another. But as we've seen, that doesn't always happen, and doubtfully happens to ensure every single engineer perfectly aligns on the values and what it means for one value to be represented in the technology. 

Further yet is considering even if all was aligned in every possible way -- we still don't have an explainable way to know what is happening in most of these LLMs. So if all the inputs are correctly "aligned" based on what is agreed upon, there is no way to predict or accurately confirm that everything that comes out of the LLM will be exactly according to the aligned values and inputs. 

This realization that alignment can never really be achieved is then a further ethical issues that brings up several other issues, and then continues to spiral until it is all about how powerful and unruly these technologies really are. I'm not necessarily fearful of them, but I do think that it will be hard to harness them while still ensuring they can do things we want to the best of their ability. My interest is more in small augmented language models, more specifically built and trained on smaller datasets that can be controlled. I believe that is where the future will lie, is the segmentation of these types of intelligence, instead of any sort of superintelligence.

Previous
Previous

Weekly Dwell #12

Next
Next

Weekly Dwell #10