To read this content please select one of the options below:

From Croesus to Computers: Logic of Perverse Instantiation

The Ethics Gap in the Engineering of the Future

ISBN: 978-1-83797-636-2, eISBN: 978-1-83797-635-5

Publication date: 25 November 2024

Abstract

Perverse instantiation is one of many hypothetical failure modes of AI, specifically one in which the AI fulfils the command given to it by its principal in a way which is both unforeseen and harmful. A lot is already said about perverse instantiation itself, especially when such a failure mode presents an existential risk, as would be the case with a superintelligent AI. However novel these disaster scenarios may be, similar fictional cautionary tales already exist in many cultures: tragic stories about misinterpreted prophecies and grand wishes gone awry, from Croesus to Macbeth. Analysis of both old and new tales of perverse instantiation reveals that the core of the issue is an ancient philosophical and logical problem that even Socrates faced: the problem of defining terms. Unlike the Socratic problem, which focused on finding a good intensional definition, perverse instantiation encompasses problems that arise from both badly defined intension of terms (their internal content) and badly defined extension of terms (their range of applicability). However, models of machine learning that use vast amounts of training data hold the promise of resolving the issue of badly defined extension of terms. The issue of defining intension of terms remains. Further parallels can be found between scenarios of perverse instantiation and Socrates' dialogues with obstinate sophists, such as importance of philosophical reflection and discussion. This indicates that our future challenges in working with AI may still have a lot to do with retracing Socrates' steps.

Keywords

Citation

Rujević, G. (2024), "From Croesus to Computers: Logic of Perverse Instantiation", Stelios, S. and Theologou, K. (Ed.) The Ethics Gap in the Engineering of the Future, Emerald Publishing Limited, Leeds, pp. 83-104. https://doi.org/10.1108/978-1-83797-635-520241005

Publisher

:

Emerald Publishing Limited

Copyright © 2025 Goran Rujević. Published under exclusive licence by Emerald Publishing Limited