OpenAI Works on Reducing SuperAI Threat

OpenAI is testing the use of an earlier version of ChatGPT to monitor the current one in a bid to alert people to any potential superintelligence AI threats. (Source: OpenAI)

Open AI’s ‘Superalignment Team’ Develops Means to Help Humans Prevent Threats by AI

According to a story on MIT’s technologyreview.com, OpenAI has introduced its superalignment team, dedicated to preventing a hypothetical superintelligence from becoming a threat. The team recently published a research paper describing a technique that allows a less powerful language model to supervise a more powerful one, potentially paving the way for humans to oversee superhuman machines. Despite recent controversies, OpenAI remains focused on advancing AI and believes in the rapid progress of AI in recent years. The team anticipates models with human-like abilities and recognizes the challenges that arise with the development of superhuman models.

The team aims to address how to align superhuman models, ensuring they behave as desired. One common technique is reinforcement learning through human feedback, where testers score a model’s responses to train it to produce desirable behavior. However, this approach assumes humans can accurately assess desirable behavior, which may not be true when superhuman models are involved.

To study the problem, the team used GPT-2, a model released by OpenAI five years ago, to supervise GPT-4, the latest and most powerful model. They trained GPT-2 on various tasks and used its responses to train GPT-4. The results were mixed, with GPT-4 outperforming GPT-2 on language tasks but performing less well on chess puzzles. While the approach shows promise, further work is needed to improve its effectiveness.

Thilo Hagendorff, an AI researcher at the University of Stuttgart, finds the idea intriguing, but questions whether GPT-2 is a suitable teacher due to its tendency to give nonsensical responses for complex tasks. He also raises concerns about aligning superintelligence that may possess unknown emergent abilities. Nonetheless, he commends OpenAI for its experimental efforts.

OpenAI aims to involve others in its superalignment initiative and has announced a $10 million funding program to support research on superalignment. The program offers grants to university labs, nonprofits, and individual researchers, as well as fellowships for graduate students. OpenAI believes new researchers can make significant contributions to the field.

Overall, OpenAI’s superalignment team’s research represents a step towards addressing the challenges of superhuman models and aligning them with human goals. While there are limitations and uncertainties, the team’s efforts and commitment to AI safety are commendable.

About the Author: Toni Denis

Toni Denis is the Publisher & Editor of Seeflection.com. She previously launched two other websites, including the first B2B vertical for the hotel industry. Her writing experience includes working for daily newspapers, magazines and websites.

Leave A Comment Cancel reply

Our Company Mission

Seeflection.AI / Seeflection.com is focused in two areas, which provide synergies to each other. First, Seeflection.com provides AI news, information and e- learning and associated development resources. Second, we provide AI-based development and support services to companies focused in AI, quantum-AI and AI-enabled blockchain development. We have a rapidly growing set of affiliations with a range of corporate and non-profit Artificial Intelligence laboratories and research centers-- as well as individuals in various AI specialties. We are active in both primary and applied AI research and development programs, as well as AI applied to medicine, robotics, media and related markets.

Our Philosophy

Create synergy through applying technology to address long-term problems and create lasting opportunities for people.

M	T	W	T	F	S	S
	1	2	3	4	5	6
7	8	9	10	11	12	13
14	15	16	17	18	19	20
21	22	23	24	25	26	27
28	29	30	31

Open AI’s ‘Superalignment Team’ Develops Means to Help Humans Prevent Threats by AI

About the Author: Toni Denis

Magic Mushrooms Meet Machine Learning

OpenAI Inks $30B Per Year Cloud Contract

Meta Snags Key AI Talent from OpenAI

Disney Declares War on AI

OpenAI’s “io” Device Delayed Until 2026

Leave A Comment Cancel reply

Our Company Mission

Our Philosophy

OpenAI Works on Reducing SuperAI Threat