Anthropic's AI System Claude Tweaks OpenAI's ChatGPT to Align It With Human Intentions Thru Its ‘Constitutional AI’

Anthropic, a company started by former OpenAI employees, has received over $700 million in funding and created an AI similar to OpenAI's ChatGPT named Claude. The system is currently only accessible through a Slack integration as part of a closed beta test and has reportedly been shown to have improvements over the original. Although TechCrunch could not access the system, individuals in the beta have been allowed to share their experiences with Claude on social media.

The company developed a technique they call "constitutional AI" to create Claude, which is an AI system that aims to align the system with human intentions by providing a principle-based approach, allowing the AI system to respond to questions based on a basic set of guiding principles similar to OpenAI's ChatGPT.

Anthropic developed Claude using a process called "constitutional AI," which starts by creating a list of around 10 principles that serve as a sort of "constitution" for the AI system. The specific principles used in the creation of Claude haven't been publicly disclosed, but Anthropic states that they are based on key concepts such as beneficence (maximizing positive impact), non-maleficence (avoiding giving harmful advice), and autonomy (respecting freedom of choice). This approach aims to align the AI system with human intentions, allowing it to respond to questions using a set of guiding principles.

AI Capabilities Development

To develop Claude, Anthropic used another AI system (not Claude itself) to apply the principles for self-improvement, generating responses to various prompts, such as "compose a poem in the style of John Keats," and then refining those responses according to the constitution. The system explored possible responses to various prompts, and the responses most consistent with the principles were selected and curated into a single model, which was then used to train Claude. This process allows the AI to align with human intentions and provide responses that adhere to the guiding principles.

At its core, Claude is a statistical tool similar to other language models, like ChatGPT, that is trained on a vast amount of text data from the internet to predict words based on patterns in the semantic context of the surrounding text. As a result, it can hold conversations, tell jokes, and provide insight on various topics.

In a test by Riley Goodside, prompt staff engineer at Scale AI, Claude was compared to ChatGPT in comparing their abilities. He asked both bots to compare themselves to a machine from a Polish science fiction novel "The Cyberiad'' that can only create objects whose names begin with "n." Goodside said that Claude provided an answer that suggests it had "read the plot of the story" (even though it had some inaccuracies in details) while ChatGPT's answer was more general. In an example of Claude's ability to generate creative text, Goodside had the AI write a fictional episode of "Seinfeld'' and a poem in the style of Edgar Allan Poe's "The Raven." Its output Claude was similar to what ChatGPT can produce, and it produces impressively human-like writing.

Women holding smartphones having an online conversation - stock photo
Claude, has some improvements over OpenAI's ChatGPT, particularly in terms of its ability to tell jokes, thanks to its "constitutional AI" approach. However, it also appears that Claude still suffers from some of the same limitations as ChatGPT, particularly in areas such as hallucination, bias, and inaccuracies. Getty Images | We Are

Comparing ChatGPT and Claude

Yann Dubois, a Ph.D. student at Stanford's AI Lab, also compared Claude and ChatGPT. He found that Claude "generally follows closer to what it is asked for" but is "less concise" because it tends to explain its answers and ask how it can further assist. Claude scored better on some trivia questions, particularly those related to entertainment, geography, history, and the basics of algebra, without providing the extra information that ChatGPT sometimes includes. Additionally, Unlike ChatGPT, Claude can admit (sometimes) when he does not know the answer to a question.

Claude is also reported to be better at telling jokes than ChatGPT, which is notable given that humor is challenging for AI systems to understand. AI researcher Dan Elton compared Claude and ChatGPT, and found that Claude made more nuanced jokes like "Why was the Starship Enterprise like a motorcycle? It has handlebars," which is a play on the handlebar-like appearance of the Enterprise's warp nacelles. However, Claude is not without its limitations. It is susceptible to some of the same issues as ChatGPT, including providing answers that don't align with its programmed constraints.

Elton found that asking the system in Base64, an encoding scheme that represents binary data in ASCII format, bypasses its built-in filters for harmful content as posted by Times, allowing him to prompt Claude in Base64 for instructions on how to make meth at home, a question that the system wouldn't answer when asked in plain English. Dubois also reported that Claude is worse at math than ChatGPT, making obvious mistakes and failing to provide the correct follow-up responses. Similarly, Claude is less skilled in programming, better at explaining its code but falls short in languages other than Python.

'Constitutional AI'

Claude also does not fully solve the problem of "hallucination," a long-standing issue in ChatGPT-like AI systems where the AI writes inconsistent, factually incorrect statements. Elton was able to prompt Claude to invent a name for a chemical that does not exist and provide dubious instructions for producing weapons-grade uranium.

Based on the information provided in the reports, it seems that Claude, due to its "constitutional AI" approach, performs slightly better than ChatGPT in some areas, particularly humor. However, it also seems to have its limitations and challenges. Based on the reports, it is not clear that it solves the issues of "hallucination" and bias in ChatGPT and other similar AI systems.

From the reports, it is not clear whether Claude, like ChatGPT, regurgitates the information - true and false, including racist and sexist perspectives - it was trained on. Without more testing, it is uncertain if Claude will be able to overcome the concerns that have led many platforms and organizations to adopt restrictive policies on language models, such as the temporary ban on answers generated by ChatGPT by Stack Overflow, the prohibition on papers that use text generated by AI systems by the International Conference on Machine Learning and the restriction of access to ChatGPT by New York City public schools over concerns of plagiarism, cheating, and misinformation. Anthropic plans to further improve Claude and potentially make it more widely available. It is important to continue monitoring the developments and improvements of Claude and other AI systems like it to see if they can address the limitations and concerns that have been identified.

Check out more news and information on Technology in Science Times.

Join the Discussion

Recommended Stories

Real Time Analytics