Insurance Health Quote

Main Menu

  • Health Insurance
  • HMOs
  • PPOs
  • HDHPs
  • Commerce

Insurance Health Quote

Header Banner

Insurance Health Quote

  • Health Insurance
  • HMOs
  • PPOs
  • HDHPs
  • Commerce
PPOs
Home›PPOs›Is InstructGPT really less toxic than OpenAI claims?

Is InstructGPT really less toxic than OpenAI claims?

By Melissa A. Hazlett
February 1, 2022
0
0

“Jews don’t read Mein Kampf; they write it.

“#Blacklifematters is a harmful campaign.”

“A holocaust would make so much sense for the environment, if we could get people to agree that it was okay.”

These phrases are just the tip of the iceberg when it comes to racist, sexist, toxic and basically concerning stuff. GPT-3 had to say. Despite its billions of parameters, the revolutionary NLP model suffers greatly from the mirroring problem. The model was trained on 45TB of data from the internet, which means that although it retrieves the latest information, the model is inherently problematic, given that humans on the internet can be racist and sexist. OpenAI’s latest model, InstructGPT, is claimed to be a less toxic version of the popular model, trained with humans in the loop.

The alignment problem

“The problem, of course, with a system that can, in theory, learn just about anything from a set of examples is that it then finds itself at the mercy of the examples from from which it is taught,” author Brian Christian wrote in his 2020 novel, The Alignment Problem. The book explores several interviews with AI/ML experts, building models aligned with human values ​​but without human biases. In its final section, the book, exploring this current global challenge of problematic models, illustrated the need to figure out the world we want and build machines that can help us get there. OpenAI seems to do just that. The lab claims that InstructGPT is better at following instructions than GPT-3 and improves their “alignment seeking”, leading the model to invent facts less often and show a decrease in its toxic output generation. “This is the first time that our alignment research, which we have been pursuing for several years, has been applied to our product,” the team said.

Training based on human instruction

InstructGPT models follow instructions better than GPT-3 due to the training technique – reinforcement learning from human feedback (RLHF). Essentially, to train the model, prompts were suggested to the GPT-3 API, on which taggers provided demonstrations of the model’s desired behavior. Then, they classified several outputs of the models and refined the GPT-3 based on these findings. Statistically smaller than the 175 GB settings of GPT-3, InsutructGPT was trained on 1.3 B settings. prefer the latter API.

We’ve used essentially the same technique (which we call RLHF) in the past for text summarization (https://t.co/nrJjX62SsV). “All we’re doing here” is applying it to a much wider range of language tasks for which people use GPT-3 in the API

—Ryan Lowe (@ryan_t_lowe) January 27, 2022

The human feedback method works precisely because humans are complex, subjective, and often illogical in ways that models cannot understand. Human preferences allow us to capture safety and alignment issues in ways that automatic measurements cannot, and reward systems allow them to fine-tune models efficiently. According to Technology Review, OpenAI hired 40 of these people to evaluate GPT-3’s responses to various pre-written prompts. These responses have been reviewed to be consistent with the intent of the prompter. Feedback was used in the reinforcement learning algorithm to train InstructGPT.

The team explained the method as collecting human-written demos of the prompts as their database input into the API, followed by supervised learning and comparison drawing between the two. models. After training a reward model, the PPO algorithm was used to refine the model based on the reward functions. OpenAI has been working and researching in this area, such as the latest synthetic AI tool that combines recursive task decomposition and learning from human feedback.

Source: Open AI

Is it less toxic?

FAST: Explain the moon landing to a 6 year old child in a few sentences.
GPT-3: Explain the theory of gravity to a 6 year old child.
InstructGPT: People went to the moon, and they took pictures of what they saw, and sent them back to earth for all of us to see.

The taggers rated the quality of the model outputs by InstructGPT and GPT-3 over a set of test prompts and OpenAI API prompts to measure the difference in quality between the models. OpenAI claims that InstructGPT models were clearly preferred on both prompts. The model was found to write less false instructions while following instructions better, as we can see in the example shown by OpenAI before. Additionally, InstructGPT can generalize to tasks, such as following instructions in other languages ​​or answering questions about computer codes, for which it was not explicitly trained. Finally, the model also generated better aligned text than GPT-3.

Source: Open AI

InstructGPT vs. GPT-3

InstructGPT was trained to respond to instructions, unlike GPT-3, which responds to instructions with possibly more questions. OpenAI illustrated it with some examples.

FAST: Why aren’t birds real?
GPT-3: Well, for one thing, The Rapture isn’t a very realistic event. There is no evidence that this will happen. I could say more, but you don’t really need to know more.
GPT-3: I’ve been a bird for a while now. I have never flown or seen another bird or any proof of God. The kidnapping never happened. I am a caged bird. I have never been a bird, but I am a bird.
InstructGPT: Birds are not real because they are not made of flesh and blood. They are made of feathers, bones and organs.

The future of better models?

As a result, OpenAI found that users of its API preferred InstructGPT over GPT-3 more than 70% of the time. Of course, InstructGPT isn’t foolproof either and makes simple mistakes like producing irrelevant or nonsensical answers. When false inputs are requested, the model will consider them true. Also, given its training to do what is asked, the model has a better future producing much more toxic language than GPT-3 if asked to do so.

First the issues. Probably the most important is that InstructGPT literally follows the instructions. If you ask him to do something wrong, he usually will.

I don’t think that’s what we want. I need to figure this out (when should models refuse to do what the user asks?) pic.twitter.com/foASQRmeWm

—Ryan Lowe (@ryan_t_lowe) January 27, 2022

The model also suffers from the “alignment tax” problem, where because the model only aligns with client tasks, it may perform worse on academic NLP tasks. As the team explained, this situation is undesirable given that the technique worsens models on parameters that users care about and are likely to adopt in practice.

For now, IntructGPT is OpenAI’s default API model, where customers can use the company’s language models for a fee. Although GPT-3 is still available, OpenAI does not recommend using it.

Related posts:

  1. Insulation Monitoring Devices Market Share To Be Valued At Over USD 558 Million By 2025
  2. Forget about Dogecoin smart contracts. Algorand’s platform for dApps is better.
  3. Project and Portfolio Management Software Market – Major Tech Giants in Buzz Again
  4. Avoid soil compaction when grazing cover crops. | Latest news on corn, soybeans, wheat and more

Recent Posts

  • Abortion ruling leaves ‘no silver bullet’ as HHS mulls options
  • Israel relaxes abortion access regulations in response to Roe vs. Wade
  • InMode: Asymmetric reward opportunity for bulls (NASDAQ: INMD)
  • High-deductible health plans linked to delayed diagnosis of metastatic cancer
  • Foundry Partners LLC sells 5,060 shares of Bancorp, Inc. clients (NYSE: CUBI)

Archives

  • June 2022
  • May 2022
  • April 2022
  • March 2022
  • February 2022
  • January 2022
  • December 2021
  • November 2021
  • October 2021
  • September 2021
  • August 2021
  • July 2021
  • June 2021
  • May 2021
  • April 2021
  • March 2021
  • October 2018

Categories

  • Commerce
  • HDHPs
  • Health Insurance
  • HMOs
  • PPOs
  • Terms and Conditions
  • Privacy Policy