ChatGPT
OpenAI has unveiled GPT-3.5 Turbo, which can perform as well as GPT-4 in certain narrow tasks. Pexels

OpenAI, the company behind the widely popular ChatGPT AI chatbot, has finally introduced fine-tuning for GPT-3.5 Turbo. The American AI company said the update will enable developers to customise models that perform better for their use cases.

Aside from that, OpenAI said developers will be able to "run these custom models at scale." While it is still unclear if GPT-3.5 Turbo will be able to surpass GPT-4, OpenAI believes that the latest update will ensure that GPT-3.5 Turbo can at least match up to GPT-4 "on certain narrow tasks."

In a blog post shared on its official website, OpenAI confirmed that the GPT-3.5 Turbo with fine-tuning for GPT-4 will arrive this fall. This is a big step forward, considering the company's CEO Sam Altman recently admitted Elon Musk's exit was tough on OpenAI.

Now, the company suggests it can keep customers' data safe by fine-tuning API. Furthermore, it assured that the information will only be used for fine-tuning purposes. On top of that, OpenAI highlighted a myriad of fine-tuning use-case scenarios

Improved steerability

Businesses can use fine-tuning to ensure the model follows instructions better, such as always responding in a given language, or making outputs that are to the point. For example, developers can use fine-tuning that make sure the model always responds in German when prompted to use that language.

Well-founded output formatting

Fine-tuning enhances the model's ability to format responses. This is a crucial aspect for applications that require specific response formats like composing API calls or code completion. With fine-tuning, a developer can convert user prompts into high-quality JSON snippets and use them with their own systems.

Custom tone

Fine-tuning can come in handy for improving the qualitative feel of the model output, such as its tone. In addition to enhancing performance, the new feature will allow organisations to use shorter prompts and still retain the same performance.

According to OpenAI, "Early testers have reduced prompt size by up to 90% by fine-tuning instructions into the model itself, speeding up each API call and cutting costs."

It is also worth noting that fine-tuning with GPT-3.5-Turbo can handle 4K tokens, which are double the size of previous fine-tuned models. OpenAI says it is ideal to combine fine-tuning with techniques like function calls, information retrieval, or prompt engineering for best results.

Here are the costs for fine-tuning:

  • Training: $0.008 (about £0.0064) / 1K Tokens
  • Usage input: $0.012 (about £0.0096) / 1K Tokens
  • Usage output: $0.016 (about £0.013) / 1K Tokens

Meta's new AI-powered coding tool to compete with ChatGPT

It is no secret that OpenAI hasn't deviated its focus from ChatGPT. In fact, the company is sparing no effort in a bid to improve the popular AI chatbot. In line with this, an earlier report indicated that OpenAI is gearing up to roll out new updates to make ChatGPT more useful.

The folks at ZDNet believe ChatGPT is without an iota of doubt, the best original AI chatbot. However, there's no dearth of noteworthy alternatives. Now, Meta has announced its new LLM (large language model) dubbed Code Llama, which can generate and discuss codes using text prompts.

"It has the potential to make workflows faster and more efficient for developers and lower the barrier to entry for developers and lower the barrier to entry for people who are learning to code," Meta said in a blog post.

How does Code Llama work?

According to Meta, Code Llama is a code-specialised version of Llama 2 that was developed by further training Llama 2 on its code-specific datasets. This process added enhanced coding capabilities. Code Llama supports a myriad of popular programming languages used today including PHP, C#, Bash, Typescript (Javascript), Java, C + +, Python, and more.

Code Llama comes in three sizes including 7B, 13B, and 34B parameters. Each model is trained with 500B tokens of code and code-related data, and they can address different serving and latency requirements.