[ad_1]
Join Transform 2021 for the most important topics in AI and business data. Learn more.
The most impressive thing about OpenAI’s natural language processing (NLP) model, GPT-3, is its size. With over 175 billion weighted connections between words known as parameters, the transformer encoder-decoder model blasts its 1.5 billion parameter predecessor, GPT-2, out of the water. This allowed the model to generate surprisingly human text after being fed a few examples of the task you want it to do.
Its 2020 release hit the headlines with people scrambling to be on the waiting list to access its API hosted on OpenAI’s cloud service. Now, months later, as more users have had access to the API (including me), interesting apps and use cases have popped up every day. For example, Debuild.co has some really cool demos where you can build an application by giving the program some simple instructions in plain English.
Despite the hype, questions persist as to whether GPT-3 will be the foundation upon which an ecosystem of NLP applications will rest, or whether newer and stronger NLP models will overthrow it from its throne. As businesses begin to imagine and design NLP applications, here’s what they should know about GPT-3 and its potential ecosystem.
GPT-3 and the PNL arms race
As I have described in the past, there are actually two approaches to pre-training an NLP model: generalized and non-generalized.
A non-generalized approach has specific pre-training goals that are aligned with a known use case. Basically, these models dive deep into a smaller, more focused dataset rather than a massive dataset. An example of this is Google’s PEGASUS model, which is specially designed to enable text synthesis. PEGASUS is pre-trained on a dataset that closely resembles its end goal. It is then refined on the text summary data sets to provide cutting edge results. The advantage of the non-generalized approach is that it can dramatically increase accuracy for specific tasks. However, it is also much less flexible than a generalized model and still requires a lot of training examples before you can start to achieve precision.
A generalized approach, on the other hand, is broad. These are the 175 billion GPT-3 settings at work, and they’re basically pre-trained all over the Internet. This allows GPT-3 to perform virtually any NLP task with only a handful of examples, although its accuracy is not always ideal. In fact, the OpenAI team points out the limitations of generalized pre-training and even concedes that GPT-3 has “notable weaknesses in text synthesis.”
OpenAI decided that it was best to go bigger when it came to precision issues, with each version of the model increasing the number of parameters by orders of magnitude. Competitors took note. Google researchers recently published an article highlighting a Switch Transformer NLP model with 1.6 trillion parameters. This is just a ridiculous number, but it could mean that we will see an arms race when it comes to generalized models. While these are by far the two largest generalized models, Microsoft has Turing-NLG at 17 billion parameters and could be looking to join the arms race as well. Considering that it cost OpenAI nearly $ 12 million to form GPT-3, such an arms race could be expensive.
Promising GPT-3 applications
The flexibility of GPT-3 is what makes it attractive from an application ecosystem perspective. You can use it to do just about anything you can imagine with the tongue. As you might expect, startups have started exploring how to use GPT-3 to power the next generation of NLP applications. Here is a list of interesting GPT-3 products compiled by Alex Schmitt at Cherry Ventures.
Many of these applications are aimed at consumers, such as the “love letter generator”, but there are also more technical applications such as the “HTML generator”. As companies contemplate how and where they can integrate GPT-3 into their business processes, some of the most promising early use cases are in healthcare, finance, and video meetings.
For healthcare, financial services, and insurance companies, streamlining research is a huge need. The data in these areas is growing exponentially and it becomes impossible to stay on top of your area in the face of this spike. GPT-3 based NLP applications could browse the latest reports, articles, findings, etc., and contextually summarize key findings to save researchers time.
And as video meetings and telehealth have become increasingly important during the pandemic, we have seen the demand increase for NLP tools that can be applied to video meetings. What GPT-3 offers is the ability to not only script and take notes from an individual meeting, but also generate “too long;” did not read ”(TL; DR) summaries.
How companies and startups can build a divide
Despite these promising use cases, the main inhibitor of a GPT-3 application ecosystem is the ease with which a copier can replicate the performance of any application developed using the GPT-3 API.
Everyone who uses the GPT-3 API receives the same pre-trained NLP model on the same data, so the only differentiator is the fine-tuning data that an organization leverages to specialize the use case. The more fine-tuning data you use, the more differentiated and sophisticated the output.
What does it mean? Larger organizations with more users or more data than their competition will be able to better take advantage of the promise of GPT-3. GPT-3 will not lead to disruptive starts; it will allow companies and large organizations to optimize their offerings because of their historical advantage.
What does this mean for businesses and startups in the future?
Applications built using GPT-3’s API are just starting to scratch the surface of possible use cases, so we haven’t seen an interesting proof-of-concept ecosystem grow yet. The question of how such an ecosystem would monetize and mature also remains an open question.
Since differentiation in this context requires fine tuning, I would expect companies to embrace the generalization of GPT-3 for some NLP tasks while sticking to non-generalized models such as PEGASUS for more specific NLP tasks.
Additionally, as the number of metrics grows exponentially among the big players in NLP, we might see users switch between ecosystems depending on which one is in the lead at the moment.
Regardless of whether a GPT-3 application ecosystem matures or is replaced by another NLP model, companies should be excited by the relative ease with which it becomes possible to create highly articulated NLP models. They need to explore use cases and think about how to leverage their market position to quickly create added value for their customers and their own business processes.
Dattaraj Rao is Innovation and R&D Architect at Persistent Systems and author of the book Keras to Kubernetes: The Journey from a Machine Learning Model to Production. At Persistent Systems, he heads the AI research lab. He has 11 patents in machine learning and computer vision.
VentureBeat
VentureBeat’s mission is to be a digital town square for technical decision-makers to gain knowledge about transformative technology and conduct transactions. Our site provides essential information on data technologies and strategies to guide you in running your organizations. We invite you to become a member of our community, to access:
- up-to-date information on the topics that interest you
- our newsletters
- Closed thought leader content and discounted access to our popular events, such as Transform
- network features, and more
Become a member
[ad_2]
Source link