While no code visual tools help companies get more out of the computer without armies of in-house technicians having to configure software for other employees, they still have access to the most powerful technical tools – deep tech at the AI coal face – still requires expert help (and / or costly in-house expertise).
Here the French bootstrapping startup NLPCloud.io trades in MLOps / AIOps – or “Compute Platform as a Service” (since the queries are executed on their own servers) – with a focus on natural language processing (NLP)), such as the Name suggests.
Developments in Artificial Intelligence in recent years have resulted in impressive advances in NLP – a technology that enables companies to scale their ability to intelligently engage with all types of communication by performing tasks such as recognizing named entities that Sentiment analysis and the text automate classification, summary, answering of questions and part-of-speech marking, which frees (human) staff to concentrate on more complex / differentiated work. (Although it’s worth noting that most of NLP research has focused on the English language – which means that this technology is at its most mature there; therefore, the AI advances associated with it are not universally distributed.)
Ready-to-go (pre-trained) NLP models for English are ready for immediate use. There are also special open source frameworks that help with training models. However, companies looking to leverage NLP will still need the DevOps resources and shops to implement NLP models.
NLPCloud.io is aimed at companies that do not feel up to the challenge of implementation themselves. It offers a “production-ready NLP API” with the promise that no DevOps is required.
The API is based on open source models from Hugging Face and spaCy. Customers can either use ready-made pre-built models (they choose the “best” open source models; they don’t build their own); or they can upload custom models developed in-house by their own data scientists. This is different from SaaS services like Google Natural Language (which uses Google’s ML models) or Amazon Comprehend and Monkey Learn.
NLPCloud.io wants to democratize NLP by helping developers and data scientists to implement these projects “in the shortest possible time and at a fair price”. (There is a tiered pricing model based on requests per minute that starts at $ 39 and goes up to $ 1,199 at the enterprise end for a custom model that runs on a GPU. It also offers a free tier that users can try out Models with low request speed without charging.)
“The idea came from the fact that, as a software developer, I saw many AI projects fail due to deployment in the production phase,” says sole founder and CTO Julien Salinas. “Organizations often focus on building accurate and fast AI models, but today there are more and more great open source models available and doing a great job. The biggest challenge now is to use these models efficiently in production. It takes AI skills, DevOps skills, coding skills … that’s why it’s challenging for so many companies, and that’s why I decided to start NLPCloud.io. “
The platform launched in January 2021 and now has around 500 users, including 30 paying for the service. The startup based in Grenoble in the French Alps currently consists of a three-person team and a few independent contractors. (Salinas says he plans to hire five people by the end of the year.)
“Most of our users are tech startups, but we also have a few larger companies,” he told Biomedarticles. “The biggest demand I see is coming from both software engineers and data scientists. Sometimes it comes from teams who have data science skills but don’t have (or don’t want to spend time on) DevOps skills. Sometimes it’s technology teams who want to start using NLP right away without hiring an entire data science team. “
“We have very different customers, from founders of startups to larger companies like BBVA, Mintel, Senuto … in all kinds of industries (banking, public relations, market research),” he adds.
One of the use cases of his customers is lead generation from unstructured text (e.g. web pages) via the extraction of named entities. and sorting support tickets by urgency by performing sentiment analysis.
Content marketers also use their platform to generate headlines (via a summary). While text classification functions are used for economic intelligence and extraction of financial data, according to Salinas.
He said his own experience as a CTO and software engineer working on NLP projects at a number of tech companies led him to see an opportunity in the challenge of AI implementation.
“I realized that thanks to great open source frameworks like spaCy and Hugging Face Transformers, it was quite easy to create acceptable NLP models, but then I found it quite difficult to use those models in production,” he explains. “It takes programming skills to develop an API, strong DevOps skills to build a robust and fast infrastructure for NLP models (AI models generally consume a lot of resources), and of course data science skills.
“I tried looking for off-the-shelf cloud solutions to save weeks of work, but couldn’t find anything satisfactory. My intuition was that such a platform would help technology teams save a lot of time, sometimes months of work for the teams that don’t have strong DevOps profiles. “
“NLP has been around for decades, but until recently it took teams of data scientists to create acceptable NLP models. For several years now we have made amazing advances in the accuracy and speed of the NLP models. More and more experts who have been working in the NLP field for decades agree that NLP is becoming a “commodity”, ”he continues. “Frameworks like spaCy make it extremely easy for developers to use NLP models without having advanced knowledge of data science. Hugging Face’s open source repository for NLP models is also a big step in that direction.
“But running these models in production is still difficult, and maybe even more difficult than before, as these brand new models are very resource-intensive.”
The models offered by NLPCloud.io are selected based on performance – where “best” means that “the best compromise between accuracy and speed” is achieved. Salinas also says they pay attention to context as NLP can be used for different use cases – so suggests a number of models to be able to adapt to a particular use.
“At first we started with models dedicated to entity extraction only, but most of our early customers asked for other use cases as well. So we started adding more models, ”he notes, adding that they will continue to add more models from the two selected frameworks -“ to cover more use cases and more languages ”.
SpaCy and Hugging Face were chosen as the source for the models offered via the API because of their track record as a company, the NLP libraries they offer, and their focus on production-ready frameworks – with the combination that NLPCloud.io enables to offer a choice of, according to Salinas Models that are fast, accurate and work within the scope of the respective tradeoffs.
“SpaCy is being developed by a solid company in Germany called Explosion.ai. This library has become one of the most used NLP libraries among companies who want to use NLP in production “for real” (as opposed to just academic research). The reason for this is that it is very fast, has high accuracy in most scenarios, and is an “appraisal” framework that makes it very easy to use by non-data scientists (the downside is that it has less customization options) “, he says.
“Hugging Face is an even more solid company that recently raised $ 40 million for good reason: They created a disruptive NLP library called ‘Transformers’ which greatly improves the accuracy of NLP models (the downside, however, is that it is very resource-intensive). . It offers the possibility to cover further use cases like sentiment analysis, classification, summary etc. In addition, an open source repository has been created where you can easily choose the best model you need for your use case. “
While AI for a clip within certain tracks – such as B. NLP for English – As you progress, there are still reservations and potential pitfalls in automating language processing and analysis, with the risk of something going wrong or worse. For example, AI models trained on human-generated data have been shown to reflect embedded prejudices and prejudices of the people who created the underlying data.
Salinas agrees that NLP can sometimes face “bias issues” such as racism and misogyny. But it does express confidence in the models they have chosen.
“Most of the time it seems that way [bias in NLP] is due to the underlying data used to train the models. It shows that we should be more careful about where this data comes from, ”he says. “In my opinion, the best solution to mitigate this is to have the NLP user community actively report something inappropriate while using a particular model so that that model can be stopped and fixed.”
“While we doubt whether there is such a tendency in the models we propose, we encourage our users to report such issues to us so we can take action,” he adds.