Named Entity Recognition (NER)

NER-Text.png

Named Entity Recognition (NER) is the task of identifying and classifying named-entities that occur in unstructured text into predefined categories such as a person, organisation, location, date, numeric, currency, time-quantity, and so on.

Why do we need such a service?

Before being able to extract financial data from unstructured documents such as PDF, Word, Excel or Email, we need to first understand them, and this involves locating and categorising all pieces of information that are relevant to the data we are extracting.

It is laying down the groundwork that will allow other services to perform their functions much better. For example, it is very useful for our Table Detection service to know where all the numbers in the document are, since most (but not all!) tables tend to contain a lot of numbers. Every time you begin to process a document with us, this service will be the first of many that look at your document.

How does it work?

The typical NER services work by either a combination of rules or by using machine learning approaches trained on a lot of data from the real world.

However, at Semantic Evolution, we took advantage of the vast financial experience of our founders and added on top of that the knowledge of all our team to build our custom NER, powered by the latest advancements in Machine Learning and Artificial Intelligence, and geared towards the financial world.