Categories are used for high level classification of content based on 8 main “news” or domain categories: Health, Politics, Law Enforcement, Sports, Entertainment, Business, Science, and Technology.
Klangoo's NLP engine automatically (no training required) returns the best one or more Category that each content item should be classified under with a score that indicates the relevance of the context to the Category.
Under each of these 8 Categories, Klangoo's NLP engine can return sub-Categories (one level) that are a bit more specific. For example, under the Business Category, the sub-categories are: Business_Financial, Oil-&-Gas, and Business_other.
Topics are disambiguated nominal phrases that are extracted from the text and represent a merged meaning.
Klangoo's NLP engine Topics are extracted automatically (no training required) from the content and returned as an ordered list based on their relevancy (level of representation). Klangoo's NLP engine can relate topics that use different words or even different languages.
Examples of topics: presidential elections, Vice President Mike Pence, electoral base, climate change.
Topics are the key element used by Klangoo's NLP engine in Semantic Full Text Search (matching content), automated summarization, and categorization.
Entities are names representing different types of objects and people.
Klangoo's NLP engine’s Semantic Network contains a huge names’ database (Namebase) that includes all the top names in different domains internationally. Along with this International Namebase, Klangoo's NLP engine has many local Namebases that are country, state, region, or even client specific.
In addition to the Namebase, Klangoo's NLP engine uses various algorithms for Named Entity Identification and Recognition. In total, Klangoo's NLP engine identifies more than 40 high-level types of entities including: People, Organizations, Events, Named Characters, Awards, Brands, Diseases, etc. Under these types, the engine identifies several levels of sub-types which adds to the semantic significance of each Entity.
Examples of Entity sub-type: People > Politician > President > Donald Trump.
Each identified Entity is scored based on its representation in the content. The score ranges between 0 and 1 and is also mapped to a qualitative score of: Very Related, Related, Somewhat Related, and Not Related or Mentioned.