Magnet uses its NLP full article processing results at the core of all the services it provides. NLP processing necessitates a noise-free input. Thus, while Magnet's JavaScript communicates with Magnet's API for processing, Magnet's crawler is the tool used to collect all the article information from the public URL of the article.
Article information necessary for NLP processing include: a unique ID or canonical URL, date of publishing, last modified date, language, title, and body. Other information listed below are secondary (optional) but recommended or required in certain specific cases.
Schema.org and Magnet Tags:
The schema.org and special tags have the same format:
- Tag start: <meta or <p or <span or <div …
- Parameter (property) name: itemprop="[property name]"
- Parameter (property) value: content="[property value]"
- Tag end: /> or </p> or </span> or </div> …
Example: <meta itemprop=”identifier” content=”122” />
Property Name |
Type |
Section |
Required |
Description |
identifier |
Text |
<head> |
Recommended |
The page’s unique ID. If not specified, the article url is used as identifier. For home- or section pages the page name (ex: homepage or sports) as identifier |
pageType |
Text |
<head> |
Required |
Content: article, video, main, gallery, relart, entrelart, or deleted. |
url |
URL |
<head> |
Required |
The canonical URL of the page |
datePublished |
Date |
<head> |
Required |
A date representing when the content was published* |
dateModified |
Date |
<head> |
Optional |
A date representing when the content was last modified* |
author |
Text |
<body> |
Optional |
The author(s) or source of the content. (comma separated if more than one) |
thumbnailUrl or image |
URL |
<head> |
Recommended |
The URL of an thumbnail image associated with the content. Please provide us with the URL of an image (typically your logo) that will be used by our widgets when there is not thumbnail/still associated with an article/video/gallery |
inLanguage |
Text |
<head> |
Recommended |
The language code of the page content. Default is ‘en’ (English). See supported languages document. |
headline |
Text |
<body> |
Required |
The title of the content |
articleBody |
Text |
<body> |
Required |
The actual text of the article, closed caption or transcript of the video. The articleBody property is optional if the pageType is “image” or “gallery” |
description |
Text |
<body> |
Recommended |
The description of the content of the page. Highly recommended when the pageType is “image” or “gallery” and articleBody is not set. Also the description can be displayed in newsletters and follow e-mails |
expires |
Date |
<expires> |
Optional |
Date when the articles expires and should be removed from Magnet. Reasons for expiring might be a limited period of publication rights for certain content. |
magnet:custom |
Text |
<head> |
Recommended |
Custom fields can be used to e.g. identifying the section the article is published in. Custom fields can be used for biasing content or follow-section functionality. |
*Dates must be in ISO 8601 format. Examples: 2014-09-30, 2014-09-30T18:58:45+00:00, 201409-30T18:58:45Z
Once your done adding your Tags, make sure you validate.
Example:
<meta itemprop="identifier" content="387250"/>
<meta itemprop="headline" content="Trump Revisits his Charlottesville Comments in Angry Speech"/>
<meta itemprop="description" content="President Donald Trump opened his political rally in Phoenix with calls for unity and an assertion that “our movement is about love.” Then he erupted in anger."/>
<meta itemprop="pageType" content="article"/>
<meta itemprop="datePublished" content="2017-08-23T06:49:37+00:00"/>
<meta itemprop="dateModified" content="2017-08-23T06:49:37+00:00"/>
<meta itemprop="url" content="http://www.xxx.com/2017/08/23/trump-revisits-charlottesville-comments-angry-speech/"/>
<meta itemprop="inLanguage" content="en"/>
<meta itemprop="author" content="AP"/>
<meta itemprop="thumbnailUrl" content="http://www.xxx.com/wp-content/uploads/2017/08/trumppheonix-300x200.jpeg"/>
<div itemprop="articleBody"><p>PHOENIX (AP) — President Donald Trump opened his political rally in Phoenix with calls for unity and an assertion that “our movement is about love.” Then he erupted in anger.</p>
Comments
0 comments
Article is closed for comments.