Welcome to WindowsClient.net | Sign in | Join

Data Feed Spec

The SCE Starter Kit enables application authors to rapidly develop and deploy rich, occasionally connected content experiences. A critical component of these applications is a live data source, or feed.

Although many publishers already expose article summaries via RSS feeds, the rich experience of a compelling smart client application will require a correspondingly richer feed than standard RSS 2.0. This document gives an overview of the data feed structure used by the Syndicated Client Experiences (SCE) Starter Kit. Although this data schema lends itself extremely well to news reader scenarios, it can readily be applied to a broader set of applications.

The feeds described in this document will be based on RSS 2.0 extended with elements from a custom namespace. The XML namespace URI for the data format described in this specification is: http://schemas.microsoft.com/rss/2007/readerextensions

In this spec, the prefix "rx:" is used for the namespace URI identified above. This spec also uses elements from the Content Sync Extensions Spec. Within this document, elements from the CSE namespace will bear the prefix “csx:”.

This spec describes three kinds of XML documents

  1. Master Feed - This feed is based on RSS 2.0 and is the root of a content hierarchy. It points to edition feeds and provides a compact representation of the data required to synchronize content updates.
  2. Edition Feed - This feed is based on RSS 2.0 and encapsulates a set of related items. This feed contains meta data that allows you to group items into hierarchies (ie sections and subsections) as well as reference additional content (text, images, etc) assocaited with each item.
  3. Story Article - This is a NITF based XML document representing a single piece of content. Although the starter kit supports NITF as a full text article format, it is relatively straightforward to use other data formats as well.

The Master Feed

All of the content to be displayed in the Syndicated Client Experiences application is discoverable from the information in the master feed. The master feed points to one or more edition feeds as well as an ad feed. Within the master feed, each edition feed is represented by a single RSS <item>

The starter kit requires that the master feed point to at least one edition feed (which typically represents the current edition of the content). The master feed may optionally point to other edition feeds that represent different editions of the content (i.e. the edition from last Tuesday, or an edition in Japanese). Additionally, any edition feed may be flagged as “onDemand”, indicating that the feed should only be downloaded upon user request. The relationship of these feeds is depicted below:


Diagram: Feed Relationships

Example Master Feed

Below is an example of a master feed that references an edition feed, an onDemand edition feed, and an ad feed.

<?xml version="1.0" encoding="utf-8">
<rss version="2.0" xmlns:rx="http://schemas.microsoft.com/rss/2007/readerextensions" xmlns:csx="http://schemas.microsoft.com/rss/2007/contentsyncextensions" >
    <channel>
        <title>Master Feed</title>
        <link>http://msn.com</link>
        <description>Master Feed</description>
        <pubDate>Tue, 09 Oct 2007 10:41:44 GMT</pubDate>
        <lastBuildDate>Tue, 09 Oct 2007 10:41:44 GMT</lastBuildDate>
        <!--One edition feed is required-->         <item csx:hiddenItem="True" rx:type="EditionFeed">
          <title>Top Level Feed</title>
          <link>http://msn.com</link>
          <description>Top Level Feed</description>
          <guid>toplevel.xml</guid>
          <pubDate>Tue, 09 Oct 2007 10:41:44 GMT</pubDate>
          <csx:link nestedFeed="True">toplevel.xml</csx:link>
        </item>
        <!--Additional edition feeds are optional -->         <item csx:hiddenItem="True" rx:type="EditionFeed">
          <title>Archive Feed</title>
          <link>http://msn.com</link>
          <description>A top level feed containing yesterday's news</description>
          <guid>toplevelArchiveEdition.xml</guid>
          <pubDate>Mon, 08 Oct 2007 10:41:44 GMT</pubDate>
          <csx:link nestedFeed="True" onDemand="True">toplevelArchiveEdition.xml</csx:link>
        </item>
        <!--The ad feed is optional -->         <item csx:hiddenItem="True" rx:type="AdFeed">
          <title>Ad Feed</title>
          <link>http://msn.com</link>
          <description>Ad Feed</description>
          <pubDate>Tue, 09 Oct 2007 10:41:44 GMT</pubDate>
          <guid>adfeed.xml</guid>
          <csx:link nestedFeed="True">adfeed.xml</csx:link>
        </item>
        <!--These items will show up in the subscription center summary-->         <item>
          <title>Really Important Headline</title>
          <link>http://boguslink.com/stories/abcdef.htm</link>
          <description>Some really important thing happened</description>
        </item>         <item>
          <title>Another Really Important Headline</title>
          <link>http://boguslink.com/stories/lmnop.htm</link>
          <description>Another really important thing happened</description>
        </item>
    </channel>
</rss>

In the spirit of conciseness, this spec only describes those elements and attributes which augment the RSS spec. In addition, this spec also describes elements from the RSS spec which are utilized by the starter kit.

This spec recognizes the following <item> types in the master feed:

  1. Edition feed items - there must be at least one.
  2. Ad feed item - there must be zero or one.
  3. All other RSS items - there must be zero to many, though in practice, they should be limited in number to 15 or fewer.

Required elements for master feed <item>'s representing edition feeds

  • <rx:type> -- used by starter kit to delineate between item types. For an edition feed, the only valid value is "EditionFeed."
  • <guid> -- globally unique ID, used as an ID during data sync.
  • <title> -- required by RSS 2.0.
  • <csx:link> -- this is the link used by the sync mechanism to download the referenced feed. It is recommended that this URL is relative to the master feed URL. The NestedFeed attribute MUST be set to True. If this is an on-Demand edition, the onDemand attribute must be set to True (denoting that this feed should only be synced and cached on a user's request).

Optional elements for master feed <item>'s representing edition feeds

  • <pubDate> -- (HIGHLY RECOMMENDED) Although not required, the pubDate is frequently used in the UI to identify various edition feeds.
  • <csx:lastBuildDate> -- (HIGHLY RECOMMENDED) used to determine if a feed needs to be resynchronized. For example, if the EditionFeed changes (i.e. additional stories are added), an updated <csx:lastBuildDate> will cause the sync component to download the new content. If not present, the application assumes the <csx:lastBuildDate> is the same as the RSS <pubDate>. If neither <pubDate> nor <csx:lastBuildDate> are specified, the <csx:lastBuildDate> is assumed to be Jan 1, 1601. Note that this element is different from the RSS <lastBuildDate> element (the RSS version is only defined as a child of <channel>)

Optional attributes for master feed <item>'s representing edition feeds

  • <csx:hiddenItem> -- This attribute is used by CSX enabled RSS viewers to filter out any <item>’s that shouldn’t be displayed to an end user in a summary view. Thus, this attribute should be set to “True” for edition feed items.

Elements and attributes for master feed <item>'s representing an ad feed

The requirements for representing an ad feed are identical to those for an edition feed with the following exception(s):

  • <rx:type> -- (Required) used by starter kit to delineate between item type. For an ad feed, the only valid value is "AdFeed."

All other <item>'s in the master feed

Any <item> that has no <rx:type> element or an unrecognized <rx:type> bears no special signifigance to the starter kit. Nevertheless, it is recommended to include 5-10 non-hidden RSS items (such as news headlines) so that CSX compliant sync engines (like the Microsoft Subscription Center) can surface these items in a summary view.

The Edition Feed

The edition feed is based on RSS 2.0 and contains elements for dividing a grouping of content into a hierarchy of sections and stories. In the context of this spec, a section is a grouping of stories and other sections. A story is a single piece of content. The above diagram shows an example hierarchy. Note the following principles:

  • Stories must have at least one parent
  • Stories can have multiple parents, and these parents can be at different depths
  • There can be an arbitrary depth of subsections
  • Each section may only have one parent
  • Only sections may exists at the root level of the hierarchy
  • There cannot be any circular references in the structure of sections

Below is an example of an edition feed that has only two sections: Home and World. Each of these sections has only 1 story. This edition feed is given as an RSS <channel> containing children <item>'s that represent the complete edition. The children of the channel are 2 <item>'s which represent the 2 sections and 2 <item>'s which represent the 2 stories.

<?xml version="1.0" encoding="utf-8">
<rss version="2.0" xmlns:nr="http://www.microsoft.com/SceReaderExtensions" xmlns:csx="http://www.microsoft.com/schemas/ContentSyncExtensions">
    <channel>
        <title>Title for this Edition</title>
        <link>http://boguspaper.com</link>
        <description>Standard flat feed sample</description>
        <pubDate>Mon, 2 Oct. 2006 07:35:00</pubDate>
        <lastBuildDate>Mon, 02 Oct 2006 06:31:00</lastBuildDate>
        <!--Order of top level sections-->         <rx:sections>
          <rx:section>frontpage.xml</rx:section>
          <rx:section>world.xml</nr:section>
        </rx:sections>
        <item rx:type="Section">
          <link>http://boguslink.com/frontpage.htm</link>
          <guid>frontpage.xml</guid>
          <title>Home</title>
          <pubDate>Mon, 2 Oct. 2006 06:00:00</pubDate>
          <csx:lastBuildDate>Mon, 02 Oct 2006 06:31:00</csx:lastBuildDate>
          <description>Leading page of the newspaper</description>
          <rx:stories>
            <rx:story>story1.xml</rx:story>
           <rx:story>story2.xml</rx:story>
          </rx:stories>
        </item>
        <item rx:type="Section">
          <link>http://boguslink.com/world.htm</link>
          <guid>world.xml</guid>
          <title>World</title>
          <pubDate>Mon, 2 Oct. 2006 07:35:00</pubDate>
          <csx:lastBuildDate>Mon, 02 Oct 2006 06:31:00</csx:lastBuildDate>
          <description>News from around the world</description>
          <rx:stories>
            <rx:story>story2.xml</rx:story>
          </rx:stories>
        </item>
        <item rx:type="Story">
          <title>Story #1</title>
          <link>http://www.boguslink.com/articles/story1.htm</link>
          <guid>story1.xml</guid>
          <author>David Ortiz</author>
          <csx:lastBuildDate>Mon, 02 Oct 2006 12:52:06</csx:lastBuildDate>
          <pubDate>Mon, 02 Oct 2006 10:00:00</pubDate>
          <csx:link>articles/story1.xml</csx:link>
          <description>This is a description.</description>
        </item>
        <item rx:type="Story">
          <title>Story #2</title>
          <link>http://www.boguslink.com/articles/story2.htm</link>
          <guid>story2.xml</guid>
          <author>Jim Fairchild</author>
          <pubDate>Sun, 01 Oct 2006 12:00:00</pubDate>
          <csx:lastBuildDate>Mon, 02 Oct 2006 12:52:06</csx:lastBuildDate>
          <csx:link>articles/story2.xml</csx:link>
          <description>This is a description.</description>
        </item>
    </channel>
</rss>

Please note the fundamental difference between <link>’s and <csx:link>’s. The <link> elements are part of the RSS 2.0 spec. Content in these links will not be downloaded by the app. The inclusion of these elements does allow the app to drive users to the associated web address. The <csx:link>’s, however, are a vital part of the data sync process. Any content referenced by these links will be automatically synchronized and made available by the cache for use within the starter kit app. Conversely, any content not referenced by these links will not be available for use within the app. For more information, see the Content Sync Extensions spec.

Modified RSS feeds created according to this spec are compatible with existing RSS readers. Since standard RSS readers ignore any elements with a prefix (such as “rx”), the user experience in these standard readers will lack the richness made possible through the SCE Starter Kit.

Imposing Structure

Since the feed in the example above is flat list of <item>'s, a feed creator must explicitly impose some structure onto the content. This is done via the <rx:sections> and <rx:stories> elements. These elements contain an ordered list of the child content of a particular section. Thus, in the above example, Story 1 and Story 2 appear in the “Home” section but only Story 2 appears in the “World” section. If any of these sections had contained subsections, these subsections would appear as children of the <rx:sections> element.

Associating resources with <item>'s

Through the use of the <csx:link> element, each <item> can have additional content associated with it. This content will be downloaded during the sync process. This spec introduces elements to associate images and full text documents with stories. Images can also be associated with sections. Application authors wishing to add additional content types (videos, podcasts) can easily extend the feed format to accommodate these elements.

Images associated with a story

Images are associated with an article via a collection of imageReferences. An imageReference is a set of image(s) representing the same scene, but at different resolutions or aspect ratios (i.e. in image of a baseball game and a thumbnail preview of the same image). The inclusion of multiple images allows the application to pick the most appropriate image for a particular context. A single image reference containing two images associated with a story would be represented in our feed as follows:

<item rx:type="Story"> ... standard story elements ...   <rx:imageReferences>
      <rx:imageReference>
        <rx:caption>This is my caption.</rx:caption>
         <rx:credit>Joe Photo</rx:credit>
         <rx:image height="480" width="640">
           <csx:link>/images/1a.jpg</csx:link>
         </rx:image>
         <rx:image height="320" width="200" >
           <csx:link>/images/1b.jpg</csx:link>
         </rx:image>
      </rx:imageReference>
    </rx:imageReferences>
</item>

An article may have zero or one <imageReferences>. An <imageReferences> contains at least one <imageReference>. Each <imageReference> contains at least one <image>. If there is more than one <image>, then each <image> in the <imageReference> is a picture of the same scene, where each <image> has a different size or aspect ratio. Having multiple sizes and aspect ratio’s typically leads to a better user experience since the app can display the image that is optimal for a particular context (article preview, full text article, slideshow, search result, etc).

<image>’s within a single <imageReference> derive from the same source image (via cropping, scaling, etc). Thus, the <imageReference> contains a single <caption> and <credit> element for all of its children <image>’s.

The order of the images is important, since the default implementation of the starter kit’s story viewer will preserve this order during full text viewing. Also, the story viewer will insert only one <image> per <imageReference>. The selection of a particular <image> within an <imageReference> is made to optimize the user experience.

Images associated with a section

An <item> representing a section can optionally contain a single <sectionImageReference> with children <image> elements. If present, the default styling in the starter kit displays this image and its associated story in a prominent location within the corresponding section view. For this reason, the <sectionImageReference> must contain a reference to the <guid> of the associated story as indicated by the <story> element. Below is an example:

<item rx:type="Section">
  <title>Home</title>
  <link>http://somenewspaper.com/home.htm</link>
  <description>Major news headlines</description>
  <pubDate>Mon, 25 Dec 2006 11:00:00</pubDate>
  <rx:sectionImageReference>
    <rx:story>story1.xml</rx:story>
    <rx:caption>This is the caption.</rx:caption>
    <rx:credit>Joe Photo</rx:credit>
    <rx:image height="480" width="640">
      <csx:link>/images/1a.jpg</csx:link>
    </rx:image>
  </rx:sectionImageReference>
  ...

The story associated with this image must appear as an <item> in the feed.

Image suggestions

Although this spec does not restrict image sizes, experience suggests that a handful of image sizes can yield an excellent user experience. Consider the following guidlines:

  • Landscape images -- An aspect ration of 2:1 with image sizes on the order of 600x300 works well
  • Portrait images -- An aspect ratio in the range of 2:3 to 3:4 with images sizes on the order of 300x450 works well
  • Thumbnail images -- Keep a 1:1 aspect ratio with an image size around 75x75; these are used for story previews

We have found these scenarios to work well:

  • Good -- a single landscape or portrait image, with a preference for landscape images
  • Better -- a landscape or portrait image + a thumbnail

Images with extreme aspect ratios usually cause poor results with respect to onscreen layout.

Element descriptions for Edition Feeds

<channel>

The channel contains a collection of <item>'s which represent the edition. An edition feed must have exactly one channel. This channel must have at least one <item>.

  • Required channel elements
    • <title> -- required by RSS 2.0
    • <link> -- required by RSS 2.0, link to web version of the publication (this is NOT a link to any part of the RSS feed)
    • <description> -- required by RSS 2.0
    • <rx:sections> -- contains an ordered list of <rx:section> elements. Each <rx:section> element must contain a single guid which refers to a section item in the feed. The order of these elements determines the order of the sections within the starter kit data model. Because all stories must be the children of at least one section, <rx:stories> cannot appear as a child of the <channel> element.
  • Optional channel elements
    • <pubDate> -- part of RSS 2.0, publication date for content in the channel
    • <lastBuildDate> -- part of RSS 2.0, HIGHLY RECOMMENED for data sync purposes, last time content in channel changed. In practice, this should be the most recent <csx:lastBuildDate> or <pubDate> for any <item> in the channel. Note that this element is part of RSS, and is NOT the <csx:lastBuildDate>
    • <item> -- reference to an article or subsection. typically a channel has several of these <item>'s.

<item>

This will represent a reference to either a story or a section.

  • Required item attributes
    • <rx:type> -- Section or Story. All other types are ignored.
  • Required item elements
    • <title> -- required by RSS 2.0
    • <guid> -- a string that uniquely identifies the content, the publisher is responsible for ensuring the uniquness of this string
    • <csx:link> -- This is a valid element only if the item represents a story. This is a link to the NITF full text story.
    • <rx:sections> -- This is a valid child only if the item represents a section and if the section contains subsections. It contains an ordered list of <rx:section> elements. The order of these elements determines the order of the subsections within the section.
    • <rx:stories> -- This is a valid child only if the item represents a section. It contains an ordered list of <rx:story> elements. the order of these elements determines the order of the stories within the section.
  • Optional item elements
    • <link> -- part of RSS 2.0, link to web version of referenced content (this is NOT a link to any part of the RSS feed)
    • <author> -- part of RSS 2.0
    • <description> -- part of RSS 2.0, if the item references a story, the <description> should contain an abstract of the story
    • <rx:imageReferences> -- represents a collection of images associated with a story. only valid when the item represents a story
    • <rx:sectionImageReference> -- represents an image associated with a section. Only valid when the item represents a section.
    • <pubDate> -- date item was published
    • <csx:lastBuildDate> -- date item was last changed. This element governs the behavior of the sync engine. If the copy of the data in the cache is older than the <lastBuildDate>, the data is re-downloaded (this provides a mechanism to update stories after they're in cache). If <lastBuildDate> is not present, it's assumed to be the <pubDate>. If <pubDate> is not present, it's assumed to be Jan 1st, 1601.
    • <rx:properties> -- a collection of properties (key/value) pairs associated with the <item>. This allows you to add custom properties without extending the feed.

<rx:sectionImageReference>

This element can only be a child of <item>’s representing a section. Since a section <item> can only have one <rx:sectionImageReference>, no <rx:imageReferences> container element is needed.

  • Required child elements
    • At least one <rx:image> -- element representing an image at a particular size or aspect ratio
    • <rx:story> -- the GUID of the associated article.
  • Optional child elements
    • <rx:credit>
    • <rx:cation>

<image>

An <image> refers to a photo or figure of a specific size.

  • Required attributes
    • height – describes the height of the image in pixels
    • width - describes the width of the image in pixels
  • Required child elements
    • <csx:link> -- this is a link to the image which will be downloaded and cached

<rx:sections>

A <rx:sections> elements denotes the order of subsections within a section. Alternatively, if this element is the child of the <channel> element, this element denotes the order of the top level sections within the entire edition feed. Thus, <rx:sections> can only be the child of the <channel> element, or an <item> element representing a section with subsections.

  • Required child elements
    • <rx:section> -- <rx:sections> contains at least one <rx:section> element. The content of this element is the GUID of the subsection it represents.

<rx:stories>

A <rx:stories> element denotes the order of stories within a section. This element cannot be the child of the <channel> element, since we require all stories to be the children of some section. Thus, <rx:stories> can only be the child of an <item> element representing a section.

  • Required child elements
    • <rx:story> -- <rx:stories> contains at least one <rx:story> element. The content of this element is the GUID of the story it represents in the ordered list.

<rx:properties>

This element is a collection of <rx:property> elements.

  • Required child elements
    • <rx:property> -- an element representing a property
      • Required attributes
        • key - a string representing the name of the property

Example:

<rx:properties>
<rx:property key="kicker">Editorial</nr:property>
<rx:property key="templatetype">sportsRed</nr:property>
<rx:property key="badge">Updated</rx:property>
<rx:properties>

Full Text Story Document

The <csx:link> of a story item points to a NITF full text news story. Although the NITF standard is very rich, the SDK will only use a subset of the available NITF elements. Any element that is not listed in the subset below (“Supported NITF elements”) will simply be ignored.

Except for the title, all elements within this subset are optional with respect to the starter kit. Application authors can add support for any elements they deem interesting. When considering which elements you will include in your NITF files, bear in mind that if you want some piece of information to be part of the user experience, it must be present in the feed.

Some application authors prefer to use other formats for representing textual content. The starter kit certainly supports this scenario, but it will require a document converter. The complexity of these converters can be very simple or very complex, depending on the input format and the desired visual representation within the application.

Supported NITF elements

We give a concise description of the pertinent NITF elements here. The complete details of the NITF file format are available at http://www.nitf.org/ . See the accompanying example dataset for a concrete example. Unless noted otherwise, plain text is the only valid data for these elements (i.e. no HTML formatting).

<nitf>

  • Child elements
    • <body> -- required

<body>

  • Children elements
    • <headline>
      • <hl1> The mainheadline goes here
    • <byline> Byline goes here
    • <dateline> Dateline goes here
  • <body.content>
    • <block class="txt">
      • <p> text of the 1st paragraph
        • <a href="http://..." -- hyperlinks are OK within a paragraph
        • <b>, <bold>, <i>, <italic>, <u>, <underline>, <em class/style="b,bold,i,italic,u,underline"> -- simple text formatting is OK
      • <p> text of the 2nd paragraph
      • ...
      • <p> text of last paragraph
    • <block class="blurb1">
      • <p> 1st paragraph of a pull quote
      • ... -- multiple paragraphs are permitted, though typically pull quotes have only one paragraph
      • <p> last paragraph of a pull quote
    • <block class="blurb2"> -- multiple pull quotes are allowed
      • <p> 1st paragraph of 2nd pull quote

Ad Feed

The ad feed referenced by the master feed is optional. To reduce the size of this spec, the specification of the ad feed can be found here.

Rationale for multiple edition feeds

Frequently, application authors wish to provide consumers the ability to access different packages of content. For example, an application may make a new edition feed every day and expose the last seven days worth of content. Alternatively, an application could make content available in several languages. The feed structure described in this document readily enables these scenarios.

Feed generators should take care not to force users to synchronize too much content. For this reason, the CSX spec includes the notion of the “onDemand” feed. Feeds denoted as onDemand are not downloaded the background sync process. These feeds, however, can be downloaded by the app when a user wishes to see this content. The starter kit provides a simple method for syncing these on-demand feeds.

As a concrete example, consider a rich news feed that creates a new edition feed daily. Although this feed is updated throughout the day, a snap shot is taken at midnight which becomes an archive feed. The master feed could thus expose the current edition and two archive editions as standard edition feeds. These feeds would get synced be default. Additionally, four more archive editions could be denoted as onDemand edition feeds. When a user wishes to view this content, the feeds would then be downloaded by the app.

Complete Example Feed

For reference, the starter kit comes with an example feed. Developers can use this feed to immediately start customizing their own starter kit based application.

Featured Item