TREC-BLOG

CONTENTS

  1. OVERVIEW
  2. MAILING-LIST
  3. DATASET
  4. TREC 2006
  5. TREC 2007 Tasks
  6. Provisional Timeline
  7. History of Document

OVERVIEW

The Blog track explores the information seeking behavior in the blogsphere. The track was first introduced in TREC 2006. The Blog track will run again in 2007.

This Wiki Web page provides the guidelines for participation in the 2007 edition of the TREC blog track. Updates and new information will ultimately appear in this Web page.

MAILING-LIST

There is a mailing list for TREC-blog that is run by NIST. To subscribe to the trec-blog list, send an email message to listproc@nist.gov such that the body of the message consists of the line

  • subscribe trec-blog <FirstName> <LastName>

If you wish to contact the Blog track organisers, please email the following email-address: (trecblog-organisers (at) dcs.gla.ac.uk)

DATASET

A new test collection, called Blog06, was created for the TREC Blog track. License details and information on how to get access to the TREC Blog06 collection are provided in http://ir.dcs.gla.ac.uk/test_collections

The TREC Blog06 collection is a big sample of the blogsphere, and contains spam as well as possibly non-blogs, e.g. RSS feeds from news broadcasters. It was crawled over an eleven week period from 6th December 2005 until the 21st February 2006. The collection is 148GB in size, consisting of:

  • 38.6GB of feeds

  • 88.8GB of permalink documents

  • 28.8GB of homepages

The number of permalinks documents, is over 3.2 million of documents. Further information on the Blog06 collection and how it was created can be found in the DCS Technical Report TR-2006-224, Department of Computing Science, University of Glasgow at http://www.dcs.gla.ac.uk/~craigm/publications/macdonald06creating.pdf

TREC 2006

In TREC 2006, we had two tasks, a main task (opinion retrieval) and an open task. The opinion retrieval task focuses on a specific aspect of blogs: the opinionated nature of many blogs. The second task was introduced to allow participants the opportunity to influence the determination of a suitable second task for 2007 on other aspects of blogs, such as the temporal/event-related nature of many blogs, or the severity of spam in the blogsphere.

Further and detailed information about the TREC 2006 Blog could be found in http://www.science.uva.nl/~mdr/Wikis/ The TREC 2006 Wiki is password protected. You will need to ask Maarten de Rijke for a login (mdr (at) science.uva.nl)

The TREC 2006 Blog track 'Overview paper' is in the Proceedings of TREC 2006, and is available from the TREC Web site at http://trec.nist.gov/pubs/trec15/papers/BLOG06.OVERVIEW.pdf. NB: You should cite this paper when you describe the opinion finding task in publications.

TREC 2007 Tasks

Last year, we ran an opinion retrieval task. This task will be run again in 2007, with a polarity subtask. Additionally, we propose a new task, called the Blog Distillation (Feed Search) task.

The Blog06 test collection will be used again in TREC 2007.

Please note that if you plan to take part in the TREC Blog track you need to respond to the TREC Call for Participation. Just to remind you, the CfP is at http://trec.nist.gov/call07.html

Opinion Retrieval Task

The opinion retrieval task involves locating blog posts that express an opinion about a given target. It can be summarised as What do people think about <target>. It is a subjective task. The target can be a "traditional" named entity -- a name of a person, location, or organization -- but also a concept (such as a type of technology), a product name, or an event. Note that the topic of the post does not necessarily have to be the target, but an opinion about the target must be present in the post or one of the comments to the post.

For example, for the target "skype":

Excerpt from relevant, opinionated post (permalink http://gigaom.com/2005/12/01/skype-20-eats-its-young/):

  • Skype 2.0 eats its young

    The elaborate press release and WSJ review while impressive don’t help mask the fact that, Skype is short on new ground breaking ideas. Personalization via avatars and ring-tones... big new idea? Not really. Phil Wolff over on Skype Journal puts it nicely when he writes, "If you’ve been using Skype, the Beta version of Skype 2.0 for Windows won’t give you a new Wow! experience." ...

Excerpt from unopinionated post (permalink http://www.slashphone.com/115/3152.html):

  • Skype Launches Skype 2.0 Features Skype Video

    Skype released the beta version of Skype 2.0, the newest version of its software that allows anyone with an Internet connection to make free Internet calls. The software is designed for greater ease of use, integrated video calling, and ...

Assessment

We will use the same assessment procedure defined in 2006. The retrieval unit is documents from the permalink component of the Blog06 test collection. The content of a blog post is defined as the content of the post itself and the contents of all comments to the post: if the relevant content is in a comment, then the permalink is declared to be relevant. Note that blogs and non-blogs will be treated equally in this task. Our objective is to run again the opinion task with 50 new topics, which we will again ask NIST to select from query logs provided by commercial blog search engines (see TREC 2006 Blog track Overview paper for further details on the methodology).

The following scale will be used for the assessment:

 *[-1] i.e. Not judged.  The content of the post was not
    examined due to offensive URL or header (such documents do exist
    in the collection due to spam).  Although the content itself was not assessed,
    it is very likely, given the offensive header, that the post is
    irrelevant.

 *[0] i.e. Not relevant.  The post and its comments were
    examined, and does not contain any information about the target,
    or refers to it only in passing.

 *[1] i.e. Relevant.  The post or its comments contain
    information about the target, but do not express an opinion
    towards it.  To be assessed as ``Relevant", the information given
    about the target should be substantial enough to be included in a
    report compiled about this entity.

If the post or its comments are not only on target, but also contain an explicit expression of opinion or sentiment about the target, showing some personal attitude of the writer(s), then judge the document using the labels below.

 *[2] i.e. Relevant, negative opinions. The post contains an explicit expression of opinion or sentiment about the target, showing some personal attitude of the writer(s), and the opinion expressed is explicitly negative about, or against, the target.

 *[3] i.e. Relevant, mixed positive and negative opinions. Same as [2], but contains both positive and negative opinions.

 *[4] i.e. Relevant, positive opinion. Same as [2], but the opinion expressed is explicitly positive about, or supporting, the target.

Evaluation

Number of test targets will be 50. These will be selected by NIST from a larger commercial query log - using the methodology described in the TREC Blog track 2006 Overview paper.

Metrics will be precision/recall based, where the actual "most important metric" will be MAP.

Polarity Subtask

We propose to add a related subtask, namely a text classification-related task, requiring participants to determine the polarity (or orientation) of the opinions in the retrieved documents, namely whether the opinions are positive or negative. For training, participants could use last year’s 50 queries, with their associated relevance judgements - available from http://trec.nist.gov/data/blog06.html. Indeed, during the assessment procedure in 2006, for each document in the pool, the NIST assessors have specified the polarity of the relevant documents: relevant positive opinion; relevant mixed positive and negative; relevant negative opinion.

Evaluation will probably be by Accuracy and the F measure.

Submissions

Opinion Task

For the opinion task, the usual trec_eval format will be used. The submission file should contain lines of the format

  topic Q0 docno rank sim runtag

where

  topic is the topic number
  Q0 is a literal "Q0"  (a historical field...)
  docno is the permalink document number (BLOG06-200.....-...)
  rank is the rank at which the system returned the document (1 .. n)
  sim is the system's similarity score
  runtag is the run's identifier string

You may submit up to six runs for the opinion findin task, which must include the following compulsary runs:

  • An automatic run using the title field of the topics.

  • An automatic run, using the title fields of the topics, with all opinion-finding features turned off. (Ie a topic-relevance run). The documentlabel field should be set to 0.

Aside from the required runs, we wholeheartedly encourage the submission of manual runs, which are invaluable in improving the quality of the collection. (An automatic run is one that involves no human interaction. In contrast, a manual run is one where (for example) you formulate queries, search manually, give relevance feedback, and/or rerank documents by hand.)

Polarity Subtask

Additionally, if you are participating in the polarity sub task, you should provide a separate file for each submitted run to the opinion finding task, that details the predicted polarity for each retrieved document in each query. This file should include the same documents in the same order as for the opinion finding task, but with an additional polarity label. You should submit a file for each run in the opinion-finding task.

Format:

  topic docno documentlabel

where

  topic is the topic number
  docno is the permalink document number (BLOG06-200.....-...)
  documentlabel is the system's prediction of polarity - one of: [0,2,3,4]

The documentlabel field states whether document is predicted to be negatively opinionated 2; mixed polarity 3; or positively opinionated 4.

Blog Distillation (Feed Search)Task

Blog search users often wish to identify blogs about a given topic, which they can subscribe to and read on a regular basis. This user task is most often manifested in two scenarios:

  • Filtering: The user subscribes to a repeating search in their RSS reader.

  • Distillation: The user searches for blogs with a recurring central interest, and then adds these to their RSS reader.

For TREC 2007, we are recommending that the TREC Blog track investigates the latter scenario – Blog Distillation. The Blog Distillation Task can be summarised as Find me a blog with a principle, recurring interest in X. For a given area X, systems should suggest feeds that are principally devoted to X over the timespan of the feed, and would be recommended to subscribe to as an interesting feed about the X (ie a user may be interested in adding it to their RSS reader).

This task is particularly interesting for the following reasons:

  • A similar (yet-different) task has been investigated in the Enterprise track (Expert Search) in a smaller setting of around 1000 candidate experts. For Blog Distillation, the Blogs06 corpus contains around 100k blogs, and a Web-like setting (with anchor text, linkage, spam, etc).

  • A Topic Distillation task was run in the Web track. In Topic Distillation, site relevance was required as (i) Is principally devoted to the topic, (ii) provides credible information on the topic, and (iii) is not part of a larger site also principally devoted to the topic.

While the definition of Blog Distillation as explained above is different, the idea is to provide the users with the key blogs about a given topic. Note that point (iii) is not applicable in a blog setting.

Operationality

This year, NIST cannot provides resources for a blog distillation task. Following the TREC Enterprise track, we suggest that the topics and assessments are provided by participating groups.

  • (24th June): Each participating group will initially provide 6 or 7 topics along with some relevant feeds.

  • (After submission): Relevant feeds will be pooled, and the groups which proposed topics will evaluate them.

Proposed assessment guidelines at TREC-BLOG/BlogDistillationAssessmentGuidelines.

Topic Development Phase

We need each participating group to create 6 or 7 topics for this task. Your aim is to identify some topics 'X', and a few (e.g. 2 or 3) relevant feeds (identified by their feedno), and send them in an email with two or three relevant feeds to ian.soboroff (AT SYMBOL) nist.gov PLEASE DO NOT POST THEM TO THE MAILING LIST

Format:

<top>
<title> a short query-ish title </title>

<desc> Description:
The desc is a sentence-length description of what you are looking for, and should include the title words.
</desc>

<narr> Narrative:
The narr is a paragraph-length description of what you are looking for.  Use it to give details on what feeds or blogs are relevant and what feeds or blogs are not.  If there are "gray areas", state them here.
</narr>

<feeds>
feedno
feedno
feedno
</feeds>

<comments>
Anything else you want to say.
</comments>

</top>

Example:

<top>
<title> solaris </title>

<desc> Description:
Blogs describing experiences administrating the Solaris operating system, or its new features or developments.
</desc>

<narr> Narrative:
Relevant blogs will post regularly about administrating or using the Solaris operating system from Sun, its latest features or developments. Blogs with posts about Solaris the movie are not relevant, not are blogs which only have a few posts Solaris.</narr>

<feeds>
*BLOG06-feed-053948 BLOG06-feed-078402 BLOG06-feed-018020* </feeds>

<comments>
None.
</comments>

</top>

The topic development phase ends in strictly 2 weeks: all topics should be submitted to Ian by end of Sunday 24th June.

Topic Development System

To help the participating groups in creating their blog distillation topics, we have provided a standard search system for *documents* on the Blog06 collection, but it also displays the feeds for each documents, and moreover, you can view all the documents for a given feed. You can access it from: http://ir.dcs.gla.ac.uk/terrier/search_blogs06/

If you have your own search system for the Blogs06 collection (say, from last year's track), feel free to use that.

You don't need to state all the relevant feeds for a topic, as there will be a separate assessment phase in September, after all runs have been submitted.

Evaluation

Participants can submit up to 4 runs. Each run has feeds ranked by their likelihood of having an principle (recurring) interest in the topic. We suggest that up to 100 feeds are returned per topic. As usual, one automatic, title-only run is required.

In contrast to the Opinion-Finding task runs, submitted runs to the blog distillation task will follow the ususal trec_eval format, ie.

  topic Q0 feedno rank sim runtag

where

  topic is the topic number
  Q0 is a literal "Q0"  (a historical field...)
  feedno is the feed number of the blog (BLOG06-feed-......)
rank is the rank at which the system returned the document (1 .. n)
  sim is the system's similarity score
  runtag is the run's identifier string

Provisional Timeline

  • 9th June: Blog Distillaion Task Topic Development Starts

  • 24th June: Blog Distillaion Task Topic Development Ends

  • 25th June: Opinion Task topics posted

  • 30th June: Blog Distillation task Topics posted

  • 6th August: Opinion Task runs due

  • 17th August: Blog Distillation runs due

  • September: Blog Distillation Task Assessment by participants.

  • October: Relevance judgements for both tasks sent to participants

  • November: TREC

History of Document

  • June 8, 2007: timeline updated for topics posted, both tasks

  • May 29, 2007: timeline updated for opinion task

  • March 9, 2007: updated for tasks

  • February 5, 2007: first draft

last edited 2007-06-09 16:13:00 by IadhOunis