Roadmap - renepickhardt/related-work.net GitHub Wiki

This is our RoadMap for the developement of related-work.net

We plan to make a first release of a beta version our website in early November 2012. The aim is to finish the basic functionality and get first users on the site. We expect major redesign especially of the database backend after learning our lessons from implementing and using the beta version.

Description of the website

The basic desing of the website is already implemented at dev.related-work.net. For the first release we want a complete rewrite of the font-end in Java using Google's Web Toolkit GWT. This brings a hughe speed advantage in comparison with the python/WSGI technology currently used.

The basic site structure is as follows:

Front page
Sparse page with a big search box, some example searches and links to further information about the project.
Search results
Displaying articles and authors matching the querry.
Article pages
Displaying metadata, references and citations. Furthermore we suggest related-work that might be interesting for the user on the side.
Author pages
Display papers written by the author. Find coauthors, and other authors the user might be interested in.

Datasets

We are currently working with data extracted from www.arxiv.org in March 2012. In the future we plan to import further datasets, including:

[http://www.ncbi.nlm.nih.gov/pmc/](pubmed central)
http://www.informatik.uni-trier.de/~ley/db/
citeUlike

There is already progress on the pubmed data.

RDF Export and Ontologies

We want to make our data available in machine readable form following an RDF standard. We will provide:

a simple interface retrieve our data via URL
SPARQL interface (later on)
database dumps

We still have to decide which exact formats/ontologies to use.

http://vocab.org/review/terms.html#

Documentation

For the release we want to get our documentation up to date. People should be able to run the system on their own computers and understand the sources easily if they want to.

Todo:

consolidate source code consolidated into one repository
improve documentation of data import
update links to source code in old blogposts
blog posts about the process

Licensing

We plan to publish all of our source code and data (!) under an open license. However, we still have to:

check the permissions/licenses for use of the arxiv data
decide for a good open source licence for the source code

User generated content

We want to enable users to

provide data for recommender by surf behavior
provide links to resources
find double entries (clean up the data base in general)
discuss the scientific work
upload their own work in our pre print archive