Useful Resources

DATA

BioASQ Task 5a: Large-scale online biomedical semantic indexing

Existing PubMed documents can be used as training data for this task. Pre-processed PubMed documents are provided, but participants are encouraged to do their own pre-processing.

Evaluation will take place on new documents that are added to PubMed and have not been annotated by curators at the time of submission.

BioASQ Task 5b: Biomedical Semantic QA (involves IR, QA, summarization)

The test dataset of Task 5b will be released in five batches, each containing approximately 100 questions. The first batch will start on March 8, 2017. Separate winners will be announced for each batch. Participation in the task can be partial; for example, it is acceptable to participate in only some of the batches, to return only relevant articles (and no concepts, triples, article snippets), or to return only exact answers (or only `ideal' answers). System responses will be evaluated both automatically and manually.

Sample Data for both tasks can be downloaded from the BioASQ Participants Area (no registration required).

 

TOOLS

 

HEMKit software (zip), a collection of hierarchical evaluation measures.
BioASQ Releases Continuous Space Word Vectors Obtained by Applying Word2Vec to PubMed Abstracts.
BioASQ Annotation and assessment tools

Tutorial

BioASQ social network

Tutorial