• Home
  • Community
  • Blog
  • Open Science
  • BiP! Finder: Assisting life scientists to identify valuable scientific publications
Published

BiP! Finder: Assisting life scientists to identify valuable scientific publications

BiP! Finder (Biomedical Publication Finder) is a freely available tool providing search facilities for the identification of valuable open access publications in life sciences.

BiP! Finder: Assisting life scientists to identify valuable scientific publications

BiP! Finder (Biomedical Publication Finder) is a freely available tool providing search facilities for the identification of valuable open access publications in life sciences. Its key feature of BiP! Finder’s search engine is that it supports ranking of the retrieved publications based on two different publication impact aspects: popularity (short-term impact) and influence (long-term impact). The user can select which one fits better her current needs. BiP! Finder also provides extra functionalities like comparing publications based on different impact aspects and other characteristics, infographics visualising useful data for each publications and saving publications as bookmarks for a later read.  

In the last decades, the growth rate of scientific publications has been increasing, a trend that is expected to continue[1,2]. This is not only due to the increase in the number of researchers worldwide[3], but also to the growing competition that pressures them to continuously produce publishable results, a trend known as “publish or perish”[4]. This trend has also been notoriously correlated with a significant drop in the average quality of scientific papers[5,6]. As a result, many important tasks during the researchers’ daily routine have become extremely tedious and time-consuming. More importantly, it has become difficult for researchers to identify valuable publications relevant to particular research topics.

Quantifying and measuring the impact of scientific publications could facilitate the above task. Publications’ impact, combined with keyword-based relevance, can be used to implement ranking schemes beyond the traditional content-based ranking (i.e., “most similar on the top”). However, measuring impact is not a trivial task, and we may easily stumble into pitfalls along the way. It is an oversimplification to assume that there is a single, one-size-fits-all publication impact measure. In fact, such an impact may have several aspects that should be captured and understood in many different ways[7].

For example, a researcher often needs to search for the most “popular” publications in a field, i.e., those which are currently the focal point of the scientific community. On the other hand, a researcher may be interested in collecting “influential” papers for a field, i.e., those which have strongly shaped the field. Each of those impact aspects of a publication, i.e., popularity and influence, can be estimated by performing a different type of link analysis on the underlying citation network. Note that there are also some other impact aspects, such as the social media attention, captured by alternative metrics which do not rely on citation data[8]. Depending on each researcher’s ongoing needs, a different impact aspect might be preferable. Moreover, knowing about all different aspects creates a more complete picture for the impact of a publication.

The vision behind our tool, BiP! Finder (Biomedical Publication Finder), is to provide search facilities to identify valuable publications in life sciences based on several impact aspects. Currently, our tool supports popularity (short-term impact) and influence (long-term impact), however, in the near future, we plan to also support alternative impact metrics. All data used for the calculations needed to measure both impact aspects are currently gathered from NCBI’s PubMed Central database.

In BiP! Finder, influence is calculated using PageRank[9] (the algorithm introduced by Google to measure the importance of Web pages) on the underlying citation network. PageRank estimates the importance of a Web page based on the number and importance of all hyperlinks to it. In the context of citation networks, PageRank can estimate a publication’s influence by considering its citations.  

Popularity is calculated by executing FutureRank[10] on the citation network. The reason PageRank is not appropriate to estimate  publication short-term impact is that it is based on the current centrality each publication has in the citation network. However, any recent publication  has low centrality simply because the first received citations usually require months or even years to appear (a phenomenon known as “citation gap”[11-14]). As a result, PageRank is biased against recently published papers[15-17]. FutureRank is a PageRank variation that considers the publication’s metadata (in particular, its publication year & author list) to reduce the aforementioned bias.

BiP! Finder’s users can benefit from the measured popularity and influence scores, since these scores are incorporated (along with the keyword relevance scores) to the ranking mechanism of the keyword search feature provided. The user can select which of the two impact aspects is more important for her current needs by clicking on the corresponding option of the radio buttons below the search box (see Figure 1). Based on her choice, more popular or more influential publications will appear at the top of the list. Moreover, the user can filter out results using a set of filters located at the top left corner of the Web page.  

Figure 1. A screenshot of the main search interface of BiP! Finder

Publication comparison based on all supported impact aspects and some other characteristics of the publication (e.g., its abstract’s readability based on Flesch Reading Ease) is possible by clicking on the records to be compared and then click on the orange button appearing at the top right corner of the Web page. The result is a new page featuring a radar chart to compare the selected publications (see Figure 2).

Figure 2: A screenshot of the publication comparison page

Finally, some features for registered users have also been implemented. After logging in, a user can save interesting publications in her list of bookmarks and access them by clicking on the corresponding menu item.

2019 will bring many improvements and new features for BiP! Finder users. First of all, we  plan to incorporate some alternative impact aspects (e.g., social media attention, number of bookmarks or views in BiP! Finder). Moreover, we plan to make BiP! Finder a useful tool for researchers of all disciplines. In this direction, we started to build a larger, interdisciplinary citation network (having around 50M nodes) based on citations provided by OpenCitations. Finally, we plan to improve the registered users functionality by providing options for the users to organise their bookmarks using folders and to make private comments on their bookmarked papers.

Always keep in mind that BiP! Finder is (and will remain) a completely free tool, developed and hosted by IMSI, Athena RC. Our  first priority is to provide open science services to the research and academic community.

References

[1] L. Bornmann and R. Mutz. Growth rates of modern science: a bibliometric analysis based on the number of publications and cited references. Journal of the Association for Information Science and Technology, 66(11):2215-1111, 2015
[2] P. Larsen and M. Von Ins. The rate of growth in scientific publication and the decline in coverage provided by science citation index. Scientometrics, 84(3):575-603, 2010
[3] UNESCO Science Report: towards 2030. UNESCO Publishing, 2015
[4] D. Fanelli. Do pressures to publish increase scientists’ bias? An empirical support from US states data. PLOS ONE, 5(4):1-7, 2010
[5] J. P. Ioannidis. Why most published research findings are false. PLOS Med, 2(8):e124, 2005
[6] D. Sarewitz. The pressure to publish pushes down quality. Nature, 533(7602):147, 2016
[7] J. Bollen, H. Van de Sompel, A. Hagberg, and R. Chute. A principal component analysis of 39 scientific impact measures. PLOS ONE, 4(6):e6022, 2009
[8] J. Lin and M. Fenner. Altmetrics in Evolution: Defining and Redefining the Ontology of Article-Level Metrics. Information Standards Quarterly, 25(2), 20, 2013
[9] L. Page, S. Brin, R. Motwani, and T. Winograd. The PageRank citation ranking: Bringing order to the Web. Stanford InfoLab (Technical Report), 1999
[10] H. Sayyadi and L. Getoor. FutureRank: Ranking scientific articles by predicting their future PageRank. SDM, p. 533-544, 2009
[11] E. V. Bernstam, J. R. Herskovic, Y. Aphinyanaphongs, C. F. Aliferis, M. G. Sriram, and W. R. Hersh. Using citation data to improve retrieval from medline. Journal of the American Medical Informatics Association, 13(1):96{105, 2006.
[12] V. P. Diodato and P. Gellatly. Dictionary of Bibliometrics (Haworth Library and Information Science). Routledge, 1994.
[13] P. Groth and T. Gurney. Studying scientific discourse on the web using bibliometrics: A chemistry blogging case study. 2010.
[14] D. R. Smith. A 30-year citation analysis of bibliometric trends at the archives of environmental health, 1975{2004. Archives of environmental & occupational health, 64(sup1):43{54, 2009.
[15] P. Chen, H. Xie, S. Maslov, and S. Redner. Finding scientific gems with googles pagerank algorithm. Journal of Informetrics, 1(1):8{15, 2007.
[16] W.-S. Hwang, S.-M. Chae, S.-W. Kim, and G. Woo. Yet another paper ranking algorithm advocating recent publications. In Proceedings of the 19th international conference on World wide web, pages 1117-1118. ACM, 2010.
[17] P. S. Yu, X. Li, and B. Liu. On the temporal dimension of search. In Proceedings of the 13th international World Wide Web conference on Alternate track papers & posters, pages 448-449. ACM, 2004.

Leave a comment

You are commenting as guest. Optional login below.

Unless otherwise indicated, content hosted on OpenUP Hub is licensed under an Attribution 4.0 International (CC BY 4.0).