You are here:

Supporting online health queries by modeling patterns of creation, modification and retrieval of medical knowledge

, Aalto University School of Science, Finland, Finland

EdMedia + Innovate Learning, in Vancouver, BC, Canada ISBN 978-1-939797-24-7 Publisher: Association for the Advancement of Computing in Education (AACE), Waynesville, NC


We evaluated properties of knowledge resources that can be used for building new semantically and behaviorally motivated resources of health guidance and clinical decision making by modeling patterns of creation, modification and retrieval of medical knowledge. We evaluated statistical properties of Wikipedia articles of general terminology and medical terminology based on 25 most common diagnosis names emerging in an electronic health records system. We also evaluated statistical properties of general terminology used in everyday life in respect to occurrence and importance to enable adaptive perspectives to medical knowledge. Our experiments exploit a conceptual co-occurrence network that we created based on a set of 93 medical texts about healthcare guidelines provided by The Finnish Medical Society Duodecim containing 57 679 unique conceptual links. We provide supplementing statistics of an extended range of Wikipedia articles and an n-gram analysis about the set of medical texts.


Lahti, L. (2016). Supporting online health queries by modeling patterns of creation, modification and retrieval of medical knowledge. In Proceedings of EdMedia 2016--World Conference on Educational Media and Technology (pp. 711-718). Vancouver, BC, Canada: Association for the Advancement of Computing in Education (AACE). Retrieved February 16, 2019 from .

View References & Citations Map


  1. Alexa Internet (2016). Web traffic report for January 2016. Http:// Amante, D., Hogan, T., Pagoto, S., English, T., & Lapane, K. (2015). Access to care and use of the Internet to search for health information: Results from the US National Health Interview Survey. Journal of Medical Internet Research, 17(4): e106. /
  2. Berinstein, P. (2006). Wikipedia and Britannica-The kid’s all right (and so’s the old man). Searcher, 14(3). Information Today, Inc. Http:// Chesney, T. (2006). An empirical examination of Wikipedia's credibility. First Monday, 11(11).
  3. Duyck, W., Vanderelst, D., Desmet, T., & Hartsuiker, R. (2008). The frequency effect in second-language visual word recognition. Psychonomic Bulletin & Review, 15(4), 850-855. Http:// Giles, G. (2005). Internet encyclopaedias go head to head. Nature, 438, 7070, 900-901.
  4. Heaps, H. (1978). Information retrieval: computational and theoretical aspects. Academic Press, New York, USA.
  5. Herdan. G. (1960). Type-token mathematics. Mouton, The Hague, the Netherlands.
  6. Izura, C., & Ellis, A. (2002). Age of acquisition effects in word recognition and production in first and second languages. Psicológica, 23, 245-281. Http:// Kamvar, M., & Baluja, S. (2006). A large scale study of wireless search behavior: Google mobile search. Proc. Conference on Human Factors in Computing Systems (CHI '06), 701–709. Http://
  7. Lewandowski, L., Codding, R., Kleinmann, A., & Tucker, K. (2003). Assessment of reading rate in postsecondary students. Journal of Psychoeducational Assessment, 21, 134-144. Http:// Morais, A., Olsson, H., & Schooler, L. (2013). Mapping the structure of semantic memory. Cognitive Science, 37, 125-145.
  8. Petruszewycz, M. (1973). L'histoire de la loi d'Estoup-Zipf: documents. Mathématiques et sciences humaines, 44, 41-56.
  9. Petersen, A., Tenenbaum, J., Havlin, S., Stanley, H., & Perc, M. (2012). Languages cool as they expand: allometric scaling and the decreasing need for new words. Scientific Reports 2, 943. Http:// Rank2traffic (2016). Web traffic report for January 2016. Http://
  10. Rowley, R. (2011). The 25 most common diagnoses. Robert Rowley, Chief Medical Officer, Practice Fusion EMR. Practice Fusion Blog, posted on 9 February 2011. Http:// Serrano, M., Flammini, A., & Menczer, F. (2009). Modeling statistical properties of written text. Public Library of Science ONE (PLoS ONE), 4(4): e5372. Http://
  11. Simkin, M., & Roychowdhury, V. (2011). Re-inventing Willis. Physics Reports 502, 1-35. Http:// Simon, H. (1955). On a class of skew distribution functions. Biometrika, 42, 425.
  12. Spink, A., Wolfram, D., Jansen, M., & Saracevic, T. (2001). Searching the web: The public and their queries. Journal of the American Society for Information Science and Technology 52 (3): 226–234. Https:// White, R., & Horvitz, E. (2009). Cyberchondria: Studies of the escalation of medical concerns in Web search. Proc. ACM Transactions on Information Systems (TOIS), 27(4), article 23. Http://
  13. Wikistics Falsikon (2009). Page hits per day for en.wikipedia in year 2008. Based on 210 analysed days, requests counted by Squid servers. Online available at Wmflabs pageviews (2016). Wikimedia Tool Labs Pageviews Analysis. Https://
  14. Zipf, G. (1935). The psychobiology of language: an introduction to dynamic philology. Houghton-Mifflin, Boston, Massachusetts, USA.

These references have been extracted automatically and may have some errors. If you see a mistake in the references above, please contact