Scientific and medical research has seen explosive growth in the past few decades. Since 1996, the United States National Library of Medicine (NLM) has maintained PubMed, a free portal providing access to references and abstracts on life sciences and biomedical topics. PubMed now has over 21 million citations going back to 1966, and continues to add a staggering amount (about 500,000 new records) each year. The chart below was adapted from a recently published journal article about PubMed.

Article by Zhiyong Lu

Today, clinical professionals have tools (like Ovid, ScienceDirect, UpToDate, Trip) that help answer complex questions and are connected to validated knowledge bases derived off of sources like PubMed. But how does a patient, with no access or expertise in the domain find and leverage this information? Medify tries to solve that.

The value proposition of Medify is not easy to describe. In fact, the ‘What is Medify‘ description on the site was banal enough to be dismissed, just like most other online social health startup marketing. They do a better (albeit prolix) job on the ‘How it works‘ page. Medify will appeal to the well-informed patients who are not afraid to sift through piles of academic articles burdened with medical jargon to understand and manage their own disease. Medify gives them a dashboard of existing literature – with it they can monitor things like which treatments are gaining traction in the provider community, which institutions are on the forefront of relevant research, etc. Affiliated web 2.0 functionality like faceted search, social sharing, tracking, annotating are bundled in to make it more personal.

Under the hood, it is smartly leveraging what public knowledge bases are already out there. The citation and abstract are free from PubMed. Interstitial phrases and terms in the content are further linked to sources like Wikipedia and MeSH. Brief outcomes or summaries are synthetically constructed from the article text.

Medify is not alone. There are other sites that try to help patients navigate the vast sea of research literature. PubMed’s parent NLM runs MedLinePlus,  UpToDate has a patient-oriented version, and niche startups like MyDailyApple, PatientsLikeMe are also tackling this to some extent.

In 2001 Brian Haynes, MD, PhD wrote an article describing the landscape of such ‘pre-appraised’ resources through a hierarchical structure that had four layers (called “4S” Model):

  • Original ‘Studies’ (what PubMed provides) at the base
  • ‘Syntheses’ (systematic reviews sources like The Chochrane Library) of evidence just above that
  • ‘Synopses’ (like EBM, EBN Online) of studies and syntheses next up, and
  • the most evolved evidence-based information ‘Decision Support Systems’ at the top.

He later expanded the model to 2 more layers (read about the “6S” paper here), but the basic argument remained same – Information seekers should begin looking at the highest level resource available for the problem that prompted their search. That is a good framework to understand why services like Medify are needed.

The skeptics would argue that offerings like Medify will do little more than empower hypochondriacs. But I believe that well-served health information only makes outcomes better. The lag time between published research being implemented in real-world medical practice can be in the order of decades. As consumers, we are entrusted to make choices about other important topics like money, and the market provides personal finance tools/services to help. Same can apply to healthcare, without diminishing the role of experts.


With regulatory push for EHR adoption, there is an impending avalanche of healthcare data coming in the next few years. Some believe it’s already here. But data can come in different flavors: from the frighteningly common free text to loosely categorized documents to well structured messages. The less structure it has, more hard it becomes for a machine to understand the real meaning (semantics) of the content. The combined effect of increasing quantity and poor quality makes this a bigger problem than what most anticipate.

Apixio is one of the few startups tackling this issue. Their analytics engine indexes the underlying data, processes queries and provides context-relevant results. The core technology is supposedly based on Apache’s Pig (a data-flow language and execution framework for parallel computation), Hadoop (a framework that allows for the distributed processing of large data sets across clusters of computers) and Cassandra (a scalable multi-master database).

There are a number of terminologies (read ontologies) in healthcare, trying to specify the concepts and relationships from a particular perspective. LOINC, ICD, SNOMED, CPT are common examples, but see a pretty comprehensive list of all human-related ontologies at BioPortal (filter by category ‘Health’).

So a medical-grade search service offering would need to traverse such terminologies and surface all relevant, normalized data related to the query. For example, a search for keyword “breathlessness” in a patient with long, complicated medical history would bring up documents and encounters that mention items like wheezing, PEFR, smoking, asthma management. It’s no short order to do all that analytical crunching.

Sophisticated data transformation and abstraction offerings are certainly needed for making sense of complex healthcare data. Niche efforts like Apixio, 360Fresh, are signs of growing market realization that the era of just trying to digitize healthcare data is getting over. Now we start figuring out what the heck to do with all the incoming bytes.

PS: Advanced analytics offerings in healthcare are an interesting topic. See this wiki page for a living list of relevant companies in this space.


Enhanced by Zemanta

Archimedes Model

ArchimedesLogoDavid M. Eddy, MD, PhD is a legend when it comes to Evidence-Based Medicine. He coined the term in 1980s, actually. Being exceptionally skilled in mathematics, it was perhaps natural for him to apply it to medicine. The result is Archimedes Model– a mathematical simulation of the human physiology and how it interacts with healthcare interventions.

At the heart of the model are a set of ordinary and differential equations that represent the physiological pathways relevant to diseases and their complications. The ‘variables’ in this model include signs, symptoms, patient behaviors (including adherence), provider behaviors, provider performance, encounters (e.g.  ER visits, office visits, admissions), protocols, guidelines, tests, treatments, etc. Basically, it tries to incorporate all aspects of diseases and healthcare system that are needed to analyze downstream clinical outcomes, utilization and costs.

A more loaded one-line description of Archimedes (taken from his original paper in 2002): “It’s an object-oriented, continuous-time, full simulation model for addressing a wide range of clinical, procedural, administrative, and financial decisions in health care at a high level of biological, clinical, and administrative detail.” Phew. I’ll confess that I don’t know what exactly is under the hood. But I know enough about the informatics field to believe that this approach is viable and very exciting.

This YouTube video explains how the model can be used to run virtual clinical trials. Kasier has already backed the findings of Archimedes to change their diabetes care delivery.  I think there are fantastic, unlimited opportunities for applying such a fundamental model to medicine- personalized health predictions, public health, health policy, cost-effectiveness and what not.  As a startup, they are doing fine. With an impressive list of partners/clients, and a $15.6M RJWF grant (2007), they have a good runway and momentum. They have all the right ingredients to be a change agent for next-generation Healthcare IT.

Jan 2011 Update: The FDA and Archimedes entered into a research agreement to understand the benefits of weight loss compared to the long-term risks of cardiovascular outcomes in patients treated with weight loss drugs.
Reblog this post [with Zemanta]


MyDailyApple is the consumer oriented website started in 2006 by Praxeon, a life sciences company focusing on semantic search for healthcare information. It brings together medical news, research, blogs, and multimedia around a disease or condition for its users.

Besides searching for personalized health information in natural prose, MyDailyApple users can get relevant medical news and community opinions for their condition. It utilizes the same technology as Curbside.MD (Praxeon’s website for evidence based clinical search for physician users).

MyDailyApple is a Google Health integrated service, which means Google Health users can share their information between the two services.


Curbside.MD is a search engine for finding evidence based clinical information. The idea is to type in the search need as a natural language question that a clinician would normally ask his/her colleague, and get relevant answers from the literature (articles, images, guidelines, etc.)

I took it for a test drive with a moderately complex question (‘what is the indication for platelet transfusion in an 80 year old female with dengue fever?’) and got relevant results in terms of review articles and clinical trial outcomes. Pretty cool.

The logic behind Curbside.MD is semantic indexing using a controlled medical terminology (they call it “semantic fingerprinting“) with a bit natural language processing. They provide a bunch of tools (search box, news, spellchecker etc.) for partners and a browser search toolbar for users. The technology is also available as an API service from an alternate website called Fingerprint.MD.

Praxeon is the company that started Curbside.MD and MyDailyApple in 2006. Both websites are currently free for users, but the company admits to a future ad-based business model.