Sunday, January 31, 2021

The Actual History Of Explorys, Part 1 (2009 to 2015, The Startup Years)

 

The Actual History Of Explorys, Part 1 (2009 to 2015, The Startup Years)

© Doug Meil, 2021

 

Explorys Founding

 

Explorys was founded, officially, in October 2009 via a deal with Cleveland Clinic Innovations.  The skunkworks started much earlier with a variety of conversations with Cleveland Clinic.  I got involved in the Summer of 2009.

 

Naming a startup is hard.  It consists of multiple sessions generating name ideas, and 80% of them will be terrible.  15% of them will be mediocre.  And if you’re lucky 4% of them will be decent.  And if you’re really lucky 1% of them will be good.  And not just ‘good’ – but also be trademark, service-mark, and copyright clear, and the domain name will be available.  The name Explorys came from an ad/marketing person that we were working with.  It was a good name.

 

One of the names I generated that had some traction in the ideation process was VennPoint.  Explorys is a much better name than that, but one of the name-themes I liked was intersecting sets as it represented the general idea of population exploration.  While this particular name-idea wasn’t used, the underlying notion wound up in another place:  if you look closely at the Explorys logo it was three intersecting sets in a Venn diagram.

 

 

Explorys Influence:  Everstream

 

A great deal of the initial technical direction of Explorys was influenced by experiences from Everstream (both in patterns and anti-patterns), though this aspect of the story isn’t well known.  As of writing (2021) there is a company called “Everstream” that sells business fiber connections, but this is not the same company. 

 

Everstream was a company founded in 1999 in Cleveland, Ohio that was originally a music streaming service, which targeted newspaper websites to place its music (and advert) player.  This was years before Pandora and Spotify.  Everstream might have been too far ahead of its time and the company wound up laying off most of its employees around 2002.  Everstream wandered around for a while performing odd consulting gigs and eventually found a niche in Video On Demand (VOD) data collection and reporting in early 2004.  This is when the only at-home streaming options were through the cable company (e.g., tune to channel 500 for the VOD application).  I joined Everstream at this time.  Everstream was acquired by Concurrent Computer Corporation in August 2005.  Concurrent eventually rebranded Everstream as something else, and then let the Everstream trademark lapse.  I was at Everstream/Concurrent until 2009.

 

Everstream experienced a number of interesting technical challenges:

 

·      Distributed Data Collection

o   Cable companies wanted an enterprise view of their video on demand services, but the service delivery platforms were deployed at a market level at that time due to technical capabilities of the day.  Larger customers could have dozens of markets.

o   Everstream had a remote collection device called an “IDG” (Interactive Data Gateway)

 

·      Multiple Data Sources

o   In addition to multiple markets, there were multiple source systems per market.

o   For example, for a single market there could be a streaming platform, a billing/back-office platform, and sometimes a different media asset management platform.  There was also typically another feed for provisioned settops and the associated subscriber.

o   All these data sources had to be merged together for a coherent picture.  This also raised the question of how to transform all this data into a common model, and still know where all the data came from.

 

·      Media Asset Metadata

o   There were some interesting reporting aspects around media asset metadata.

o   Streaming a movie preview was different from the actual movie, but these were related assets.  Similarly, an SD version of a movie was technically different than the HD version of a movie, but was still functionally “the same movie”.

o   A movie could be categorized multiple ways:  comedy, drama, action, etc., or a combination of categories.  So one needs to be careful about interpreting aggregate functions like summing views and dollars across categories.  

o   While video metadata is obviously a lot less complicated than healthcare, it served as a reminder of the importance of categorization.

 

Everstream was certainly not the first company to experience these types of challenges, but the value of years of hands-on experience on these topics cannot be over-stated.  I spent over 5-1/2 years at Everstream and there were a number of key experiences I learned:

 

·      Selling Software Is Hard

o   I’m not saying anything novel here, but if you’ve ever lived this you really understand it.  Especially for on-premise software deployments.

 

·      Supporting Software Is Hard, Part 1

o   Too often whatever hardware requirements we would specify (CPUs, memory, storage) for the IDG, customers would divide that specification in half to save money – especially since they had to be purchased per market (and there could be dozens).  But if anything went wrong with the IDG it was our fault.

 

·      Supporting Software Is Hard, Part 2

o   The data warehousing product was Oracle-based (typical of solutions during the period), and was often co-deployed on a sizable database server (also typical of deployments at the time) with other solutions (custom and other vendor solutions).

o   What we often found is no matter how big the server was, there was always an inevitable clash over resources between the deployed applications.  We frequently hit “scale up” limits.  I half-jokingly called this “The 49th Processor Problem” (e.g., when a database on a 48-CPU SuperDome is tapped out, where do you go from there?).  That was a lot of computing power back in the day, and arguably still is. 

o   Also, because our warehousing solution was deployed in the customer environment (as was typical of solutions at the time) we had no insight into the performance until something went wrong.  Then we were on the phone under emergency conditions, and entirely on our heels.

 

·      Oracle Is… Oracle

o   Oracle data warehousing tends to be on the pricey side, especially for enterprise features like table partitioning.

o   Everstream also had Oracle OEM pricing for the IDG (data collector) component, which had some technical and contractual requirements.

o   Oracle can be “enthusiastic” about patrolling its contracts, which can make things “exciting”.

 

 

Explorys

 

With that in mind, these were some of the core principles (or at least goals) when I started engineering Explorys in 2009:

 

·      Hosted

o   Avoid on-premise deployments

o   While being cloud-based is obvious in 2021, it was much less obvious to the market in 2009.

o   Healthcare organizations were extremely suspicious of public cloud vendors in 2009.

o   But based on the Everstream experiences, self-hosting was necessary (we used Expedient, a commercial hosting provider).  We needed to be able to move fast, and we needed to be able to control the environment and deployment process to do that. 

 

·      Data Collection

o   This was one aspect of on-premise deployments that was unavoidable.

o   However, this time we needed to own the collector device so we could make sure it was being managed appropriately.  Thus, the collector device became an appliance.

o   Fun fact:  Explorys called its collector device the HDG (Health Data Gateway) - we changed one letter.  The HDG software architecture was very different from Everstream’s IDG though.

 

·      Software Frameworks

o   Where possible, I wanted to leverage Open Source Software, and distributed frameworks.  Especially based on the Everstream experiences, I wanted to scale horizontally where-ever possible. 

 

Explorys started at a fortuitus time:  Hadoop had started in 2006, itself inspired from a number of Google papers, but was starting to go mainstream in 2009.

 

The forming of Explorys also coincided with the first Hadoop World in October 2009.  I was there with about 500 people in the Roosevelt Hotel ballroom in New York City and it was palpable that something special was happening.  People were talking about clusters of 1,000 nodes, and more - heady stuff.  No reasonable software engineer expected any of this to be easy, but rather it was about what was possible.  I managed to track down Christophe Bisciglia, one of the founders of Cloudera, the sponsor of the conference, and told him about our Big Huge Plans.  He decided right there that he wanted to work with us.  Keep in mind that Explorys had only been formally incorporated for a couple of days at that point (at most), not exactly the profile of “enterprise customer” that Cloudera was seeking, and a tiny, tiny startup in Cleveland at that.  Kudos and thanks to Cloudera for taking the chance, their support was critical.    

 

Down the road I wound up becoming an Apache HBase committer which was one of the distributed data storage frameworks Explorys utilized, on top of my day-job.  I consider myself lucky to have worked with such a smart collection of people, and Explorys benefited from many lessons learned from that open source community in particular and the Hadoop and big data communities in general.   

 

Somehow during this period I also managed to form the Cleveland Big Data Meetup in 2010.  I had personally benefitted from the open source mindset of experience sharing, and I wanted to try to bring a slice of that back to Cleveland.  I am proud to say that Cleveland Big Data has had 10 great years, and also proud that Cleveland Big Data has been a small but consistent voice in the promotion of science.

 

This brings us to the Cleveland Clinic.  On paper, Explorys was a Cleveland Clinic Innovations spinoff.  Cleveland Clinic’s partnership with Explorys was critical in terms of business endorsement.  There also were several key Cleveland Clinic staff members who were early test-users of Explorys’ early applications to give us usability feedback, and a few staff members who provided early guidance in the Cleveland Clinic data environment.  Explorys benefited from this support.  

 

That said, the technical contributions from this partnership have not been accurately described to date.  The intellectual property made available to Explorys was the eResearch/MyResearch application.  The application was effectively a prototype for web-based patient population searching, and the limited development on it had ceased in 2008. 

 

The eResearch/MyResearch application had the following attributes:

 

·      Limited to single machine

o   Microsoft stack

o   Microsoft SQL Server

o   .Net/C# web application

 

·      Limited data sources

o   The application was primarily centered around Epic (the EMR the Cleveland Clinic utilized), and Clarity (the Epic data warehouse)

 

·      Limited data size

o   Supported a subset of patients from the Cleveland Clinic population

 

·      Limited searchable features (about ~900)

o   Each searchable feature was hand-created and hard-coded to the Epic Clarity data model

o   No data standardization existed for standard ontologies like SNOMED, LOINC and RxNorm

 

No code from the MyResearch application was ever used at Explorys. 

 

Explorys’ first application was Population Explorer (later renamed Explore), which was a complete re-imagining of MyResearch, and then some.  The Explore application supported: 

 

·      Data standardization

o   Healthcare ontologies like SNOMED, RxNorm, and LOINC had been around for over a decade before Explorys started in 2009, but at the time Explorys started those tended to be viewed more as ‘research’ frameworks, and operational reporting was more centered on ICD9 and CPT codesets. 

o   Explorys took a bit of a gamble investing in data standardization when we did, but after Meaningful Use required usage of those ontologies in 2011 to start laying the groundwork for standardized data representation and interchange, we had a head start.

o   Supporting SNOMED, RxNorm, and LOINC required us to support searching across many hundreds of thousands of searchable concepts per patient – which was a sizable technical challenge (especially to be able to search it quickly, and across large patient populations).   This is a non-trivial problem and took no small amount of innovation to address.  

o   MyResearch did not support standardized data, and patient-searching on those ontologies.

 

·      Type-ahead search

o   A very user-friendly (and demo-friendly) feature of Explore. 

o   People take this kind of feature for granted in 2021, but it was kind of a big deal to do in 2009-2010, especially with medical terminology (see the above data standardization item), not just to get a term, but the most appropriate term, and the most appropriate term that we had search results for.

o   This took a lot of iteration.

o   Nothing like this existed in MyResearch.

 

·      Browse The Crowd

o   A feature to allow users to “dive into” results and navigate up and down the ontologies we supported from data standardization (e.g., SNOMED is a directed acyclic graph) with population counts of selected criteria.  This was a really slick feature, and probably one of the features I was most proud of designing.  It always demoed really well.

o   Nothing like this existed in MyResearch.

 

·      PowerSearch

o   This was a powerful feature that allowed searching not just on custom ranges for lab results and temporal searching (e.g., this happened before that) with arbitrary date periods, and features that otherwise defied summarization.

o   This kind of search obviously wouldn’t run sub-second, but the fact the query criteria could be defined through an application and run in the background (instead of writing a series of non-trivial programs) was huge for researchers.

o   Nothing like this existed in MyResearch, and this was only possible to the big data frameworks we were leveraging at Explorys.

 

·      Large population searching

o   At peak Explore could search over 60 million de-duplicated and de-identified patients 

o   And per above, supporting many hundreds of thousands of concepts per patient

o   This required extensive work in scalable data management and processing.

 

·      Complex data governance

o   MyResearch assumed that there was only a single organization’s data in the database.  Explore was built for multi-organizational support, as well as “Universe” support (the de-identified searchable cross-organizational construct).

 

Cleveland Clinic’s backing was important for Explorys in many ways.  But with all due respect to Cleveland Clinic Innovations, Explorys did its own engineering and design.  Explore was a completely different application than MyResearch.

 

Right after formation of Explorys we attended the American Medication Informatics Association (AMIA) annual conference in November 2009 as attendees.  One year later at the 2010 AMIA conference Explorys had a vendor booth and we were giving demos of the first version of Explore.  Kind of crazy, actually.  I remember doing a demo for somebody and they started asking “what if’s” that was off my demo-script, so I went with it.  After I answered his questions in Explore, he turned to a colleague and said “he just did in 20 seconds what our SAS programmer did in 3 days.”

 

In mid-2011 Explorys started developing analytic solutions for the provider sector with the Enterprise Performance Management (EPM) application with Clinical Measure and Registry functionality which became the core solution.  This leveraged and extended items from above topics such as data standardization, and also required a variety of other improvements in entity resolution (patient matching, provider matching, etc.)  SuperMart (customer-facing data marts) was released about 2012/2013.  These solutions were totally unrelated to the MyResearch prototype.

 

It took a sizable and dedicated engineering team at Explorys to make all this happen, with a lot of work, innovation, and iteration.  Hats off to the team.  We did some great work.

 

Explorys was acquired by IBM in April 2015. 

 

IBM Watson Health unfortunately blew up in mid-2018 in a very public way and Explorys was one of the casualties, but that’s another story.  The team deserved a better ending than that.  I left at the end of 2018.                  

 

https://spectrum.ieee.org/biomedical/diagnostics/how-ibm-watson-overpromised-and-underdelivered-on-ai-health-care

 

https://spectrum.ieee.org/the-human-os/artificial-intelligence/medical-ai/layoffs-at-watson-health-reveal-ibms-problem-with-ai

 

https://www.beckershospitalreview.com/healthcare-information-technology/stat-ibm-watson-health-was-crumbling-long-before-layoff-announcements-10-things-to-know.html

 

For the record, I had no part in any of those articles, or any others.

 

 

Offices

 

Moving offices was a core competency of Explorys.

 

First Office

The first office (Fall 2009 to July 2010) was in the Triangle Apartment complex in Cleveland’s University Circle, right next to Case Western Reserve University.  It was in a storefront which at one time had been a Time Warner Cable billing office.  I know this because somebody once walked in and tried to return their old settop to me, explaining that he thought our storefront was still a Time Warner Cable billing office.

 

The Triangle was (and still is) an apartment complex, and another time a resident upstairs left their water running and it overflowed to our office below.  We were watching a bulge form in the ceiling tiles and it seemed to be growing in real time, and we were looking at each other thinking “is that really happening?”  We were able to get the desks out of the way just in time before it burst, and then were able to get in touch with building maintenance to get into the apartment and shut off the water.  I think we had somebody interviewing and a new-hire in the office at that moment as well.  “Welcome to Explorys!  Trust us, it’s going to be awesome!”  This happened late on a Friday afternoon, and had it happened just an hour or two later we would have been gone and it would have been flooding all weekend.  The joy of small offices. 

 

This office was my favorite in terms of walking, because within two minutes I could be strolling around the museums in Wade Oval.  I would go out for a lunch-walk most days to clear my head and think.  About a month after moving into this office I was out for a walk when I was stopped by an old couple in a car asking me how to get to the hospital.  In University Circle “hospital” is ambiguous so I had to then engage them in some conversation to find out whether they were going to Cleveland Clinic, University Hospitals, or the VA.  After determining the destination, I then provided directions and some parking tips.  This kept happening every 6 to 8 weeks.  I must have one of those faces.

 

The Triangle Apartment storefronts (and parking lot) were later converted into the Uptown district.  Mitchells Ice Cream is pretty much where the first office was.

 

Second Office

The second office (July 2010 to November 2012) was in the Global Cardiovascular Innovation Center (GCIC) at E. 100th and Cedar, on the south edge of the Cleveland Clinic main campus.  On top of seeing main campus right across the street, you could hear it – helicopters would land a block away on the roof of the emergency services building.  After a while it can become tempting to tune it out as background noise and forget there was probably a patient being transported somewhere.

 

My favorite part of this office was the innovation room on the 2nd floor that had a great work-table with monitors on each end, with barstool-height swivel chairs – great for collaboration.  The walls were a giant whiteboard.  There were 2 frequently utilized treadmill desks.   One engineer got carried away on a project and walked 7 miles in one day, which doesn’t seem like that much, but try it on a treadmill - it’s painful.  The room was supposed to be a common area for all the tenants of the building, but we tended to hog it (sorry!) 

 

I enjoyed eating lunch at the main Clinic cafeteria.  I thought the food was good, and there was always a mix of staff, visitors, and patients – a constant motivating reminder of why we were doing this.  This is also back when the Clinic had already made the decision to only stock diet soda drinks and low-fat snacks, a small but noticeable improvement.  And this was also shortly after the Clinic banned smoking for employees in 2007, which was an initially an unpopular edict but it really does make sense as smoking is a major risk factor in a number of cancers, and like it or not the Clinic staff needed to set the example.  

 

Speaking of setting the example, at this time there was still a McDonalds by the main cafeteria which was the only place on main campus where one could not only legally purchase a high-octane Coke but also a quarter pounder with cheese and supersize fry - at the #1 heart center in the United States.  I’m sure the Clinic wasn’t thrilled with this unintentionally hilarious unhealthy dichotomy, and the alleged backstory was that this McDonalds had a long-term lease that somehow couldn’t be revoked.  Eventually the McDonalds was replaced with something less medically ironic, but it took a few years. 

 

Third Office

The third office (November 2012 to Aug 2016) was in the old Cleveland Playhouse complex on Carnegie on the west edge of the Cleveland Clinic main campus.  The Clinic had purchased the complex after the Cleveland Playhouse theater company moved to Playhouse Square downtown.  Technically, we were in “the old Sears building” section of the complex, whose previous tenant was the Cleveland Museum of Contemporary Art.  Coincidentally, they moved down the street near where Explorys’ first office was with a new building. 

 

The old Playhouse complex was the craziest office location I’ve ever worked in.  And it wasn’t “we have a ping pong table and bean bag chairs” crazy (we did have a ping pong table).  The complex was old and massive.  It wasn’t entirely clear to most people where the back fire-escape of our section actually led and how much of the team we would get back if anybody went out that way.  That particular door led to a warren of rooms and staircases and good luck finding your way in the dark and/or urgent conditions.  Cleveland-area SWAT teams would periodically practice room-clearing drills in the extensive basement, which was the former theater prop storage area and was horror-movie creepy.  You could find targets with spent stun gun wires lying about.  And then there were pallets of Clinic medical supplies that were either inventory overflow from next door or were stockpiled for regional disaster scenarios.  You know… normal office stuff.  

 

After we moved in we received a great tour from the building supervisor Maurice, who knew the history of the complex because his father had been the building supervisor for 40 years back when the Playhouse was in operation at that location.  Amazing history.  Margaret Hamilton, Joel Grey, and Paul Newman once performed there.  We may or may not have explored the various stages occasionally on subsequent unsanctioned tours, but my memory is hazy on this point.  The oldest theater from the 1920’s is my favorite, allegedly.  I hope if Cleveland Clinic ever redevelops that complex that it can at least keep the original theater and as much of the Philip Johnson addition as possible. 

 

There was also a Hot Sauce Williams a block or so away which had an autographed picture of Robin Williams on the wall made out to the restaurant.  I imagine this would have horrified Robin’s cardiologist if he found out that Robin had flown a few thousand miles to Cleveland for heart surgery, and then apparently went for some post-op recovery spicy ribs. 

 

Fourth Office

The fourth office (August 2016 to Summer 2018) was at 1111 Superior, downtown Cleveland.  We were in the 26th to 28th floors - the former Eaton executive floors, refurbished for the proletariat.  Great views.  Being downtown for the 2016 World Series was exciting (and crushing). 

 

I was a huge fan of the downtown Heinen’s.  The old Cleveland Trust Rotunda and salad bar made for a great lunch experience.  This building pre-Heinens was featured in The Avengers (as a bank that was being attacked by the Chitauri) and Captain America: Winter Soldier (as a Hydra secret lair).

 

During our time in this office I was stopped twice by drivers asking me where the Justice Center was and how to get there.  I must have still had one of those faces.

 

Fifth Office

The fifth office (Summer 2018) was a new building at E. 105th and Cedar back next to the Cleveland Clinic.  This building was in planning for years, but by the time it was built the business had unfortunately caved.  It’s a nice office, though.

 

 

See this link for Explorys, Part 2 (2015 to 2018, the IBM Watson Health Years):  

https://themeildeal.blogspot.com/2021/02/the-actual-history-of-explorys-part-2.html

 

Appendix

 

Why was there such a disconnect in the Explorys origin story?  It’s complicated, but reviewing what happened after Explorys is insightful.  The first article below explains Cleveland’s Amazon HQ bid, and how the Unify Project somehow wrote itself into the city’s bid and described itself as being one of the most important assets in the region without having actually having done anything.  Few cities had a realistic chance in this bid process so Clevelanders shouldn’t be that disappointed at not winning, but Cleveland at least deserved an honest entry.  

 

https://www.clevescene.com/cleveland/an-essay-on-the-failed-amazon-bid-and-the-defective-philosophy-undermining-clevelands-progress/Content?oid=20653840

 

And the story continued…

 

https://www.clevescene.com/scene-and-heard/archives/2018/08/23/the-unify-project-star-of-clevelands-amazon-hq2-bid-for-some-reason-is-finally-hiring

 

https://www.clevescene.com/scene-and-heard/archives/2019/10/03/unify-project-is-now-unify-labs-new-partnership-with-united-way-raises-questions-eyebrows

 

https://www.clevescene.com/cleveland/a-legacy-cleveland-nonprofit-struggles-for-relevance-and-financial-survival-under-ceo-august-napoli/Content?oid=32490788

 

Can’t make this stuff up.

 

Also for the record, I had no part in any of those articles