The Actual History Of Explorys, Part 1 (2009 to 2015, The
Startup Years)
© Doug Meil, 2021
Explorys Founding
Explorys was founded, officially, in October 2009 via a deal
with Cleveland Clinic Innovations. The
skunkworks started much earlier with a variety of conversations with Cleveland
Clinic. I got involved in the Summer of
2009.
Naming a startup is hard.
It consists of multiple sessions generating name ideas, and 80% of them
will be terrible. 15% of them will be
mediocre. And if you’re lucky 4% of them
will be decent. And if you’re really
lucky 1% of them will be good. And not
just ‘good’ – but also be trademark, service-mark, and copyright clear, and the
domain name will be available. The name Explorys
came from an ad/marketing person that we were working with. It was a good name.
One of the names I generated that had some traction in the
ideation process was VennPoint. Explorys
is a much better name than that, but one of the name-themes I liked was
intersecting sets as it represented the general idea of population exploration. While this particular name-idea wasn’t used,
the underlying notion wound up in another place: if you look closely at the Explorys logo it was
three intersecting sets in a Venn diagram.
Explorys Influence:
Everstream
A great deal of the initial technical direction of Explorys
was influenced by experiences from Everstream (both in patterns and
anti-patterns), though this aspect of the story isn’t well known. As of writing (2021) there is a company
called “Everstream” that sells business fiber connections, but this is not the
same company.
Everstream was a company founded in 1999 in Cleveland, Ohio that
was originally a music streaming service, which targeted newspaper websites to
place its music (and advert) player.
This was years before Pandora and Spotify. Everstream might have been too far
ahead of its time and the company wound up laying off most of its employees around
2002. Everstream wandered around for a
while performing odd consulting gigs and eventually found a niche in Video On Demand
(VOD) data collection and reporting in early 2004. This is when the only at-home streaming
options were through the cable company (e.g., tune to channel 500 for the VOD
application). I joined Everstream at
this time. Everstream was acquired by
Concurrent Computer Corporation in August 2005.
Concurrent eventually rebranded Everstream as something else, and then
let the Everstream trademark lapse. I
was at Everstream/Concurrent until 2009.
Everstream experienced a number of interesting technical
challenges:
·
Distributed
Data Collection
o
Cable
companies wanted an enterprise view of their video on demand services, but the
service delivery platforms were deployed at a market level at that time due to
technical capabilities of the day.
Larger customers could have dozens of markets.
o
Everstream
had a remote collection device called an “IDG” (Interactive Data Gateway)
·
Multiple
Data Sources
o
In
addition to multiple markets, there were multiple source systems per market.
o
For
example, for a single market there could be a streaming platform, a billing/back-office
platform, and sometimes a different media asset management platform. There was also typically another feed for
provisioned settops and the associated subscriber.
o
All
these data sources had to be merged together for a coherent picture. This also raised the question of how to
transform all this data into a common model, and still know where all the data
came from.
·
Media
Asset Metadata
o
There
were some interesting reporting aspects around media asset metadata.
o
Streaming
a movie preview was different from the actual movie, but these were related
assets. Similarly, an SD version of a
movie was technically different than the HD version of a movie, but was still functionally
“the same movie”.
o
A
movie could be categorized multiple ways:
comedy, drama, action, etc., or a combination of categories. So one needs to be careful about interpreting
aggregate functions like summing views and dollars across categories.
o
While
video metadata is obviously a lot less complicated than healthcare, it served
as a reminder of the importance of categorization.
Everstream was certainly not the first company to
experience these types of challenges, but the value of years of hands-on
experience on these topics cannot be over-stated. I spent over 5-1/2 years at Everstream and
there were a number of key experiences I learned:
·
Selling
Software Is Hard
o
I’m
not saying anything novel here, but if you’ve ever lived this you really
understand it. Especially for on-premise
software deployments.
·
Supporting
Software Is Hard, Part 1
o
Too
often whatever hardware requirements we would specify (CPUs, memory, storage)
for the IDG, customers would divide that specification in half to save money –
especially since they had to be purchased per market (and there could be
dozens). But if anything went wrong with
the IDG it was our fault.
·
Supporting
Software Is Hard, Part 2
o
The
data warehousing product was Oracle-based (typical of solutions during the
period), and was often co-deployed on a sizable database server (also typical
of deployments at the time) with other solutions (custom and other vendor
solutions).
o
What
we often found is no matter how big the server was, there was always an
inevitable clash over resources between the deployed applications. We frequently hit “scale up” limits. I half-jokingly called this “The 49th
Processor Problem” (e.g., when a database on a 48-CPU SuperDome is tapped out,
where do you go from there?). That was a
lot of computing power back in the day, and arguably still is.
o
Also,
because our warehousing solution was deployed in the customer environment (as
was typical of solutions at the time) we had no insight into the performance
until something went wrong. Then we were
on the phone under emergency conditions, and entirely on our heels.
·
Oracle
Is… Oracle
o
Oracle
data warehousing tends to be on the pricey side, especially for enterprise
features like table partitioning.
o
Everstream
also had Oracle OEM pricing for the IDG (data collector) component, which had
some technical and contractual requirements.
o
Oracle
can be “enthusiastic” about patrolling its contracts, which can make things
“exciting”.
Explorys
With that in mind, these were some of the core principles
(or at least goals) when I started engineering Explorys in 2009:
·
Hosted
o
Avoid
on-premise deployments
o
While
being cloud-based is obvious in 2021, it was much less obvious to the market in
2009.
o
Healthcare
organizations were extremely suspicious of public cloud vendors in 2009.
o
But
based on the Everstream experiences, self-hosting was necessary (we used
Expedient, a commercial hosting provider).
We needed to be able to move fast, and we needed to be able to control
the environment and deployment process to do that.
·
Data
Collection
o
This
was one aspect of on-premise deployments that was unavoidable.
o
However,
this time we needed to own the collector device so we could make sure it
was being managed appropriately. Thus, the
collector device became an appliance.
o
Fun
fact: Explorys called its collector
device the HDG (Health Data Gateway) - we changed one letter. The HDG software architecture was very
different from Everstream’s IDG though.
·
Software
Frameworks
o
Where
possible, I wanted to leverage Open Source Software, and distributed frameworks. Especially based on the Everstream
experiences, I wanted to scale horizontally where-ever possible.
Explorys started at a fortuitus time: Hadoop had started in 2006, itself inspired
from a number of Google papers, but was starting to go mainstream in 2009.
The forming of Explorys also coincided with the first Hadoop
World in October 2009. I was there with
about 500 people in the Roosevelt Hotel ballroom in New York City and it was palpable
that something special was happening. People
were talking about clusters of 1,000 nodes, and more - heady stuff. No reasonable software engineer expected any
of this to be easy, but rather it was about what was possible. I managed to track down Christophe Bisciglia,
one of the founders of Cloudera, the sponsor of the conference, and told him about
our Big Huge Plans. He decided right
there that he wanted to work with us. Keep
in mind that Explorys had only been formally incorporated for a couple of days
at that point (at most), not exactly the profile of “enterprise customer” that Cloudera
was seeking, and a tiny, tiny startup in Cleveland at that. Kudos and thanks to Cloudera for taking the
chance, their support was critical.
Down the road I wound up becoming an Apache HBase committer which
was one of the distributed data storage frameworks Explorys utilized, on top of
my day-job. I consider myself lucky to
have worked with such a smart collection of people, and Explorys benefited from
many lessons learned from that open source community in particular and the
Hadoop and big data communities in general.
Somehow during this period I also managed to form the
Cleveland Big Data Meetup in 2010. I had
personally benefitted from the open source mindset of experience sharing, and I
wanted to try to bring a slice of that back to Cleveland. I am proud to say that Cleveland Big Data has
had 10 great years, and also proud that Cleveland Big Data has been a small but
consistent voice in the promotion of science.
This brings us to the Cleveland Clinic. On paper, Explorys was a Cleveland Clinic
Innovations spinoff. Cleveland Clinic’s
partnership with Explorys was critical in terms of business endorsement. There also were several key Cleveland Clinic
staff members who were early test-users of Explorys’ early applications to give
us usability feedback, and a few staff members who provided early guidance in
the Cleveland Clinic data environment.
Explorys benefited from this support.
That said, the technical contributions from this partnership
have not been accurately described to date.
The intellectual property made available to Explorys was the
eResearch/MyResearch application. The
application was effectively a prototype for web-based patient population searching,
and the limited development on it had ceased in 2008.
The eResearch/MyResearch application had the following
attributes:
·
Limited
to single machine
o
Microsoft
stack
o
Microsoft
SQL Server
o
.Net/C#
web application
·
Limited
data sources
o
The
application was primarily centered around Epic (the EMR the Cleveland Clinic
utilized), and Clarity (the Epic data warehouse)
·
Limited
data size
o
Supported
a subset of patients from the Cleveland Clinic population
·
Limited
searchable features (about ~900)
o
Each
searchable feature was hand-created and hard-coded to the Epic Clarity data
model
o
No
data standardization existed for standard ontologies like SNOMED, LOINC and
RxNorm
No code from the MyResearch application was ever used at
Explorys.
Explorys’ first application was Population Explorer (later
renamed Explore), which was a complete re-imagining of MyResearch, and then
some. The Explore application supported:
·
Data
standardization
o
Healthcare
ontologies like SNOMED, RxNorm, and LOINC had been around for over a decade
before Explorys started in 2009, but at the time Explorys started those tended
to be viewed more as ‘research’ frameworks, and operational reporting was more
centered on ICD9 and CPT codesets.
o
Explorys
took a bit of a gamble investing in data standardization when we did, but after
Meaningful Use required usage of those ontologies in 2011 to start laying the
groundwork for standardized data representation and interchange, we had a head
start.
o
Supporting
SNOMED, RxNorm, and LOINC required us to support searching across many hundreds
of thousands of searchable concepts per patient – which was a sizable technical
challenge (especially to be able to search it quickly, and across large patient
populations).
This is a non-trivial problem and took no small amount
of innovation to address.
o
MyResearch
did not support standardized data, and patient-searching on those ontologies.
·
Type-ahead
search
o
A
very user-friendly (and demo-friendly) feature of Explore.
o
People
take this kind of feature for granted in 2021, but it was kind of a big deal to
do in 2009-2010, especially with medical terminology (see the above data
standardization item), not just to get a term, but the most appropriate term, and
the most appropriate term that we had search results for.
o
This
took a lot of iteration.
o
Nothing
like this existed in MyResearch.
·
Browse
The Crowd
o
A
feature to allow users to “dive into” results and navigate up and down the
ontologies we supported from data standardization (e.g., SNOMED is a directed
acyclic graph) with population counts of selected criteria. This was a really slick feature, and probably
one of the features I was most proud of designing. It always demoed really well.
o
Nothing
like this existed in MyResearch.
·
PowerSearch
o
This
was a powerful feature that allowed searching not just on custom ranges for lab
results and temporal searching (e.g., this happened before that) with arbitrary
date periods, and features that otherwise defied summarization.
o
This
kind of search obviously wouldn’t run sub-second, but the fact the query
criteria could be defined through an application and run in the background
(instead of writing a series of non-trivial programs) was huge for researchers.
o
Nothing
like this existed in MyResearch, and this was only possible to the big data
frameworks we were leveraging at Explorys.
·
Large
population searching
o
At
peak Explore could search over 60 million de-duplicated and de-identified patients
o
And
per above, supporting many hundreds of thousands of concepts per patient
o This required extensive work in scalable data
management and processing.
·
Complex
data governance
o
MyResearch
assumed that there was only a single organization’s data in the database. Explore was built for multi-organizational
support, as well as “Universe” support (the de-identified searchable
cross-organizational construct).
Cleveland Clinic’s backing was important for Explorys in many
ways. But with all due respect to
Cleveland Clinic Innovations, Explorys did its own engineering and design. Explore was a completely different
application than MyResearch.
Right after formation of Explorys we attended the American
Medication Informatics Association (AMIA) annual conference in November 2009 as
attendees. One year later at the 2010
AMIA conference Explorys had a vendor booth and we were giving demos of the
first version of Explore. Kind of crazy,
actually. I remember doing a demo for
somebody and they started asking “what if’s” that was off my demo-script, so I
went with it. After I answered his
questions in Explore, he turned to a colleague and said “he just did in 20
seconds what our SAS programmer did in 3 days.”
In mid-2011 Explorys started developing analytic solutions
for the provider sector with the Enterprise Performance Management (EPM)
application with Clinical Measure and Registry functionality which became the
core solution. This leveraged and
extended items from above topics such as data standardization, and also
required a variety of other improvements in entity resolution (patient
matching, provider matching, etc.) SuperMart
(customer-facing data marts) was released about 2012/2013. These solutions were totally unrelated to the
MyResearch prototype.
It took a sizable and dedicated engineering team at Explorys
to make all this happen, with a lot of work, innovation, and iteration. Hats off to the team. We did some great work.
Explorys was acquired by IBM in April 2015.
IBM Watson Health unfortunately blew up in mid-2018 in a
very public way and Explorys was one of the casualties, but that’s another
story. The team deserved a better ending
than that. I left at the end of
2018.
https://spectrum.ieee.org/biomedical/diagnostics/how-ibm-watson-overpromised-and-underdelivered-on-ai-health-care
https://spectrum.ieee.org/the-human-os/artificial-intelligence/medical-ai/layoffs-at-watson-health-reveal-ibms-problem-with-ai
https://www.beckershospitalreview.com/healthcare-information-technology/stat-ibm-watson-health-was-crumbling-long-before-layoff-announcements-10-things-to-know.html
For the record, I had no part in any of those articles, or
any others.
Offices
Moving offices was a core competency of Explorys.
First Office
The first office (Fall 2009 to July 2010) was in the
Triangle Apartment complex in Cleveland’s University Circle, right next to Case Western Reserve University. It was in a storefront which at one time had
been a Time Warner Cable billing office.
I know this because somebody once walked in and tried to return their
old settop to me, explaining that he thought our storefront was still a Time
Warner Cable billing office.
The Triangle was (and still is) an apartment complex, and another
time a resident upstairs left their water running and it overflowed to our
office below. We were watching a bulge
form in the ceiling tiles and it seemed to be growing in real time, and we were
looking at each other thinking “is that really happening?” We were able to get the desks out of the way
just in time before it burst, and then were able to get in touch with building
maintenance to get into the apartment and shut off the water. I think we had somebody interviewing and a
new-hire in the office at that moment as well.
“Welcome to Explorys! Trust
us, it’s going to be awesome!” This
happened late on a Friday afternoon, and had it happened just an hour or two later
we would have been gone and it would have been flooding all weekend. The joy of small offices.
This office was my favorite in terms of walking, because within
two minutes I could be strolling around the museums in Wade Oval. I would go out for a lunch-walk most days to
clear my head and think. About a month
after moving into this office I was out for a walk when I was stopped by an old
couple in a car asking me how to get to the hospital. In University Circle “hospital” is ambiguous
so I had to then engage them in some conversation to find out whether they were
going to Cleveland Clinic, University Hospitals, or the VA. After determining the destination, I then provided
directions and some parking tips. This
kept happening every 6 to 8 weeks. I
must have one of those faces.
The Triangle Apartment storefronts (and parking lot) were
later converted into the Uptown district.
Mitchells Ice Cream is pretty much where the first office was.
Second Office
The second office (July 2010 to November 2012) was in the Global
Cardiovascular Innovation Center (GCIC) at E. 100th and Cedar, on
the south edge of the Cleveland Clinic main campus. On top of seeing main campus right across the
street, you could hear it – helicopters would land a block away on the
roof of the emergency services building.
After a while it can become tempting to tune it out as background noise
and forget there was probably a patient being transported somewhere.
My favorite part of this office was the innovation room on
the 2nd floor that had a great work-table with monitors on each end,
with barstool-height swivel chairs – great for collaboration. The walls were a giant whiteboard. There were 2 frequently utilized treadmill
desks. One engineer got carried away on a project and
walked 7 miles in one day, which doesn’t seem like that much, but try it on a
treadmill - it’s painful. The room was
supposed to be a common area for all the tenants of the building, but we tended
to hog it (sorry!)
I enjoyed eating lunch at the main Clinic cafeteria. I thought the food was good, and there was
always a mix of staff, visitors, and patients – a constant motivating reminder
of why we were doing this. This is also back
when the Clinic had already made the decision to only stock diet soda drinks
and low-fat snacks, a small but noticeable improvement. And this was also shortly after the Clinic
banned smoking for employees in 2007, which was an initially an unpopular edict
but it really does make sense as smoking is a major risk factor in a number of
cancers, and like it or not the Clinic staff needed to set the example.
Speaking of setting the example, at this time there was
still a McDonalds by the main cafeteria which was the only place on main campus
where one could not only legally purchase a high-octane Coke but also a quarter
pounder with cheese and supersize fry - at the #1 heart center in the United
States. I’m sure the Clinic wasn’t
thrilled with this unintentionally hilarious unhealthy dichotomy, and the alleged
backstory was that this McDonalds had a long-term lease that somehow couldn’t
be revoked. Eventually the McDonalds was
replaced with something less medically ironic, but it took a few years.
Third Office
The third office (November 2012 to Aug 2016) was in the old Cleveland
Playhouse complex on Carnegie on the west edge of the Cleveland Clinic main
campus. The Clinic had purchased the complex
after the Cleveland Playhouse theater company moved to Playhouse Square
downtown. Technically, we were in “the old
Sears building” section of the complex, whose previous tenant was the Cleveland
Museum of Contemporary Art. Coincidentally,
they moved down the street near where Explorys’ first office was with a new
building.
The old Playhouse complex was the craziest office location I’ve
ever worked in. And it wasn’t “we
have a ping pong table and bean bag chairs” crazy (we did have a ping pong
table). The complex was old and massive. It wasn’t entirely clear to most people where
the back fire-escape of our section actually led and how much of the team we
would get back if anybody went out that way.
That particular door led to a warren of rooms and staircases and good
luck finding your way in the dark and/or urgent conditions. Cleveland-area SWAT teams would periodically practice
room-clearing drills in the extensive basement, which was the former theater prop
storage area and was horror-movie creepy. You could find targets with spent stun gun wires
lying about. And then there were pallets
of Clinic medical supplies that were either inventory overflow from next door
or were stockpiled for regional disaster scenarios. You know… normal office stuff.
After we moved in we received a great tour from the building
supervisor Maurice, who knew the history of the complex because his father had
been the building supervisor for 40 years back when the Playhouse was in
operation at that location. Amazing
history. Margaret Hamilton, Joel Grey,
and Paul Newman once performed there. We
may or may not have explored the various stages occasionally on subsequent unsanctioned
tours, but my memory is hazy on this point.
The oldest theater from the 1920’s is my favorite, allegedly. I hope if Cleveland Clinic ever redevelops
that complex that it can at least keep the original theater and as much of the
Philip Johnson addition as possible.
There was also a Hot Sauce Williams a block or so away which
had an autographed picture of Robin Williams on the wall made out to the
restaurant. I imagine this would have
horrified Robin’s cardiologist if he found out that Robin had flown a few thousand
miles to Cleveland for heart surgery, and then apparently went for some post-op
recovery spicy ribs.
Fourth Office
The fourth office (August 2016 to Summer 2018) was at 1111
Superior, downtown Cleveland. We were in
the 26th to 28th floors - the former Eaton executive
floors, refurbished for the proletariat. Great views.
Being downtown for the 2016 World Series was exciting (and crushing).
I was a huge fan of the downtown Heinen’s. The old Cleveland Trust Rotunda and salad bar
made for a great lunch experience. This
building pre-Heinens was featured in The Avengers (as a bank that was being
attacked by the Chitauri) and Captain America: Winter Soldier (as a Hydra secret
lair).
During our time in this office I was stopped twice by drivers
asking me where the Justice Center was and how to get there. I must have still had one of those faces.
Fifth Office
The fifth office (Summer 2018) was a new building at E. 105th
and Cedar back next to the Cleveland Clinic.
This building was in planning for years, but by the time it was built
the business had unfortunately caved.
It’s a nice office, though.
See this link for Explorys, Part 2 (2015 to 2018, the IBM Watson Health Years):
https://themeildeal.blogspot.com/2021/02/the-actual-history-of-explorys-part-2.html
Appendix
Why was there such a disconnect in the Explorys origin story?
It’s complicated, but reviewing what
happened after Explorys is insightful.
The first article below explains Cleveland’s Amazon HQ bid, and how the Unify
Project somehow wrote itself into the city’s bid and described itself as being one
of the most important assets in the region without having actually having done
anything. Few cities had a realistic
chance in this bid process so Clevelanders shouldn’t be that
disappointed at not winning, but Cleveland at least deserved an honest entry.
https://www.clevescene.com/cleveland/an-essay-on-the-failed-amazon-bid-and-the-defective-philosophy-undermining-clevelands-progress/Content?oid=20653840
And the story continued…
https://www.clevescene.com/scene-and-heard/archives/2018/08/23/the-unify-project-star-of-clevelands-amazon-hq2-bid-for-some-reason-is-finally-hiring
https://www.clevescene.com/scene-and-heard/archives/2019/10/03/unify-project-is-now-unify-labs-new-partnership-with-united-way-raises-questions-eyebrows
https://www.clevescene.com/cleveland/a-legacy-cleveland-nonprofit-struggles-for-relevance-and-financial-survival-under-ceo-august-napoli/Content?oid=32490788
Can’t make this stuff up.
Also for the record, I had no part in any of those articles