Skip to content

Giving MiHIN eyes on every message it routes

MiHIN routes millions of healthcare messages a day and couldn't always tell when one went missing. We built the logging, tracing, and alerting that let them catch problems before patients felt them. It became the start of a multi-year partnership.

Hours → minutes

to detect and resolve a message-routing incident

MiHIN (Michigan Health Information Network)

When a missing message means a missing record

The Michigan Health Information Network is how a huge share of Michigan’s healthcare data actually moves. When a patient is admitted or discharged, when lab results or radiology images need to reach the next provider, MiHIN’s Integrated Technology Platform is what routes that message to the right place. It handles 97% of the state’s admit and discharge messages. As one of their team put it, before MiHIN existed, hospitals lost time and lost critical information, and losing those things costs lives.

A platform like that is made of many interconnected pieces, and that complexity was also its blind spot. When a component failed somewhere in the chain, a message could quietly go unrouted. Too often the first sign of trouble was a participant on the receiving end calling in to ask where their expected messages were, which meant MiHIN was learning about problems after the people depending on them already had. Tracking down which component had dropped the message was slow work, and in a system carrying this kind of data, every hour of delay has consequences downstream.

This was MiHIN’s first project with us, in 2021. They came to us as they were approaching a go-live, short on monitoring while their own developers were heads-down on launch features, and they wanted to be able to see which messages had made it through and which hadn’t.

What we actually did

We split the work into two parts: something that would help immediately, and something that would last.

The immediate piece was alerting. Using AWS serverless services, we built a system that watches the flow of messages to each participant and notices when one stops receiving them for longer than it should. Lambda functions check message flow on a schedule, record state in DynamoDB, and notify the right people the moment a feed goes quiet. That alone changed the dynamic, because MiHIN could now find a stalled feed on their own instead of waiting for the phone to ring.

The longer-term piece was real observability across the whole platform. We built a uniform way to record custom metrics from every Lambda component using Lambda Layers, fed that high volume of data into Timestream, and used X-Ray to trace each message’s path through every step of the system. Then we pulled Timestream, CloudWatch, and X-Ray together into Managed Grafana so the whole thing could be seen in one place. We built two ways to look at it: an application view that follows a message through the system to show exactly where a problem started, and an infrastructure view that surfaces the component failures behind a processing problem.

We did this working alongside MiHIN’s own architects rather than off on our own, which was as much the point as the software. The platform was theirs to run when we finished, and they needed to understand it well enough to keep extending it. MiHIN’s Executive Director, Dr. Tim Pletcher, later described what they look for in a partner as someone who stays ahead of them, knows more than they do, and is “willing to teach and train others and not have that be a threat.” That is the relationship we were trying to build, and it is the part of the work we care most about.

What changed

MiHIN went from finding out about routing problems from the people affected by them to catching those problems first. Detection and resolution that used to take hours now takes minutes, because an engineer can follow a single message through the system and see exactly where it stopped. The dashboards became something the teams use every day, and they have kept adding new views as the platform grows.

The project also turned out to be the beginning of a long relationship. What started as a logging and monitoring engagement has continued through years of further work on MiHIN’s platform, which is the outcome we value most: not a deliverable handed over and forgotten, but a team that keeps asking us back.

Stack

AWS Lambda · DynamoDB · Timestream · X-Ray · Managed Grafana · CloudWatch

← All case studies

Talk to an engineer