Sunday, September 24, 2017

Libre: the “other” bytes (well, some of them at least)

Personal comment

Back in early 2015, when I started my “running the Libre as a full CGM” experiments, I quickly became aware that the core problem was much more complex than simply figuring out the translation of the so-called Dexcom “raw” signal to human readable values. There’s a reason: in the Libre FRAM, what we are seeing is a real “raw” signal. While the measure of the glucose signal itself is fairly reliable, it is heavily post-processed by the Libre firmware. Specifically - and in no particular order – temperature compensation, delay compensation, de-noising… all play a role. That understanding and, to some extent, my MD training, led me to extreme caution and prevented me from releasing my “solution”, which I knew to be both incomplete and unable to handle some error conditions.

The main driver behind my decision was the well known “first do no harm” (primum non nocere) motto, an essential part of the Hippocratic Oath which I symbolically took. I still stick by it today.
However, by the time I came to realize the full extent of the problem, I had already released enough pointers for developers to build partial solutions upon (and have no doubt my meagre contribution would have quickly been replicated anyway, had the Libre been more widely available back then).

Today, there are a lot of add-on devices that aim to transform the Libre into a full CGM. To be honest, in general, I do not like either the results they provide or their (in)convenience. None of those I have tried delivered results that would lead to an approval by a regulatory agency, none of them were stable for long periods of time. But, apparently, patients still feel they are helpful and there is now a thriving community that aims at improving them.

That is the reason why I will release a bit more information about my own experiments. Keep in mind that I can be wrong.

Personal situation

Max is now a real teen (almost 17), with all the warts of that age. We have been running both the Dexcom G4 (with the 505 algorithm) and the Libre in parallel for more than a year now. This is both a “belt and suspenders” and an optimal results strategy. We mostly use the Dexcom at night, when it is most convenient and the Libre during activities, where we benefit from the added speed. We use what we have when one of them fails (rare) or becomes detached.

As far as we are concerned, the main conclusions shared in 2014/2015 and 2016 on this blog remain true. The Libre is faster and more accurate on the whole. An “anal” calibration strategy brings the Dexcom in the same overall accuracy range, but that strategy is now just a fond memory (teen warts…). The Libre sometimes has a mind of its own (predictive failure and poor temperature compensation. My subjective (and almost statistically significant) impression is that both systems have improved a lot in terms or reliability and post insertion period.

Let me stress that this is not gospel: the performance and the length of reliable operation of a CGM sensor has a lot to do with its eventual encapsulation as a foreign body: your mileage may vary as your macrophages fuse into giant cells and encapsulate the sensor. For some people, I am sure, the Dexcom wire will behave better.

Thermal compensation

As I have shown here before, the Libre and the Dexcom (like all enzyme based bio-sensors) are sensitive to temperature variations (and pH, and potentially other things). This is extremely basic bio-chemistry. You can see an example of this here for example (as a side note, since I am an ill tempered old fart, I quickly grew tired of arguing with non believers Winking smile). Some info on a possible Dexcom temperature compensation strategy can be found here.

That means that the raw signal of a glucose oxydase based sensor has to be compensated for temperature (and ideally pH and pO2, especially for compressions). There are several methods to measure the temperature at the sensing site.

At that point, let me say that I do not know _precisely_ which method is used in which sensor, I can only make reasonable guesses based on patent parsing, probabilities and side indicators. The Dexcom could very well use its platinum electrode wire as a RTD, for example by driving it from time to time with excess current and measuring its resistance.

The Libre thermal compensation

Some of the things I will say in this paragraph are confirmed, some of them are best guesses. Some of them, I am sure will be wrong. Bear with me. The thermal compensation of the Libre signal is described in this patent. Like all “good” patents, there is some obfuscation as many methods are described.

I have worked on the assumption that the Libre follows the 2 point calibration method given in the patent.

Very briefly, that means that the Libre relies on both a “skin thermistor” – that one and its small well in contact with the skin is clearly visible – and a board thermistor. Assuming a certain core temperature (say 34°C), you need to estimate the temperature of the sensing site (below but close to the skin, say 5mm) by measuring a skin temperature that is dependent on the core temperature and the external temperature. A second thermistor, the “board thermistor” located a bit above the skin thermistor adds another measure point that allows you to compute the gradient between the core temperature and the outside world (which can be quite close to the core temperature under clothes for example) if you know the exact distance and thermal conductivity between those two thermistors. In practice, you could also rely on a one point measure (which is what I am doing currently) but there are interesting pluses and minuses to 1, 2 and 3 points gradient estimations.
In a wider context, this method fits with
  • the Libre not having a metallic wire that can be uses as a RTD, could allow lower cost for electrode design.
  • Abbott having the Libre approved for the arm site only (core temperature is different from abdomen which is higher and stabler). (see senseonics troubles with the wrist for additional pointers)
  • Abbott discouraging the covering and not replacing misbehaving sensors that were covered (it potentially messes up the gradient temperature computations)
and a bunch of other anecdotal pieces of evidence.

I can still be wrong, of course:

Abbott could be using one point compensation, but I believe they do not because I see data fitting a 2 thermistors scenario.

Abbott could be using three point compensation with a metallic wire I missed or some other fancy property of their sensing wire they could use as a RTD.


At this point you may begin to think “It sounds great, but where is the data to back this”? Let me show you. But first, let’s have a thought for our sensor 0M0000U0Q68 that suffered a quick and painful death thanks to an impromptu meeting with a door frame.

We’ll be assuming T2 is one of the temperature of interest. Let’s pick the sensor up and put it in my jeans pocket. There will be no glucose measure as the sensor wire is now out the body. Ah, it seems my jeans pocket is warmer than the air outside… (dataset: warmedinpocket)


Let’s drop it on my “lab” table. 21C ambient, should be about right. (dataset: postremovalsuite)

Oops, I forgot it for a while, but now comes a week-end. Let’s take a heating bed and put it at 40C (dataset: 41c)


And let’s slowly bring the temperature down to 36C (dataset 36c)

then 32C (dataset 32c)

then back to room temperature (around 20-21) (dataset 21c)

and finally outside (9C reported by external thermometer dataset 9c4)

Those measures and data sets clearly show there is some validity to the interpretation.
Things to keep in mind: amateur temperature measurement are a pain – breathing on the setup, the height of the table, etc… have an impact. The values are relative, they fit and track the circumstances, but the Libre doesn’t necessarily see them that way. We are not talking “skin” “board” or even delta between skin and board here. Just ambient as a whole.


Let’s look again at what happens during the most drastic change, when the sensor was placed on the heat bed, this time with a bit of code

Mandatory disagreable note: at this point, the reader is expected to know which 2 bytes I am talking about. If he doesn’t know, he just looks them up in the data dump.

t2, high first [12417, 12417, 12417, 12417, 7040, 4992, 4224, 4224, 4224, 4480, 4480, 4736, 4736, 4992, 4992, 4992]
for i in sortedimmediatevalues:
    r2 = hex_str_to_int(i[3])
    r2m = r2 & bitmask14
comment: interesting, values go down as temperature increases as expected fromresistance based thermistors. 

t2 inverted [3966, 3966, 3966, 3966, 9343, 11391, 12159, 12159, 12159, 11903, 11903, 11647, 11647, 11391, 11391, 11391]
temperatures2inverted = [16383-x for x in t2h]
comment: but I am a human, and want them to go up. (dirty hack!)

Temperatures 2, TI [13.83, 13.83, 13.83, 13.83, 30.76, 36.77, 38.97, 38.97, 38.97, 38.24, 38.24, 37.51, 37.51, 36.77, 36.77, 36.77]
def ConvertTemperatureTISpec(counts):
    a = 1
    b = 273
    c = -counts
    d = (b**2)-(4*a*c)
    sol1 = (-b+cmath.sqrt(d))/(2*a)
    return round(abs(sol1), 2)
comment: the TI FRL thermistor formula gives reasonable looking results but my room is definitely not that cold

Temperatures 2 final [20.51, 20.51, 20.51, 20.51, 35.4, 41.07, 43.2, 43.2, 43.2, 42.49, 42.49, 41.78, 41.78, 41.07, 41.07, 41.07] 
def ConvertTemperature(counts):
    sol1 = counts*0.0027689+9.53
    return round(abs(sol1), 2)
comment: that works much better for my purpose… 
This works with my setup, in the temperature range I am interested in. I have exactly zero idea if that is how the FAL sees things. The result I am using could very well be totally off in terms of absolute values. I could be in the linear part of some complex spline or a dangerously exponential function that I would not know about it. I am not an electrical engineer, just a tinkerer. I am particularly concerned about the bottom range of the temperatures: did I hit a hard limit? Not sure. At some point, but I don’t know precisely where, the Libre reader just reports “too cold”. And please note that, in order to avoid a shutdown on no decent glucose values, I could not use the official reader during the experiment.


Yes, I have ideas about the other two bytes. But they are noisy (as the patent hints they would be) and I am currently considering (but not using) them as a delta. I’d rather not talk about them in public. It is easy to see patterns when there are none.

Finally: I am not currently actively looking at the Libre anymore. I just decided to share past data in the hope more competent people could have a look at it.

 download dataset

Wednesday, August 2, 2017

Clean, but shorter, Dexcom G4 (505) Freestyle Libre comparison

Since my previous post has triggered a few private reactions. Here’s another comparison on a fairly standard situation, with clean data: clocks are in perfect synchronisation, there are climbs (pre-game carb loading) and falls, including a severe low (delayed hypo).

On the left, the data as downloaded. On the right, the data shifted for the best correlation (which basically means that the Dexcom data is rolled back in time to erase the delay). That post-mortem analysis is both realistic and a bit unfair to the Dexcom. Realistic because the Libre raw data matches historical data quite well. A bit unfair because the Libre only provides delayed and adjusted historical data. Adjusted relative to what? The spot checks. As I have shown many times on this blog, spot checks are typically even faster than the Dexcom in practice, with the drawback that they are really inaccurate at times, especially on the high side.


In this case, the best correlation is found with a shift of 5-6 minutes (Libre ahead of the Dexcom by 5-6 minutes). This is fairly typical of what we see with the Libre vs the 505, when everything works well for both sensors. That’s the tricky part in practice of course: adhesion issues, desynchronisation between insertions (ie comparing a fresh Dexcom to a Libre in its second week) all play a role.

Broadly speaking, the sensors see the same thing. The 505 data is a bit more bumpy: that is a consequence of the adaptive 505 algorithm and, of course, of the smoothing introduced by the Libre historical data.

One important point: as you can see in the left Bland Altman plot, two well working sensors can show very significant differences based on timing and rate of change.

Regardless of the absolute magnitude of the differences, a consistent behavior emerges: the Libre overshoots highs compared to the Dexcom and undershoots lows to a lesser (absolute) extent. This type of behavior could be the consequence of the calibration slope of the BGM used to calibrate the Dexcom, but we have observed the same behaviors with different BGMs (Menarini Glucomen LX, Roche Accucheck Mobile, Abbott’s Libre BGM). If you are interested in that behavior, the 2014 and 2015 posts on this blog provide additional insight.

The third screen is a log/log plot privately suggested by L. and is basically a Bland Altman on steroids that amplifies the visualization of the differences in behavior in a way that is less dependent on absolute differences. (I am sure I will be corrected if I didn’t get that right).

Beautifying the data

Now, let’s look at the old Clarke plot of the Dexcom vs the Libre. (yes, I know, Clarke plots are out of fashion, but I have had the function for ages, so why not…

First the un-shifted data plot.


Quite decent match, you would not have killed yourself by relying on either device.

Now, the delay corrected data plot.

Isn’t that something? We have gained almost 8% in the A zone.

Now, this doesn’t mean anything in absolute terms. For all we know, the Dexcom could have been right and the Libre could have been overshooting. Only one thing is certain: the delay.

But this tells us something else: it is extremely easy to tweek test results to your liking. Something as simple as asking patients to tests 2 hours after a meal vs asking them to test 1.5 hours after a meal, something seemingly as innocuous as using standard meals or standard sport sessions can have a drastic impact on the numbers. In a market where T1D fanboys love to argue about the 1% MARD advantage of their sensor (while at the same time losing 10% MARD or more through home made hacks), a couple of percent of differences can mean a huge amount of good publicity…

Tuesday, August 1, 2017

Non clean Dexcom vs Libre comparison

Real life has interfered – that would probably be a good “psychological burden of chronic disease” post is I was in the mood – and, while the blog hasn’t been updated, it isn’t dead yet.
Here’s a new comparison between the Libre and the Dexcom 505. Unlike one of the previous comparison posted here, this one is utterly “unclean”. In short
  • this was a tennis tournament week, with frequent games.
  • Max forgot to scan with the Libre, or simply forgot the Libre reader. The straight green lines are those no data periods.
  • both sensors were on the arms: we experienced several adhesion issues and patched as we went.
  • variability is much higher than usual because we were “pre-loading” a bit for games (not very useful, but better than starting too low anyway) and experienced severe delayed hypos on a couple of occasions, despite minimal levemir doses (5U / 24 hours)
In other words, ultra messy real life…

ERRATUM: G4 505 vs Libre - legend copy paste error. Thanks to KS for spotting it.
While I would not draw too many conclusions out of such an awful data set, some comments

It is good to have backup. We lost a Dexcom sensor almost at once (not shown here) and the Libre started dangling after a few days. Interestingly, the Libre started to read a bit too low and sensing delay increased a lot. The yellow marker on the above chart marks the near sensor loss moment. When Max noticed (or paid attention), we used a bit of opsite to stabilize the sensor and normal operation resumed.

The Libre remains, in general, faster than the Dexcom 505 algorithm, and even more so if one looks at spot checks (with the draback that those can be off when the trend changes suddenly). We now have a year or so of side by side data and experience and the result is always the same. Yes, on occasions the Dexcom will pick up a trend before the Libre does (as reflected in historical data) but I don’t remember seeing it picking up a trend before Libre spot checks. Depending on the data set, the optimal correlation between the two signals consistently gives a 6 to 10 minutes advantage to the Libre.

Note: I am not really that interested in collecting additional very clean data. In order to make a rigorous comparison, we need to sync the device clocks on a regular basis, we need precise reference points such as “timecode” BG tests, we need mechanically stable sensors, reminders to scan at least once every 8 hours, etc… All of this adds to the management burden of a teen T1D and that is something I don’t really need.

In practice, that speed advantage needs to be taken with some caution:
  • the Libre historical data is computed and corrected a posteriori (as shown here). It is not useful in real time.
  • the Libre spot checks are typically faster than historical data, but the delay compensation (combined to the so-so temperature compensation) often introduces overshoots.
Still, the Libre remains our favorite sensor for sports.
Excluding the excursions introduced by the interpolation, the Bland Altman plot is relatively flat. Still I wouldn’t draw any conclusion in terms of absolute slopes/biases because the G4 505 depends to a large extent on the calibration it receives (the nasty non linearity of the original G4 has been reduced in the current sensors/algorithm combo).

I realize quite a few issues I addressed here need a more detailed discussion, more data and detailed examples. Please treat this post as a simple keep-alive ping.

Thursday, May 11, 2017

Just a "standard" situation...

For some reason, even though we have a fairly strict rotation routine when it comes to Max's Levemir injection, we are now often confronted to frequent situations where the slow acting insulin seems to fail to act... I do not have a clear explanation for that: Max doesn't seem to skip his injection and there's no site/situation/meal/physical activity that I can correlate the rises with.

Anyway, here's such a situation, but also an illustration of many of the practical issues we face.

green segment: flattish around 100 mg/dl with a couple of mild compressions, no big deal.

By the way, a word about compressions: I often read very specific descriptions of compressions (transient sensor attenuations) in the T1D forums and groups. The compression should be abrupt, deep, and should end with a rebound. That is partly true: a major compression may indeed so unfold. But in practice, the compressions we detect and visually confirm can take almost any form. They can be partial, lead to fairly minor atenuations with no rebounds. They can be masked, as it is almost the case here, by a simultaneous increase. Be open: observe and learn: you may encounter compression lows, but also compression steady states or even compression highs (where the compression attenuates the ongoing rise)

third compression: that one is a major PITA. While it is detected, it masks - in a plausible way the rise that is happening at that moment.

compression exit: the trend starts to appear. But we need a few packets to make sure it is not one of those post-compression rebounds we see now and then. Unfortunately, another mild compression confuses the situation even more (and at that point, the compression detection algorithm, lacking a clear trend, has given up).

correction: the trend is now clear. Since we have seen such situation get out of hand quickly, the time has come for a quick Libre and blood check (see below): the Libre reports 230 mg/dl. The Roche Accu-check reports 225 mg/dl. The Dexcom still lingers at 160 mg/dl, one arrow up.

effect: as expected, around 6 packets later, the correction effect shows up.

Here's what the BG Meter and the Libre showed. Disregard time differences: both the BGM and the Libre are still running on winter time and both have drifting clocks. The actual time is 01:20 for everything.

A couple of comments on the sensors and accuracy.
  • the dexcom is running the G4 share 505 algorithm. The sensor is 5 days old.
  • the dexcom has been calibrated with the Roche Accu-Check BGM used here.
  • the dexcom is on the right arm.

  • the Libre is on day 12 of its life cycle.
  • that particular Libre sensor has been eerily accurate through the session.
  • the Libre is on the left arm.

I could be tempted to blame the Dexcom and praise the Libre and, to be honest, to some extent, I do.


  • this is the ideal situation for the Libre "delay compensation" algorithm. None of the fancy factors where it goes a bit crazy are present.
  • the Libre hasn't been compressed.
  • this Libre sensor has been noticeably better than average (MARD of 5% vs Accu-Check over the whole period, but not enough data to be statistically significant). 
  • that Dexcom sensor has been underperforming a bit for reasons that I can't be certain of.

And what about the correction?

I hit hard. Very hard. Based on our experience, when the Levemir injection seems to fail, EGP can spiral out of control (we did get our first even 400 mg/dl on such an occasion). I used about 2.5 times more insulin that I would use to correct that trend in daytime.

There's always a bit of anxiety when using such a relatively high dose (8U) in the middle of the night. I do want to avoid the yo-yo situation where I have to correct a low later. And, at first, the huge drop after the plateau isn't reassuring. What is the fall accelerates? That is always a question that lingers.

As it turns out "insulin resistance", or EGP, or a mix of both is so high in those circumstances that the situation should evolve well. But that is an opinion based on our fuzzy experience and gut feeling, not a computable one, if only because the previous nights were OK and we have no definite idea about the current insulin sensitivity level.

As you can see, the trend settles quickly.

And even if I am usually very confident with my decisions, I will lose a few hours of sleep, keeping an eye on the situation just in case... and write this blog post to kill time.

Sunday, February 19, 2017

“Zero Carb” day on a non T1D person

I have already posted a few non diabetic CGM/FGM response patterns to food and exercise and even made my 2014 complete 14 days run available on this blog. In this very quick post, I will simply share the result of a full “strict zero carb” day on a non diabetic (your servant, now almost 54yo). A few BGM test strips were spent to ensure the CGM/FGM was working perfectly. The minimum of 62 mg/dL probably wasn't reached and came in what definitely looks like a prolonged compression.image
I felt a bit dizzy around 15:00. My urinary ketones were positive at 16:00

I will stubbornly avoid discussing my opinions on that type of diet, short term or long term, in adults or kids. A comprehensive review of its issues and merits can be found here (pdf) on the paleomom blog

Tuesday, February 7, 2017

Libre Clinical Study and discussion

The blog has slowed to a crawl, I apologize. The reasons behind my relative silence are

  1. Max has reached the tender age of 16. That means that teen issues and behaviors have become more common, impacting his control and our mood. I believe every T1D or T1D caregiver can relate to that situation which means I will leave it at that. Our latest HbA1c, a week ago, was still 5.5% but I believe this will be one of the last time we’ll see values below 6%. I will try not to despair, as there definitely are trade-offs one must accept if a kid is to have a semi-normal adolescence.
  2. We are going through an extensive remodeling of our environment and that takes time.
  3. rant alert
    Finally, as much as I hate to write this, I have lost interest in most open-source, community driven projects. I think I need to qualify that statement a bit before I get a lot of flak. As far as making data accessible everywhere and anywhere, I am still extremely grateful to the community as a whole, and especially the core members of the Nightscout project who made that data conveniently and cheaply available. The open source, or semi-open source community is great at developing features that actual T1D and T1D caregivers need or want. What really deeply annoys me, however, is how little attention is paid the the delivery of accurate results. Adding a new display device, check. Adding new minor features or screens, check. Accuracy, not so much. Assuming one want to deliver accurate results from raw data, there is a bit more to it than jumping from a single point calibration to another, or calculating an arbitrarily constrained slope. Occasionally, two open or semi-open source solutions are compared: they show a 50 mg/dL difference, eventually absurdly amplified by the lever effect of a bad slope, devices are rebooted, restarted and the community moves on. That is not to say that I would or that I privately do better, at least in a way that is applicable to a general population but that is precisely because I am aware of the potential issues that I decided not to inflict my experiments on innocent bystanders. On top of that, in the Libre world, the “semi-open source” approach, consisting of an incomplete github source dump that often misses all the computation parts, irritates me. Don’t think for a minute that those effectively closed source solutions are hiding some miraculous sauce: they aren’t.  The reason for the omission is often that they simply want to hide how they turn a very nice sensor like the Libre into something that behaves and performs like a second generation Medtronic sensor…
    end rant alert

The study

Let’s now have a look at the study, recently published in the British Medical Journal, “An alternative sensor-based method for glucose monitoring in children and young people with diabetes.” which you can download here.

The work was sponsored by Abbott: they were involved in the planning, the funding and the provision of devices used in the study. Except from the possible cherry picking of sensors used in the study, slight cherry picking of the competitors studies cited I did not spot any obvious red light. The population studied was a set of 4-17 yo children and teens that, according to the additional data (for example 7.6% mean HbA1c) seems to be a bit better controlled than the average population since 75% of that normal population does not meet the 7.5% target. Such a small bias may have had some impact on the study (more on this below) but it is probably because the authors of the study deliver better than average care. 

The conclusion of the study were, in short, MARD vs SMBG (capillary) 13.9% in that population (vs a previous 11.4% in a previous adult study) and 99.4% in the AB zone of the CEG. That is in line with the reported accuracy of the Dexcom G4 505 in some studies, although Dexcom likes to focus on its best study exclusively. 

The general conclusion was that the device could be trusted, was well accepted and, usual scientific caveat, could be beneficial long term. Well, there is nothing groundbreaking here, we all knew that, didn’t we? The benefit of that study is to be found elsewhere: respected researchers and clinicians, a fair number of cutaneous adverse effects (unlike in some previous studies), a protocol that does not smell of manipulation – that will drive acceptance and adds argument for funding and full coverage. 

Some personal comments

We do consistently get better accuracy than what the study reported on average. This is probably attributable to the fact that our “bad” weeks were 80% in range, our “good“ weeks were 90% in range while the population studied only stayed 50% in range. Incidentally, as a non T1D, when I ran sensors we had purchased in France on myself, I stayed at an 8% MARD for 12 days before the sensor started to drift. Variability, and the more frequent and usually rapid change of range it implies, definitely affect the CGM accuracy numbers.

The “acceptance” part of the study is very positive for Abbott. Again, we all know that. In fact, despite the overwhelming satisfaction expressed by the participants in the study, I believe the benefits to be understated. I always come back to our tennis experience on that issue: being able to play a full tennis tournament on a single daily SMBG check (as opposed to 10 to 15 checks per match) was just amazing. This was due both on the general accuracy of the device but also on its delay which was, in our carefully documented experience, 9 minutes shorter than the Dexcom G4 delay. For us, the Libre wasn’t merely a well accepted replacement, it changed our experience of T1D for the better.

On the delay side, the authors of the paper note “no delay”. This is really where I want to nitpick a bit. There definitely is a delay (quite visible in RAW data at stable temperature). It is simply partially compensated and partially obfuscated by the behavior of the Abbott’s algorithm.

It is extremely visible in chart B of the paper

as you can see, the sensor is – on average, note this is MRD not MARD – essentially perfect in stable or near stable conditions. The most significant relative differences occur in dynamic conditions and in the same direction.

In other words, when you are falling quickly, the Libre trails the fall and reads higher (probably missing some hypos), almost as a non delay compensated CGM would do. When you are rising quickly, the Libre leads and overshoots the rise (overestimating some hypers).
This is a behavior we noticed immediately (see herehere and here for some of our 2014 reports) and have consistently observed since.

I believe, just as I believed in 2014 that this is mostly the result of the Abbott delay compensation algorithm. It is not necessarily a failure of the algorithm (although looking at the raw data is appears it could be improved) but possibly a conscious decision by Abbott, either based on a technical issue such as an eventual lower signal to noise ratio in low ranges, or based on physiological issues they have identified in the BG to IG dynamics on falls. 

I am of course quite happy and a bit proud to have identified the issue in 2014 , while remaining aware our test population was n=2.

One last point on the delay issue is that the authors noted that the granularity of their time measurement was 5 mins. Timing issues are really critical as far as delay computations are concerned, which is why when we tested SMBGs vs Libre we always used immediate spot checks (because that is what matters to the patient) and I had to programmatically resynchronize the clocks on each checks (both the Libre and our BG Meter had drifting internal clocks). I used the same constant resynchronization technique with the Libre vs Dexcom comparison in order to maximize accuracy. Ballpark figures give a 15 minutes delay on the Dexcom G4, with a 9 minutes advantage on the Libre you end up with a six minutes average delay for the Libre vs SMBG (confirmed by our Libre vs SMBG tests in slow rises and slow drops), which would be hard to demonstrate with a 5 minutes granularity, especially if the comparison is not versus spot checks but versus inferred values from the 15 minutes averages.

Last comment: in absolute terms, you should keep in mind that the MARD given in that paper is most probably Libre CGM vs Libre BGM (or other Abbot BGM) and might be a bit biased as the same fundamental decisions have obviously driven the design of both devices. I do like that bias myself as the use of different BG meters would have muddied the algorithmic issue even further and would probably have required a set of Bland Altman plots to debias/detrend the data.

Apologies if I sound obsessed by speed issues, but as far as we are concerned, that was and probably remains (until a full CGM is available one way or another) the defining advantage of the Libre versus the Dexcom G4 or Dexcom G4 505.

Monday, January 9, 2017

It’s always compression, duh!

Let’s be honest, I’ve been waiting for a moment like this one, where algorithms trump my quick visual assessment.

Here’s the situation on a relatively standard scale: can you spot anything? Don't cheat and look below.


21:00 Max starts climbing slowly. Because he had a small hypo earlier, the light evening meal, mostly protein (salmon) he took starts showing up.
21:30 recalibration, the Dexcom was a tad too high, the Libre (on the other arm) was spot on at 108 mg/dL. The decision is made to take no action because we know that the 20:30 will start pushing BG back down a bit roughly 3-4 hours after injection (20:15 in this case).
Sure enough, we seem to be on a small downtrend starting around 23:15, for a 155 mg/dL high. Yes, that is not ideal, but at some point we have to consider the trade-off between undisturbed sleep and perfect BG. Today, undisturbed sleep was the intent.
At first sight, this slope still looks like a mild downtrend, with a bit of noise.
However, this is what I get in another view: my compression detection algorithm has triggered!


Interesting… Time to look at the decision parameters

Parenthesis: my compression algorithm isn’t wildly different from what has been published in the literature. I developed it independently in 2015, as a toy project. In a nutshell, the algorithms examines the last few hours available (at least an hour although I can fine tune the parameters conveniently), assesses noise, overall trend and builds “confidence” on those values. It’s a bit of a cookbook of hacks and rules. For example, the SD a detrended trend gives a good indicator of the current “meta” noise level in the signal: a drop caused by a compression should, obviously, by more important than the SD of that detrended signal by some factor (one that I tuned based on experience and visual assessment). It also goes without saying that the delta must be negative. On top of that, a few rules have been added here and there for experimentally observed special cases.

At that 00:53, the new value enters the “hour buffer” which happens to have an extremely high level of confidence. Note that the algorithm did not have that level of confidence at 23:23, post peak, where the hourly trend was less clear and a bit of noise (maybe a transient compression) pushed the detrended SD a bit higher.

That being said, the case isn’t settled at this point and I decide to zoom on the chart and go up and check. Max is indeed leaning on his Dexcom, and not leaning on his Libre. The Dexcom, which had been tracking the Libre after recalibration is actually 10 points below.

The acid test is of course to move Max a bit from leaning on its Dexcom to not leaning on either devices. Here is a zoom on what happened: the very mild compression recovered almost immediately.


  • Instead of being slightly down, the trend is actually either stable of very slightly trending up. Knowing this allows me to push a 1 U correction (being extremely conservative here in order to avoid any hypo risk.
  • the scale at which we are looking at our CGM signal impacts our perception and our assessment of a situation. (which is one of the reason I developed my own “in-the-cloud” visualization, which I can tweak and zoom to my liking)
At this point, I can already hear the dissenters saying “How in the world can you tell it is a compression on such a small variation ? My Dex can be off by xx points or jump around”

Good question: let’s answer this methodically
  • the custom “artificially intelligent” algorithm says so Winking smile
  • the Libre says there was no drop.
  • I have confirmed visually that Max was sleeping on its Dexcom side.
  • I have confirmed that, by relieving the compression, the sensor recovers as expected and resumes cruising.
  • yes, I have no idea that the “real” level is actually 137, 127 or 147 mg/dL but it does not matter: the relative change does matter.
  • yes, there are situations where the Dexcom is too noisy, the trend is unclear and the decision is ambiguous, if possible at all.
but when the Dexcom (or Libre) is tracking smoothly, there is very little variation in the signal, or detrended signal if in a clear trend. That is that consistency, when the signal is good, that allows Dexcom executives to claim their technology is already much better than BG Meters. That is a statement I can totally agree with… until real life interferes (micro traumas, compressions, failing sensor, encapsulation…) and, of course, except for the fact that the baseline Dexcom values depend to a large extent of the performance of your BG Meter.

Anyway, this blog post is almost live because this must be the first time an algorithm tells me something I may have not noticed. 3 years ago I had a quick look at Neural Networks and AI but, while I got them to tell me interesting things and issue decent predictions, they never told me anything I wouldn’t have noticed or predicted by myself. That one is a first!

Ah, and one more thing – let me reassure all the hydrophiles out there, no glass of water was harmed in this experiment.