Tuesday, November 21, 2017

Trying to navigate the MDI and adolescence maze (and failing…)

Since Max does not want to use a pump, we are on Multiple Daily Injections. On the quick acting insulin side, we have always been using Novorapid pens. We are fairly happy with Novo, assuming it is pre-injected to match the carb absorption curve as much as possible. On the slow acting side, we switched from Lantus to Levemir because we couldn’t avoid a systematic low trend when Lantus picked up. Then, for a few months, Max expressed the desire to switch back to Lantus (I have absolutely no idea why he wanted to do that, but since it is HIS diabetes, he gets to choose…). After a few months, we switched back to Levemir, as the “pickup lows” resurfaced. Those aren’t completely avoided with Levemir, but they are less severe.
As far as the schema and doses are concerned, we have two peculiarities:
  • we use a single evening dose of Levemir in the evening (around 10 units) in contrast with the standard two doses 12 hours apart schema.
  • we tend to have low total insulin requirements, in the 0.25 to 0.50 U per day/kg. At times, after sports, Max can actually skip fast acting doses and have a meal. We did get a few periods (months) where a very low (<0.25U/kg.day) TDD was used. Max had an ACTH stimulation test to exclude Addison’s disease which turned out to be normal (along with insulin dosing).
As far as dosing is concerned, Max typically guesstimates his doses (teens…) without severe consequences: only one trip to the hospital for a post-sport hypo that scared the school (a slight over-reaction, but better safe than sorry), zero glucagon injection required, and 4 and 1/2 year of sub 6 HbA1c.

I you are beginning to think we have an easy ride, you are wrong…

“What hath night to do with sleep?”

(John Milton, Paradise Lost)

The biggest problem we have is that there is a huge chasm between the basal insulin theory and the practice, at least in teens. Regardless of the care we put into the injections, variability is kind. Our site rotation plan is quite decent. So is Max’s injection technique. Storage is what it should be. Doses are adapted depending on the exercise of the day. Trends are corrected the next day, etc…

Still, too often, the results seem to be a lottery.

One of the questions on my mind was to double check if we could correlate poor outcomes with injection sites. (Well, to be honest, there are many questions on my mind but I will focus on that one in this post).

Site rotation

We use the standard “front of leg” site for basal insulin. We usually plan a site sequence for the week. Unless a mistake is made, we do not use sites two days in a row and, in the worst case scenario, sites are re-used after a week. Now and then, we reset the sites rotation we use. I am aware that some app can help, but in most cases, apps are a burden and we tend to stay away from them. Max two legs are divided in three height zones (upper, middle, lower) and three sides (internal, middle and external) giving us a total of 18 injection sites to rotate through.

Outcome classification and scoring

Outcome are defined – subjectively – by examination of CGM traces as follows.

STB is a stable night, regardless of the level (we of course would correct stable low or stable high levels). This is the usual archetype of a “good basal rate”
TL and TH are nights with mild trends that you would typically associate with a slightly excessive or insufficient basal rate.

ER, MR, LR are the rises. Early in the night (before 24:00), in the middle of the night (between 00:00 and 3:00) or late in the night (between 3:00 and 6:30). These rises are characterized by a sudden relatively quick increase, steeper than trends but not as steep as meals, they are typically at the rate of roughly 30 mg/dL.hour. Obviously, as soon as the increase is confirmed (example, starts at 90 mg/dL is at 150 mg/dL after two hours) corrective measures are taken. (for example: a 30 mg/dL trend would be corrected by x units to bring the 150 mg/dL back to around 100 mg/dL plus x units to compensate the trend for the next 3 hours). That basic algorithm has never failed us on the low side (no hypo caused) but is sometimes insufficient and is typically reassessed  after 2.5 to 3 hours. Late rises are a pain, especially if they are combined with a dawn phenomenon. As a caregiver, you either have to go to sleep extremely late or wake up extremely early. Setting up CGM alarms is a no go. They aren’t flexible enough to help. 150 md/dL at 6 AM is OK, we aren’t going to be able to correct it anyway, but it would be nice to have one at 1AM in the same conditions and rising.

HYP are the severe low trends which require multiple and frequent corrections. They typically rear their ugly heads after intense sport sessions. We do of course correct those actively. While we do reduce basal in obvious situations and adapt the evening meals after such sport sessions, we never totally got rid of it.

That classification is therefore somewhat subjective and can’t be directly derived by nightly averages.

DAWN the dawn phenomenon is treated separately. It shows up in random clusters. The dawn phenomenon’s classification is again somewhat subjective. That being said, it is quite obvious when it shows up, kicking a LR into high gear above 300 mg/dL, pushing a TL situation into a TH one, etc…

Scoring is even more arbitrary (even though it is normalized for more sophisticated analysis). STB is worth 10 points, that is the ideal situation. Trends are given 6 points, they are extremely easy to assess. Early Rises are not that bad, easy to correct at a decent hour, they are worth 5 points, Middle of the night rises are still manageable and worth 4 points. Late rises are basically uncorrectable: you would have to disrupt the kid’s sleep way too early for a correction, and you might be ending up with stacking issues at breakfast. They only get 3 points. Hypo is the worst case scenario and is worth a single point. ND stands for the few nights the Dexcom wasn’t operational (we cover with the Libre) and are given a “neutral” 6 score.
def outcometoscore(outcome):
    if outcome == 'STB':
        return 10
    if outcome == 'TL':
        return 6
    if outcome == 'MR':
        return 4
    if outcome == 'ER':
        return 5
    if outcome == 'LR':
        return 3
    if outcome == 'TH':
        return 6
    if outcome == 'HYP':
        return 1
    if outcome == 'ND':
        return 6


Here  is a visualization of our actual rotation (approx 300 shots) – the diameter of each circle proportional to the number of shots (ok, the area would be better, but the differences wouldn’t be as visible)

Here is the numerical view
Comment: the human mind is not very good at generating truly random series even if it intends to. Remember that there is always a week before a site is reused though.

We favor the lower zones, possibly because they are easier to reach.

our leg distribution is quite good
So is our side randomization
Now, let’s look at outcomes
As you can see, we have less than a third stable nights, quite a few “trending low” nights (easy to fix with a few dextrose tablets when the trend is established), about 40% of sudden unexplained increases and a relatively low number of severe hypos (more about them later). Regardless of what you know, of how careful you are, the variability of nights is a pain.
We observe a fair number of dawn phenomenons, in around 25% of the cases.


Is any site really bad?
It seems there are some differences indeed. However, given the arbitrary scoring system, the small sample size for some sites and confounding factors (more below), a random distribution can’t be excluded.


Height doesn’t seem to be much of a factor either. The lower part of the leg seems to lead to very slightly better outcomes, but that is only, strictly speaking, statistically borderline significant.
Legs are equivalent. That’s a certainty.
The external side of the leg seems to lead to better outcomes (which are heavily weighed towards stable nights).
Another interesting thing to look at is the average score per week day. Saturday is clearly the worst day and that can be explained by the fact that it is often Max’s most intensive tennis day (2.5 hours) where delayed hypos are more frequent. Wednesday is also a “sport day”, with PE in the morning and tennis in the afternoon. That Saturday turns out to also be a significant confounding factor as far as the injection site is concerned: given our weekly rotation basis, the same site may end up being used frequently on Saturdays…


The impact of the exact injection site is limited in the absence of lipodystrophy.
A good site rotation prevents the appearance of lipodystrophies.
Non computer generated site rotation sequences are not optimal.
That type of basic analysis is interesting, but not tremendously helpful in the absences of lipodystrophies. I suspect that they would however be revealed by careful site tracking.
Other approaches yield slightly more interesting results, at the risk of non statistical significance. Some more informed approaches are quite interesting (post sport night sequences). We may get to that in another blog post.

Even with the best effort, in patients who have been able to maintain sub six HbA1c values for 4 1/2 year, who benefit from a favorable environment (dual CGM, reasonably informed dad on duty) T1D teen nights are highly variable.

What? no dosing info?

You may be surprised by the lack of dosing information/analysis. In practice, insulin sensitivity varies tremendously from one individual to an other. We dose “as best as we can”, following all the standard guidelines. The point of this post was simply to look at the possible impact of sites, assess our site rotation.

Wednesday, October 11, 2017

Libre: the “other” bytes (part 3)

Reminder: Max is currently wearing both the Libre and a Dexcom G4 (505 algorithm, “Share” US reader). We take advantage of the Libre speed during the day and for sports. We love the trouble free Dexcom remote transmission at night and, of course, the alerts. We are not running any third party add-on for the Libre as I remain unsatisfied with them. I am still not convinced that a reliable, independent and non infringing third party solution can emerge. I simply treat the “Libre Problem” as a hobby I come back to now and then.
The time has come for a few corrections and additional information.

First Correction

As I said in a previous post, the Libre can potentially use several temperature compensation methods, as described in their patent (1, 2 or 3 points).  To summarize the options
1 point: at least one point (skin) is certain. There’s the huge thermistor sitting in its nice well, and I can interpret its data in relative terms.
2 points: the “skin and board” option, used to estimate the temperature at the sensing site. That option was my favorite for a couple of reasons: other TI FRL processors offer thermistors and I systematically saw two temperature values move as I heated and cooled the sensor. One of those two values was always a bit ahead of the other, which fit nicely with a diffusion gradient. But there is a catch.

3 points: with skin, board and in-situ thermistors. After having looked at microscopic images of the sensing wire, I think that option can be excluded, but as usual can still be wrong.

The catch

It is simple and stupid at the same time. Back in 2014-2015, I immediately split the 6 bytes immediate Libre record into 3 words. One for BG, two moving on temperature changes, with some flags. I was aware of the existence of flags, and did not drop them as I masked the observed values. That left me with a temperature value (thermistor value) that I could interpret directly based on experimental correlation and another that I had no choice but to consider as a delta.

On the plus side
  • I really liked the 2 points compensation design.
  • I had decently working code that I could reliably use for thermal compensation.
On the minus side
  • I was forced to use one value as LSB, the other as MSB and mask and keep the flags.
  • my temperature data had a resolution of 0.7C
Blog reader Robert Gras (r/o/b/e/r/t.gras./3/3at/gmail) pointed out that I would get rid of the 0.7C quantization issue and have a simpler solution by assembling the bytes differently: the well known G value, some flags, the thermistor value, some flags). This feels much cleaner, thank you Robert!
Here’s an example of where I stand right now, with both the immediate values and the historical data.
A few comments
  • that situation doesn’t warrant temperature compensation, at least from my own algorithm point of view, therefore G and Temp compensated G match. It could require a slight tweak if the temperature remains at a lower (relative) level, but I know from experience that a small temperature change needs a few minutes to impact the glucose oxydase activity.
  • the scan and my interpretation match closely (176 vs 180) but that isn’t always the case. While a lot of my sensors match my algorithm nicely, some sensors may appear “off” if one uses a constant interpretation.
  • I have noticed significant behavior differences between my 2014-2015 sensors and the 2016-2017 ones.
  • I am aware that the condition flags would benefit from being displayed in binary form and correlated with events and situations – I have another visualization for that… Maybe later.
  • the interpretation of historical data (the eight hours stored in the FRAM of the reader) is a bit tricky, in the sense that it is delayed, post-processed (smoothed mostly) by the reader and ends up being a bit different when exported by the PC software.


Blog reader R.Z. brought to my attention an additional thermistor that I had missed on the skin temperature sensing circuit. I am told that this type of circuit is a standard, but probably will have to redo my experimental temperature measurements at some point.
In a way, one can say that I lost a thermistor (correction 1, much to my dismay) and that I gained another one….


I’ll keep logging data and situations as I enjoy the hobby.
But, to be honest, I am still depressed when I see open or semi open source solutions, sometimes even commercial add-on products, use a basic approach that I rejected in late 2014 or 2015…
A concerted strategy to make progress (legal progress that is…) would be to collect and document as many situations as possible, too hot, too warm, trending up, trending down, not available, noisy, clean etc… completely. By completely I mean
  • complete data dump (NXPTagInfo for example).
  • simultaneous official scan
  • simultaneous error log if any
However, one should keep in mind that the Libre system is actually quite flexible and that Abbott can and has changed things whenever it wants.

Monday, September 25, 2017

Libre: the "other" bytes (part 2)

Let's now have a look at the behavior of the Libre thermistor derived data in my archetypal case.

Here is the April 2015 view of the incident.

Quick reminder: in a stable BG situation, the impact of a temperature change (bath), compounded by the Libre predictive algorithm, led to a very significant error by the "official" Libre reader. Outside of outright malfunction, this is the only situation where I felt the Libre could have been dangerous.

And here is the "thermistor" informed version.
From a stable condition, jumping into a warm bath, here is what happened to my son's sensor.

T2: a thermistor value I am fairly sure of, at least in relative terms (see previous post), starts to climb. That is expected.

T1: a thermistor related value also shows a marked increase, at least in my interpretation, it is also noisier, as expected based on my understanding. Please note that it is not shown to scale in this chart. My interpretation of it is somewhat arbitrary, and so is the scaling.

RAW measured IG: the green line, starts to rise. This is most likely due to the sensing site becoming warmer as my kid lies in his bath. Since the mass of his body is huge compared to the mass of the sensor and the enzymatic reaction has its own inertia, measured IG starts to rise more slowly. In fact, it is still rising as my son steps out of the bath.

The official Libre scan gives a reading of 194 mg/dL which almost perfectly fits a basic linear prediction based on the few pas minutes (incidentally, the behavior of that prediction algorithm matches almost exactly one prediction algorithm previously documented by Abbott for the calibration of its "full" Navigator CGM. But then, many prediction algorithms would match).

The actual BG was stable. It was double checked with our BGMeter and fit perfectly with the Libre scans prior to the bath and my own interpretation of raw data.

The key question, at least for me, was what could I do with that data.

Well, I could tell when the temperature was rising, what it seemed to add to the raw BG measures, when and how the delay compensation algorithm kicked in. That may seems a lot, but it could also be summarized as "Max, please do not trust the Libre after sudden temperature changes or when the predictive algorithm shows its ugly side." In practice that is all you need to know.

As far as "standard users" are concerned, I could have come up with an algorithm that reflected what I thought an appropriate correction would be and that could have been used in third party implementations. But let's be real for a minute here:

  • I am not foolish enough to believe my algorithm wouldn't be shaky at times. I can't run clinical tests.
  • the Abbott teams are not fools. I would be delusional if I thought I could better them based on incomplete, guessed information.
  • my algorithms (I tried a few) were of course inspired a bit by my own thoughts and a lot by the literature. Even if they had worked flawlessly, I probably would have knowingly and unknowingly trampled a few patents.
  • as I understand things, covering the sensor would not have been a good thing in general.
  • I did not know, in depth, what I was doing. (and I still don't :) )
  • standard users don't care.
That being said, for a while, running a Dexcom/U. Padova inspired smart sensor algorithm on thermal compensated Libre raw data was fun, if very inconvenient. 

Sunday, September 24, 2017

Libre: the “other” bytes (well, some of them at least)

Personal comment

Back in early 2015, when I started my “running the Libre as a full CGM” experiments, I quickly became aware that the core problem was much more complex than simply figuring out the translation of the so-called Dexcom “raw” signal to human readable values. There’s a reason: in the Libre FRAM, what we are seeing is a real “raw” signal. While the measure of the glucose signal itself is fairly reliable, it is heavily post-processed by the Libre firmware. Specifically - and in no particular order – temperature compensation, delay compensation, de-noising… all play a role. That understanding and, to some extent, my MD training, led me to extreme caution and prevented me from releasing my “solution”, which I knew to be both incomplete and unable to handle some error conditions.

The main driver behind my decision was the well known “first do no harm” (primum non nocere) motto, an essential part of the Hippocratic Oath which I symbolically took. I still stick by it today.
However, by the time I came to realize the full extent of the problem, I had already released enough pointers for developers to build partial solutions upon (and have no doubt my meagre contribution would have quickly been replicated anyway, had the Libre been more widely available back then).

Today, there are a lot of add-on devices that aim to transform the Libre into a full CGM. To be honest, in general, I do not like either the results they provide or their (in)convenience. None of those I have tried delivered results that would lead to an approval by a regulatory agency, none of them were stable for long periods of time. But, apparently, patients still feel they are helpful and there is now a thriving community that aims at improving them.

That is the reason why I will release a bit more information about my own experiments. Keep in mind that I can be wrong.

Personal situation

Max is now a real teen (almost 17), with all the warts of that age. We have been running both the Dexcom G4 (with the 505 algorithm) and the Libre in parallel for more than a year now. This is both a “belt and suspenders” and an optimal results strategy. We mostly use the Dexcom at night, when it is most convenient and the Libre during activities, where we benefit from the added speed. We use what we have when one of them fails (rare) or becomes detached.

As far as we are concerned, the main conclusions shared in 2014/2015 and 2016 on this blog remain true. The Libre is faster and more accurate on the whole. An “anal” calibration strategy brings the Dexcom in the same overall accuracy range, but that strategy is now just a fond memory (teen warts…). The Libre sometimes has a mind of its own (predictive failure and poor temperature compensation. My subjective (and almost statistically significant) impression is that both systems have improved a lot in terms or reliability and post insertion period.

Let me stress that this is not gospel: the performance and the length of reliable operation of a CGM sensor has a lot to do with its eventual encapsulation as a foreign body: your mileage may vary as your macrophages fuse into giant cells and encapsulate the sensor. For some people, I am sure, the Dexcom wire will behave better.

Thermal compensation

As I have shown here before, the Libre and the Dexcom (like all enzyme based bio-sensors) are sensitive to temperature variations (and pH, and potentially other things). This is extremely basic bio-chemistry. You can see an example of this here for example (as a side note, since I am an ill tempered old fart, I quickly grew tired of arguing with non believers Winking smile). Some info on a possible Dexcom temperature compensation strategy can be found here.

That means that the raw signal of a glucose oxydase based sensor has to be compensated for temperature (and ideally pH and pO2, especially for compressions). There are several methods to measure the temperature at the sensing site.

At that point, let me say that I do not know _precisely_ which method is used in which sensor, I can only make reasonable guesses based on patent parsing, probabilities and side indicators. The Dexcom could very well use its platinum electrode wire as a RTD, for example by driving it from time to time with excess current and measuring its resistance.

The Libre thermal compensation

Some of the things I will say in this paragraph are confirmed, some of them are best guesses. Some of them, I am sure will be wrong. Bear with me. The thermal compensation of the Libre signal is described in this patent. Like all “good” patents, there is some obfuscation as many methods are described.

I have worked on the assumption that the Libre follows the 2 point calibration method given in the patent.

Very briefly, that means that the Libre relies on both a “skin thermistor” – that one and its small well in contact with the skin is clearly visible – and a board thermistor. Assuming a certain core temperature (say 34°C), you need to estimate the temperature of the sensing site (below but close to the skin, say 5mm) by measuring a skin temperature that is dependent on the core temperature and the external temperature. A second thermistor, the “board thermistor” located a bit above the skin thermistor adds another measure point that allows you to compute the gradient between the core temperature and the outside world (which can be quite close to the core temperature under clothes for example) if you know the exact distance and thermal conductivity between those two thermistors. In practice, you could also rely on a one point measure (which is what I am doing currently) but there are interesting pluses and minuses to 1, 2 and 3 points gradient estimations.
In a wider context, this method fits with
  • the Libre not having a metallic wire that can be uses as a RTD, could allow lower cost for electrode design.
  • Abbott having the Libre approved for the arm site only (core temperature is different from abdomen which is higher and stabler). (see senseonics troubles with the wrist for additional pointers)
  • Abbott discouraging the covering and not replacing misbehaving sensors that were covered (it potentially messes up the gradient temperature computations)
and a bunch of other anecdotal pieces of evidence.

I can still be wrong, of course:

Abbott could be using one point compensation, but I believe they do not because I see data fitting a 2 thermistors scenario.

Abbott could be using three point compensation with a metallic wire I missed or some other fancy property of their sensing wire they could use as a RTD.


At this point you may begin to think “It sounds great, but where is the data to back this”? Let me show you. But first, let’s have a thought for our sensor 0M0000U0Q68 that suffered a quick and painful death thanks to an impromptu meeting with a door frame.

We’ll be assuming T2 is one of the temperature of interest. Let’s pick the sensor up and put it in my jeans pocket. There will be no glucose measure as the sensor wire is now out the body. Ah, it seems my jeans pocket is warmer than the air outside… (dataset: warmedinpocket)


Let’s drop it on my “lab” table. 21C ambient, should be about right. (dataset: postremovalsuite)

Oops, I forgot it for a while, but now comes a week-end. Let’s take a heating bed and put it at 40C (dataset: 41c)


And let’s slowly bring the temperature down to 36C (dataset 36c)

then 32C (dataset 32c)

then back to room temperature (around 20-21) (dataset 21c)

and finally outside (9C reported by external thermometer dataset 9c4)

Those measures and data sets clearly show there is some validity to the interpretation.
Things to keep in mind: amateur temperature measurement are a pain – breathing on the setup, the height of the table, etc… have an impact. The values are relative, they fit and track the circumstances, but the Libre doesn’t necessarily see them that way. We are not talking “skin” “board” or even delta between skin and board here. Just ambient as a whole.


Let’s look again at what happens during the most drastic change, when the sensor was placed on the heat bed, this time with a bit of code

Mandatory disagreable note: at this point, the reader is expected to know which 2 bytes I am talking about. If he doesn’t know, he just looks them up in the data dump.

t2, high first [12417, 12417, 12417, 12417, 7040, 4992, 4224, 4224, 4224, 4480, 4480, 4736, 4736, 4992, 4992, 4992]
for i in sortedimmediatevalues:
    r2 = hex_str_to_int(i[3])
    r2m = r2 & bitmask14
comment: interesting, values go down as temperature increases as expected fromresistance based thermistors. 

t2 inverted [3966, 3966, 3966, 3966, 9343, 11391, 12159, 12159, 12159, 11903, 11903, 11647, 11647, 11391, 11391, 11391]
temperatures2inverted = [16383-x for x in t2h]
comment: but I am a human, and want them to go up. (dirty hack!)

Temperatures 2, TI [13.83, 13.83, 13.83, 13.83, 30.76, 36.77, 38.97, 38.97, 38.97, 38.24, 38.24, 37.51, 37.51, 36.77, 36.77, 36.77]
def ConvertTemperatureTISpec(counts):
    a = 1
    b = 273
    c = -counts
    d = (b**2)-(4*a*c)
    sol1 = (-b+cmath.sqrt(d))/(2*a)
    return round(abs(sol1), 2)
comment: the TI FRL thermistor formula gives reasonable looking results but my room is definitely not that cold

Temperatures 2 final [20.51, 20.51, 20.51, 20.51, 35.4, 41.07, 43.2, 43.2, 43.2, 42.49, 42.49, 41.78, 41.78, 41.07, 41.07, 41.07] 
def ConvertTemperature(counts):
    sol1 = counts*0.0027689+9.53
    return round(abs(sol1), 2)
comment: that works much better for my purpose… 
This works with my setup, in the temperature range I am interested in. I have exactly zero idea if that is how the FAL sees things. The result I am using could very well be totally off in terms of absolute values. I could be in the linear part of some complex spline or a dangerously exponential function that I would not know about it. I am not an electrical engineer, just a tinkerer. I am particularly concerned about the bottom range of the temperatures: did I hit a hard limit? Not sure. At some point, but I don’t know precisely where, the Libre reader just reports “too cold”. And please note that, in order to avoid a shutdown on no decent glucose values, I could not use the official reader during the experiment.


Yes, I have ideas about the other two bytes. But they are noisy (as the patent hints they would be) and I am currently considering (but not using) them as a delta. I’d rather not talk about them in public. It is easy to see patterns when there are none.

Finally: I am not currently actively looking at the Libre anymore. I just decided to share past data in the hope more competent people could have a look at it.

 download dataset

Wednesday, August 2, 2017

Clean, but shorter, Dexcom G4 (505) Freestyle Libre comparison

Since my previous post has triggered a few private reactions. Here’s another comparison on a fairly standard situation, with clean data: clocks are in perfect synchronisation, there are climbs (pre-game carb loading) and falls, including a severe low (delayed hypo).

On the left, the data as downloaded. On the right, the data shifted for the best correlation (which basically means that the Dexcom data is rolled back in time to erase the delay). That post-mortem analysis is both realistic and a bit unfair to the Dexcom. Realistic because the Libre raw data matches historical data quite well. A bit unfair because the Libre only provides delayed and adjusted historical data. Adjusted relative to what? The spot checks. As I have shown many times on this blog, spot checks are typically even faster than the Dexcom in practice, with the drawback that they are really inaccurate at times, especially on the high side.


In this case, the best correlation is found with a shift of 5-6 minutes (Libre ahead of the Dexcom by 5-6 minutes). This is fairly typical of what we see with the Libre vs the 505, when everything works well for both sensors. That’s the tricky part in practice of course: adhesion issues, desynchronisation between insertions (ie comparing a fresh Dexcom to a Libre in its second week) all play a role.

Broadly speaking, the sensors see the same thing. The 505 data is a bit more bumpy: that is a consequence of the adaptive 505 algorithm and, of course, of the smoothing introduced by the Libre historical data.

One important point: as you can see in the left Bland Altman plot, two well working sensors can show very significant differences based on timing and rate of change.

Regardless of the absolute magnitude of the differences, a consistent behavior emerges: the Libre overshoots highs compared to the Dexcom and undershoots lows to a lesser (absolute) extent. This type of behavior could be the consequence of the calibration slope of the BGM used to calibrate the Dexcom, but we have observed the same behaviors with different BGMs (Menarini Glucomen LX, Roche Accucheck Mobile, Abbott’s Libre BGM). If you are interested in that behavior, the 2014 and 2015 posts on this blog provide additional insight.

The third screen is a log/log plot privately suggested by L. and is basically a Bland Altman on steroids that amplifies the visualization of the differences in behavior in a way that is less dependent on absolute differences. (I am sure I will be corrected if I didn’t get that right).

Beautifying the data

Now, let’s look at the old Clarke plot of the Dexcom vs the Libre. (yes, I know, Clarke plots are out of fashion, but I have had the function for ages, so why not…

First the un-shifted data plot.


Quite decent match, you would not have killed yourself by relying on either device.

Now, the delay corrected data plot.

Isn’t that something? We have gained almost 8% in the A zone.

Now, this doesn’t mean anything in absolute terms. For all we know, the Dexcom could have been right and the Libre could have been overshooting. Only one thing is certain: the delay.

But this tells us something else: it is extremely easy to tweek test results to your liking. Something as simple as asking patients to tests 2 hours after a meal vs asking them to test 1.5 hours after a meal, something seemingly as innocuous as using standard meals or standard sport sessions can have a drastic impact on the numbers. In a market where T1D fanboys love to argue about the 1% MARD advantage of their sensor (while at the same time losing 10% MARD or more through home made hacks), a couple of percent of differences can mean a huge amount of good publicity…

Tuesday, August 1, 2017

Non clean Dexcom vs Libre comparison

Real life has interfered – that would probably be a good “psychological burden of chronic disease” post is I was in the mood – and, while the blog hasn’t been updated, it isn’t dead yet.
Here’s a new comparison between the Libre and the Dexcom 505. Unlike one of the previous comparison posted here, this one is utterly “unclean”. In short
  • this was a tennis tournament week, with frequent games.
  • Max forgot to scan with the Libre, or simply forgot the Libre reader. The straight green lines are those no data periods.
  • both sensors were on the arms: we experienced several adhesion issues and patched as we went.
  • variability is much higher than usual because we were “pre-loading” a bit for games (not very useful, but better than starting too low anyway) and experienced severe delayed hypos on a couple of occasions, despite minimal levemir doses (5U / 24 hours)
In other words, ultra messy real life…

ERRATUM: G4 505 vs Libre - legend copy paste error. Thanks to KS for spotting it.
While I would not draw too many conclusions out of such an awful data set, some comments

It is good to have backup. We lost a Dexcom sensor almost at once (not shown here) and the Libre started dangling after a few days. Interestingly, the Libre started to read a bit too low and sensing delay increased a lot. The yellow marker on the above chart marks the near sensor loss moment. When Max noticed (or paid attention), we used a bit of opsite to stabilize the sensor and normal operation resumed.

The Libre remains, in general, faster than the Dexcom 505 algorithm, and even more so if one looks at spot checks (with the draback that those can be off when the trend changes suddenly). We now have a year or so of side by side data and experience and the result is always the same. Yes, on occasions the Dexcom will pick up a trend before the Libre does (as reflected in historical data) but I don’t remember seeing it picking up a trend before Libre spot checks. Depending on the data set, the optimal correlation between the two signals consistently gives a 6 to 10 minutes advantage to the Libre.

Note: I am not really that interested in collecting additional very clean data. In order to make a rigorous comparison, we need to sync the device clocks on a regular basis, we need precise reference points such as “timecode” BG tests, we need mechanically stable sensors, reminders to scan at least once every 8 hours, etc… All of this adds to the management burden of a teen T1D and that is something I don’t really need.

In practice, that speed advantage needs to be taken with some caution:
  • the Libre historical data is computed and corrected a posteriori (as shown here). It is not useful in real time.
  • the Libre spot checks are typically faster than historical data, but the delay compensation (combined to the so-so temperature compensation) often introduces overshoots.
Still, the Libre remains our favorite sensor for sports.
Excluding the excursions introduced by the interpolation, the Bland Altman plot is relatively flat. Still I wouldn’t draw any conclusion in terms of absolute slopes/biases because the G4 505 depends to a large extent on the calibration it receives (the nasty non linearity of the original G4 has been reduced in the current sensors/algorithm combo).

I realize quite a few issues I addressed here need a more detailed discussion, more data and detailed examples. Please treat this post as a simple keep-alive ping.

Thursday, May 11, 2017

Just a "standard" situation...

For some reason, even though we have a fairly strict rotation routine when it comes to Max's Levemir injection, we are now often confronted to frequent situations where the slow acting insulin seems to fail to act... I do not have a clear explanation for that: Max doesn't seem to skip his injection and there's no site/situation/meal/physical activity that I can correlate the rises with.

Anyway, here's such a situation, but also an illustration of many of the practical issues we face.

green segment: flattish around 100 mg/dl with a couple of mild compressions, no big deal.

By the way, a word about compressions: I often read very specific descriptions of compressions (transient sensor attenuations) in the T1D forums and groups. The compression should be abrupt, deep, and should end with a rebound. That is partly true: a major compression may indeed so unfold. But in practice, the compressions we detect and visually confirm can take almost any form. They can be partial, lead to fairly minor atenuations with no rebounds. They can be masked, as it is almost the case here, by a simultaneous increase. Be open: observe and learn: you may encounter compression lows, but also compression steady states or even compression highs (where the compression attenuates the ongoing rise)

third compression: that one is a major PITA. While it is detected, it masks - in a plausible way the rise that is happening at that moment.

compression exit: the trend starts to appear. But we need a few packets to make sure it is not one of those post-compression rebounds we see now and then. Unfortunately, another mild compression confuses the situation even more (and at that point, the compression detection algorithm, lacking a clear trend, has given up).

correction: the trend is now clear. Since we have seen such situation get out of hand quickly, the time has come for a quick Libre and blood check (see below): the Libre reports 230 mg/dl. The Roche Accu-check reports 225 mg/dl. The Dexcom still lingers at 160 mg/dl, one arrow up.

effect: as expected, around 6 packets later, the correction effect shows up.

Here's what the BG Meter and the Libre showed. Disregard time differences: both the BGM and the Libre are still running on winter time and both have drifting clocks. The actual time is 01:20 for everything.

A couple of comments on the sensors and accuracy.
  • the dexcom is running the G4 share 505 algorithm. The sensor is 5 days old.
  • the dexcom has been calibrated with the Roche Accu-Check BGM used here.
  • the dexcom is on the right arm.

  • the Libre is on day 12 of its life cycle.
  • that particular Libre sensor has been eerily accurate through the session.
  • the Libre is on the left arm.

I could be tempted to blame the Dexcom and praise the Libre and, to be honest, to some extent, I do.


  • this is the ideal situation for the Libre "delay compensation" algorithm. None of the fancy factors where it goes a bit crazy are present.
  • the Libre hasn't been compressed.
  • this Libre sensor has been noticeably better than average (MARD of 5% vs Accu-Check over the whole period, but not enough data to be statistically significant). 
  • that Dexcom sensor has been underperforming a bit for reasons that I can't be certain of.

And what about the correction?

I hit hard. Very hard. Based on our experience, when the Levemir injection seems to fail, EGP can spiral out of control (we did get our first even 400 mg/dl on such an occasion). I used about 2.5 times more insulin that I would use to correct that trend in daytime.

There's always a bit of anxiety when using such a relatively high dose (8U) in the middle of the night. I do want to avoid the yo-yo situation where I have to correct a low later. And, at first, the huge drop after the plateau isn't reassuring. What is the fall accelerates? That is always a question that lingers.

As it turns out "insulin resistance", or EGP, or a mix of both is so high in those circumstances that the situation should evolve well. But that is an opinion based on our fuzzy experience and gut feeling, not a computable one, if only because the previous nights were OK and we have no definite idea about the current insulin sensitivity level.

As you can see, the trend settles quickly.

And even if I am usually very confident with my decisions, I will lose a few hours of sleep, keeping an eye on the situation just in case... and write this blog post to kill time.

Sunday, February 19, 2017

“Zero Carb” day on a non T1D person

I have already posted a few non diabetic CGM/FGM response patterns to food and exercise and even made my 2014 complete 14 days run available on this blog. In this very quick post, I will simply share the result of a full “strict zero carb” day on a non diabetic (your servant, now almost 54yo). A few BGM test strips were spent to ensure the CGM/FGM was working perfectly. The minimum of 62 mg/dL probably wasn't reached and came in what definitely looks like a prolonged compression.image
I felt a bit dizzy around 15:00. My urinary ketones were positive at 16:00

I will stubbornly avoid discussing my opinions on that type of diet, short term or long term, in adults or kids. A comprehensive review of its issues and merits can be found here (pdf) on the paleomom blog

Tuesday, February 7, 2017

Libre Clinical Study and discussion

The blog has slowed to a crawl, I apologize. The reasons behind my relative silence are

  1. Max has reached the tender age of 16. That means that teen issues and behaviors have become more common, impacting his control and our mood. I believe every T1D or T1D caregiver can relate to that situation which means I will leave it at that. Our latest HbA1c, a week ago, was still 5.5% but I believe this will be one of the last time we’ll see values below 6%. I will try not to despair, as there definitely are trade-offs one must accept if a kid is to have a semi-normal adolescence.
  2. We are going through an extensive remodeling of our environment and that takes time.
  3. rant alert
    Finally, as much as I hate to write this, I have lost interest in most open-source, community driven projects. I think I need to qualify that statement a bit before I get a lot of flak. As far as making data accessible everywhere and anywhere, I am still extremely grateful to the community as a whole, and especially the core members of the Nightscout project who made that data conveniently and cheaply available. The open source, or semi-open source community is great at developing features that actual T1D and T1D caregivers need or want. What really deeply annoys me, however, is how little attention is paid the the delivery of accurate results. Adding a new display device, check. Adding new minor features or screens, check. Accuracy, not so much. Assuming one want to deliver accurate results from raw data, there is a bit more to it than jumping from a single point calibration to another, or calculating an arbitrarily constrained slope. Occasionally, two open or semi-open source solutions are compared: they show a 50 mg/dL difference, eventually absurdly amplified by the lever effect of a bad slope, devices are rebooted, restarted and the community moves on. That is not to say that I would or that I privately do better, at least in a way that is applicable to a general population but that is precisely because I am aware of the potential issues that I decided not to inflict my experiments on innocent bystanders. On top of that, in the Libre world, the “semi-open source” approach, consisting of an incomplete github source dump that often misses all the computation parts, irritates me. Don’t think for a minute that those effectively closed source solutions are hiding some miraculous sauce: they aren’t.  The reason for the omission is often that they simply want to hide how they turn a very nice sensor like the Libre into something that behaves and performs like a second generation Medtronic sensor…
    end rant alert

The study

Let’s now have a look at the study, recently published in the British Medical Journal, “An alternative sensor-based method for glucose monitoring in children and young people with diabetes.” which you can download here.

The work was sponsored by Abbott: they were involved in the planning, the funding and the provision of devices used in the study. Except from the possible cherry picking of sensors used in the study, slight cherry picking of the competitors studies cited I did not spot any obvious red light. The population studied was a set of 4-17 yo children and teens that, according to the additional data (for example 7.6% mean HbA1c) seems to be a bit better controlled than the average population since 75% of that normal population does not meet the 7.5% target. Such a small bias may have had some impact on the study (more on this below) but it is probably because the authors of the study deliver better than average care. 

The conclusion of the study were, in short, MARD vs SMBG (capillary) 13.9% in that population (vs a previous 11.4% in a previous adult study) and 99.4% in the AB zone of the CEG. That is in line with the reported accuracy of the Dexcom G4 505 in some studies, although Dexcom likes to focus on its best study exclusively. 

The general conclusion was that the device could be trusted, was well accepted and, usual scientific caveat, could be beneficial long term. Well, there is nothing groundbreaking here, we all knew that, didn’t we? The benefit of that study is to be found elsewhere: respected researchers and clinicians, a fair number of cutaneous adverse effects (unlike in some previous studies), a protocol that does not smell of manipulation – that will drive acceptance and adds argument for funding and full coverage. 

Some personal comments

We do consistently get better accuracy than what the study reported on average. This is probably attributable to the fact that our “bad” weeks were 80% in range, our “good“ weeks were 90% in range while the population studied only stayed 50% in range. Incidentally, as a non T1D, when I ran sensors we had purchased in France on myself, I stayed at an 8% MARD for 12 days before the sensor started to drift. Variability, and the more frequent and usually rapid change of range it implies, definitely affect the CGM accuracy numbers.

The “acceptance” part of the study is very positive for Abbott. Again, we all know that. In fact, despite the overwhelming satisfaction expressed by the participants in the study, I believe the benefits to be understated. I always come back to our tennis experience on that issue: being able to play a full tennis tournament on a single daily SMBG check (as opposed to 10 to 15 checks per match) was just amazing. This was due both on the general accuracy of the device but also on its delay which was, in our carefully documented experience, 9 minutes shorter than the Dexcom G4 delay. For us, the Libre wasn’t merely a well accepted replacement, it changed our experience of T1D for the better.

On the delay side, the authors of the paper note “no delay”. This is really where I want to nitpick a bit. There definitely is a delay (quite visible in RAW data at stable temperature). It is simply partially compensated and partially obfuscated by the behavior of the Abbott’s algorithm.

It is extremely visible in chart B of the paper

as you can see, the sensor is – on average, note this is MRD not MARD – essentially perfect in stable or near stable conditions. The most significant relative differences occur in dynamic conditions and in the same direction.

In other words, when you are falling quickly, the Libre trails the fall and reads higher (probably missing some hypos), almost as a non delay compensated CGM would do. When you are rising quickly, the Libre leads and overshoots the rise (overestimating some hypers).
This is a behavior we noticed immediately (see herehere and here for some of our 2014 reports) and have consistently observed since.

I believe, just as I believed in 2014 that this is mostly the result of the Abbott delay compensation algorithm. It is not necessarily a failure of the algorithm (although looking at the raw data is appears it could be improved) but possibly a conscious decision by Abbott, either based on a technical issue such as an eventual lower signal to noise ratio in low ranges, or based on physiological issues they have identified in the BG to IG dynamics on falls. 

I am of course quite happy and a bit proud to have identified the issue in 2014 , while remaining aware our test population was n=2.

One last point on the delay issue is that the authors noted that the granularity of their time measurement was 5 mins. Timing issues are really critical as far as delay computations are concerned, which is why when we tested SMBGs vs Libre we always used immediate spot checks (because that is what matters to the patient) and I had to programmatically resynchronize the clocks on each checks (both the Libre and our BG Meter had drifting internal clocks). I used the same constant resynchronization technique with the Libre vs Dexcom comparison in order to maximize accuracy. Ballpark figures give a 15 minutes delay on the Dexcom G4, with a 9 minutes advantage on the Libre you end up with a six minutes average delay for the Libre vs SMBG (confirmed by our Libre vs SMBG tests in slow rises and slow drops), which would be hard to demonstrate with a 5 minutes granularity, especially if the comparison is not versus spot checks but versus inferred values from the 15 minutes averages.

Last comment: in absolute terms, you should keep in mind that the MARD given in that paper is most probably Libre CGM vs Libre BGM (or other Abbot BGM) and might be a bit biased as the same fundamental decisions have obviously driven the design of both devices. I do like that bias myself as the use of different BG meters would have muddied the algorithmic issue even further and would probably have required a set of Bland Altman plots to debias/detrend the data.

Apologies if I sound obsessed by speed issues, but as far as we are concerned, that was and probably remains (until a full CGM is available one way or another) the defining advantage of the Libre versus the Dexcom G4 or Dexcom G4 505.

Monday, January 9, 2017

It’s always compression, duh!

Let’s be honest, I’ve been waiting for a moment like this one, where algorithms trump my quick visual assessment.

Here’s the situation on a relatively standard scale: can you spot anything? Don't cheat and look below.


21:00 Max starts climbing slowly. Because he had a small hypo earlier, the light evening meal, mostly protein (salmon) he took starts showing up.
21:30 recalibration, the Dexcom was a tad too high, the Libre (on the other arm) was spot on at 108 mg/dL. The decision is made to take no action because we know that the 20:30 will start pushing BG back down a bit roughly 3-4 hours after injection (20:15 in this case).
Sure enough, we seem to be on a small downtrend starting around 23:15, for a 155 mg/dL high. Yes, that is not ideal, but at some point we have to consider the trade-off between undisturbed sleep and perfect BG. Today, undisturbed sleep was the intent.
At first sight, this slope still looks like a mild downtrend, with a bit of noise.
However, this is what I get in another view: my compression detection algorithm has triggered!


Interesting… Time to look at the decision parameters

Parenthesis: my compression algorithm isn’t wildly different from what has been published in the literature. I developed it independently in 2015, as a toy project. In a nutshell, the algorithms examines the last few hours available (at least an hour although I can fine tune the parameters conveniently), assesses noise, overall trend and builds “confidence” on those values. It’s a bit of a cookbook of hacks and rules. For example, the SD a detrended trend gives a good indicator of the current “meta” noise level in the signal: a drop caused by a compression should, obviously, by more important than the SD of that detrended signal by some factor (one that I tuned based on experience and visual assessment). It also goes without saying that the delta must be negative. On top of that, a few rules have been added here and there for experimentally observed special cases.

At that 00:53, the new value enters the “hour buffer” which happens to have an extremely high level of confidence. Note that the algorithm did not have that level of confidence at 23:23, post peak, where the hourly trend was less clear and a bit of noise (maybe a transient compression) pushed the detrended SD a bit higher.

That being said, the case isn’t settled at this point and I decide to zoom on the chart and go up and check. Max is indeed leaning on his Dexcom, and not leaning on his Libre. The Dexcom, which had been tracking the Libre after recalibration is actually 10 points below.

The acid test is of course to move Max a bit from leaning on its Dexcom to not leaning on either devices. Here is a zoom on what happened: the very mild compression recovered almost immediately.


  • Instead of being slightly down, the trend is actually either stable of very slightly trending up. Knowing this allows me to push a 1 U correction (being extremely conservative here in order to avoid any hypo risk.
  • the scale at which we are looking at our CGM signal impacts our perception and our assessment of a situation. (which is one of the reason I developed my own “in-the-cloud” visualization, which I can tweak and zoom to my liking)
At this point, I can already hear the dissenters saying “How in the world can you tell it is a compression on such a small variation ? My Dex can be off by xx points or jump around”

Good question: let’s answer this methodically
  • the custom “artificially intelligent” algorithm says so Winking smile
  • the Libre says there was no drop.
  • I have confirmed visually that Max was sleeping on its Dexcom side.
  • I have confirmed that, by relieving the compression, the sensor recovers as expected and resumes cruising.
  • yes, I have no idea that the “real” level is actually 137, 127 or 147 mg/dL but it does not matter: the relative change does matter.
  • yes, there are situations where the Dexcom is too noisy, the trend is unclear and the decision is ambiguous, if possible at all.
but when the Dexcom (or Libre) is tracking smoothly, there is very little variation in the signal, or detrended signal if in a clear trend. That is that consistency, when the signal is good, that allows Dexcom executives to claim their technology is already much better than BG Meters. That is a statement I can totally agree with… until real life interferes (micro traumas, compressions, failing sensor, encapsulation…) and, of course, except for the fact that the baseline Dexcom values depend to a large extent of the performance of your BG Meter.

Anyway, this blog post is almost live because this must be the first time an algorithm tells me something I may have not noticed. 3 years ago I had a quick look at Neural Networks and AI but, while I got them to tell me interesting things and issue decent predictions, they never told me anything I wouldn’t have noticed or predicted by myself. That one is a first!

Ah, and one more thing – let me reassure all the hydrophiles out there, no glass of water was harmed in this experiment.

Wednesday, January 4, 2017

Pre-bolus: rationale and examples


I am always surprised at the number of patients or caregivers who are either unfamiliar with or afraid of the “pre-bolus” technique. “Pre-bolusing” means injecting insulin 25, 20, 15, 10 minutes before a meal rich in carbohydrates. While I am sure this post will be very basic for most of the readers of this blog, I feel it could be useful for occasional readers.

What is the reason behind pre-bolusing?

It is very simple. Ideally, you want your insulin injection to match as closely as it is possible the insulin secretion of a non T1D person. Unfortunately, this is not possible with insulin that is injected subcutaneously. When you are not diabetic, a meal will trigger an immediate insulin secretion in the bloodstream (in some studies, the mere thought of a meal was enough to trigger an insulin secretion).  As a T1D, the insulin you inject (or push through your pump) lands in the peripheral subcutaneous tissue and needs to be picked up. That takes a while. This is not speculation, not something “one needs to consider” – it is a fact and it has been extensively studied.

In the graph below, you can see the three essential differences between a physiological response and an injection.  (source: recovered data from https://www.ncbi.nlm.nih.gov/pubmed/26041603) Note: these activity curves are always a bit approximate in terms of absolute activity as, even in healthy volunteers, clamp studies area bit imprecise. Model curves don’t take some practical parameters into accounts. What matters is the notion of delay (due to absorption and transport) of the injected insulin peak and, later, the residual tail which is often ignored.

After an injection, even if you have matched your insulin dose and meal perfectly

  • you start by having a relative lack of insulin
  • after an hour or so and for the next two hours, you have a relative excess of insulin
  • your short acting insulin typically has a longer “tail” than a physiological response

And here is a video showing what happens when the timing of the insulin injection is adjusted

Effect of pre-bolusing on relative insulin activity.

Here are a few real life examples, all of them are high carbs breakfasts.

Injection at the start of the meal: the relative lack of insulin at the start of the digestion process leads to an excessive peak at 200 mg/dL. The relative excess of insulin after the meal has been digested slowly but surely leads to an hypo that will require a correction.



Late bolus: in this situation, the injection came during the meal (as in “f***, I forgot my insulin”). The relative lack of insulin leads to a higher peak, that is to be expected. But the late relative excess is also more pronounced. It leads even more quickly to a potentially severe hypo that requires a couple of corrections.


Pre bolus: here, the insulin was taken some 10 minutes before the meal. The difference is drastic. The initial peak is greatly reduced and the relative excess of insulin has a minor impact. Yes, the timing possibly could have been a bit better, maybe 15 minutes. And, yes, there is still what could be considered a mild hypo. But in this case, a couple of dextrose tablets was all we needed to get back on track.


20 mins prebolus:  In this case, Max woke up a bit late (holidays…) with a dawn phenomenon already significant. This gave us the opportunity for a longer wait. It ended up so well that the insulin action tail (and another small prebolus) took care of the light 2PM lunch.


Important note: we typically don’t prebolus if we are trending down or already below 80 mg/dL. We obviously don’t want any additional insulin activity pre-meal in those cases. Common sense, as usual, applies.