Skip to main content

Taking care of the data

The most worrying words one can currently read at present are probably; "Preliminary data from a study".

If there's one industry that has grown dramatically during the COVID-19 crisis, it is that of speculation around the causes, effects and susceptibilities to this disease. And alongside this are the direct attempts to hijack public opinion with scare stories that rely on precisely no meaningful evidence to promote political extremism.

So I am personally feeling a little uneasy this morning at my having placed a particular Facebook post and a Tweet out there this morning. The source is impeccable, or so I would like to believe—the Financial Times.

The story contains those worrying words that will, if necessary, get the FT off the hook if the content of a story originating from the DNA testing firm who are revealing their results turns out to be rubbish. But I have made myself party to this through my social media activity.

The story is interesting. It suggests that a study of 750,000 people shows that COVID-19 impacts people differentially depending on their blood group. For Celts the conclusions are particularly interesting because we have a majority of us with blood group O. Anglo-Saxons are more group A than any other. The pluralities in each case are modest and so should not be overplayed. But the study suggests we group O people are nine to eighteen per cent less likely to have tested positive for COVID-19.

One might, therefore, conclude that the disease will impact us less than our Southern neighbours. But the evidence of that seems scant to non-existent.

So where are the data upon which this all based? It's in a commercial DNA testing company's database. One constructed mainly to match segments in one person's DNA with that in another's so that common ancestry can be demonstrated.

As a long-time amateur genealogist, such a service has been an absolute boon. For my spouse and I, it has identified ninety living people, and one now deceased after taking their test, we are related to of who's existence we would otherwise have been unaware. It's not that either of us feels a desperate need for more relatives. Rather it proves paper-based research to be correct, or wrong, and may break through ommissions in the historical record. For me, it now means I know more about the Stevensons in the eighteenth century and before. Previously I was "stuck".

I have taken all three variants of the commercial DNA test which are useful for family tree research.

The Y-Chromosome test helps track the male line a long way back. Because females do not carry a "Y", it does not help for at all in tracking any other line. That test is what enabled me to find a fifth cousin once removed. That means I share my great-great-great-great-grandfather, John Stevenson, with him. He was born in in 1744 in Bannockburn. For my newly found relative, there is another "great". He's one generation more distant.

I have also done a mitochondrial DNA test. That DNA only be passed on from one's mother. It, therefore, allows research of the female line of ancestry. As mothers can pass it on to their sons, from whence it goes no further, males can use it for research. In my case, the test has yet to give me any new discoveries but has confirmed previous research.

But the most common test is the autosomal test. It tests DNA which is transferred to a child from both their parents. It is the most commonly used and has lead to most of my new discoveries. I have placed my results for this test in a number of databases. The biggest of them states that it has over 50,000 matches for my DNA. So the work of a lifetime, and more, beckons. The DNA test, whichever one it is, does not magically build a family tree for you.

But it is quite big business. I think I have spent about £300 on my various tests. So it would be a reasonable assumption that people who have had their DNA tested have higher disposable income than the general population. And we know that life expectancy is higher for that group. So if we are to rely on this database, we should only do so if it has been normalised, adding weight to people with lower salaries, reducing that of higher earners—the first of a number of issues to think about.

Let's pick up something from the FT story;

"[..company name..] is combining the database it has collected from selling direct-to-consumer genetic tests with surveys from customers. This includes information on whether they had been tested for or hospitalised by the virus and the nature of their symptoms."

So to add to the skewed nature of the database, we now have two factors which might further influence conclusions. It's been done by a survey of people in the database. Some of those approached will have responded with information about the effect COVID-19 has had on their health, and many will not have.

Will those upon whom the greatest impact of the disease has fallen, be more likely to respond? If I were to guess, yes guess, I would say yes.

It is also asking about testing. We know that a huge proportion of the population of South Korea has been tested. And substantially lower numbers in Europe and the USA. Are South Koreans, who are much more likely to be able to say something about testing, a disproportionately high share of the people who make up the 750,000 upon whom this story is based?

None of these issues I raise is intended to question the integrity of the company whose database is at the heart of this report. It is interesting and suggests further enquiry would be a good idea.

But in our rush to find out as much as we can about this nasty disease, are we bypassing the normal checks and balances that are so valuable in giving high confidence in the results derived from scientific endeavour?

The peer-review process has been speeded up, a bit, by the deployment of new electronic platforms. But it remains a neutral, measured process which is not much constrained by time.

If a lay-person like myself can identify a number of what I regard as significant questions, I need to be careful in any conclusions I draw.

It may be that every question I raise has been addressed before this story was published. But there seems to be no academic paper, not even a pre-peer-reviewed publication of one, that I can go and read to answer that question.

In our desperate search for good news out this crisis, it's simply too dangerous to abandon proper, external processes as part of academic check and balance.

After this argument with myself, I have concluded that it would not be responsible to leave my social media posts in place. So they are now deleted.

Even when it is the FT that's the source of my news, I shall err on the side of caution.

Comments

Popular posts from this blog

Genealogy Series: Betsy (or Elizabeth) Esplin Bell (1858-1930).

Betsy (or Elizabeth) Esplin Bell (1858-1930). She had a long criminal record driven by her addiction to drink, but was she her husband’s victim? by Stewart Stevenson. Betsy was born on 26 th January 1858 in Dundee to David Bell, a carpenter, and his wife, Agnes Sandeman. i  Father registered the birth, but is recorded as “Not Present”. George T Bisset-Smith, the Registration Examiner, published his book “Vital Registration”, the manual for Scottish Registrars in 1907. ii  In it he states that a “liberal interpretation” should be given to the word “Present” in this context but also states that “Not Present” must not be used. I suspect that leaves most genealogists, me included, little the wiser as to what “Present” was actually supposed to mean. So let’s pass on to the story. Betsy’s parents married in 1856, iii  with her mother Agnes making her mark, an ”X”, rather than signing the registration record, indicating that she was illiterate. Her husband David signed. Betsy appears

A sad farewell

Have been caught by my own writings today. Yesterday I discussed preparing for the unexpected. It is 1700 hours, and this is me just sitting down to write today's notes. They will be rather shorter as well as much later than ever before. Why? After a week in the south, the return journey went fairly well albeit having to leave at 0715 for the four-hour drive up the A9 and then across from Aviemore to Keith and then home, was rather earlier than I would wish. During the journey, several text messages came in. I have previously written about how smart the little three-year-old Honda that I got last December is. A prompt comes up to say that the phone has had a text message. A press of a button and it reads it out. A very welcome message that my god-daughter Darcey made a successful transition from home to school. And thoroughly enjoyed doing so. Mum, on the other hand, is finding that the ergonomics of the kitchen, otherwise known as the office, is fighting her off via the gif

Not always "right, right, right"

It's been a cracking week for home-working, and a wee bit of socialising. Sixteen online MSP sign-ins for meetings. And one for a social get-together. There has been some exercise as well, with my weather-beaten look being more than adequately topped up in the bright sun we have experienced over the last few days. I splashed out and bought a new gilet. It replaces one I purchased at the Turra Show more than ten years ago. And its replacement might have been acquired from the same stall but for the COVID-driven cancellation of one of our most important local events. It's a particular shame not have had our usual meeting of Parliamentarians and farmers at the NFU tent. The term hybrid is now most used by your Parliamentarians to describe meetings where some are physically present and others dial-in. But until this year, this term meant in one part a delightful combination of a formal agenda, speakers and question and answer at that gathering. The other half, justifying th