Comparability of raw data from different accelerometer brands and configurations

A new article was recently published from the project I did with Annelinde Lettink and colleagues at Amsterdam University Medical Center.

The gap in knowledge we aim to fill

For many years the field has focussed on finding useful algorithms to extract insight from raw data accelerometry. People, me inclusive, either assume that the algorithm is applicable across sensor brands and configurations given that they all store data in the same gravitational units, or we investigate comparability at algorithm output level. However, the comparability at the most fundamental level, the raw data itself, has never been studied.

How to assess comparability?

For comparing raw data between accelerometers we need to be sure that movement is 100% identical. It is impossible to attach a set of accelerometers on the exact same human body location. Neither can we expect an individual to repeatedly make the exact same body movement while wearing a different accelerometer at each repetition. Therefore, the only way to investigate the comparability at raw data level is by using mechanical movements. For this we dusted off the mechanical shaker table that I previously used during my McRoberts days in 2005-2008.

The next challenge was to get access to a sufficiently large pool of accelerometers from various brands. The pandemic lockdowns offered a solution as many research groups were not using their accelerometers. So, we asked colleagues in the field to lent us a sample of their accelerometers.

What makes this study valuable?

The results of our study reveal that raw data differs between brands in both the time and frequency domain.

However, this is not some geek project with no relevance to the real world! Awareness of differences in raw sensor data allows us to better anticipate how the output of any algorithm applied to it will be affected. And yes, this applies to any algorithm, including both machine learning techniques and domain knowledge driven approaches.

Further, understanding raw data comparability helps us to be more confident when using these algorithms across sensor brands and configurations.

Open access data and open source code

All data, code, and more detailed documentation of experiments have been shared publicly to enable reproducibility of our findings and facilitate future research. On that note, we even share additional experiments which we did not use in the published article. We did these additional experiments in the hope to maximise the potential value of the dataset.

For example, I was curious to know whether it is possible to develop an experiment that anyone can do in any research context that would be equally informative as a mechanical shaker experiment but does not require an actual mechanical shaker. So, I attached a series of accelerometers to the edge of a door and moved the door repeatedly. I know this may sound ridiculous, but what if this little experiment could help studies to identify problematic sensors? I have not had time yet to look at the data and investigate this, but maybe you do?!

Further, it may be worth highlighting that there is a video to summarise the experiments inside the documentation (credits to Annelinde for creating the video!).

If you plan to use any of the data or code and run into difficulties then do not hesitate to reach out!

Federated Data Analysis in ProPASS

The aim of this project is to help implement Federated Data Analysis in ProPASS. ProPASS is the international research consortium for Prospective Physical Activity, Sitting and Sleep. The consortium aims to facilitate pooled analyses of data from members around the world to boost statistical power. Their data includes accelerometer data and various other typical epidemiological data types. ProPASS actively works on addressing challenges of pooled analysis such as data harmonization.

Pooled data analysis when data cannot be shared

Privacy concerns and research regulations can prohibit the sharing of research data for a pooled analysis. Study level meta-analysis (SLMA) is the next best option. However, we know that SLMA becomes inefficient when conducting explorative analysis: A local person at each data site needs to run the analysis script and share results with every iteration of the analysis done by every researcher in the consortium.

Individual level meta analysis (ILMA), often referred to as Federated Data Analysis or Federated Learning, circumvents this problem. Here, secure multi party computation is used to perform statistical analysis on individual data points across multiple data sites without disclosing identifiable information. In other words, a single person can run the analysis across all the data in the consortium without the need for action from staff at the data sites.

Accelting’s involvement

The ProPASS working group has asked me to help coordinate the implementation Federated Data Analysis in ProPASS. We identified DataSHIELD and the Obiba software stack as the most promising technologies to enable Federated Data Analysis. Several other consortia in the life sciences use DataSHIELD and it has an active developer community.

I am not doing this project on my own, but work closely with a task group of early career researchers: Doua El-Fatouhi, Jairo Migueles, Jonah Thomas, and Esther Smits. Also, Joel Nothman from the Sydney Informatics Hub has been making valuable contributions to our discussions.

So far, we have piloted the installation and usage of DataSHIELD on machines that hold non-confidential dummy data. Further, we will soon roll out a survey among a sample of ProPASS members to gain better insight into their statistical needs. For the upcoming year the goal is to set up the infrastructure and try to find additional partners to help set up some of the infrastructure components such as a user management system and tooling to aid process management.

Update August 2023:

Following this post I helped ProPASS to establish a working agreement with the DataSHIELD consortium to develop the infrastructure for ProPASS. As result, my involvement in ProPASS could be concluded.

 

Classifying behaviour in Covid-19 patients

Chronic fatigue is a condition where patients perceive extreme fatigue for long periods of time. Mental activity or physical activity worsen the fatigue, while paradoxically taking rest is not always a solution. The Dutch Expert Center for Chronic Fatigue explores how patients with Chronic fatigue can best be supported. The issue of chronic fatigue is especially timely now many Covid-19 patients also experience chronic fatigue during their recovery, which is the focus of the Centre’s new ReCOVer study. If you are living in The Netherlands and have had Covid-19, please check out the study’s participant recruitment website.

Physical behaviour assessment

A good understanding of fatigue requires insight in a patient’s lifestyle in terms of physical behaviour. To quantify physical behaviour the Expert Center has used an ankle worn accelerometer, named Actometer, since 1997 (see image). In short, the Actometer and the accompanying software classifies an individual into one of three classes: Pervasively passive, fluctuating active, and pervasively active. Clinicians use these classifications together with other information about the patient to inform a treatment program. Unfortunately, the manufacturing of Actometers has stopped. Therefore, a transition to modern accelerometers is needed. Although, it would be great if we can somehow preserve the informative value of the Actometer software output that the clinicians are familiar with and that is supported by literature.

Can we replicate the old technology?

Around the year 2014 I already explored whether Actometer output can be mimicked with modern accelerometer data. For this I relied on the descriptions provided in the original publications by Vercoulen et al 1997 and van der Werf et al. 2000. Unfortunately, an accurate replication proved to be difficult at the time, possibly explained by unknown differences in hardware properties between old and new sensors and the software not being open source.

In 2019, Prof. Dr. Hans Knoop from the Dutch Center for Chronic Fatigue asked me to pick up the work again. However, this time with an additional question: Can we replicate ankle-worn Actometer output based on a wrist-worn accelerometer? Wrist-attachment opens the door for better comparability with many modern studies, especially for sleep assessment. In this blog post I am providing an overview of the work I have done to help the Dutch Center for Chronic Fatigue with this.

Classification model

Given the known challenges with replicating Actometer counts and 1990s sensor output in general, I decided to go for a simple interpretable indicator of body movement. The indicator roughly reflects the Actometer approach as it applies a band-pass filter to the three acceleration signals. Here, I use the same filter settings as described for the Actometer. Next, I calculate the average vector magnitude during waking hours per day. To identify waking hours, I use the sleep detection algorithm I previously developed for UK Biobank, aided by sleep diary (if available) as I described in a separate study. After this I calculate the 91.66th percentile of the distribution of this variable across the days. Here, the 91.66th percentile matches the required 11 out of 12 days of inactivity in the original Actometer algorithm described by van der Werf. Finally, I use the resulting acceleration value to classify a person as pervasively passive with logistic regression. Note that I merge the classes ‘fluctuating active’ and ‘pervasively active’ in one active class as discriminating those was considered less relevant.

Training the model

The logistic regression model was trained with a dataset where patients wore simultaneously one Actometer on the ankle and one accelerometer (ActiGraph) on the wrist. The performance of the classification model was assessed with leave-one-out cross validation. Given that there are no hyper parameters to optimize and given the moderate sample size (N=50), I decided not to create an additional independent test set. However, when more data is available later this year we could use that to improve the model or as independent test set.

Reflections on model performance

The model currently classifies 42 out of the 50 individuals correctly, 6 active individuals are misclassified as pervasively passive, and 2 pervasively passive individuals are misclassified as active. However, the model’s classification of behaviour appears consistent with what we see in the data. Patients classified as active truly move a lot and patients classified as pervasively passive truly move very little. One plausible explanation for the misclassified  patients is the difference in sensor placement. For example, it is not hard to imagine that an ankle worn sensor picks up cycling better than a wrist-worn sensor. On the other hand, an ankle sensor will miss the arm movements when sitting down. We did not encounter any major issues with the quality of the data.

Clinical research implementation

Aside from developing a classification model I worked on its implementation in the clinical research setting. The intended users are research nurses and psychologists who have limited time to be involved in the data handling and are not computer savvy.

In a previous project, named Optimistic, this was addressed with a cloud service, but in the current project I decided to initially work on a local offline solution. This allowed me to focus on the model development first. The development of a cloud service could of course be a future enhancement to facilitate scaling it up.

Software design

I used R package GGIR to pre-process the accelerometer data. Therefore, I also developed the rest of the data analysis in the R programming language and organized it as a complementary open-source R package. For now, the code is in a pragmatic location, with a temporary name, and with limited documentation. I am aiming to improve those aspects later this year.

Software interface

Before using the model for the first time, the user (research nurse/psychologist) will have to install R and RStudio on their laptop. Next, the user must download a specific R script, open it in RStudio, and press the Source button in RStudio. This will trigger the installation of all relevant software dependencies.

Initially I explored RShiny as a more attractive user interface. However, I decided to not pursue this yet as the added software complexity worried me. Complex software is harder to maintain and more bug prone, and also I wanted to prioritize my time towards the model itself.

After the installation, the software asks the user to specify the input and output directories and then the data processing automatically starts.

Software output and speed

At the end of the process the software creates a pdf-report. The pdf-report shows the classification of the patient as pervasively passive or active, a Z-score of the average acceleration relative to a reference population, a visual overview of the day-by-day physical behaviour level, sleep characteristics, and accelerometer wear time. The software completes a 14-day accelerometer recording within 10 minutes on a standard laptop.

Do you want the same for your study?

Recently the ReCOVer study started to use the solution that I am describing in this blog post. If you have feedback, want to know more, or would like to hire me to work on similar wearable data challenges then let me know!

HABITUS

Merging and processing accelerometer and GPS data

HABITUS is a system for merging and processing accelerometer and GPS data hosted by the University of Southern Denmark, and the successor to the PALMS system which used to be at the University of California, San Diego. HABITUS, which stands for Human Activity Behavior Identification Tool and data Unification System, is part of the University of Southern Denmark (SDU) cloud environment build and maintained by SDU eScience. The development of HABITUS is led by Dr. Jasper Schipperijn from the Institute of Sports science and Clinical Biomechanics at SDU.

Implementing GGIR

I became involved in HABITUS during the summer of 2019. My initial task was to make R package GGIR available on the platform. Therefore, I modified GGIR to be able to process multiple data files in parallel via R package foreach. Further, I expanded GGIR such that it accepts configuration file as alternative to specifying function arguments explicitly. Next, I put the new version of GGIR in a Docker container to facilitate reproducible research. Our partners at SDU eScience then put this docker container in an app within the SDU cloud environment.

Facilitating other algorithms

Making GGIR available in HABITUS helped us to identify potential bottle necks for other HABITUS users. The next goal was to explore how we can facilitate more algorithms than just the ones implemented in GGIR. The challenge here is that algorithms developed by many others do not always come with extensive data reading or report generation functionality. To address this I made another enhancement to GGIR such that it can embed external functions. This means that we can now run any algorithm for modern accelerometer data on HABITUS via GGIR. As a first test case we used the Actigraph-count imitation algorithm developed by Dr. Jan Br¢nd and implemented by Dr. Ruben Brondeel in R package activityCounts. In short, Actigraph counts were the output from one commercial accelerometer brand in the 1990s en early 2000s, named Actigraph. Replicating this data metric could facilitate historical comparisons of human behavior. The imitated counts can then serve as input for the PALMSplus software by Dr. Tom Stewart, who is also part of the project team. PALMSplus facilitates the combined analysis of GPS, GIS and Actigraph count data.

Next steps

The project team is applying for funding to support the development of HABITUS. For more information and specific questions about the HABITUS see https://www.habitus.eu.

Sleepsight analytics pipeline

Sleepsight

People living with psychosis often experience problems with their sleep, particularly when symptoms worsen. Sleepsight is an innovative research study taking place in South London which uses wearable and mobile technologies to study the links between sleep, activity and symptom levels in psychosis. The study is led by Nick Meyer a psychiatrist based at the Institute of Psychiatry, Psychology & Neuroscience at King’s College London, and South London and Maudsley NHS Foundation Trust.

Data collection

For the duration of one year patients used a study smart phone and wore a consumer wearable. The smart phone collected data on GPS, app usage, battery status, accelerometer, and screen status. Additionally, patients filled in a daily sleep- and mood questionnaire.

Analytics pipeline

Consultant Chris Karr developed a phone app and server-site software to collect the data. Next, I developed complementary software (GitHub link) to assess data quality, omit data segments as required, and fuse the data types into a single activity score. Further, the software stores it’s output in a research friendly format. And as a final step, it generates a heatmap visualisation with ggplot2 to aid quick exploration of the data.

Project status

The data collection phase has ended and data analysis is in progress.

 

Physical Activity descriptors for Cardiometabolic Health

Earlier this year I started a new project with Dr. Séverine Sabia at the French National Institute for Health Research (INSERM). I have successfully collaborated with Dr. Sabia in the past. Our new project explores how novel descriptors of accelerometer data relate to cardio metabolic health.

Time series extraction

My involvement is to help implement these novel descriptors in R package GGIR. A first step was to revise the existing software code to extract cleaned time series data. For example, we did not want to rely on sleep detection for the first recording night, but we considered it valuable to have some estimate of waking-up time on the second recording day. So, we had to decide what information about the first night to trust and use for that. Further, we want to exclude nights for sleep analysis when an accelerometer is not worn. However, we want to include those nights for 24 hour time-use analysis if sleep diary indicates that the accelerometer was only not worn during sleep.

Next steps

The next step in the project will be to use these time series as input for various behavioural descriptors. The technique I will mainly focus on is the behavioral fragmentation analysis as most recently implemented in R package ActFrag by Junrui Di and colleagues (https://doi.org/10.1101/182337). The plan is to implement these metrics in GGIR and explore opportunities for improvement.

 

Data Quality in German National Health Cohort

Accelerometer data in NAKO

The German National Health Cohort (NAKO) has collected high resolution accelerometer data in a large sample of the German population. After visiting the study centre, the study participant wears the accelerometer on their hip for the next seven days.

Data quality assessment

The University of Regensburg, which is one of the 18 study centres, has asked me to help build and implement a data quality assessment pipeline. It is expected that R package GGIR will provide most of the required functionality. In this project I will closely work with Prof. Dr. Leitzmann’s group to implement GGIR. Also, I will develop new functionalities needed for this specific data set. Another aspect of this project is to educate project partners about accelerometer data analysis. Further, we will create detailed documentation of the data collection and data quality assessment process.

Status and next steps

As a result of COVID-19 we started the project digitally. Nonetheless, we are expecting to perform a first pilot analysis in the upcoming months. Data quality and processing decisions can have major impact on the research done with it. Therefore, we will seek consensus with other experts in the field.