COVID-19 Research Methods Resources

This is a place where researchers can post links to resources under development, or the materials themselves, that would help other researchers, and also obtain feedback for those posting the material. This topic is a wiki, meaning those with adequate datamethods privileges can edit this top section to provide more general information.

Here are some of the types of information we seek (both drafts and final documents)

General Methods Discussion Boards

Clinical Trial Protocols

Clinical Trial Information Resources

Data Collection Instruments

  • including case report forms and ready-to-load electronic data capture metadata such as that from REDCap

Statistical Design and Analysis Plans for Treatment Studies

Statistical Design and Analysis Plans for Observational Studies

Predictive Model Development

Data sources


Great idea @f2harrell

There is a newly established Covid19 disease registry that has started to collect cases called LEOSS that will make data available both publically, and a richer dataset for scentists who register:

I believe they will also accept data from countries outside of Europe.
They are looking for country leads and site lead also


Hi all -

I have a new Bayesian/Stan model of a latent infectious process (COVID-19) that I am planning to use with an ongoing data collection project of international government responses to the coronavirus epidemic. Compared to the SEIR/SIR models, this more simpler approach uses observed cases/tests to model COVID-19 as a partially-identified latent variable. I show in the paper that it is possible to rank identify suppression policies targeted at the virus even if the true infected rate is unknown. Data analysis of the 50 US states shows that states that declared a state of emergency earlier have lower total infection rates (although the rate of increase has not yet slowed).

Furthermore, by incorporating informative priors based on SEIR/SIR papers, it is possible to convert the estimates to “true” numbers. The model predicts that there were approximately 700,000 infected cases in the US yesterday versus 80,000 reported cases.

Here is a teaser pic from the results:

I really appreciate feedback on the model as we are collecting data now and the model will be used later. My hope is that it will also be useful for others engaged in modeling various aspects of this public health crisis.





New center opening up for COVID-19 research and training. Please contact if you are interesting in participating.

Pandemic Clinical Analysis Center (CPAC) Now Open in Central Ohio

Our research team develops hospital EMR processing software for research and training. The software converts each EMR dataset (of one or tens of thousands cases) from each sending hospital into a massive time series matrix for time series imaging, reanimation, and machine learning.

We have just completed the “Pandemic Clinical Analysis Center” (PCAC) and have opened in our 8000 sq ft. research and training space (at 1251 Dublin Rd. Columbus, Ohio). We were in a unique position to be ready for this pandemic because, during the 2014 White House Ebola response, our research team was asked to provide pandemic clinical analysis software to integrate diverse time series signals and aggregate cases from different hospitals. While the ebolavirus epidemic did not progress to a pandemic, we decided to continue to build the pandemic capable software over the past 5 yrs.

PCAC will be receiving de-identified datasets of patients with COVID-19 from EMR warehouses under an IRB. All services are free and funded by a private grant. The goal is to develop a massive observational COVID-19 training and research cohort providing embedded ML analysis of every cases and feature (pharm, diagnoses, de-identified notes, lab, vitals, ventilator pressures and flow, procedures, fluid balance, etc) in fine relational detail and in aggregation.

We presently have the capability for secure remote access for researchers and clinicians around the world and at scale. Clinicians with varying levels of expertise will be able to see the COVID-19 cases and reanimate them with all the timed procedure and treatment data (e.g. ventilator, pharma and lab) integrated and displayed together (for example in time lapse) for training. This is the type of training and real case data experience which cannot be provided by journals.

Our mission: Analyze every Ohio COVID-19 case in fine relational detail and all cases in aggregation. We are already funded and only seek expert assistance and promulgation. Please connect us or forward to anyone who might like to help or participate. These are not usual times.


Kudos to you and your team for performing this service, Lawrence!

In this post on a parallel discussion board, I highlight a recent paper that suggests opportunities for serial quantification of viral load and IgG & IgM serology. Notably, the viral RNA specimens were obtained from posterior oropharyngeal saliva rather than by the more invasive and hazardous swabs. What is your sense of the feasibility and clinical utility of (and thus motivation for) collecting such serial samples routinely? Is there any prospect for such data to appear in your CPAC warehouse? What time-series would you expect to prove most useful?

1 Like

We don’t have those data with COVID but here you see those parallel time series of viral load and antibodies with bio-markers in Zaire ebolavirus. Note the rise in IL6 and CRP develops as viral load is falling and in this case corresponds temporally to a resistant enteric GNR secondary infection, which was almost fatal even as the patient was recovering from the virus. (Original raw TS matrix Data from Kreueles et al. NEJM 2015).

The clinical utility— We need to determine the relationships of the viral load, antibody response, myeloid or lymphoid inflammatory response, biomarkers and treatment. The data we have now from China etc. cannot provide this.

Detection of recovery or worsening in relation to treatment is important as is detection of secondary infection.

We hope to have these data. Surprisingly there are still many siloes. Hopeful.

Those you mentioned in addition to biomarkers, Fluid balance, medications, and the rest of the lab and vitals. (Basically the full TS matrix itself augmented with the viral load and response time series)


Kreuels B, Wichmann D, Emmerich P, et al. A case of severe Ebola virus infection complicated by gram-negative septicemia. N Engl J Med. 2014;371(25):2394-2401. doi:10.1056/NEJMoa1411677 PDF

1 Like

Here in contrast you see the relationship of AST with viral load. AST is improving while (as the previous slide demonstrated) myeloid augmentation, IL6, CRP was worsening.

While it is compelling because we see what we would expect (as a contrast a viral sepsis phenotype to a GNR phenotype), of course determination of these TS relationships, which is pivotal to optimally managing these cases and timely detecting secondary infection ( which is likely to be the primary killer in the US, where sophisticated ventilators will reduce the early death rate from ARDS) requires a large cohort (and a plurality of statisticians).

1 Like

Here is the experiment panel in the clinical analysis system. Note the optional ML and/or regression.
The sets (training, validation, testing) are built by the system with user discretion. AUC as a function of time and probability are outputs. Confusion matrix is auto-generated and user can see the false pos, false negs etc in relation to the time series matrix at the time of the miss for algorithm improvement…

So once we have a sufficient de-identified cohort in the cloud, outside researchers and clinicians from all over the world can use our data visualization tools to reanimate the cases for learning, hypothesis generation , and to apply the embedded ML (which includes regression), insert their own methods for study into the embedded Jupyter Notebook, or otherwise study the cohort (as selected by the search engine shown) on their own.

If no one cares about hoarding their own data for publication we are going to save very many more lives.
As the Ohio Governor’s team said. “These are not usual times.”

1 Like

I’m working with some colleagues to develop a platform for sharing RCT protocols in the service of getting doctors the evidence they need to effectively treat COVID-19 as quickly as possible. COVID-19 Collaboration Platform aims to:

  1. connect PIs from different institutions who are interested the same clinical question in the hopes that they will operate their trials under the same protocol and combine evidence across institutions. This is just not happening right now, with the consequence that incredibly important RCTs are underpowered, may never complete enrollment (because the epidemic may overwhelm capacity to run trials or may taper off before enough patients can be enrolled), or at the very least will not deliver answers to doctors nearly as quickly as they could with some coordination.

  2. facilitate analyses that combine evidence across institutions within the bounds of IRB and HIPAA constraints.

Please take a look at the website and send us your feedback and your protocols!

1 Like

Just thought I’d contribute the following concept which I’m hoping a Research Organization may consider to put forward for Clinical trials. As you may all know, with so many infections out there, the human immune system remains the vital key to our survival. Over the years, many people tend to naturally develop higher immune strength to certain infections. Therefore, since the symptoms of COVID-19 are very similar to that of Tuberculosis (TB), maybe a closer look at the antibodies of individuals that may have had active TB and are now healthy might lead Researchers on to something! If those individuals do actually show certain resilience because they had been exposed to similar infections like COVID-19, such higher defense mechanism can be passed on to COVID-19 patients. This can be done in line with current Blood Plasma trials instead of just relying and rushing for limited convalescence plasma from recovered COVID-19 patients. This might even help to alleviate challenges in relation to antibody titre for current Blood Plasma trials. Some countries are considering BCG vaccine, but as we all know, this is only a hopeful preventive measure but may not necessarily help the already infected COVID-19 patients. Further Research into the findings might even help with development of a vaccine to end this pandemic.

Contributed by:
Marlene Beier
BSc. in Biology

1 Like