# Project for the study of Respiratory Syncytial Virus (RSV)

## RSV Problems

Respiratory Syncytial Virus (RSV) is the most frequent agent of respiratory tract infections in children younger than 2 years old and the major cause of hospitalizations specially for bronchiolitis and pneumonia. RSV is the cause of annual seasonal epidemics with minor variations each year and these are coincident with other widespread viral infections such as influenza or rotavirus, all these infections produce a high number of hospitalizations saturating the National Health Systems. Moreover, its transmission is very easy and nosocomial infections are frequent.

In Spain, there are 15,000 – 20,000 visits to primary care due to RSV each year. In the Spanish region of Valencia, 1,500 children younger than five years old are hospitalized annually with an average of 6 days of hospitalization per case. The cost of pediatric hospitalization for the Valencian Health System is about 3.5 million euros per year. To estimate the total pediatric cost of the illness it would be necessary to add the costs of primary care attention and the social (indirect) costs.

Therefore, it is important to have tools that allow to adapt to the health requirements, establish epidemic markers and predictors that help Health organizations to design control strategies against RSV and a reliable preparation of Health System. Researchers are also studying vaccines to protect individuals at early ages, when the immune system is not completely developed and, what is more important, to modulate the immune response responsible for the severity of the disease in young children.

## Virus Modelling

The above facts lead us to propose the development of a dynamic model of RSV transmission and infection fitting real values of hospitalization for children younger than one year in the Spanish

region of Valencia. Once developed the model, our goal is to design several prevention strategies, vaccination included, study their effectiveness, perform pharmaco-economic analysis in order to obtain more efficient strategies to reduce RSV incidence. Finally, as a consequence, to propose Public Health guidelines and to extend the model to Spain.

The team which is working in this disease has developed a network model able to simulate the inter-person contacts and the RSV transmission better than the already existing models. The starting point is the segmentation of population into the situations an individual can face with respect to the disease: susceptible or healthy, sick and recovered, with transitions among all these status as detailed in the following figure:

Then we build a network or graph where each node is an individual with specific characteristics independent of the other nodes (age, health status, sex, etc.). The links between among nodes represent relationships among individuals the disease uses for transmission. If the relationships are chosen randomly when building the network, then we have a so called "random network". Furthermore, depending on the probability distribution chosen to set-up the links among nodes, we can distinguish different random networks (Poisson, exponential, potential). Once we have built the network and the evolution rules have been defined, we can simulate the model studying, for each individual, the relationships he/she has and how is affected by them. With this approach it is easy to study situations such as the disease’s behaviour when a vaccine is applied on a selected group of the population (i.e. only children, only elderly), or if some treatment is applied to specific group of sick persons. This is represented by the following figure.

The odds of this model is that parameter estimation (what we call "model fitting") is highly CPU demanding. Apart from a few specific exceptions, the fitting process implies a brute force search. What this means is that we are forced to test every possible parameter combination (in our case they are the infection rate, the number of relationships and the recovering time after infection), being each one of these combinations a model. The test or "evolution" of the model consists on analyse, in a day-by-day basis for a period of several years, what happens to each individual, if he/she becomes ill, recovers, dies, etc. After that, the simulation results must be inspected and check how well they fit with the previously known data.

In practice this leads to the network models being used with great restrictions, such as reducing the network’s size (either in number of nodes, relationships or both) or the range in the parameter’s values to be tried. Lets think in a model for the whole population of Spain (about 45,000,000 nodes). The network would be so huge that usually we cut it off and use, say 10,000 or 100,000 nodes, and then extrapolate the results. This is valid sometimes but many others is, at least, questionable.

At the end, once the model has been validated and improved, the main aim is to test different vaccination strategies and to perform a cost-benefit analysis of each strategy.

## Project’s Phases

In this project we have used distributed computation to carry on with the model calculation in a reasonable period of time, splitting it into three phases.

### Phase I: Primary fitting

In this phase we have done a coarse adjustment of the infection rates and number of relationships which better approach to real data. We have used a proprietary distributed computation system called "SISIFO", suitable for small distributed computation in local networks. We used for the computations a changing number of computers (usually over 20 with peaks of about 100 during weekends) and computed over 60,000 models of 1,000,000 nodes each for a total of 3 years of CPU time. We got interesting results and found a model able to imitate reasonably the virus evolution.

### Fase II: Secondary fitting

In this phase we have done the fine tuning of the model’s infection rate, number of relationships and immunity time. This is done so to determinate the way the infection is affected with respect to the time the individuals become susceptible once they have recovered. This implies the computation of about 140,000,000 models of 1,000,000 nodes each. This is bigger than what our SISIFO system can handle so we changed to BOINC and built our own BOINC server. In the following figure we show the evolution:

From 18/05 to 26/06 we started computations using the computers of CES Felipe II and IMM but the computation power was not enough and would need about 8 months to end the computation. On 27/06 we make the project public among the BOINC community and the Word-of-mouth makes the rest. In the following 3 weeks we reach over 850 active hosts and we are able to compute all the models we had still pending, reaching 21 years of CPU time and more than 500 Gigabyte of results.

### Phase III: Tertiary fitting

In this phase we will use the best model we can find in Phase II to evaluate different vaccination strategies. We hope to do this around September. It is to be defined if we would distributed computation again.

# Thanks

We wishes to thank the BOINC community for their support, and among the English speaking teams, thank the Team Starfire and SETI.USA. The list is far longer (and it is available in the statistics area of the project) but we show here the teams over 10,000 credits:

1 Team Starfire World BOINC

2 SETI.USA

3 CANAL@Boinc

4 Crunchers@Freiburg

5 BOINC@Poland

6 SeriousCrunchers

7 TitanesDC

8 SETI.Germany

9 BOINC@MIXI

10 AMD Users

11 Team 2ch

12 SaveTheWorld

13 L’Alliance Francophone

14 BOINC@Heidelberg

15 Free-DC

16 The Knights Who Say Ni!

17 Team Norway

18 Team England (Boinc)