What is a data pattern that repeats itself after a period of days weeks months or quarters?

Time Series: Seasonal Adjustment

E.B. Dagum, in International Encyclopedia of the Social & Behavioral Sciences, 2001

2 Seasonal Models

The most often discussed seasonal models can be classified as deterministic or stochastic according to the assumptions made concerning the time evolution of the seasonal pattern.

The simplest and often studied seasonal model assumes that the generating process of seasonality can be represented by strictly periodic functions of time of annual periodicity. The problem is to estimate s seasonal coefficients (s being the seasonal periodicity, e.g., 12 for monthly series, 4 for quarterly series) subject to the seasonal constraint that they sum to zero.

This regression seasonal model can be written as

(1)Yt=St+εt ,t=1,…,T;St=∑j=1sγjdjtsubject to∑j=1s γj=0

where the dj's are dummy variables taking a value of unity in season j and a value of zero otherwise; {εt}∼WN(0, σε2) is a white noise process with zero mean, homoscedastic, and nonautocorrelated. The seasonal parameters γj's can be interpreted as the expectation of Yt in each season and represent the seasonal effects of the series under question. Since the parameters are constant, the seasonal pattern repeats exactly over the years, that is,

(2)St=St−s,for allt>s

Eqn. (1), often included in econometric models, assumes that the effects of the seasonal variations are deterministic because they can be predicted with no error. However, for most economic and social time series, seasonality is not deterministic but changes gradually in a stochastic manner. One representation of stochastic seasonality is to assume in regression Eqn. (1) that the γj's are random variables instead of constant. Hence, the relationship, Eqn. (2), becomes

(3)St=St−s+ωt,for allt>s

where {ω}∼WN(0, σω2) and E(ωtεt)=0, for all t. The stochastic seasonal balance constraint is here given by ∑ j=0s−1St−j=ωt with expected value equal to zero.

Eqn. (3) assumes seasonality to be generated by a nonstationary stochastic process. In fact, it is a random walk process and can be made stationary by seasonal differencing ∇s=(1−Bs) where B denotes the backward shift operator. In reduced form, the corresponding regression model is a linear combination of white noise processes of maximum lag s. Since (1−Bs)=(1−B)(1+B+…+Bs−1), model-based seasonal adjustment methods attribute only S(B)=∑j=0s−1Bj to the seasonal component leaving (1−B) as part of a nonstationary trend. Thus, the corresponding seasonal stochastic model can be written as

(4)SBS t=ωt,

Eqn. (4) represents a seasonality that evolves in an erratic manner and it is often included in seasonal adjustment methods based on structural model decomposition (see, e.g., Harvey 1981, Kitagawa and Gersch 1984). This seasonal behavior is seldom found in observed time series where seasonality evolves smoothly through time. To represent this latter, Hillmer and Tiao (1982) introduced a major modification

(5)S(B)St=ηs(B)bt

where the left-hand side of Eqn. (5) follows an invertible moving average process of maximum order s−1, denoted by MA(s−1). Eqn. (5) is discussed extensively in Bell and Hillmer (1984) and can be generalized easily by replacing the MA part with stationary invertible autoregressive moving average (ARMA) processes of the Box and Jenkins (1970) type. Depending on the order of the ARMA process a large variety of evolving stochastic seasonal variations can be modeled.

Other seasonal models used in regression methods where the components are assumed to be local polynomials of time, are based on a smoothness criterion. For example, given Yt=St+εt, t=1,…, T, restrictions are imposed on St, such that, ∑t (St−St−s)2 and ∑t[S(B)St]2 be small. The solution is given by minimizing a weighted linear combination of both criteria as follows

(6)∑ t=1T(Yt−St)2+α ∑t=1T(St−St−s) 2+β∑t=1T[S(B) St]2

where the ill-posed problem of choosing the α and β parameters was solved by Akaike (1980) using a Bayesian model.

Finally, mixed seasonal models with deterministic and stochastic effects are given in Pierce (1978).

Read full chapter

URL: https://www.sciencedirect.com/science/article/pii/B0080430767005222

Time Series: ARIMA Methods

G.C. Tiao, in International Encyclopedia of the Social & Behavioral Sciences, 2001

7 Seasonal ARIMA Models

Business, economic, and atmospheric time series data often exhibit a strong, but not necessarily deterministic, seasonal behavior. It seems natural to think that the seasonal pattern arises simply because observations at the same time point of the year are dynamically related across the years. For monthly data, this means that observations 12, 24,… periods apart are correlated. Just as the class of ARIMA (p, d, q) models represent the dependence of observations in successive periods (months), we may construct similar models to account for dynamic dependence of observations in the same month of successive years. As an illustration, corresponding to the regular ARIMA (0, 1, 1) model in Eqn. (5), we may consider a seasonal model of the form ∇12yt=et−Θ12et−12, where ∇12yt=yt−yt−12 is the seasonal difference and et is the noise term. To account for both seasonal dependence as well as month to month dependence, we may make et to follow the ARIMA (0, 1, 1) model ∇et=at−θ1at−1. Together, we have that

(7) ∇∇12yt=at−θ1at− 1−Θ12at−12−θ1at−13

where ∇∇12yt=yt−yt−1−(yt−12−yt−13) is the double, regular and seasonal, difference of yt. This ‘doubly’ exponential smoothing model, proposed by Box and Jenkins and motivated by their analysis of airline passenger data, has been widely used to present a variety of seasonal time series in practice. It is known as the ‘airline model.’ Similar extensions are readily obtainable for other ARIMA models.

Read full chapter

URL: https://www.sciencedirect.com/science/article/pii/B0080430767005209

11th International Symposium on Process Systems Engineering

Daniel P. Word, ... Carl D. Laird, in Computer Aided Chemical Engineering, 2012

2 Model Formulation

As early as 1929, researchers proposed that measles transmission is correlated with school terms (Soper, 1929). Measles data from England and Wales is used to estimate the seasonal pattern of the transmission parameter across all 60 cities. The SIR model used in this work is shown below in Eq. (3). The corresponding dynamic optimization problem is converted to a large-scale nonlinear programming problem (through the simultaneous discretization approach) using a three-point Radau collocation on finite elements (Zavala, 2008). The number of finite elements was chosen to yield one finite element per reporting interval of the data giving 26 finite elements per year.

The optimization model for a single city is as follows:

(3)min∑t∈T wMεMt2+wφεφt2s.t.dsdt=-βytStItN +εMt+BtdIdt =βytStItNεMt-γItdφdt= βytStItN+εM t,Ri*=ηiφi-φ i-1+εφiβ¯=∑i∈τβiτ=2.05,0.05≤βyt≤5.00≤It,St ≤N,0≤φt

Here, S denotes individuals with no immunity to the disease, I denotes individuals that are infected with the disease and are infectious, N denotes the total population, and ß(t) denotes the time-varying transmission parameter. The function y(t) maps the overall time horizon into the elapsed time within the year, making ß(y(t)) a seasonal transmission parameter with a periodicity of one year. B denotes the reported births, and the recovery rate (γ = 1/14) is given as a known scalar input. εM represents the dynamic model noise, which is assumed to be normally distributed. The index i denotes a point in time within the set of data reporting intervals T, η denotes a reporting factor accounting for under-reporting, and Rt* denotes the actual reported incidence over a given time interval. The cumulative new incidence is φ and εφt represents the measurement noise. ωM and ωφ represent weights for the model and measurement noise terms, respectively. These weights are set to be proportional to the inverse of the assumed variance of the error terms. The β¯ term denotes the average β across the yearly set of discretizations τ and is constrained here due to difficulties in simultaneously estimating S and ß. |τ| is the cardinality of the set τ.

The England and Wales measles data contains biweekly reported measles case counts by city for the years 1944 through 1963. The number of births per year is also reported by city. We assume births are uniform throughout the year. For all cities, population is assumed to be constant throughout the time horizon. The reporting fraction was estimated for London using a susceptible reconstruction technique that is similar to those described elsewhere (Bobashev et al., 2000; Finkenstädt and Grenfell, 2000), with the exception that we restrict the reporting factor to vary linearly with time.

Read full chapter

URL: https://www.sciencedirect.com/science/article/pii/B9780444595065501322

Time Series: Cycles

M.W. Watson, in International Encyclopedia of the Social & Behavioral Sciences, 2001

1 Cycles in a Typical Time Series

Figure 1 shows a time series plot of new housing authorizations (‘building permits’) issued by communities in the USA, monthly, from 1960 through 1999. This plot has characteristics that are typical of many economic time series. First, the plot shows a clear seasonal pattern: permits are low in the late fall and early winter, and rise markedly in the spring and summer. This seasonal pattern is persistent throughout the sample period, but does not repeat itself exactly year-after-year. In addition to a seasonal pattern, there is a slow moving change in the level of the series associated with the business cycle. For example, local minima of the series are evident in 1967, the mid-1970s, the early 1980s, and in 1990, which correspond to periods of macroeconomic slowdown or recession in the US economy.

What is a data pattern that repeats itself after a period of days weeks months or quarters?

Figure 1. Building permits in the USA

How can the cyclical variability in the series be represented? One approach is to use a periodic function like a sine or cosine function. But this is inadequate in at least two ways. First, a deterministic periodic function doesn't capture the randomness in the series. Second, since several different periodicities (seasonal, business cycle) are evident in the plot, several periodic functions will be required. The next section presents a representation of a stochastic process that has these ingredients.

Read full chapter

URL: https://www.sciencedirect.com/science/article/pii/B0080430767005210

Streamflow

A.I. McKerchar, in Encyclopedia of Physical Science and Technology (Third Edition), 2003

IV.A.2 Seasonal Variation

Seasonal variation is evident in most streamflow records and occurs because of seasonal variations in the precipitation, evaporation and transpiration, and soil moisture and groundwater levels. Snow and ice accumulation in winter and snowmelt in spring and summer contribute to the seasonal variation in cooler regions. Where rivers are sources of water for irrigation, the abstractions typically have a strong seasonal pattern, which impacts on the flow remaining in the river. Other human impacts are regulating reservoirs that store water at times of high flows for release when natural flows are lower.

All these factors contribute to the seasonal variations. For the Clutha River, seasonal variation is displayed as the distributions of monthly mean flows in Fig. 5. This figure is a boxplot of the monthly mean flows for each month from 1955–1999. For each month it shows the minimum monthly mean streamflow, the value exceeded in 75, 50, and 25% of years, and the maximum monthly mean streamflow. (The 50 percentile value is also known as the median.) The seasonal variation is modest, but the austral winter values (in June, July, and August) are typically lower than the austral spring and summer values (in September, October, November, December, January, and February). The lower winter flows and the higher spring and summer flows are due primarily to winter snow accumulation and spring and summer snowmelt.

What is a data pattern that repeats itself after a period of days weeks months or quarters?

FIGURE 5. Seasonal variation in streamflow illustrated using boxplots of monthly mean flows for the Clutha River at Balclutha, New Zealand, for 1954–1999. For each month, the plots show the minimum monthly streamflow, the value exceeded in 75, 50, and 25% of years, and the maximum monthly streamflow.

Read full chapter

URL: https://www.sciencedirect.com/science/article/pii/B0122274105007419

Process Models for Data Mining and Predictive Analysis

Colleen McCue, in Data Mining and Predictive Analysis (Second Edition), 2015

An iterative process also is important because crime and criminals change. As the players change, so do the underlying patterns and trends in crime. For example, preferences for illegal drugs often cycle between stimulants and depressants. Markets associated with distribution of cocaine frequently differ from those associated with heroin. Therefore, as the drug of choice changes, so will the associated markets and related crime. Similarly, seasonal patterns and even weather can change crime patterns and trends, particularly if they affect the availability of potential victims or other targets. For example, people are less likely to stroll city streets during a torrential downpour, which limits the availability of potential “targets” for street robberies; temperature extremes might be associated with an increased prevalence of vehicles left running to keep either the air conditioning or heat on, which increases the number of available targets for auto theft. Successful police work also will require periodic “refreshing” of the model. The models will change subtly as offenders are apprehended and removed from the streets. These revised models will reflect the absence of known offenders, as well as the emergence of any new players. For example, illegal drug markets frequently experience changes in operation and function associated with changes in players. Similarly, serial killers may have a unique “signature” associated with their crimes that can be used to link several crimes and segment them into a separate and distinct series.4

Read full chapter

URL: https://www.sciencedirect.com/science/article/pii/B9780128002292000043

29th European Symposium on Computer Aided Process Engineering

Haile Woldesellasse, ... Tareq Al-Ansari, in Computer Aided Chemical Engineering, 2019

3 Methodology

The first part of the methodology consists of analysing satellite images acquired from Landsat 7 for the year 2002. NDVI and NDWI are calculated to assess the crop vegetation and water stress of crop over the given time period. NDVI is calculated using the Red and Near Infrared channels, which are Band 3 and 4 respectively using the GIS tool. The vegetation phase displays a seasonal pattern in terms of WF which depends on its water intake and temperature. Under ideal condition, where there is an abundance of water available within minimal evapotranspiration (ET), crops display green vegetation characteristics. Therefore, NDVI and NDWI are high because the corresponding water content of the crop is high. Alternatively, high temperature increases the ET and crop water stress, thereby lowering the NDVI and NDWI. The correlation of NDVI with crop water demand (m3/ha) and temperature is also assessed to check the interaction between them. The total water demand of the fields, D, is then calculated by multiplying the crop water demand by the area of the fields. This study considers the optimisation of water supplied from the treated sewage effluent (TSE) plants to meet the water demand of the alfalfa fields, D, considered in (Woldesellasse, Govindan and Al-Ansari 2018). The water is pumped from the TSE plants and is stored at optimum locations (x, y) prior to its distribution to the alfalfa fields. The objective of the nonlinear optimisation is to minimize the cost of water distribution. The Non-Linear Programming sequence used to allocate the water irrigated to the fields is detailed below:

Constraint:

q11+q12=1q21+q 22=1C1⋅q11+C2⋅q21≥D1C1⋅q12+C2⋅q22≥D2q11,q12,q21,q22≥0

Objectives - Minimise:

C1⋅q11⋅r11⋅Cost11+C2⋅q 21⋅r21⋅Cost21+C1⋅q12⋅r12⋅Cost12+C2⋅q22⋅r22⋅Cost22

Where, Ci is the capacity of the two TSE, Di is the demand of the two fields, qij indicates the fraction of water transported from source plants to alfalfa fields, Costij is the cost of water distribution, and rij is the length of transportation from the TSE to the alfalfa fields.

Read full chapter

URL: https://www.sciencedirect.com/science/article/pii/B9780128186343502496

Smart restaurants: survey on customer demand and sales forecasting

A. Lasek, ... J. Saunders, in Smart Cities and Homes, 2016

3.3 Box–Jenkins models (ARIMA)

Time series models are different from Multiple and Poisson Regression models in that they do not contain cause–effect relationship. They use mathematical equation(s) to find time patterns in series of historical data. These equations are then used to project into the future the historical time patterns in the data. There are three types of time series patterns: trend, seasonal, and cyclic. A trend pattern exists when there is a long-term increase or decrease in the series. The trend can be linear, exponential, or different one and can change direction during time. Seasonality exists when data is influenced by seasonal factors, such as a day of the week, a month, and one-quarter of the year. A seasonal pattern exists of a fixed known period. And a cyclic pattern occurs when data rise and fall, but this does not happen within the fixed time and the duration of these fluctuations is usually at least 2 years [33].

The AR model specifies that the output variable depends linearly on its own previous values. The notation AR(p) refers to an AR model of order p. The AR(p) model for time series Xt is defined as follows:

Xt=c+∑i=1pφi Xt−i+ɛt

where φ1, …, φp are the parameters of the model, c is a constant, and ɛt is white noise. ɛt are typically assumed to be independent and identically distributed (IID) random variables sampled from a normal distribution with zero mean: ɛt ∼ N(0, σ2), where σ2 is the variance [10].

Another common approach in time series analysis is a MA model. The notation MA(q) indicates the MA model of order q:

Xt=μ+ɛt+θ1ɛt−1+⋯+θq ɛt−q

where μ is the mean of the series, θ1, …, θq are the parameters of the model, and ɛt−1, …, ɛt−q are white noise error terms [10].

In other words, a MA model is a linear regression of the current value of the series of the data against current and previous, unobserved white noise error terms, or random shocks. These random shocks at each point are assumed to be mutually independent and to come from the same, usually a normal distribution.

MA method is very simple, based on the idea that the most recent observations serve as better predictors for the future demand than do older data. Therefore, instead of having the forecast as the average of all data, a window with an average of only q previous observations is used.

MA reacts faster to the underlying shifts in the demand if q is small, but small span results in a forecast more sensitive to the noise in the data.

If the date shows up or down trend, the MA is systematically under projections or above forecast. To handle such cases, improvements such as a double or triple MA have been developed, but for this kind of data exponential smoothing methods are usually preferred, described in the next section.

AR and MA models were used to make a prediction for many different time series data. One of the examples is presented in Ref. [34], which is the first research looking into the casino buffet restaurants. Authors examined in this study eight simple forecasting models. The results suggest that the most accurate model with the smallest Mean Absolute Percentage Error (MAPE) and root mean square percentage error (RMSPE) was a double MA.

Another tool created for understanding and predicting future values in time series data is model ARMA(p; q), which is a combination of an AR part with order p and a MA part with order q. The general autoregressive moving average (ARMA) model was described in 1951 in the thesis of Whittle [35]. Given a time series of data Xt, the ARMA model is given by the following formula:

Xt=c+ɛt+∑i=1p φiXt−i+∑i=1qθi ɛt−i

where the terms in the equation have the same meaning as earlier.

An autoregressive integrated moving average (ARIMA) model is a generalization of an ARMA model. ARIMA models (Box–Jenkins models) are applied in some cases where data show evidence of nonstationarity (stationary process is a stochastic process whose joint probability distribution does not change over time and consequently parameters, eg, the mean and variance, do not change over time) [36].

Most of real-time series data turns out to be nonstationary. In such cases, the stationary time series models may not fit the data well and can produce poor prognosis. Techniques for dealing with nonstationary data try to make such data stationary by applying suitable transformations, so that stationary time series models can be used to analyze the transformed data. The resulting stationary predictions are then converted back to their original nonstationary form. One of such techniques is the differentiation of successive points in the time series. An autoregressive integrated moving-average process, ARIMA(p, d, q), is one whose dth differenced series is an ARMA(p, q) process.

Interesting case of big data mining project for one of the world’s largest multibrand fast-food restaurant chains with more than 30,000 stores worldwide is illustrated in Ref. [2]. Time series data mining is discussed at both the store level and the corporate level. To analyze and forecast large number of data researchers used Box–Jenkins seasonal ARIMA models. Also an automatic outlier detection and adjustment procedure was used for both model estimation and prediction.

A system designed to generate statistical predictions on menu-item demand in hospitals with intervals of 1–28 days prior to patient meal service is described in Ref. [37]. Authors used 18 weeks of supper data for analysis of menu-item preferences and to evaluate the performance of the forecasting system. There were three interdependent levels in the system: (1) Forecasting patient census, (2) predicting diet category census, and (3) forecasting menu-item demand. To assess the effectiveness of mathematical forecasting system and manual techniques a cost function was used. The costs of menu-item prediction errors resulting from the use of exponential smoothing method and Box–Jenkins model were about 40% less than the costs associated with manual system.

A paper considering the unique seasonal pattern in university dining environments is given in Ref. [7]. This study determines the degree of improvement in accuracy of each tested forecasting model in situation when the data are seasonally adjusted. Researchers compare the seasonally adjusted data and raw data and verify if seasonally adjusted data improves the accuracy. The data for this study is collected at a dining facility at a Southern university during two consecutive spring semesters. The customer count data of the 2000 spring semester was used as a base for forecasting guest counts in the 2001 spring semester. The data includes guest counts for dinner meals from Monday to Saturday, since the dining facility is closed on Sundays. Researchers selected six different forecasting methods including naïve model, MA method, Simple Exponential Smoothing, Holt’s Method, Winters’ Method, and Linear Regression. More sophisticated forecasting techniques, such as Box–Jenkins or neural networks, were not tested here. The accuracy of these models was assessed by Mean Squared Error (MSE), Mean Percentage Error (MPE), and MAPE. The results show that Winters’ method outperforms the other five methods when raw data is used. It turned out that seasonally adjusted data is much more effective in forecasting customer counts and significantly improve accuracy in most of used methods. All the other five mathematical forecasting methods outperform the naïve model when using seasonally adjusted data. And the MA method is the most accurate method of forecasting when seasonally adjusted data is used.

Overall, the main result of this study indicates that the use of seasonally adjusted data is critical for better forecasting accuracy in case of the university dining operations, where seasonal pattern certainly occurs. Thus the researchers strongly recommend employing the MA model with seasonally adjusted data to predict the number of customers in this kind of places.

Note, however, that the prediction of abnormal, extremely high or low demands is not considered in this study. In real situation, forecasting may be more complicated due to unexpected variables such as special events or unusual weather. Thus, the authors recommend for food service managers to employ techniques such as MA with the judgments from their own experience to get better forecasting results under their unique environment.

Read full chapter

URL: https://www.sciencedirect.com/science/article/pii/B9780128034545000171

26th European Symposium on Computer Aided Process Engineering

Grégoire Léonard, ... Diego Villarreal-Singer, in Computer Aided Chemical Engineering, 2016

1 Introduction

In the context of climate change, the European Commission has set goals to cut CO2 emissions by 80-95 % in 2050 compared to 1990 levels and up to 96-99 % in the electricity sector (European Commission, 2012). In consequence, challenges appear due to the higher share of variable renewables in the electricity grid which conflicts with the low flexibility of most conventional generators. Besides demand-response and better grid interconnections, a third solution to match demand and supply at any time consists in developing efficient means to store and release electrical energy. In a previous work (Léonard et al., 2015a), we showed that seasonal patterns appear in the storage level when electricity is produced from variable renewables only. Thus, storage technologies playing on time-scales varying from seconds to seasons are required. In the present paper, we study a power-to-fuel process in which electricity (presumably from renewable sources) is co-electrolysing water and CO2 (presumably from flue gas capture) to produce syngas that is then reacted into methanol as liquid energy vector. Liquid fuels like methanol have a high energy density, both gravimetric and volumetric (22.4 MJ/kg for methanol vs. less than 1 MJ/kg for batteries or pumped hydro storage; 17.8 MJ/L vs. 0.01 and 0.03 MJ/L for H2 or CH4 respectively). This high energy density and the stability of methanol at ambient conditions make its long-term storage and transport easy and very cheap. Moreover, methanol can be converted back to electricity (via fuel cell or combustion) or used as a fuel substitute for transportation thanks to its interesting properties compared to gasoline. Finally, if produced with renewable energies and captured CO2, methanol may also be considered as a decarbonized energy carrier, opening the way to the “Methanol Economy” discussed by Olah (2005).

Although all sub-processes of the power-to-methanol technology have been extensively studied, very few modelling works of the whole process were found. A preliminary evaluation of the combined process (Sayah et al., 2010) evidenced the price of water electrolysis as limiting factor for the technology economic viability. However, no simulation was performed in this theoretical study and no interaction between the three sub-processes was considered. Galindo Cifre and Badr (2007) performed a cost evaluation of different methanol synthesis routes, stating that using syngas from biomass gasification leads to lower methanol cost than using syngas from electrolysis and CO2 capture. Again, no process modelling was performed in this study. More recently, a study has investigated the flexibility of methanol synthesis at varying hydrogen flow rates resulting from intermittent energy sources (Fournel and Wagner, 2013). This study was performed in the framework of the French project VItESSE2, and it led to an optimized design of the methanol synthesis reactor. However, the CO2 capture and water electrolysis were not modelled in details, so that there is still a knowledge gap about the process integration that may be achievable. Moreover, the main drawback of the power-to-fuel process is the low conversion efficiency, reaching about 50 % (LHV) in industrial conditions (Lübbehüsen and Becker, 2014). Thus, the present work proposes a model of the power-to-methanol process with focus on improving the integration of the water/CO2 co-electrolysis and methanol synthesis sub-processes. The main modelling assumptions are discussed and the results of a heat integration using the pinch method are presented in the next sections.

Read full chapter

URL: https://www.sciencedirect.com/science/article/pii/B9780444634283503040

Modeling the data

Claudio Cobelli, Ewart Carson, in Introduction to Modeling in Physiology and Medicine (Second Edition), 2019

4.6 Modeling a single variable in response to a perturbation

The second type of situation is when we are again interested in modeling a single variable. However, this time we are viewing it as the response of a physiological system to a specific stimulus or perturbation that has been applied to that system. Examples of such situations are presented below.

4.6.1 Glucose home monitoring data

With advances in telecare, it is becoming increasingly common for patients with chronic diseases to monitor their clinical and health status at home. The data can then be transmitted to a clinical center where they can be analyzed, and appropriate advice fed back to the patient. This is particularly relevant in the case of patients with type 1 diabetes who home-monitor their blood glucose concentration.

In many cases the diabetic patient will measure their blood glucose concentration up to four times a day. One approach to analyzing these data is to make use of structured time series modeling, an approach that has been widely adopted in the analysis of economic time series data (Harvey, 1989). This involves identifying patterns in these blood glucose time series as comprising four elements:

(4.12)Gi=f(Di,Ci ,Ti,Ri)

where Gi refers to the observed blood glucose value at time ti, whilst Di, Ci, Ti, and Ri are the daily pattern, cyclical component, trend component, and random noise, respectively at time ti. The explicit functional relationship, f, used to relate these four subpatterns, can take a variety of forms. The most straightforward, adopted here in this example, is additive; simply adding the elements (Deutsch et al., 1994).

Trend patterns representing long-term behavior exist when there is a general increase in the blood glucose values over time. Daily patterns (which in time series analysis are generally known as seasonality) exist when the blood glucose data observed on different days fluctuate according to some daily rhythm (in other words, periodic fluctuations of constant length which repeat themselves at fixed intervals). This may be due to an internal diurnal rhythm or inadequacy in the current insulin therapy.

A cyclic pattern is similar to a seasonal pattern, but the length of a single cycle is generally longer. In the context of the diabetic patient, this could arise as a consequence of the difference between workdays and weekends, or from the menstrual cycle in women. Although randomness, by definition, cannot be predicted, once it has been isolated, its magnitude can be estimated and used to determine the extent of likely variation between actual and predicted blood glucose levels (Deutsch et al., 1994).

Fig. 4–16 depicts blood glucose data obtained from an adult patient over a 21-day period by home monitoring. During this time the patient recorded four blood glucose measurements per day, together with details of his diet and insulin regimen. In addition, he provided a brief commentary on his social activities and general well-being, information that could be helpful in interpreting the results of the time series analysis. The results of the time series analysis are shown in Fig. 4–17 (Deutsch et al., 1994).

What is a data pattern that repeats itself after a period of days weeks months or quarters?

Figure 4–16. Home-monitored blood glucose measurements collected by a diabetic patient over a period of 21 days.

Adapted from Deutsch, T., Lehmann, E. D., Carson, E. R., Roudsari, A. V., Hopkins, K. D., & Sönksen, P. H. (1994). Time series analysis and control of blood glucose levels in diabetic patients. Computer Methods and Programs in Biomedicine, 41, 167–182.

What is a data pattern that repeats itself after a period of days weeks months or quarters?

Figure 4–17. Analysis of the time series home-monitored blood glucose data in terms of (A) trend, (B) cyclical, (C) random, and (D) seasonal components.

Adapted from Deutsch, T., Lehmann, E. D., Carson, E. R., Roudsari, A. V., Hopkins, K. D., & Sönksen, P. H. (1994). Time series analysis and control of blood glucose levels in diabetic patients. Computer Methods and Programs in Biomedicine, 41, 167–182.

4.6.2 Response to drug therapy—prediction of bronchodilator response

The clinical focus of this example is chronic obstructive pulmonary disease in which a bronchodilator drug such as theophylline is administered in order to attempt to open obstructed airways. The aim is to be able to predict the response to theophylline using a simple data-driven model (Whiting, Kelman, & Struthers, 1984). This then enables the drug dosage to be adjusted in order to achieve the target therapeutic concentration.

The model used to relate the drug effect on respiratory function to concentration in the steady state was of the form:

(4.13)FVC=mCpSS+i

where FVC is the forced vital capacity, Cpss is the steady-state plasma concentration of theophylline, m (l µg−1 mL) represents the sensitivity of an individual to theophylline and the intercept i is the untreated, baseline FVC. FVC values were used to assess ventilatory response since they closely reflected the extent to which the small airways were unobstructed. This linear (straight line) model of the data was assumed to provide an adequate representation over the range of concentrations encountered in the study.

Using Bayesian probability theory and maximum likelihood estimation, Whiting and colleagues were able to predict the response of an individual to any steady-state concentration of theophylline, provided that the mean values of the parameters m and i were known in a representative population of patients with chronic bronchitis.

Assuming that all the parameters are normally distributed and independent, and all measurements are also independent, the most likely set of values of m and i for an individual is obtained by minimizing the function M where:

(4.14)M=(m−mmean)2σm2 +(I−imean)2 σi2+∑j=1n( (FVCj−FVCjest)2σFVC2)

and where σm2 and σi2 are the variances of the parameters m and i, σFVC 2 is the estimated variance of FVC and FVCj est is the expected value of FVCj given by the equation:

(4.15)FVCjest=mCpSS+i

and n is the number of paired FVC–Cpss measurements. The estimates of the parameters m, σm2, i, and σi2 were subsequently obtained using the program NONMEM.

Read full chapter

URL: https://www.sciencedirect.com/science/article/pii/B9780128157565000047

What is a data pattern that repeats itself after a period of days weeks months and quarters?

Seasonality repeats itself after a period of days, weeks, months, or quarters. The following table illustrates the six common seasonality patterns.
The method that considers several variables that are related to the variable being predicted is. weighted moving average.

What is trend and seasonality?

Trend: Long-term increase or decrease in the data. The trend can be any function, such as linear or exponential, and can change direction over time. Seasonality: Repeating cycle in the series with fixed frequencies (hour of the day, week, month, year, etc.). A seasonal pattern exists of a fixed known period.

What are repetitive patterns such as growth or decline data over time?

Trend. The trend shows the general tendency of the data to increase or decrease during a long period of time. A trend is a smooth, general, long-term, average tendency.