Modeling Temperature and Precipitation in Hyderabad and Medak Using Copula

Objective: To analyze the Temperature and Precipitation dependence in Hyderabad and Medak and forecast Temperature. Hyderabad and Medak are districts of Telangana State, which is a Plateau region and Hyderabad is chosen as it is a thickly populated area whereas Medak is the nearest district to Hyderabad, which has relatively less population. Methods and Statistical analysis: The data is collected from the Indian Meteorological Department, Pune from 1901 to 2019. Data from 1901 to 1996 used for training and 1997 to 2019 data used for testing. Modeling is done keeping in view the deep association between Temperature and Precipitation, through Copula analysis. The Mean Absolute Percentage Error (MAPE) for the best model in Hyderabad and Medak are found to be 0.04 and 0.09 respectively. R- Software and IBM SPSS statistics version 25 were used to analyze the data and interpret the results. Findings: The best Copula need not be necessarily the same for diﬀerent data sets. We could ﬁnd the best Copula for Hyderabad as Rotated Gumbel 270 Copula and for Medak, Frank Copula based on AIC and BIC criteria. The simulated data of Temperature showed a very close agreement with testing data, which can be seen in [Figure 5 (a) & (b)]. Novelty: This type of analysis and model ﬁtting is not found in the literature for Hyderabad and Medak and these districts are in Telangana, the Plateau region. Our ﬁtted models can give a good prediction of Temperature in this region.


Introduction
Over past few decades global warming is assumed to be one of the primary reasons for the irregularities in climatic conditions in the world. The increase and the erratic changes in the temperatures in many parts of the world is because of global warming which is the main reason for disasters such as droughts, heat waves, floods and storms. Prior prediction of weather conditions coupled with adoption of necessary regulations can make lives of mankind better. In climate studies, association between Temperature and Precipitation is of great importance. In order to determine the interdependence https://www.indjst.org/ between Temperature and Precipitation, identifying the best possible joint distribution of these two factors is essential.
In this direction some studies have been taken place. Choubin et al. (1) focused on the application and evaluation of classification and regression trees (CART) model in prediction of seasonal precipitation and compared with adaptive neurofuzzy inference system (ANSIF), auto regressive integrated moving average (ARIMA) models. Choubin et al. (2) , have applied Linear regression with two nonlinear models, the adaptive neuro fuzzy inference system (ANSIF) and the multi-layer perceptron to forecast an ensemble season precipitation for semi arid catchment in Iron. However, these papers are on precipitation prediction. Pandey et al. (3) in his study modeled interdependency between railfall and Temperature using Copula. This study was carried out in Agartala (humid region) and Bikener (Arid region). Lazoglou and Anagnostopoulou (4) developed a joint distribution for the above two factors using Copula in Mediterranean region.The remaining studies carried out by Bezak et al. (5) in Slovenia, Yu et al. (6) in China, Shaukat et al. (7) in Pakistan, have also used Copula models for their study in which Rainfall is the main factor. Dzupire et al. (8) , has used the dame Copula analysis to identify interdependency pattern between Temperature and Rainfall. He also stated that it is difficult to model the bivariate data. Mesbahzadeh et al. (9) also modeled using Copula analysis for Temperature and Rainfall for Arid region.
These studies were aimed at predicting the Rainfall concentrated over Humid, Arid and Mediterranean regions. From the review of literature, it is observed that there were hardly any studies on Plateau region and there was no prediction regarding the Temperature which was reported. Our study aims at developing a bi-variate Model for monthly mean average Temperature and Precipitation processes which can be used to simulate Temperature of the selected regions (Hyderabad and Medak districts of Telangana State, which is a Plateau region) taking into consideration data pertaining to a closely related variable, namely Precipitation. In this direction Copula analysis is a methodology found to be most appropriate. Objective of the study is to develop a single Copula model to estimate Temperature in selected region, if possible.
Present study is classified into four sections, starting with introduction in section 1. Section 2 consists of materials and methods, deals with the collection of data for selected regions, describing the univariate distribution for both Temperature and Precipitation and identify the bi-variate model for Temperature and Precipitation. Section 3 consists of parameter estimation for univariate and bi-variate distribution functions using Maximum Likelihood technique followed by prediction. Section 4 includes conclusions on the findings, limitations and future work.

Study Area
This study is confined to Hyderabad and Medak district.
Hyderabad is the capital city of Telangana state, which measures to 625 Sq. Kms in area on the bank of River Musi. Hyderabad was known to be a city of Gardens and Lakes. Hyderabad is densely populated region as it is one of the country's IT Hubs and over a period of time there have been migrants from all over the country in search of jobs which implied to have lot of population along with pollution which caused erratic changes in the Temperature and Rainfall conditions over a period of time. During the recent floods the prediction of Temperature and Rainfall was not predicted accurately which caused in loss of life.
Medak region, in twentieth century, was a part of Nizam State before 1948 and converged into Hyderabad State in independent India and at present a district of Telangana. The district is spread over a region of 2,740.89 Sq.Kms. Medak as it is neighboring district has a contrasting relation with Hyderabad district.

Source of Data and Descriptive Analysis
From the Indian Meteorological Department, Pune month wise Precipitation and Temperature data is collected for Hyderabad and Medak districts for the past 119 years (1901 to 2019).

Temperature and Precipitation of Hyderabad
Hyderabad has a typical dry and wet climate environment surrounding on a hot semi-desert climate (10) . 26.6 0 C is its yearly average Temperature; monthly mean Temperatures are 22.1 -33.5 0 C. Summer season is during March -June in which the environment is sweltering, with mean Temperature 26 0 C to 35 0 C, most extreme Temperatures frequently surpass 35 0 C among April to June. The coolest Temperatures happen in November, December and January, when the lowest Temperature incidentally plunges to 10 0 C. During May the Temperature is the sultriest and the daily maximum Temperature range from 26 to 43 0 C; we can observe an oscillating cycle of monthly mean Temperature of Hyderabad from 1901 -2019.
Monthly average Precipitation of Hyderabad also exhibited similar cyclical behavior from 1901 -2019. From June -October, the monthly average Precipitation is found to be maximum every year. The mean Precipitation in the rainy season (JJunehttps://www.indjst.org/ October is 217.56mm.

Medak Temperature and Precipitation
Medak is situated at a considerable distance from the sea-coast, the environment is tropical and is described by very warm summer and dry, except during the South-West monsoon season. The yearly mean Temperature is 26.74 0 C; monthly mean Temperatures are 22.2 -32.5 0 C. During March -June climate of Medak is sweltering. Its mean Temperature is 28 0 C to 33 0 C, most extreme Temperature regularly surpass 35 0 C between April and June. The hottest Temperature occurs in May and coolest Temperature occurs in December and January. In this region a clear seasonal cycle can be observed in monthly mean Temperature from 1901 -2019. A clear seasonal cycle is not exhibited by monthly average Precipitation in Medak. During June -October, the monthly average Precipitation is found to be the maximum during the period of study. The mean Precipitation in the rainy season (June to October) is 147.32mm.

The association between Precipitation and Temperature in Hyderabad and Medak
The cause of association between Precipitation and Temperature is observed to be influencing soil wetness which in turn influencing Temperature on the surface. The reason for this consequence is indirect control of soil wetness on dividing idle and reasonable warmth situations (11) . As the sample data shows a non-Gaussian distribution, the Kendall's tau correlation coefficient is utilized to ascertain relationship between month to month Temperature and Precipitation. A negative association has been observed between Precipitation and Temperature during February -April and June -October (at the 5% significant level), which is given in [

Analytical Methods
Copula functions are used to develop a joint probability distribution of Temperature and Precipitation for selected months to represent their association. Let X and Y denote Temperature and Precipitation, which are continuous in nature, with cumulative distribution functions F X (x) = Pr (X ≤ x) and G Y (y) = Pr (Y ≤ y) respectively. By the definition of Sklar, (12) the joint probability function is given by Where C is an unique function and is known as As contended by Joe (13) and Nelsen (14) , C portrays the dependence between (X, Y). In literature many Copula families are accessible whose parameters control intensity of dependence of the variables (X, Y) Once the parameters of different Copula are estimated, selecting the Copula which can represent the structure of dependency between the interested variables is very important. Few criteria like Aldrian (15) and Black (16) Information Criteria, are available in literature to identify the best Copula. Information criteria are received here in light of the fact that they can portray the tradeoff between bias (precision) and variance (intricacy) in model development. To measure a relative goodness of fit of a statistical model we use The Akaike information criterion (AIC). It is defined as Here k is the Copula parameters; L is optimized value of the likelihood function of the Copula. https://www.indjst.org/ The Bayesian information criterion (BIC) was evolved by Schwarz using Bayesian formalism. It is defined as BIC = −2 ln(L) + k ln(N) ; here N represents the sample size.
In Hyderabad and Medak, Temperature and Precipitation sample data of September during 1901 -2019 is taken to demonstrate the modeling. A significant negative association can be seen between Temperature and Precipitation in September for Hyderabad and Medak, respectively. (Kendall's Tau is −0.429 and -0.391, P-value=0.000). Temperature has positive skewness (1.01 and 0.99) and Precipitation has a kurtosis (0.93 and 1.26) respectively, which shows that given data follows a normal distribution for Hyderabad and non-normal distribution for Medak districts.

Probability Distribution of Temperature
A forecasted cycle can be observed for Temperature, moves as for the average, impacted because of hazardous atmospheric deviation. Considering all of these characteristics, the Temperature cycle is exhibited as a Normal (Gaussian) and Sinh-Arcsinh (SHASH) distribution for Hyderabad and Medak respectively.
Normal distribution is a two parameter distribution function and the parameterization of the normal distribution given in the function is Here µ and σ are mean and standard deviation of the distribution respectively. Sinh-Arcsinh is a four parameter distribution function and the parameterization of the Sinh-Arcsinh distribution given in the function is and z = (x− µ) σ ; for -∞ < x <∞, -∞ < µ < ∞, σ > 0, ϑ > 0, τ > 0 Here µ and σ are the location and scale of the distribution. The parameters ϑ , τ determine the left hand tail and right hand tail of the distribution.

Distribution of Precipitation
The Precipitation cycle is exhibited as a Skew normal type2 and Reverse Gumbel distributions for Hyderabad and Medak respectively.
Skew Normal type2 distribution is a three parameter distribution function and the parameterization of the Skew Normal type2 distribution given in the function is Here µ and σ are the location and scale of the distribution. The parameter ϑ determines tail of the distribution. Reverse Gumbel distribution is a two parameter distribution function and the parameterization of the Reverse Gumbel distribution given in the function is, Where, -∞ < x < ∞, here µ -the mean (-∞ < µ < ∞) and σ -the standard deviation (σ > 0).
And φ (t) = (−lnt) θ is the generator, where t varies from 0 to 1 and θ ≥ 1 Here, the Gumbel parameter θ is given by θ = 1 1−τ and τ is the correlation between the variables. The Archimedian Frank Copula is defined as, And its generator is φ (t) = −ln , where t varies from 0 to 1 and -∞ < θ < ∞ Here, the Frank Copula parameter θ is given by

Analysis
In this examination to estimate the parameters of the model we utilized the Inference Function for Marginals. In this estimation the marginal distributions likelihood function and the Copula density functions are used. Given,

Where, F(u) is a Normal & Sinh-Arcsinh distributions and G(v) is a Skew Normal type2 and Reverse Gumbel distributions.
Here, t 1 = (µ 1 , σ 1 ) & (µ 2 , σ 2 , ϑ 2 , τ 2 ) and t 2 = (µ 3 , σ 3 , ϑ 3 ) & (µ 4 , σ 4 ) of Hyderabad and Medak respectively. C is the Rotated Gumble 270 degree Copula density function for Hyderabad and Frank Copula density function for Medak. The density function of Copula is defined as, then the bi-variate joint probability density function is where z 1 , z 2 , …… z n , is a sample of size 'n' and the log likelihood of z i is a univariate function given as The log likelihood function of the Copula density function reduces to to get the estimates ( , L j , the log likelihood function of 'j' , are independently maximized. The sample data of Hyderabad and Medak shows that there is a negative correlation (-0.429 and -0.391) between Temperature and Precipitation https://www.indjst.org/

Estimation of Parameters
To estimate the parameters of marginal distributions of Temperature and Precipitation in both Hyderabad and Medak districts we used Maximum Likelihood technique and found the following results presented in [Tables 2 and 3 Table 4 ]. The probability density function (pdf) and cumulative distribution function (cdf) of the Copula distributions can be seen in [ Figure 3    Further, for the identified best Copula model for Temperature, the MAPE is found to be 0.04 for Hyderabad and 0.09 for Medak. It implies that these models can simulate Temperature values with 96% accuracy 91% respectively. Hence, the simulated values of Temperature for testing data of both the districts are computed. The observed and simulated values for testing period are presented in [ Figure 5 (a) & (b)]. As the variation in Hyderabad data is relatively higher than Medak data. Higher is the variation in the observed data, more is the accuracy in simulating data for future. Hence, for Medak simulated data agreement with observed data is not that perfect.

Conclusion
Rotated Gumbel 270 Copula for Hyderabad and Frank Copula for Medak are found to be the best models to identify the type of dependency between Temperature and Precipitation. The simulated data for testing period of Temperature has shown good agreement with observed data.
As the variation in Hyderabad data is relatively higher than Medak data. Higher is the variation in the observed data, more is the accuracy in simulating data for future. Hence, for Medak simulated data agreement with observed data is not that perfect.
https://www.indjst.org/ Though the best joint Copula is identified for both Temperature and Precipitation, only Temperature can be simulated using these models. The associated variable Precipitation cannot be simulated.
In the present study, homoscedasticity is considered. In the literature scan, no such study is made for Hyderabad and Medak. Further, no model is found in the literature for the situation considered in this study for comparison. As the model is able to give prediction with 96% accuracy and a clear agreement is seen for the testing data, we felt that comparing with any other model may be redundant.
The interdependent Temperature and Precipitation can be modeled in the most appropriate way using marginals through Copula method. In our analysis, based on Kendall's tau the dependency between Temperature and Precipitation is computed. The relation between these two variables is found to be negative in both the districts.
The best joint model of Temperature and Precipitation is found to be the bi-variate Rotated Gumbel 270 distribution for Hyderabad and Frank Copula for Medak, based on AIC and BIC criteria. The models are able to explain 96% of the variation hidden in the observed bi-variate data. Hence, it is concluded that it can be used for predicting the future Temperature.
In a country like India climatic conditions in southern and northern regions itself vary. So it is not possible to use the same model in both regions. Similarly, as the climate conditions vary at different regions on the globe, it is very difficult to develop a single model which can be used for prediction of temperature for any region on the globe. Though, we are doing analysis of bivariate data and identifying a best Copula bi-variate distribution prediction can be carried out only for the variable Temperature which acts independently but not the closely associated variable precipitation.
Similar analysis can be carried out in the geographical regions for the remaining observatory centers of Temperature and Precipitation in Two Telugu States (Telangana and Andhra Pradesh).