https://tanmiyat.uomosul.edu.iq/index.php/stats/issue/feed IRAQI JOURNAL OF STATISTICAL SCIENCES 2025-08-11T10:04:35+00:00 Assistant Professor Dr. Heyam A. Hayawi [email protected] Open Journal Systems <p>Iraqi Journal of Statistical Sciences (<strong>IQJOSS</strong>) is a scientific and open access journal. This journal has been published twice a year by the College of Computer Science and Mathematics, University of Mosul, Iraq. The iThenticate is used to prevent plagiarism and to ensure the originality of our submitted manuscripts. A double-blind peer-reviewing system is also used to assure the quality of the publication. The Iraqi Journal of Statistical Sciences was established in 2005 and publishes original research, review papers in the field of Statistical Science, Mathematical and Computers.</p> https://tanmiyat.uomosul.edu.iq/index.php/stats/article/view/29247 Using Wavelets to Identify Linear Dynamic Models 2025-08-11T10:04:07+00:00 Youns M. Th. Al.Obeady - Heyam Abd Al-Majeed Hayawi [email protected] Mohamed Ahmed Elkhouli - <![CDATA[The forecasting process of water cleaning in the city of Mosul was studied using the input and output variables represented by some tests performed on raw water before the filtration process, to be treated through multiple filtration stages. To determine the safety of water for human consumption, the filtration process was studied by forecasting using Dynamic models, including the self-regression model with additional inputs, the moving averages model, self-regression with additional inputs, as well as the output error model and the box Jenkins model. the best model obtained from the data was diagnosed using statistical criteria, and then a comparison was made to predict through forecasting criteria and applied to water data.]]> 2025-05-01T00:00:00+00:00 Copyright (c) 2025 IRAQI JOURNAL OF STATISTICAL SCIENCES https://tanmiyat.uomosul.edu.iq/index.php/stats/article/view/29249 Parallel Algorithm for Calculating the Integration 2025-08-11T10:04:11+00:00 Ehab abdulrazak Alasadi [email protected] <![CDATA[Analysis and implementation of a parallel algorithm to calculate the integration of the function y=1/e to the x with a specified time interval. Design and implementation using C/C++ is based on the sharing of memory among "THREADS." The "Pthread" library has been used. Use of the output file to print information and the purpose of using the POSIX library It is to implement the program faster than the one nucleus, when it involves a set of processors (THREADS) where each thread is considered to be a processor. This accelerates the solution of the complex problems in the system that need a large memory, where time sharing is used by Mutex. Lock and unlock through research prior to the use of parallel programs and its memory sharing technique to solve complex and large issues that require a long time to be implemented. Using parallel programs, each thread carries a particular issue and solves it, and the results are combined by reducing the time execution and increasing the speed of the system speedup according to the speed equation S=T1/Tn]]> 2025-05-01T00:00:00+00:00 Copyright (c) 2025 IRAQI JOURNAL OF STATISTICAL SCIENCES https://tanmiyat.uomosul.edu.iq/index.php/stats/article/view/29248 Nonparametric Estimation Method for the Distribution Function Using Various Types of Ranked Set Sampling 2025-08-11T10:04:08+00:00 Ramy Saad Ghareeb [email protected] Rikan Abd AL Azeez AL khalidi - <![CDATA[The purpose of this research is to estimate the cumulative distribution function using the local polynomial regression and compare it to parameter estimation using the method of moments and the maximum likelihood method to calculate both the mean square error and the bias using the ranked sets sample and the median ranked sets sample . As well as frequently produces more exact estimates than simple random sampling for the same sample size. By ranking samples based on some easily measurable characteristic, the variability within each set is decreased, resulting in more accurate estimations. We investigated three different degrees of local polynomial regression: the first, second, and third. The simulation analysis demonstrated that the second degree outperforms the other degrees. Also, when is used to analyze data, it takes advantage of the reduced variability within each ranked set, resulting in more precise and reliable regression function estimates. Following that, we investigated several degrees of bandwidth (0.1, 0.2, and 0.9) and discovered that the bandwidth of degree 0.8 is superior to the other degrees based on a simulation study. Finally, we analyzed the relative efficiency of each of the three approaches: , , and , and we discovered that is more efficient than the other methods for estimating the in different kernels (normal (gaussian), epanechinkov). The numerical results provide that the suggested estimator based on is more efficient than other methods, as predicted by the simulation analysis]]> 2025-05-01T00:00:00+00:00 Copyright (c) 2025 IRAQI JOURNAL OF STATISTICAL SCIENCES https://tanmiyat.uomosul.edu.iq/index.php/stats/article/view/29261 g_*^*-I-Closed Sets and Their Properties in in Ideal Topological Space 2025-08-11T10:04:35+00:00 Rughzai Shwan Mahamood - Darwesh Halgwrd Mohammed - <![CDATA[There are many research papers that deal with different types of generalized closed sets. Levine [4] introduced generalized closed (briefly, -closed) sets and studied their basic properties and Veera Kumar [5] introduced -closed sets in topological spaces. The purpose of this present paper is to define a new class of generalized idea closed sets called -closed sets by using -open set .In this paper, we introduce the -closed sets, characterizations and properties of -closed sets and its complement and other related sets. We prove that the class of closed sets lies between the class of -closed sets and the class of -closed sets. Also, we find some relations between -closed sets and already existing closed sets. -open neighborhood is introduced and their properties are investigated.]]> 2025-05-01T00:00:00+00:00 Copyright (c) 2025 IRAQI JOURNAL OF STATISTICAL SCIENCES https://tanmiyat.uomosul.edu.iq/index.php/stats/article/view/29254 A Comparative Study of K-means Clustering Algorithms Using Euclidean and Manhattan Distance for Climate Data. 2025-08-11T10:04:22+00:00 Bakhshan Ahmed Hamad [email protected] <![CDATA[The K-means clustering algorithms (Random, K-means++, Canopy, and Farthest First) are unsupervised machine learning techniques designed to group data points based on their similarities. The study examined the effects of clustering algorithms and distance metrics on climate data analysis from meteorological stations in the Kurdistan Region of Iraq (20202022). 8-attribute dataset with 1,095 cases was clustered using Random, K-means++, Canopy, and Farthest First methods, evaluated with Euclidean and Manhattan distance metrics via the WEKA tools, which is a versatile and accessible open-source tool for machine learning and data mining. It features a user-friendly interface, a wide range of algorithms, robust pre-processing and visualization tools, and cross-platform compatibility.Focusing on efficiency and reducing variation within clusters, the results revealed that within Euclidean distance, all algorithms formed two clusters. Canopy required the most iterations, Farthest First the fewest. K-means++ was the fastest, Canopy the slowest. WCSS values were similar, with Random and Canopy scoring lowest, but within Manhattan Distance, all algorithms again formed two clusters. Canopy had the highest iterations, Farthest First the fewest and fastest, while Random was slowest. WCSS differences were negligible, with Random, Canopy, and Farthest First performing best.Graphs illustrate the highlighted differences in cluster distribution, iterations, execution time, and WCSS. Euclidean distance yielded lower WCSS, while interactive maps revealed clearer cluster distributions for most attributes compared to Manhattan distance. produced the lowest within-cluster sum of squared errors compared to the Manhattan distance.]]> 2025-05-01T00:00:00+00:00 Copyright (c) 2025 IRAQI JOURNAL OF STATISTICAL SCIENCES https://tanmiyat.uomosul.edu.iq/index.php/stats/article/view/29250 Determining Climate Extremes of Mosul Weather Using Robust Noise Clustering Strategy. 2025-08-11T10:04:12+00:00 Marwan Moysar AL-Hyali [email protected] Bashar A. AL-Talib - <![CDATA[The paper aims to analyse the climatic data of the city of Mosul during the summer season from 2013 to 2022, focusing on the maximum temperature variable. Modern methods have been used to detect climate fluctuations that have not been used previously, adapt them to the study data, and explore the general and extreme climates that any of the previous studies have not touched. Variable clustering techniques have been used to discover the latent components according to the local groups model. The "K+1" noise group strategy was used to identify high-noise variables. The researcher proposed a wide format for ordering the data: P > N, which means that the number of variables is greater than the number of observations. The observations represented the school years; the variables were the summer days for three months (June, July, and August). This arrangement proved suitable for the variable aggregation technique of high-dimensional data. The results showed six groups, five of which were almost homogeneous. The five clusters indicate different patterns of maximum temperature increases during the summer. The first cluster highlights heat waves in mid-summer (July and August), while the second cluster focuses on the hot ends of summer (late June and August). The third cluster refers to early and continuous heat waves in June and July, while the fourth cluster reflects persistent heat in late July and August. The fifth cluster shows a variation in temperatures between the beginning and the end of summer. The excluded noise variables represent inconsistent data or outliers that did not belong to any cluster. This contributes to improving the accuracy of climate models. The results highlight characteristic climatic patterns and provide recommendations for strengthening environmental and agricultural planning.]]> 2025-05-01T00:00:00+00:00 Copyright (c) 2025 IRAQI JOURNAL OF STATISTICAL SCIENCES https://tanmiyat.uomosul.edu.iq/index.php/stats/article/view/29259 Using the State Space Model based on ARIMA Model for Air Temperature Forecasting. 2025-08-11T10:04:31+00:00 Suha Saleem Mahjoob - Osamah Basheer Shukur [email protected] <![CDATA[The high accuracy of forecasts with the temperature data is very important to control environmental damages such as desertification and water resources drought as well as it is important to control the uses of renewable energy and clean energy. Using the multiplicative seasonal integrated auto-regressive and moving average (SARIMA) model for forecasting with uncertainty problem in the modeling process especially with nonlinear data such as minimum temperatures will make the forecasting results become low in quality because ARIMA is a linear model. Improving the minimum temperature forecasting quality is the main aim for this study by using more suitable methods for modeling the data with the problem of uncertainty. In this study, the minimum temperature data for Mosul and Baghdad will be used as a case of study. The state space (SS) will be used based on the ARIMA model which can be called the hybrid ARIMA-SS model which will be used to solve the uncertainty problem caused by the non-linearity of temperature data. Therefore the forecasting results may be not accurate. Also, the climate data often suffers from heterogeneity, especially in non-tropical regions, due to the high difference between the hot and cold seasons of these data. Time stratified (TS) will be used to solve the problem of data heterogeneity. In the ARIMA-SS hybrid method ARIMA is used only for the purpose of specifying the input of the SS model. In this study, the SS model was used as a statistical method for estimating and forecasting the state space. The SS method is to combine observations with current forecasts values by using weights that reduce biases and errors. The ARIMA-SS hybrid model has been used to deal with uncertainty and improve the minimum temperature forecasting by handling it well. The performance of the ARIMA model and the ARIMA-SS hybrid model will be compared to determine which of them will perform with more accurate forecasts .The results showed that the ARIMA-SS hybrid model outperformed the ARIMA model and produced more accurate forecasts. Therefore, it is possible to conclude that ARIMA-SS hybrid model can be used to result better forecasting accuracy for the minimum temperature compared to the forecasting performance of the traditional ARIMA model.]]> 2025-05-01T00:00:00+00:00 Copyright (c) 2025 IRAQI JOURNAL OF STATISTICAL SCIENCES https://tanmiyat.uomosul.edu.iq/index.php/stats/article/view/29256 Estimating Outliers Using the Iterative Method in Partial Least Squares Regression Analysis for Linear Models. 2025-08-11T10:04:26+00:00 Taha H Ali [email protected] Mahammad Mahmoud Bazid - <![CDATA[Outliers affect the accuracy of the estimated parameters of the partial least squares regression model and give unacceptably large residual values. Traditional robust methods (used in ordinary least squares) cannot be used to treat outliers in estimating partial least squares regression model, due to the number of independent variables greater than the sample size, therefore, it was proposed to use an iterative method to treat outliers and estimation of partial least squares regression model parameters. The iterative method relies on identifying outliers and then estimating them using the initial estimated values and the residual and determining the optimal value that gives the least sum of squares error for the partial least square regression model. To illustrate the proposed method, simulated and real data were used based on a program MATLAB designed for this purpose. The proposed method provided accurate results for the partial squares regression model parameters depending on MSE criteria and addressed the problem of outliers.]]> 2025-05-01T00:00:00+00:00 Copyright (c) 2025 IRAQI JOURNAL OF STATISTICAL SCIENCES https://tanmiyat.uomosul.edu.iq/index.php/stats/article/view/29255 Comparison of Estimation Methods for the Parameters of the Frechet Distribution - using Simulation. 2025-08-11T10:04:24+00:00 Ahmed Husham Mohammed [email protected] Montadher Jumaa Mahdi - <![CDATA[Probability distributions are mathematical functions that describe the likelihood of different outcomes in random process estimates for the scale parameters and the shape parameter according to the type of data that can determine the appropriate probability distribution. In this paper, an experimental study is presented to compare a number of estimation methods for the parameters of the Frechet distribution, which is one of the most important probability distributions in the fields of determining failure times. The estimation Methods are (Maximum Likelihood, Moments and Bayesian methods) were adopted. Through the simulation method, the comparison process was carried out, where the experimental samples were determined (n = 15, 25, 50, 75, 100) with the assumption of four default values for each of the shape parameter ( =1.1, 1.5, 2, 2.5) and the scale parameter (=1.4, 1.8, 2.3, 3). Through this method, the paper was able to determine the appropriate method for estimation by adopting the Mean square error criterion. The experimental results showed the superiority of the Bayes method. Then the method of Maximum likelihood.]]> 2025-05-01T00:00:00+00:00 Copyright (c) 2025 IRAQI JOURNAL OF STATISTICAL SCIENCES https://tanmiyat.uomosul.edu.iq/index.php/stats/article/view/29257 Group Variable Selection Methods with Quantile Regression: A Simulation Study. 2025-08-11T10:04:27+00:00 Hussein A. Hashem [email protected] <![CDATA[In many cases, covariates have a grouping structure that can be used in the analysis to identify important groups and the significant members of those groups. This paper reviews some group variable selection methods that utilize quantile regression. The study compares seven previously proposed group variable selection methods, namely the group Lasso estimate, the quantile group Lasso (median group Lasso) estimate, the quantile group adaptive Lasso estimate, the sparse group Lasso estimate, the group scad estimate, the group mcp estimate, and the group gel estimate through a simulation study. The simulation study helps determine which methods perform best in all linear regression scenarios.]]> 2025-05-01T00:00:00+00:00 Copyright (c) 2025 IRAQI JOURNAL OF STATISTICAL SCIENCES https://tanmiyat.uomosul.edu.iq/index.php/stats/article/view/29252 Proposed Quality Control Charts Using Haar Wavelet Coefficients for Enhanced Production Monitoring 2025-08-11T10:04:17+00:00 Taha H Ali [email protected] Sarah Bahrooz Ameen - <![CDATA[One main problem of the traditional quality control charts, such as the Individual Observations Chart and the Moving Average Chart, is that they do not focus on monitoring the differences in the produced materials. To address this issue, researchers suggested creating new charts based on the Haar wavelet that could potentially put more focus and better handle the data noise affecting traditional charts' accuracy. The new proposed charts are based on a method called wavelet transform for Haar wavelet. One chart records the average of individual observations (Approximate coefficients or low pass filter) while the other monitors the variations among these observations (Detail coefficients or high pass filter). For the first time, the universal threshold method to treat data noise was used to create control limits in the proposed charts. The researchers used both simulated and real data to develop these charts using MATLAB software. The study proved the accuracy and efficiency of the proposed charts, their success in handling the data noise, and their sensitivity in detecting minor changes that may occur in the production process.]]> 2025-05-01T00:00:00+00:00 Copyright (c) 2025 IRAQI JOURNAL OF STATISTICAL SCIENCES https://tanmiyat.uomosul.edu.iq/index.php/stats/article/view/29258 Estimation of Delay Time in Linear Dynamic Systems Using Wavelets 2025-08-11T10:04:29+00:00 Heyam A.A Hayawi - Reem Talal Taha [email protected] Mohamed Ahmed Elkhouli - <![CDATA[This research explores the use of wavelets in estimating delay time in stochastic linear dynamic systems, as delay time plays a crucial role in diagnosing the system by determining the time interval between inputs and outputs. Several simulation experiments were conducted, utilizing one type of waveletthe Haar waveletfor data processing. Subsequently, various methods for estimating delay time were applied, and the results were compared. The findings indicate that estimating delay time using the Haar wavelet yielded better results when applied to an autoregressive model with additional inputs compared to the unprocessed data. The research aims to employ the har wavelet in the process of estimating the delay time in stochastic linear dynamic models using some estimation methods and comparing the results based on simulation experiments]]> 2025-05-01T00:00:00+00:00 Copyright (c) 2025 IRAQI JOURNAL OF STATISTICAL SCIENCES https://tanmiyat.uomosul.edu.iq/index.php/stats/article/view/29251 Comparison between Logistic Regression and K-Nearest Neighbour Techniques with Application on Thalassemia Patients in Mosul 2025-08-11T10:04:14+00:00 Mohammed Faris Al jbory [email protected] Hutheyfa Hazam Taha - <![CDATA[Thalassemia is a genetic disease that is transmitted from parents to children when both parents are carriers of the genetic mutation. This change leads to a decrease in the number, quality, and condition of red blood platelets and an increase in the rate of red blood platelet damage, which leads to iron accumulation in the body and a decrease in hemoglobin in the blood. This project aims to develop a model to predict thalassemia using the nearest neighbor technique and the logistic regression model based on the model evaluation criteria: accuracy, recall, precision, F1-score, and AUC. The data were obtained from Al-Hadbaa Specialized Hospital in Mosul. The data set included 280 observations, of which 149 (53.21%) were thalassemia intermedia and 131 (46.78%) were thalassemia major. The data was divided into 70% for training and 30% for screening.The experimental results showed that the logistic regression model performed better than the nearest neighbor algorithm with a precision of 96%, recall of 98%, and F1- score of 97% in the thalassemia intermedia category, while it had a precision of 97%, recall of 95%, and F1- score of 96% in the thalassemia major category, indicating that logistic regression performed well in distinguishing between these two categories. it has been shown that logistic regression is more effective than the K-nearest neighbor algorithm in classifying thalassemia patients, especially those with thalassemia major. The study showed that the type of distance used in the K-nearest neighbor algorithm, whether "Manhattan" or "Chebyshev", has a significant impact on the accuracy of predictions, with the highest accuracy reaching 95% when K = 4. It was also shown that the difference between distance calculation methods and the K value plays a major role in improving the classification results, as it was determined that the optimal value for K is 4, which led to improving the accuracy of predictions. The researcher suggests increasing the data size, as it is possible to improve the accuracy of models by increasing the data size. In addition, the researcher recommends using other artificial intelligence techniques, especially neural networks, to verify any additional improvements.]]> 2025-05-01T00:00:00+00:00 Copyright (c) 2025 IRAQI JOURNAL OF STATISTICAL SCIENCES https://tanmiyat.uomosul.edu.iq/index.php/stats/article/view/29260 Data Envelopment Analysis Models to Measure the Relative and Scale Efficiency of Educational Institutions (Tikrit University as A Model) 2025-08-11T10:04:33+00:00 Omar Ibrahem AL-Sultany [email protected] Oday Abdulrahman Jarjies - <![CDATA[The research dealt with measuring the relative efficiency and scale efficiency of the colleges of Tikrit University for the academic year (2019-2020) using the data envelope analysis (DEA) method, which is one of the linear programming methods to measure the productive efficiency of institutions and economic units. The constant returns to scale (CCR) model and the variable returns to scale (BCC) model were used according to the input-oriented measures and output-oriented measures indicators. In order to achieve the objectives of the study, the 21 colleges of the University of Tikrit were selected, and three inputs were identified: the number of registered students, the number of teaching staff, the number of employees, and two outcomes (the number of graduates and the number of published research, seminars, and conferences). The research reached several results, the most important of which is that (9) colleges achieved relative efficiency in the CCR model and (11) colleges in the BCC model with both internal and external orientations. The research also addressed the necessary procedures and reforms for incompetent colleges for the purpose of reaching competency, and also identified the reference colleges for each incompetent college to imitate and emulate in order to reach full competency.]]> 2025-05-01T00:00:00+00:00 Copyright (c) 2025 IRAQI JOURNAL OF STATISTICAL SCIENCES https://tanmiyat.uomosul.edu.iq/index.php/stats/article/view/29253 Bayesian Time Series Modelling with Wavelet Analysis for Forecasting Monthly Inflation 2025-08-11T10:04:19+00:00 Taha H Ali [email protected] Heyam A.A Hayawi - Hunar Adam Hamza - <![CDATA[This article treats data noise and outliers in Bayesian ARIMA models through wavelet analysis. Apply the discrete wavelet transformation using Daubechies and Symlets wavelets for orders 10 and 15 to decompose the data of Bayesian ARIMA models into their frequency components. Threshold the wavelet coefficients using a method like soft thresholding, with the threshold selected via Steins unbiased risk estimate and soft rule. Simulation experiments were used with real data representing the monthly inflation in the Kurdistan Region of Iraq (2009-2024) with a forecast for the next ten months. The proposed wavelet-based Bayesian ARIMA method provides a robust framework for handling noisy time series data and offers significant improvements over classical methods, making it an appealing choice for practical applications in time series forecasting, particularly when dealing with outliers and noise.]]> 2025-05-01T00:00:00+00:00 Copyright (c) 2025 IRAQI JOURNAL OF STATISTICAL SCIENCES