Search published articles


Showing 4 results for Classification

Nasser Behnampour, Ebrahim Hajizadeh, Shahriar Semnani, Farid Zayeri,
Volume 1, Issue 2 (10-2013)
Abstract

Background & objective:

One of the common purposes of medical research is Determination of effective factors on the occurrence of event. Due to the interaction of risk factors regression models, discriminant analysis and classification procedures used. Uses of these models require making the assumption which in the medical data isn’t usually established. Therefore, alternative methods must be used. According to diversification of risk factors for of esophageal cancer, the purpose of this article is the Introduction and application of classification and regression tree for determination of risk factor for esophageal cancer in Golestan province.

Methods:

Data of this article gathered from case-control study. Case group contain all confirmed cases of esophageal cancer that consist of 90 male and 60 female subjects in Golestan province during one year. Two control groups were considered for each case. Control groups were selected from family of patients and neighbors and matched for age, sex, ethnic and place of residence. Data was analyzed with classification and regression tree model and by using of R software. Gini criterion was used for selection of best splitting in each node and ROC surveyed accuracy of CRT model.

Results:

(ethnic factors) can be effective in esophageal cancer occurrences.

Results of Classification tree model showed that exposure to CT and X-ray dye (socio-environmental factors), unwashed hands after defecation, history of smoking (lifestyle factors) and family history of cancer

Conclusion:

models results` interpretation are two essential beneficiary of these models which can use in medical sciences.

Tree models don’t require the establishment of no default for making model and feasibility of tree
Alireza Abadi, Bagher Pahlavanzade, Keramat Nourijelyani, Seyed Mostafa Hosseini,
Volume 3, Issue 1 (5-2015)
Abstract

Background & Objective: Inability to measure exact exposure in epidemiological studies is a common problem in many studies, especially cross-sectional studies. Depending on the extent of misclassification, results may be affected. Existing methods for solving this problem require a lot of time and money and it is not practical for some of the exposures. Recently, new methods have been proposed in 1:1 matched case–control studies that have solved these problems to some extent. In the present study we have aimed to extend the existing Bayesian method to adjust for misclassification in matched case–control Studies with 1:2 matching.

Methods: Here, the standard Dirichlet prior distribution for a multinomial model was extended to allow the data of exposure–disease (OR) parameter to be imported into the model excluding other parameters. Information that exist in literature about association between exposure and disease were used as prior information about OR. In order to correct the misclassification Sensitivity Analysis was accomplished and the results were obtained under three Bayesian Methods.

Results: The results of naïve Bayesian model were similar to the classic model. The second Bayesian model by employing prior information about the OR, was heavily affected by these information.

The third proposed model provides maximum bias adjustment for the risk of heavy metals, smoking and drug abuse. This model showed that heavy metals are not an important risk factor although raw model (logistic regression Classic) detected this exposure as an influencing factor on the incidence of lung cancer. Sensitivity analysis showed that third model is robust regarding to different levels of Sensitivity and Specificity.

Conclusion: The present study showed that although in most of exposures the results of the second and third model were similar but the proposed model would be able to correct the misclassification to some extent.


Arezoo Bagheri, Mahsa Saadati,
Volume 3, Issue 2 (10-2015)
Abstract

Background and Objective: Discriminant analysis and logistic regression are classical methods for classifying data in several studies. However, these models do not lead in valid results due to not meeting all necessary assumptions. The purpose of this study was to classify the number of Children Ever Born (CEB) using decision tree model in order to present an efficient method to classify demographic data.

Methods: In the present study, CART tree model with Gini splitting rule was fitted to classify the number of CEB in fertility behavior of at least once married 15-49 year-old women, in Semnan-2012. 405 women aged 15-49 years old comprised the survey sample.

Results: Women in first and second birth cohorts who had married at an early age had 3 CEB while women who had married at an older age had 2 CEB. Women in third birth cohort who had married at an early age and were employed, had 2 CEB while unemployed women in this cohort whose type of marriages were familial and non-familial had 0 and 1 CEB respectively. Women in the third birth cohort who were married in older age had 1 CEB.

Conclusion: Among important advantages of CART model are the simplicity in interpretation, using distribution-free measures, considering missing data and outliers for construction trees which has increased the usage of this method. Therefore, this method is a suitable way for classifying demographic data in comparison to other classical modeling methods in the conditions that necessary assumptions are not met.


Fatemeh Bagheri, , ,
Volume 3, Issue 2 (10-2015)
Abstract

Background and objectives: Investigatingg the mortality in a population has been considered as one of the appropriate methods of health detection. Although, there are some problems such as lack of confidence in accuracy measurement and quality of data collection. Establishment of death registration systems and using international classification codes of diseases, and also mortality data integrating by responsible organizations have solved great parts of the previous problems. In this study, considering a set of parameters, the study population was divided into two groups: deceased under one year (infants) and over one year (adults).  Then both groups were clustered using the K-means method to identify different groups. Hidden models and useful patterns were also discovered using decision tree algorithms. Finally, a neural network algorithm was used to show the ranking of attributes in order of their importance.

Methods: In this research, data of 12,865 deceased individuals in Golestan province since 2007 to 2009 is studied. The data has been obtained from the Health Center of Golestan province. The main characteristics used in this study are: deceased age, gender, cause of death, place of residence and place of death. K-means algorithm is used to cluster data. The decision tree algorithms and neural networks algorithm were also used for classification. Finally, results and rules were extracted. Due to different natures of causes of death in infants and adults, studying on these different groups is performed separately.

Results: In clustering phase, the optimal number of clusters is obtained by Dunn index; eight clusters for infants and seven clusters for adults were obtained. Among four decision-tree algorithms (C5.0, QUEST, CHAID and CART), C5.0 algorithm with high correction rate, 77.37% in infants data and 96.86% in adults data was the best classifier algorithm. Age, gender and place of death were the most important variables that were detected by neural network algorithm.

Conclusion: In the present study, the collected mortality data was clustered by considering the effective factors and the standard of International Classification of Diseases. The hidden patterns of mortality for infants and adults were extracted. Due to the explicit nature and the intelligibility of the decision tree algorithms, the results and extracted rules are very useful for specialists in this field.



Page 1 from 1     

© 2024 CC BY-NC 4.0 | Jorjani Biomedicine Journal

Designed & Developed by : Yektaweb