Creating harmony for a healthy, balanced life
<2nd Digital Universe Colloquium>
- Title: Introduction to Clustering in Machine Learning
- Speker: Behrouz Fathi Vajargah
(NRF Research Professor at Mokpo National Maritime University)
- Date/Time: September 19, 2024 (Thursday) 16:00–17:00
- Location: Building A(N4), Room 227
Below is an abstract of the talk:
It is well-known that clustering is an unsupervised machine learning technique that divides the given data into different clusters based on their distances from each other. This difference also presents similarity/dissimilarity of data from a desired cluster. The unsupervised clustering can be considered by hard algorithms such as K-means that give the values of any point lying in some particular cluster to be either as 0 or 1 i.e., which assign each data point to a single cluster, only. In contrast, soft clustering algorithm gives the values of any point lying in different clusters in interval (0,1). It means that a particular data of a given data set may belong to two or more clusters of models, at the same time. Then, fuzzy clustering assigns a membership degree between 0 and 1 for each data point for each cluster.
It also provides us a different possibility to have uncertain clustering and can be also analysis it via probability and uncertain theories.
Large amounts of data are collected every day from many sources and cluster analysis is for them, such as maritime, satellite images, bio-medical, marketing, security, web searching, geo-spatial, cancer research, traffic flow, risk assessment and city planning. For example, in cancer research for classifying patients into subgroups according their gene expression profile. This can be useful for identifying the molecular profile of patients with good or bad prognostic, as well as for understanding the disease. In marketing for market segmentation by identifying subgroups of customers with similar profiles and who might be receptive to a particular form of advertising. In city-planning for identifying groups of houses according to their type, value and location.
In this talk we present the clustering of the desired data by hierarchical, K-means and fuzzy clustering algorithms. Finally, we present the performance of Monte Carlo simulation in fuzzy clustering in our recent research.
**Students interested in the lecture can attend even if they haven't registered, provided there are available seats in the classroom.