Hadoop with Intuitionistic Fuzzy C-Means for Clustering in Big Data

doi:10.1007/978-981-10-0767-5_62

Book ChapterDOI

Hadoop with Intuitionistic Fuzzy C-Means for Clustering in Big Data

- pp 599-610

TLDR

This paper proposes a new algorithm/technique of data clustering where Intuitionistic Fuzzy C-Means (IFCM) is used along with Hadoop to produce high-quality clusters and thereby making clustering on very large data more efficient.

Abstract:

In recent days, industry and academia have been trying to address the data handling issues with respect to big data. This has led to development of new computing arenas in the fields of data mining and analysis of data which are the need of the hour. One of the techniques to handle large data is by making clusters of the similar data. But this technique is complex as well. This paper proposes a new algorithm/technique of data clustering where Intuitionistic Fuzzy C-Means (IFCM) is used along with Hadoop to produce high-quality clusters and thereby making clustering on very large data more efficient. The results of the proposed algorithm are demonstrated with the help of UCI data sets. Performance metrics like Accuracy, SSW, SSB, DB, DD, and SC indices are used for comparison of the obtained results with Parallel K-means (PKM) and modified Parallel K-means (MPKM).

Citations

PDF

Open Access

More filters

Journal ArticleDOI

Soft and Declarative Fishing of Information in Big Data Lake

Bożena Małysiak-Mrozek, +2 more

- 12 Mar 2018 -

IEEE Transactions on Fuzzy Systems

TL;DR: It is shown how fuzzy techniques can be incorporated in big data analytics carried out with the declarative U-SQL language over a big data lake located on the cloud, and the solution directly addresses three characteristics of big data, i.e., volume, variety, and velocity, and indirectly addresses, veracity and value.

...read moreread less

Journal ArticleDOI

Informational Paradigm, management of uncertainty and theoretical formalisms in the clustering framework

Pierpaolo D'Urso

- 01 Aug 2017 -

Information Sciences

TL;DR: It is shown how all these clustering approaches are able of managing in different ways the uncertainty associated with the two components of the Informational Paradigm, i.e. the Empirical and Theoretical Information.

...read moreread less

Posted Content

Informational Paradigm, Management of Uncertainty and Theoretical Formalisms in the Clustering Framework: a Review

Pierpaolo D'Urso

- 01 Nov 2017 -

viXra

TL;DR: The first paper on clustering based on fuzzy sets theory was published in 1965 as mentioned in this paper, where L.A. Zadeh had published "Fuzzy Sets" and it has been 50 years since then.

...read moreread less

Journal ArticleDOI

A Hopping Umbrella for Fuzzy Joining Data Streams From IoT Devices in the Cloud and on the Edge

Dariusz Mrozek, +3 more

- 01 May 2020 -

IEEE Transactions on Fuzzy Systems

TL;DR: A hopping umbrella which fuzzifies timestamps from sensor readings while joining data streams from asynchronous IoT devices in a flexible way is presented, able to properly join the best matching sensor readings and in some scenarios, reduce the number of data transferred to the Cloud data center without significant overhead in resource utilization of stream processing units.

...read moreread less

Book ChapterDOI

Uncertainty-Based Clustering Algorithms for Large Data Sets

B. K. Tripathy, +2 more

TL;DR: It is the aim in this chapter to present the uncertainty based clustering algorithms developed so far and proposes a few new algorithms which can be developed further.

...read moreread less

References

PDF

Open Access

More filters

Book

Fuzzy sets

Lotfi A. Zadeh

TL;DR: A separation theorem for convex fuzzy sets is proved without requiring that the fuzzy sets be disjoint.

...read moreread less

Journal ArticleDOI

Intuitionistic fuzzy sets

Krassimir T. Atanassov

- 01 Aug 1986 -

Fuzzy Sets and Systems

TL;DR: Various properties are proved, which are connected to the operations and relations over sets, and with modal and topological operators, defined over the set of IFS's.

...read moreread less

Book

Finding Groups in Data: An Introduction to Cluster Analysis

Leonard Kaufman, +1 more

TL;DR: An electrical signal transmission system, applicable to the transmission of signals from trackside hot box detector equipment for railroad locomotives and rolling stock, wherein a basic pulse train is transmitted whereof the pulses are of a selected first amplitude and represent a train axle count.

...read moreread less

BookDOI

Finding Groups in Data

Leonard Kaufman, +1 more

TL;DR: In this article, an electrical signal transmission system for railway locomotives and rolling stock is proposed, where a basic pulse train is transmitted whereof the pulses are of a selected first amplitude and represent a train axle count, and a spike pulse of greater selected amplitude is transmitted, occurring immediately after the axle count pulse to which it relates, whenever an overheated axle box is detected.

...read moreread less

Journal ArticleDOI

A Cluster Separation Measure

David L. Davies, +1 more

- 01 Feb 1979 -

IEEE Transactions on Pattern Analysis an...

TL;DR: A measure is presented which indicates the similarity of clusters which are assumed to have a data density which is a decreasing function of distance from a vector characteristic of the cluster which can be used to infer the appropriateness of data partitions.

...read moreread less