Big Data Analytic: From Accuracy to Interpretability

A special issue of Big Data and Cognitive Computing (ISSN 2504-2289).

Deadline for manuscript submissions: closed (15 February 2018) | Viewed by 22130

Special Issue Editor

Special Issue Information

Dear Colleagues,

The primary disadvantage of Big Data analytics using high-performance classifiers and Deep Learning is that they have no clear declarative representation of knowledge. In addition, the current Big Data analytics have considerable difficulties in generating the necessary explanation structures, which limits their full potential because the ability to provide detailed characterizations of classification strategies would promote their acceptance. Expert systems benefit from a clear declarative representation of knowledge about the problem domain; therefore, a natural means to elucidate the knowledge embedded within neural networks (NNs), support vector machines (SVMs), evolutionary computation (EC) and their hybrids are to extract symbolic rules. However, surprisingly, very little work has been conducted in relation to Big Data analytics. Bridging this gap could be expected to contribute to the real-world utility of Big Data analytics.

Rule extraction from NNs, SVMs, EC and their hybrids can also be considered an optimization problem because it involves a clear trade-off between accuracy and interpretability; although higher number of rules typically provides better accuracy, it also reduces interpretability. Rule extraction from ANNs, SVMs, EC, and their hybrids, therefore, remain an area in need of further innovation.

Potential topics include, but are not limited to, the following:

  • Big Data analytics using machine learning and computational intelligence
  • Machine learning and computational intelligence applied to transparency of deep learning networks
  • Big Data analytics for medical, financial, and industrial big data
  • Rule extraction from decision tree ensembles and forests
  • Accuracy-interpretability dilemma: high performance classifiers versus rule extraction

Prof. Dr. Yoichi Hayashi
Guest Editor

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Big Data and Cognitive Computing is an international peer-reviewed open access monthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 1800 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Published Papers (3 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

19 pages, 797 KiB  
Article
A Rule Extraction Study from SVM on Sentiment Analysis
by Guido Bologna and Yoichi Hayashi
Big Data Cogn. Comput. 2018, 2(1), 6; https://doi.org/10.3390/bdcc2010006 - 02 Mar 2018
Cited by 15 | Viewed by 4355
Abstract
A natural way to determine the knowledge embedded within connectionist models is to generate symbolic rules. Nevertheless, extracting rules from Multi Layer Perceptrons (MLPs) is NP-hard. With the advent of social networks, techniques applied to Sentiment Analysis show a growing interest, but rule [...] Read more.
A natural way to determine the knowledge embedded within connectionist models is to generate symbolic rules. Nevertheless, extracting rules from Multi Layer Perceptrons (MLPs) is NP-hard. With the advent of social networks, techniques applied to Sentiment Analysis show a growing interest, but rule extraction from connectionist models in this context has been rarely performed because of the very high dimensionality of the input space. To fill the gap we present a case study on rule extraction from ensembles of Neural Networks and Support Vector Machines (SVMs), the purpose being the characterization of the complexity of the rules on two particular Sentiment Analysis problems. Our rule extraction method is based on a special Multi Layer Perceptron architecture for which axis-parallel hyperplanes are precisely located. Two datasets representing movie reviews are transformed into Bag-of-Words vectors and learned by ensembles of neural networks and SVMs. Generated rules from ensembles of MLPs are less accurate and less complex than those extracted from SVMs. Moreover, a clear trade-off appears between rules’ accuracy, complexity and covering. For instance, if rules are too complex, less complex rules can be re-extracted by sacrificing to some extent their accuracy. Finally, rules can be viewed as feature detectors in which very often only one word must be present and a longer list of words must be absent. Full article
(This article belongs to the Special Issue Big Data Analytic: From Accuracy to Interpretability)
Show Figures

Figure 1

13 pages, 854 KiB  
Article
Reimaging Research Methodology as Data Science
by Ben Kei Daniel
Big Data Cogn. Comput. 2018, 2(1), 4; https://doi.org/10.3390/bdcc2010004 - 12 Feb 2018
Cited by 14 | Viewed by 8236
Abstract
The growing volume of data generated by machines, humans, software applications, sensors and networks, together with the associated complexity of the research environment, requires immediate pedagogical innovations in academic programs on research methodology. This article draws insights from a large-scale research project examining [...] Read more.
The growing volume of data generated by machines, humans, software applications, sensors and networks, together with the associated complexity of the research environment, requires immediate pedagogical innovations in academic programs on research methodology. This article draws insights from a large-scale research project examining current conceptions and practices of academics (n = 144) involved in the teaching of research methods in research-intensive universities in 17 countries. The data was obtained through an online questionnaire. The main findings reveal that a large number of academics involved in the teaching of research methods courses tend to teach the same classes for many years, in the same way, despite the changing nature of data, and complexity of the environment in which research is conducted. Furthermore, those involved in the teaching of research methods courses are predominantly volunteer academics, who tend to view the subject only as an “add-on” to their other teaching duties. It was also noted that universities mainly approach the teaching of research methods courses as a “service” to students and departments, not part of the core curriculum. To deal with the growing changes in data structures, and technology driven research environment, the study recommends institutions to reimage research methodology programs to enable students to develop appropriate competences to deal with the challenges of working with complex and large amounts of data and associated analytics. Full article
(This article belongs to the Special Issue Big Data Analytic: From Accuracy to Interpretability)
Show Figures

Figure 1

15 pages, 1111 KiB  
Article
Big Data Processing and Analytics Platform Architecture for Process Industry Factories
by Martin Sarnovsky, Peter Bednar and Miroslav Smatana
Big Data Cogn. Comput. 2018, 2(1), 3; https://doi.org/10.3390/bdcc2010003 - 26 Jan 2018
Cited by 25 | Viewed by 8641
Abstract
This paper describes the architecture of a cross-sectorial Big Data platform for the process industry domain. The main objective was to design a scalable analytical platform that will support the collection, storage and processing of data from multiple industry domains. Such a platform [...] Read more.
This paper describes the architecture of a cross-sectorial Big Data platform for the process industry domain. The main objective was to design a scalable analytical platform that will support the collection, storage and processing of data from multiple industry domains. Such a platform should be able to connect to the existing environment in the plant and use the data gathered to build predictive functions to optimize the production processes. The analytical platform will contain a development environment with which to build these functions, and a simulation environment to evaluate the models. The platform will be shared among multiple sites from different industry sectors. Cross-sectorial sharing will enable the transfer of knowledge across different domains. During the development, we adopted a user-centered approach to gather requirements from different stakeholders which were used to design architectural models from different viewpoints, from contextual to deployment. The deployed architecture was tested in two process industry domains, one from the aluminium production and the other from the plastic molding industry. Full article
(This article belongs to the Special Issue Big Data Analytic: From Accuracy to Interpretability)
Show Figures

Figure 1

Back to TopTop