Tran Khanh Dang, Duc Minh Chau Pham and Duc Dan Ho
Data crawling in e-commerce for market research often come with the risk of poor authenticity due to modification attacks. The purpose of this paper is to propose a novel data…
Abstract
Purpose
Data crawling in e-commerce for market research often come with the risk of poor authenticity due to modification attacks. The purpose of this paper is to propose a novel data authentication model for such systems.
Design/methodology/approach
The data modification problem requires careful examinations in which the data are re-collected to verify their reliability by overlapping the two datasets. This approach is to use different anomaly detection techniques to determine which data are potential for frauds and to be re-collected. The paper also proposes a data selection model using their weights of importance in addition to anomaly detection. The target is to significantly reduce the amount of data in need of verification, but still guarantee that they achieve their high authenticity. Empirical experiments are conducted with real-world datasets to evaluate the efficiency of the proposed scheme.
Findings
The authors examine several techniques for detecting anomalies in the data of users and products, which give the accuracy of 80 per cent approximately. The integration with the weight selection model is also proved to be able to detect more than 80 per cent of the existing fraudulent ones while being careful not to accidentally include ones which are not, especially when the proportion of frauds is high.
Originality/value
With the rapid development of e-commerce fields, fraud detection on their data, as well as in Web crawling systems is new and necessary for research. This paper contributes a novel approach in crawling systems data authentication problem which has not been studied much.
Details
Keywords
In the digital age, organizations want to build a more powerful machine learning model that can serve the increasing needs of people. However, enhancing privacy and data security…
Abstract
Purpose
In the digital age, organizations want to build a more powerful machine learning model that can serve the increasing needs of people. However, enhancing privacy and data security is one of the challenges for machine learning models, especially in federated learning. Parties want to collaborate with each other to build a better model, but they do not want to reveal their own data. This study aims to introduce threats and defenses to privacy leaks in the collaborative learning model.
Design/methodology/approach
In the collaborative model, the attacker was the central server or a participant. In this study, the attacker is on the side of the participant, who is “honest but curious.” Attack experiments are on the participant’s side, who performs two tasks: one is to train the collaborative learning model; the second task is to build a generative adversarial networks (GANs) model, which will perform the attack to infer more information received from the central server. There are three typical types of attacks: white box, black box without auxiliary information and black box with auxiliary information. The experimental environment is set up by PyTorch on Google Colab platform running on graphics processing unit with labeled faces in the wild and Canadian Institute For Advanced Research-10 data sets.
Findings
The paper assumes that the privacy leakage attack resides on the participant’s side, and the information in the parameter server contains too much knowledge to train a collaborative machine learning model. This study compares the success level of inference attack from model parameters based on GAN models. There are three GAN models, which are used in this method: condition GAN, control GAN and Wasserstein generative adversarial networks (WGAN). Of these three models, the WGAN model has proven to obtain the highest stability.
Originality/value
The concern about privacy and security for machine learning models are more important, especially for collaborative learning. The paper has contributed experimentally to private attack on the participant side in the collaborative learning model.
Details
Keywords
Tran Khanh Dang and Thu Anh Duong
In the open data context, the shared data could come through many transformation processes, originating from many sources, which exposes the risk of non-authentic data. Moreover…
Abstract
Purpose
In the open data context, the shared data could come through many transformation processes, originating from many sources, which exposes the risk of non-authentic data. Moreover, each data set has different properties, shared under various licenses, which means the updated data could change its characteristics and related policies. This paper aims to introduce an effective and elastic solution to keep track of data changes and manage their characteristics within the open data platform. These changes have to be immutable to avoid violated modification and could be used as the certified provenance to improve the quality of data.
Design/methodology/approach
This paper will propose a pragmatic solution that focuses on the combination of comprehensive knowledge archive network – the broadest used open data platform and hyperledger fabric blockchain to ensure all the changes are immutable and transparent. As using smart contracts plus a standard provenance data format, all processes are running automatically and could be extended to integrate with other provenance systems and so the introduced solution is quite flexible to be used in different open data ecosystems and real-world application domains.
Findings
The research involves some related studies about the provenance system. This study finds out that most of the studies are focused on the commercial sector or applicable to a specific domain and not relevant for the open-data section. To show that the proposed solution is a logical and feasible direction, this paper conducts an experimental sample to validate the result. The testing model is running successfully with an elastic system architect and promising overall performance.
Originality/value
Open data is the future of many businesses but still does not receive enough attention from the research community. The paper contributes a novel approach to protect the provenance of open data.
Details
Keywords
Vaclav Snasel, Tran Khanh Dang, Josef Kueng and Lingping Kong
This paper aims to review in-memory computing (IMC) for machine learning (ML) applications from history, architectures and options aspects. In this review, the authors investigate…
Abstract
Purpose
This paper aims to review in-memory computing (IMC) for machine learning (ML) applications from history, architectures and options aspects. In this review, the authors investigate different architectural aspects and collect and provide our comparative evaluations.
Design/methodology/approach
Collecting over 40 IMC papers related to hardware design and optimization techniques of recent years, then classify them into three optimization option categories: optimization through graphic processing unit (GPU), optimization through reduced precision and optimization through hardware accelerator. Then, the authors brief those techniques in aspects such as what kind of data set it applied, how it is designed and what is the contribution of this design.
Findings
ML algorithms are potent tools accommodated on IMC architecture. Although general-purpose hardware (central processing units and GPUs) can supply explicit solutions, their energy efficiencies have limitations because of their excessive flexibility support. On the other hand, hardware accelerators (field programmable gate arrays and application-specific integrated circuits) win on the energy efficiency aspect, but individual accelerator often adapts exclusively to ax single ML approach (family). From a long hardware evolution perspective, hardware/software collaboration heterogeneity design from hybrid platforms is an option for the researcher.
Originality/value
IMC’s optimization enables high-speed processing, increases performance and analyzes massive volumes of data in real-time. This work reviews IMC and its evolution. Then, the authors categorize three optimization paths for the IMC architecture to improve performance metrics.
Details
Keywords
Tran Khanh Dang, Tuyen Thi Kim Le, Anh Tuan Dang and Ha Duc Son Van
The paper aims to propose a flexible framework to support X-STROWL model. Extensible access control markup language (XACML) is an international standard used for access control in…
Abstract
Purpose
The paper aims to propose a flexible framework to support X-STROWL model. Extensible access control markup language (XACML) is an international standard used for access control in distributed systems. However, XACML and its existing extensions are not sufficient to fulfill sophisticated security requirements (e.g. access control based on user’s roles, context-aware authorizations and the ability of reasoning). Remarkably, X-STROWL, a generalized extension of XACML for spatiotemporal role-based access control (RBAC) model with reasoning ability, is a comprehensive model that overcomes these shortcomings. It mainly focuses on the architecture design as well as the implementation and evaluation of proposed framework and the comparison with others.
Design/methodology/approach
Based on the concept of X-STROWL model, the paper reviewed a large amount of open sources implementing XACML with defined criteria and chose the most suitable framework to be extended for the implementation. The paper also presented a case study used to evaluate the research result.
Findings
Holistic enterprise-ready application security framework – architecture framework (HERAS-AF) is chosen as the most suitable framework to be extended to implement X-STROWL model. Extending HERAS-AF to support spatiotemporal aspect and other contextual conditions as well as the way to integrate security in the access request, together with ability of reasoning for hierarchical roles, are striking features that make the proposed framework able to meet more sophisticated security requirements in comparison with others.
Research limitations/implications
Due to the research content, the performance of proposed framework is not the focused issue of this work.
Originality/value
The proposed framework is a crucial contribution of our research to provide a holistic, extensible and intelligent authorization decision engine.
Details
Keywords
Tran Khanh Dang and Tran Tri Dang
By reviewing different information visualization techniques for securing web information systems, this paper aims to provide a foundation for further studies of the same topic…
Abstract
Purpose
By reviewing different information visualization techniques for securing web information systems, this paper aims to provide a foundation for further studies of the same topic. Another purpose of the paper is to discover directions in which there is a lack of extensive research, thereby encouraging more investigations.
Design/methodology/approach
The related techniques are classified first by their locations in the web information systems architecture: client side, server side, and application side. Then the techniques in each category are further classified based on attributes specific to that category.
Findings
Although there is much research on information visualization for securing web browser user interface and server side systems, there are very few studies about the same techniques on web application side.
Originality/value
This paper is the first published paper reviewing extensively information visualization techniques for securing web information systems. The classification used here offers a framework for further studies as well as in‐depth investigations.
Details
Keywords
Tran Tri Dang and Tran Khanh Dang
The purpose of this paper is to propose novel information visualization and interaction techniques to help security administrators analyze past web form submissions, with the…
Abstract
Purpose
The purpose of this paper is to propose novel information visualization and interaction techniques to help security administrators analyze past web form submissions, with the goals of searching, inspecting, verifying, and understanding about malicious submissions.
Design/methodology/approach
The authors utilize well‐known visual design principles in the techniques to support the analysis process. They also implement a prototype and use it to investigate simulated normal and malicious web submissions.
Findings
The techniques can increase analysts' efficiency by displaying large amounts of information at a time, help analysts detect certain kinds of anomalies, and support the analyzing process via provided interaction capabilities.
Research limitations/implications
Due to resources constraints, the authors experimented on simulated data only, not real data.
Practical implications
The techniques can be used to investigate past web form submissions, which is a first step in analysing and understanding the current security situation and attackers' skills. The knowledge gained from this process can be used to plan for effective future defence strategy, e.g. by improving/fine‐tuning the attack signatures of an automatic intrusion detection system.
Originality/value
The visualization and interaction designs are the first visual analysis technique for security investigation of web form submissions.
Details
Keywords
Khanh Tran Dang, Nhan Trong Phan and Nam Chan Ngo
The paper aims to resolve three major issues in location-based applications (LBA) known as heterogeneity, user privacy, and context-awareness by proposing an elastic and open…
Abstract
Purpose
The paper aims to resolve three major issues in location-based applications (LBA) known as heterogeneity, user privacy, and context-awareness by proposing an elastic and open design platform named OpenLS privacy-aware middleware (OPM) for LBA.
Design/methodology/approach
The paper analyzes relevant approaches ranging from both academia and mobile industry community and insists the importance of heterogeneity, user privacy, and context-awareness towards the development of LBA.
Findings
The paper proposes the OPM by design. As a result, the OPM consists of two main component named application middleware and location middleware, which are cooperatively functioned to achieve the above goals. In addition, the paper has given the implementation of the OPM as well as its experiments. It is noted that two privacy-preserving techniques at two different levels are integrated into the OPM, including Memorizing algorithm at the application level and Bob-tree at the database level. Last but not least, the paper shows further discussion about other problems and improvements that might be needed for the OPM.
Research limitations/implications
Each issue has its sub problems that cause more influences to the OPM. Besides, each of the issues requires more investigations in depth in order to have better solutions in detail. Therefore, more overall experiments should be conducted to assure the OPM's scalability and effectiveness.
Practical implications
The paper hopefully promotes and speeds up the development of LBA when providing the OPM with suitable application programming interfaces and conforming the OpenLS standard.
Originality/value
This paper shows its originality towards location-based service (LBS) providers to develop their applications and proposes the OPM as a unified solution dealing with heterogeneity, user privacy, and context-awareness in the world of LBS.