Search results | Emerald Insight

Article

Publication date: 2 July 2018

Obtaining depth map from 2D non stereo images using deep neural networks

Daniil Igorevich Mikhalchenko, Arseniy Ivin and Dmitrii Malov

Single image depth prediction allows to extract depth information from a usual 2D image without usage of special sensors such as laser sensors, stereo cameras, etc. The purpose of…

HTML

PDF (782 KB)

Downloads

188

Abstract

Purpose

Single image depth prediction allows to extract depth information from a usual 2D image without usage of special sensors such as laser sensors, stereo cameras, etc. The purpose of this paper is to solve the problem of obtaining depth information from 2D image by applying deep neural networks (DNNs).

Design/methodology/approach

Several experiments and topologies are presented: DNN that uses three inputs—sequence of 2D images from videostream and DNN that uses only one input. However, there is no data set, that contains videostream and corresponding depth maps for every frame. So technique of creating data sets using the Blender software is presented in this work.

Findings

Despite the problem of an insufficient amount of available data sets, the problem of overfitting was encountered. Although created models work on the data sets, they are still overfitted and cannot predict correct depth map for the random images, that were included into the data sets.

Originality/value

Existing techniques of depth images creation are tested, using DNN.

Details

International Journal of Intelligent Unmanned Systems, vol. 6 no. 3

Type: Research Article

DOI:

ISSN: 2049-6427

Keywords

View access options

Article

Publication date: 23 November 2020

Projection-based augmented reality system for assembly guidance and monitoring

Chengjun Chen, Zhongke Tian, Dongnian Li, Lieyong Pang, Tiannuo Wang and Jun Hong

This study aims to monitor and guide the assembly process. The operators need to change the assembly process according to the products’ specifications during manual assembly of…

HTML

PDF (3 MB)

Downloads

1077

Abstract

Purpose

This study aims to monitor and guide the assembly process. The operators need to change the assembly process according to the products’ specifications during manual assembly of mass customized production. Traditional information inquiry and display methods, such as manual lookup of assembly drawings or electronic manuals, are inefficient and error-prone.

Design/methodology/approach

This paper proposes a projection-based augmented reality system (PBARS) for assembly guidance and monitoring. The system includes a projection method based on viewpoint tracking, in which the position of the operator’s head is tracked and the projection images are changed correspondingly. The assembly monitoring phase applies a method for parts recognition. First, the pixel local binary pattern (PX-LBP) operator is achieved by merging the classical LBP operator with the pixel classification process. Afterward, the PX-LBP features of the depth images are extracted and the randomized decision forests classifier is used to get the pixel classification prediction image (PCPI). Parts recognition and assembly monitoring is performed by PCPI analysis.

Findings

The projection image changes with the viewpoint of the human body, hence the operators always perceive the three-dimensional guiding scene from different viewpoints, improving the human-computer interaction. Part recognition and assembly monitoring were achieved by comparing the PCPIs, in which missing and erroneous assembly can be detected online.

Originality/value

This paper designed the PBARS to monitor and guide the assembly process simultaneously, with potential applications in mass customized production. The parts recognition and assembly monitoring based on pixels classification provides a novel method for assembly monitoring.

Details

Assembly Automation, vol. 41 no. 1

Type: Research Article

DOI:

ISSN: 0144-5154

Keywords

View access options

Article

Publication date: 15 February 2022

A cascaded CNN-based method for monocular vision robotic grasping

Xiaojun Wu, Peng Li, Jinghui Zhou and Yunhui Liu

Scattered parts are laid randomly during the manufacturing process and have difficulty to recognize and manipulate. This study aims to complete the grasp of the scattered parts by…

HTML

PDF (3.3 MB)

Downloads

307

Abstract

Purpose

Scattered parts are laid randomly during the manufacturing process and have difficulty to recognize and manipulate. This study aims to complete the grasp of the scattered parts by a manipulator with a camera and learning method.

Design/methodology/approach

In this paper, a cascaded convolutional neural network (CNN) method for robotic grasping based on monocular vision and small data set of scattered parts is proposed. This method can be divided into three steps: object detection, monocular depth estimation and keypoint estimation. In the first stage, an object detection network is improved to effectively locate the candidate parts. Then, it contains a neural network structure and corresponding training method to learn and reason high-resolution input images to obtain depth estimation. The keypoint estimation in the third step is expressed as a cumulative form of multi-scale prediction from a network to use an red green blue depth (RGBD) map that is acquired from the object detection and depth map estimation. Finally, a grasping strategy is studied to achieve successful and continuous grasping. In the experiments, different workpieces are used to validate the proposed method. The best grasping success rate is more than 80%.

Findings

By using the CNN-based method to extract the key points of the scattered parts and calculating the possibility of grasp, the successful rate is increased.

Practical implications

This method and robotic systems can be used in picking and placing of most industrial automatic manufacturing or assembly processes.

Originality/value

Unlike standard parts, scattered parts are randomly laid and have difficulty recognizing and grasping for the robot. This study uses a cascaded CNN network to extract the keypoints of the scattered parts, which are also labeled with the possibility of successful grasping. Experiments are conducted to demonstrate the grasping of those scattered parts.

Details

Industrial Robot: the international journal of robotics research and application, vol. 49 no. 4

Type: Research Article

DOI:

ISSN: 0143-991X

Keywords

View access options

Article

Publication date: 6 August 2024

Visual SLAM algorithm in dynamic environment based on deep learning

Yingjie Yu, Shuai Chen, Xinpeng Yang, Changzhen Xu, Sen Zhang and Wendong Xiao

This paper proposes a self-supervised monocular depth estimation algorithm under multiple constraints, which can generate the corresponding depth map end-to-end based on RGB images…

HTML

PDF (934 KB)

Downloads

84

Abstract

Purpose

This paper proposes a self-supervised monocular depth estimation algorithm under multiple constraints, which can generate the corresponding depth map end-to-end based on RGB images. On this basis, based on the traditional visual simultaneous localisation and mapping (VSLAM) framework, a dynamic object detection framework based on deep learning is introduced, and dynamic objects in the scene are culled during mapping.

Design/methodology/approach

Typical SLAM algorithms or data sets assume a static environment and do not consider the potential consequences of accidentally adding dynamic objects to a 3D map. This shortcoming limits the applicability of VSLAM in many practical cases, such as long-term mapping. In light of the aforementioned considerations, this paper presents a self-supervised monocular depth estimation algorithm based on deep learning. Furthermore, this paper introduces the YOLOv5 dynamic detection framework into the traditional ORBSLAM2 algorithm for the purpose of removing dynamic objects.

Findings

Compared with Dyna-SLAM, the algorithm proposed in this paper reduces the error by about 13%, and compared with ORB-SLAM2 by about 54.9%. In addition, the algorithm in this paper can process a single frame of image at a speed of 15–20 FPS on GeForce RTX 2080s, far exceeding Dyna-SLAM in real-time performance.

Originality/value

This paper proposes a VSLAM algorithm that can be applied to dynamic environments. The algorithm consists of a self-supervised monocular depth estimation part under multiple constraints and the introduction of a dynamic object detection framework based on YOLOv5.

Details

Industrial Robot: the international journal of robotics research and application, vol. 52 no. 1

Type: Research Article

DOI:

ISSN: 0143-991X

Keywords

View access options

Article

Publication date: 13 August 2024

Recognition and pose estimation method for random bin picking considering incomplete point cloud scene

Yan Kan, Hao Li, Zhengtao Chen, Changjiang Sun, Hao Wang and Joachim Seidelmann

This paper aims to propose a stable and precise recognition and pose estimation method to deal with the difficulties that industrial parts often present, such as incomplete point…

HTML

PDF (4 MB)

Downloads

57

Abstract

Purpose

This paper aims to propose a stable and precise recognition and pose estimation method to deal with the difficulties that industrial parts often present, such as incomplete point cloud data due to surface reflections, lack of color texture features and limited availability of effective three-dimensional geometric information. These challenges lead to less-than-ideal performance of existing object recognition and pose estimation methods based on two-dimensional images or three-dimensional point cloud features.

Design/methodology/approach

In this paper, an image-guided depth map completion method is proposed to improve the algorithm's adaptability to noise and incomplete point cloud scenes. Furthermore, this paper also proposes a pose estimation method based on contour feature matching.

Findings

Through experimental testing on real-world and virtual scene dataset, it has been verified that the image-guided depth map completion method exhibits higher accuracy in estimating depth values for depth map hole pixels. The pose estimation method proposed in this paper was applied to conduct pose estimation experiments on various parts. The average recognition accuracy in real-world scenes was 88.17%, whereas in virtual scenes, the average recognition accuracy reached 95%.

Originality/value

The proposed recognition and pose estimation method can stably and precisely deal with the difficulties that industrial parts present and improve the algorithm's adaptability to noise and incomplete point cloud scenes.

Details

Robotic Intelligence and Automation, vol. 44 no. 5

Type: Research Article

DOI:

ISSN: 2754-6969

Keywords

View access options

Article

Publication date: 7 August 2017

Feature fusion using Extended Jaccard Graph and word embedding for robot

Shenglan Liu, Muxin Sun, Xiaodong Huang, Wei Wang and Feilong Wang

Robot vision is a fundamental device for human–robot interaction and robot complex tasks. In this paper, the authors aim to use Kinect and propose a feature graph fusion (FGF) for…

HTML

PDF (790 KB)

Downloads

179

Abstract

Purpose

Robot vision is a fundamental device for human–robot interaction and robot complex tasks. In this paper, the authors aim to use Kinect and propose a feature graph fusion (FGF) for robot recognition.

Design/methodology/approach

The feature fusion utilizes red green blue (RGB) and depth information to construct fused feature from Kinect. FGF involves multi-Jaccard similarity to compute a robust graph and word embedding method to enhance the recognition results.

Findings

The authors also collect DUT RGB-Depth (RGB-D) face data set and a benchmark data set to evaluate the effectiveness and efficiency of this method. The experimental results illustrate that FGF is robust and effective to face and object data sets in robot applications.

Originality/value

The authors first utilize Jaccard similarity to construct a graph of RGB and depth images, which indicates the similarity of pair-wise images. Then, fusion feature of RGB and depth images can be computed by the Extended Jaccard Graph using word embedding method. The FGF can get better performance and efficiency in RGB-D sensor for robots.

Details

Assembly Automation, vol. 37 no. 3

Type: Research Article

DOI:

ISSN: 0144-5154

Keywords

View access options

Article

Publication date: 10 June 2014

3-D vision-assist guidance for robots or the visually impaired

Du-Ming Tsai, Hao Hsu and Wei-Yao Chiu

This study aims to propose a door detection method based on the door properties in both depth and gray-level images. It can further help blind people (or mobile robots) find the…

HTML

PDF (923 KB)

Downloads

317

Abstract

Purpose

This study aims to propose a door detection method based on the door properties in both depth and gray-level images. It can further help blind people (or mobile robots) find the doorway to their destination.

Design/methodology/approach

The proposed method uses the hierarchical point–line region principle with majority vote to encode the surface features pixel by pixel, and then dominant scene entities line by line, and finally the prioritized scene entities in the center, left and right of the observed scene.

Findings

This approach is very robust for noise and random misclassification in pixel, line and region levels and provides sufficient information for the pathway in the front and on the left and right of a scene. The proposed robot vision-assist system can be worn by visually impaired people or mounted on mobile robots. It provides more complete information about the surrounding environment to guide safely and effectively the user to the destination.

Originality/value

In this study, the proposed robot vision scheme provides detailed configurations of the environment encountered in daily life, including stairs (up and down), curbs/steps (up and down), obstacles, overheads, potholes/gutters, hazards and accessible ground. All these scene entities detected in the environment provide the blind people (or mobile robots) more complete information for better decision-making of their own. This paper also proposes, especially, a door detection method based on the door’s features in both depth and gray-level images. It can further help blind people find the doorway to their destination in an unfamiliar environment.

Details

Industrial Robot: An International Journal, vol. 41 no. 4

Type: Research Article

DOI:

ISSN: 0143-991X

Keywords

View access options

Article

Publication date: 28 January 2014

A comparative study of RIFCM with other related algorithms from their suitability in analysis of satellite images using other supporting techniques

Swarnalatha Purushotham and Balakrishna Tripathy

The purpose of this paper is to provide a way to analyze satellite images using various clustering algorithms and refined bitplane methods with other supporting techniques to…

HTML

PDF (505 KB)

Downloads

187

Abstract

Purpose

The purpose of this paper is to provide a way to analyze satellite images using various clustering algorithms and refined bitplane methods with other supporting techniques to prove the superiority of RIFCM.

Design/methodology/approach

A comparative study has been carried out using RIFCM with other related algorithms from their suitability in analysis of satellite images with other supporting techniques which segments the images for further process for the benefit of societal problems. Four images were selected dealing with hills, freshwater, freshwatervally and drought satellite images.

Findings

The superiority of the proposed algorithm, RIFCM with refined bitplane towards other clustering techniques with other supporting methods clustering, has been found and as such the comparison, has been made by applying four metrics (Otsu (Max-Min), PSNR and RMSE (40%-60%-Min-Max), histogram analysis (Max-Max), DB index and D index (Max-Min)) and proved that the RIFCM algorithm with refined bitplane yielded robust results with efficient performance, reduction in the metrics and time complexity of depth computation of satellite images for further process of an image.

Practical implications

For better clustering of satellite images like lands, hills, freshwater, freshwatervalley, drought, etc. of satellite images is an achievement.

Originality/value

The existing system extends the novel framework to provide a more explicit way to analyze an image by removing distortions with refined bitplane slicing using the proposed algorithm of rough intuitionistic fuzzy c-means to show the superiority of RIFCM.

Details

Kybernetes, vol. 43 no. 1

Type: Research Article

DOI:

ISSN: 0368-492X

Keywords

Open Access

Article

Publication date: 18 January 2016

A kind of infrared expand depth of field vision sensor in low-visibility road condition for safety-driving

Hui-Feng Wang, Gui-ping Wang, Xiao-Yan Wang, Chi Ruan and Shi-qin Chen

This study aims to consider active vision in low-visibility environments to reveal the factors of optical properties which affect visibility and to explore a method of obtaining…

HTML

PDF (366 KB)

Downloads

1536

Abstract

Purpose

This study aims to consider active vision in low-visibility environments to reveal the factors of optical properties which affect visibility and to explore a method of obtaining different depths of fields by multimode imaging.Bad weather affects the driver’s visual range tremendously and thus has a serious impact on transport safety.

Design/methodology/approach

A new mechanism and a core algorithm for obtaining an excellent large field-depth image which can be used to aid safe driving is designed and implemented. In this mechanism, atmospheric extinction principle and field expansion system are researched as the basis, followed by image registration and fusion algorithm for the Infrared Extended Depth of Field (IR-EDOF) sensor.

Findings

The experimental results show that the idea we propose can work well to expand the field depth in a low-visibility road environment as a new aided safety-driving sensor.

Originality/value

The paper presents a new kind of active optical extension, as well as enhanced driving aids, which is an effective solution to the problem of weakening of visual ability. It is a practical engineering sensor scheme for safety driving in low-visibility road environments.

Details

Sensor Review, vol. 36 no. 1

Type: Research Article

DOI:

ISSN: 0260-2288

Keywords

View access options

Article

Publication date: 9 October 2018

Planar array magnetic induction tomography further improvement

F. Li, M. Soleimani and J. Abascal

Magnetic induction tomography (MIT) is a tomographic imaging technique with a wide range of potential industrial applications. Planar array MIT is a convenient setup but unable to…

HTML

PDF (3.2 MB)

Downloads

236

Abstract

Purpose

Magnetic induction tomography (MIT) is a tomographic imaging technique with a wide range of potential industrial applications. Planar array MIT is a convenient setup but unable to access freely from the entire periphery as it only collects measurements from one surface, so it remains challenging given the limited data. This study aims to assess the use of sparse regularization methods for accurate position and depth detection in planar array MIT.

Design/methodology/approach

The most difficult challenges in MIT are to solve the inverse and forward problems. The inversion of planar MIT is severely ill-posed due to limited access data. Thus, this paper posed a total variation (TV) problem and solved it efficiently with the Split Bregman formulation to overcome this difficulty. Both isotropic and anisotropic TV formulations are compared to Tikhonov regularization with experimental MIT data.

Findings

The results show that Tikhonov method failed or underestimated the object position and depth. Both isotropic and anisotropic TV led to accurate recovery of depth and position.

Originality/value

There are numerous potential applications for planar array MIT where access to the materials under testing is restrict. Sparse regularization methods are a promising approach to improving depth detection for limited MIT data.

Details

Sensor Review, vol. 39 no. 2

Type: Research Article

DOI:

ISSN: 0260-2288

Abstract

Purpose

Design/methodology/approach

Findings

Originality/value

Details

Keywords

Abstract

Purpose

Design/methodology/approach

Findings

Originality/value

Details

Keywords

Abstract

Purpose

Design/methodology/approach

Findings

Practical implications

Originality/value

Details

Keywords

Abstract

Purpose

Design/methodology/approach

Findings

Originality/value

Details

Keywords

Abstract

Purpose

Design/methodology/approach

Findings

Originality/value

Details

Keywords

Abstract

Purpose

Design/methodology/approach

Findings

Originality/value

Details

Keywords

Abstract

Purpose

Design/methodology/approach

Findings

Originality/value

Details

Keywords

Abstract

Purpose

Design/methodology/approach

Findings

Practical implications

Originality/value

Details

Keywords

Abstract

Purpose

Design/methodology/approach

Findings

Originality/value

Details

Keywords

Abstract

Purpose

Design/methodology/approach

Findings

Originality/value

Details

Keywords

Access

Year

Content type

All feedback is valuable

Report an issue or find answers to frequently asked questions