scispace - formally typeset
Search or ask a question
Topic

Object (computer science)

About: Object (computer science) is a research topic. Over the lifetime, 106024 publications have been published within this topic receiving 1360115 citations. The topic is also known as: obj & Rq.


Papers
More filters
Posted Content
TL;DR: In this article, an Aggregate View Object Detection Network (AVOD) is proposed for autonomous driving scenarios using LIDAR point clouds and RGB images to generate features that are shared by two subnetworks: a region proposal network (RPN) and a second stage detector network.
Abstract: We present AVOD, an Aggregate View Object Detection network for autonomous driving scenarios. The proposed neural network architecture uses LIDAR point clouds and RGB images to generate features that are shared by two subnetworks: a region proposal network (RPN) and a second stage detector network. The proposed RPN uses a novel architecture capable of performing multimodal feature fusion on high resolution feature maps to generate reliable 3D object proposals for multiple object classes in road scenes. Using these proposals, the second stage detection network performs accurate oriented 3D bounding box regression and category classification to predict the extents, orientation, and classification of objects in 3D space. Our proposed architecture is shown to produce state of the art results on the KITTI 3D object detection benchmark while running in real time with a low memory footprint, making it a suitable candidate for deployment on autonomous vehicles. Code is at: this https URL

371 citations

Posted Content
Christopher Choy1, Danfei Xu1, JunYoung Gwak1, Kevin Chen1, Silvio Savarese1 
TL;DR: The 3D-R2N2 reconstruction framework outperforms the state-of-the-art methods for single view reconstruction, and enables the 3D reconstruction of objects in situations when traditional SFM/SLAM methods fail (because of lack of texture and/or wide baseline).
Abstract: Inspired by the recent success of methods that employ shape priors to achieve robust 3D reconstructions, we propose a novel recurrent neural network architecture that we call the 3D Recurrent Reconstruction Neural Network (3D-R2N2). The network learns a mapping from images of objects to their underlying 3D shapes from a large collection of synthetic data. Our network takes in one or more images of an object instance from arbitrary viewpoints and outputs a reconstruction of the object in the form of a 3D occupancy grid. Unlike most of the previous works, our network does not require any image annotations or object class labels for training or testing. Our extensive experimental analysis shows that our reconstruction framework i) outperforms the state-of-the-art methods for single view reconstruction, and ii) enables the 3D reconstruction of objects in situations when traditional SFM/SLAM methods fail (because of lack of texture and/or wide baseline).

370 citations

Patent
Paul Edward Showering1
25 Jan 2013
TL;DR: In this article, the authors describe a system for determining dimensions of a physical object using a mobile computer equipped with a motion sensing device, which includes a microprocessor, a memory, a user interface, a motion sensor, and a dimensioning program executable by the microprocessor.
Abstract: Devices, methods, and software are disclosed for determining dimensions of a physical object using a mobile computer equipped with a motion sensing device. In an illustrative embodiment, the mobile computer can comprise a microprocessor, a memory, a user interface, a motion sensing device, and a dimensioning program executable by the microprocessor. The processor can be in communicative connection with executable instructions for enabling the processor for various steps. One step includes initiating a trajectory tracking mode responsive to receiving a first user interface action. Another step includes tracking the mobile computer's trajectory along a surface of a physical object by storing in the memory a plurality of motion sensing data items outputted by the motion sensing device. Another step includes exiting the trajectory tracking mode responsive to receiving a second user interface action. Another step includes calculating three dimensions of a minimum bounding box corresponding to the physical object.

370 citations

Proceedings ArticleDOI
28 Jun 2017
TL;DR: Online Adaptive Video Object Segmentation (OnAVOS) is proposed which updates the network online using training examples selected based on the confidence of the network and the spatial configuration and adds a pretraining step based on objectness, which is learned on PASCAL.
Abstract: We tackle the task of semi-supervised video object segmentation, i.e. segmenting the pixels belonging to an object in the video using the ground truth pixel mask for the first frame. We build on the recently introduced one-shot video object segmentation (OSVOS) approach which uses a pretrained network and fine-tunes it on the first frame. While achieving impressive performance, at test time OSVOS uses the fine-tuned network in unchanged form and is not able to adapt to large changes in object appearance. To overcome this limitation, we propose Online Adaptive Video Object Segmentation (OnAVOS) which updates the network online using training examples selected based on the confidence of the network and the spatial configuration. Additionally, we add a pretraining step based on objectness, which is learned on PASCAL. Our experiments show that both extensions are highly effective and improve the state of the art on DAVIS to an intersection-over-union score of 85.7%.

370 citations

Patent
29 Sep 1995
TL;DR: In this article, a system for allowing media content to be used in an interactive digital media (IDM) program has Frame Data for the media content and object mapping data (N Data) representing the frame addresses and display location coordinates for objects appearing in the media contents.
Abstract: A system for allowing media content to be used in an interactive digital media (IDM) program has Frame Data for the media content and object mapping data (N Data) representing the frame addresses and display location coordinates for objects appearing in the media content. The N Data are maintained separately from the Frame Data for the media content, so that the media content can be kept intact without embedded codes and can be played back on any system. The IDM program has established linkages connecting the objects mapped by the N Data to other functions to be performed in conjunction with display of the media content. Selection of an object appearing in the media content with a pointer results in initiation of the interactive function. A broad base of existing non-interactive media content, such as movies, videos, advertising, and television programming can be converted to interactive digital media use. An authoring system for creating IDM programs has an object outlining tool and a object motion tracking tool for facilitating the generation of N Data. In a data storage disk, the Frame Data and the N Data are stored on separate sectors. In a network system, the object mapping data and IDM program are downloaded to a subscriber terminal and used in conjunction with presentation of the media content.

369 citations


Network Information
Related Topics (5)
Query optimization
17.6K papers, 474.4K citations
84% related
Programming paradigm
18.7K papers, 467.9K citations
84% related
Software development
73.8K papers, 1.4M citations
83% related
Compiler
26.3K papers, 578.5K citations
83% related
Software system
50.7K papers, 935K citations
82% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
202238
20213,087
20205,900
20196,540
20185,940
20175,046