scispace - formally typeset
Search or ask a question

Showing papers on "Distributed object published in 2021"


Journal ArticleDOI
TL;DR: In this article, the authors propose an end-to-end global-local self-adaptive network (GLSAN) for drone-view object detection, which includes a global-layer detection network (GLDN), a simple yet efficient selfadaptive region selecting algorithm (SARSA), and a local super-resolution network (LSRN).
Abstract: Directly benefiting from the deep learning methods, object detection has witnessed a great performance boost in recent years. However, drone-view object detection remains challenging for two main reasons: (1) Objects of tiny-scale with more blurs w.r.t. ground-view objects offer less valuable information towards accurate and robust detection; (2) The unevenly distributed objects make the detection inefficient, especially for regions occupied by crowded objects. Confronting such challenges, we propose an end-to-end global-local self-adaptive network (GLSAN) in this paper. The key components in our GLSAN include a global-local detection network (GLDN), a simple yet efficient self-adaptive region selecting algorithm (SARSA), and a local super-resolution network (LSRN). We integrate a global-local fusion strategy into a progressive scale-varying network to perform more precise detection, where the local fine detector can adaptively refine the target’s bounding boxes detected by the global coarse detector via cropping the original images for higher-resolution detection. The SARSA can dynamically crop the crowded regions in the input images, which is unsupervised and can be easily plugged into the networks. Additionally, we train the LSRN to enlarge the cropped images, providing more detailed information for finer-scale feature extraction, helping the detector distinguish foreground and background more easily. The SARSA and LSRN also contribute to data augmentation towards network training, which makes the detector more robust. Extensive experiments and comprehensive evaluations on the VisDrone2019-DET benchmark dataset and UAVDT dataset demonstrate the effectiveness and adaptivity of our method. Towards an industrial application, our network is also applied to a DroneBolts dataset with proven advantages. Our source codes have been available at https://github.com/dengsutao/glsan .

39 citations


Journal ArticleDOI
TL;DR: Anveshak as mentioned in this paper is a runtime platform for composing and coordinating distributed tracking applications, which provides a domain-specific dataflow programming model to intuitively compose a tracking application, supporting contemporary CV advances like query fusion and re-identification, and enabling dynamic scoping of the camera network's search space to avoid wasted computation.
Abstract: Advances in deep neural networks (DNN) and computer vision (CV) algorithms have made it feasible to extract meaningful insights from large-scale deployments of urban cameras. Tracking an object of interest across the camera network in near real-time is a canonical problem. However, current tracking platforms have two key limitations: 1) They are monolithic, proprietary and lack the ability to rapidly incorporate sophisticated tracking models, and 2) They are less responsive to dynamism across wide-area computing resources that include edge, fog, and cloud abstractions. We address these gaps using Anveshak , a runtime platform for composing and coordinating distributed tracking applications. It provides a domain-specific dataflow programming model to intuitively compose a tracking application, supporting contemporary CV advances like query fusion and re-identification, and enabling dynamic scoping of the camera network's search space to avoid wasted computation. We also offer tunable batching and data-dropping strategies for dataflow blocks deployed on distributed resources to respond to network and compute variability. These balance the tracking accuracy, its real-time performance, and the active camera-set size. We illustrate the concise expressiveness of the programming model for four tracking applications. Our detailed experiments for a network of 1000 camera-feeds on modest resources exhibit the tunable scalability, performance, and quality trade-offs enabled by our dynamic tracking, batching, and dropping strategies.

8 citations


Proceedings ArticleDOI
27 Jul 2021
TL;DR: In this article, the authors propose a hinting mechanism for near data processing (NDP) in distributed storage systems to reduce data movement by up to 99% when querying CSV data with NDP co-located with the stored data.
Abstract: Most general-purpose distributed storage systems are not designed with near data processing (NDP) in mind. They do not respect semantic data boundaries when writing data, for example splitting a record across servers. This reduces NDP effectiveness by requiring data collation before computation. While semantic data awareness and NDP functions can be retroactively added to existing distributed storage, it is often complex and difficult to accomplish in practice. We propose sharing storage system layout information with data writers so they can adjust data layouts to prevent data alignment issues regardless of the underlying architectures. By doing so, we can simplify NDP implementation by reducing the need for data reassembly, and reduce the need for complex storage system or application extensions. We demonstrate a hinting mechanism on both HDFS with computational block storage and an erasure coded MinIO deployment, reducing data movement by up to 99% when querying CSV data with NDP co-located with the stored data. This was accomplished purely with client side data alignment, no modifications to the server side write paths, and no inter-node collation of data.

5 citations


Book ChapterDOI
20 Sep 2021
TL;DR: In this paper, the authors investigated the concept of proactive monitoring of critical events at the distributed objects of engineering communications of the city and in the urban road environment and proposed a multi-agent approach, which involves the use of software agents directly on distributed data collection sources and robots for collecting data from open sources on the Internet, brokers for consolidation and protection of transmitted data, a component of distributed information warehouse.
Abstract: The article proposes and investigates the concept of proactive monitoring of critical events at the distributed objects of engineering communications of the city and in the urban road environment. The purpose of monitoring is to determine, assess and predict the dynamics of the risks of critical events, depending on changes in indicators and factors correlating with them. The main characteristics of distributed monitoring objects, the classification and analysis of critical events, the reasons for their occurrence and possible influencing factors are given. For predictive analysis and assessment of risks of critical events, collection, consolidation and analysis of big sensor and social data is carried out, reflecting the dynamics of changes in the indicators of monitored objects and external factors of influence. Big data includes indicators of monitored objects, information about events and possible causes of their occurrence, factors influencing the risks of their development. Cyber-physical (sensor) data is unloaded from spatially distributed photo and video recording complexes, video surveillance cameras, weather stations, measuring instruments and sensors of objects and pipelines of engineering networks. Cyber-social (social) data is collected from open sources of information on the Internet and mobile communications of the civilian population. The monitoring system is using a multi-agent approach, which involves the use of software agents directly on distributed data collection sources and robots for collecting data from open sources on the Internet, brokers for consolidation and protection of transmitted data, a component of a distributed information warehouse.

4 citations


Journal ArticleDOI
14 Mar 2021
TL;DR: A basic concept on types of traditional SIP covering File-Based, Common Database, Remote Procedure Call, Distributed Objects, and Messaging is included, and an overview of three Service Interface Design (SID) approaches for systems interoperability is discussed.
Abstract: One of the major issues in system integration is to deal with interoperability of legacy systems which use traditional System Integration Patterns (SIP). Information are unable to exchange effectively when the systems involved comes from developer that tended to not interoperate and this leads to the interoperability problem in heterogeneous system integration. To address the interoperability issues, interfacing processes need to be made more easily by defining components, processes, and interfaces that affect the system integration architecture at the initial design stage. This paper includes a basic concept on types of traditional SIP covering File-Based, Common Database, Remote Procedure Call (RPC), Distributed Objects, and Messaging. An overview of three Service Interface Design (SID) approaches for systems interoperability is discussed. The discussions on these approaches serve as a basis for the solution of interoperability of heterogeneous systems which use traditional SIP.

3 citations


Journal ArticleDOI
TL;DR: A new communication abstraction, called Set-Constrained Delivery Broadcast (SCD-broadcast), whose aim is to provide its users with an appropriate abstraction level when they have to implement objects or distributed tasks in an asynchronous message-passing system prone to process crash failures is introduced.

2 citations


Journal ArticleDOI
Javier López-Gómez1, Jakob Blomer1
TL;DR: The ROOT RNTuple I/O system aims at overcoming TTree's limitations and at providing improved efficiency for modern storage systems, such as NVMe devices and distributed object stores as mentioned in this paper.
Abstract: Over the last two decades, ROOT TTree has been used for storing over one exabyte of High-Energy Physics (HEP) events. The TTree columnar on-disk layout has been proved to be ideal for analyses of HEP data that typically require access to many events, but only a subset of the information stored for each of them. Future colliders, and particularly HL-LHC, will bring an increase of at least one order of magnitude in the volume of generated data. Therefore, the use of modern storage hardware, such as low-latency high-bandwidth NVMe devices and distributed object stores, becomes more important. However, TTree was not designed to optimally exploit modern hardware and may become a bottleneck for data retrieval. The ROOT RNTuple I/O system aims at overcoming TTree's limitations and at providing improved efficiency for modern storage systems. In this paper, we extend RNTuple with a backend that uses Intel DAOS as the underlying storage, demonstrating that the RNTuple architecture can accommodate high-performance object stores. From the user perspective, data can be accessed with minimal changes to the code, that is by replacing a filesystem path by a DAOS URI. Our performance evaluation shows that the new backend can be used for realistic analyses, while outperforming the compatibility solution provided by the DAOS project.

2 citations


Journal ArticleDOI
TL;DR: Based on mimic defense theory, the authors constructs the principle framework of the distributed object storage system and introduces the dynamic redundancy and heterogeneous function in the DOS architecture, which increases the attack cost, and greatly improves the security and availability of data.
Abstract: With the advent of the era of big data, cloud computing, Internet of things, and other information industries continue to develop. There is an increasing amount of unstructured data such as pictures, audio, and video on the Internet. And the distributed object storage system has become the mainstream cloud storage solution. With the increasing number of distributed applications, data security in the distributed object storage system has become the focus. For the distributed object storage system, traditional defenses are means that fix discovered system vulnerabilities and backdoors by patching, or means to modify the corresponding structure and upgrade. However, these two kinds of means are hysteretic and hardly deal with unknown security threats. Based on mimic defense theory, this paper constructs the principle framework of the distributed object storage system and introduces the dynamic redundancy and heterogeneous function in the distributed object storage system architecture, which increases the attack cost, and greatly improves the security and availability of data.

2 citations


Journal ArticleDOI
15 Oct 2021
TL;DR: In this paper, the authors propose a compositional, atomic distributed object (ADO) model for strongly consistent distributed systems that combines the best of both options, which abstracts over protocol-specific details and decouples high-level correctness reasoning from implementation choices.
Abstract: Despite recent advances, guaranteeing the correctness of large-scale distributed applications without compromising performance remains a challenging problem. Network and node failures are inevitable and, for some applications, careful control over how they are handled is essential. Unfortunately, existing approaches either completely hide these failures behind an atomic state machine replication (SMR) interface, or expose all of the network-level details, sacrificing atomicity. We propose a novel, compositional, atomic distributed object (ADO) model for strongly consistent distributed systems that combines the best of both options. The object-oriented API abstracts over protocol-specific details and decouples high-level correctness reasoning from implementation choices. At the same time, it intentionally exposes an abstract view of certain key distributed failure cases, thus allowing for more fine-grained control over them than SMR-like models. We demonstrate that proving properties even of composite distributed systems can be straightforward with our Coq verification framework, Advert, thanks to the ADO model. We also show that a variety of common protocols including multi-Paxos and Chain Replication refine the ADO semantics, which allows one to freely choose among them for an application's implementation without modifying ADO-level correctness proofs.

1 citations


Journal ArticleDOI
TL;DR: In this paper, a new approach based on an object-based storage method was designed and implemented, taking into account the lessons learned and leveraging the ATLAS experience with this kind of systems.
Abstract: The Large Hadron Collider (LHC) is about to enter its third run at unprecedented energies. The experiments at the LHC face computational challenges with enormous data volumes that need to be analysed by thousands of physics users. The ATLAS EventIndex project, currently running in production, builds a complete catalogue of particle collisions, or events, for the ATLAS experiment at the LHC. The distributed nature of the experiment data model is exploited by running jobs at over one hundred Grid data centers worldwide. Millions of files with petabytes of data are indexed, extracting a small quantity of metadata per event, that is conveyed with a data collection system in real time to a central Hadoop instance at CERN. After a successful first implementation based on a messaging system, some issues suggested performance bottlenecks for the challenging higher rates in next runs of the experiment. In this work we characterize the weaknesses of the previous messaging system, regarding complexity, scalability, performance and resource consumption. A new approach based on an object-based storage method was designed and implemented, taking into account the lessons learned and leveraging the ATLAS experience with this kind of systems. We present the experiment that we run during three months in the real production scenario worldwide, in order to evaluate the messaging and object store approaches. The results of the experiment show that the new object-based storage method can efficiently support large-scale data collection for big data environments like the next runs of the ATLAS experiment at the LHC.

1 citations


Patent
25 May 2021
TL;DR: In this article, a distributed object store can expose object metadata, in addition to object data, to distributed processing systems such as Hadoop and Apache Spark, and it may act as a HCFS, exposing object metadata as a collection of records that can be efficiently processed by MapReduce (MR) and other distributed processing frameworks.
Abstract: A distributed object store can expose object metadata, in addition to object data, to distributed processing systems, such as Hadoop and Apache Spark. The distributed object store may acts as a Hadoop Compatible File System (HCFS), exposing object metadata as a collection of records that can be efficiently processed by MapReduce (MR) and other distributed processing frameworks. Various metadata records formats are supported. Related methods are also described.

Posted Content
TL;DR: In this article, the authors developed a simple, user-friendly platform built for academic and scientific research collaboration, which consists of a metadata quality control based on blockchain technologies, and the data is stored separately in a distributed object storage that functions as a cloud.
Abstract: The Hybrid Technology Hub and many other research centers work in cross-functional teams whose workflow is not necessarily linear and where in many cases technology advances are done through parallel work. The lack of proper tools and platforms for a collaborative environment can create time lags in coordination and limited sharing of research findings. To solve this, we have developed a simple, user-friendly platform built for academic and scientific research collaboration. To ensure FAIRness compliance, the platform consists of a metadata quality control based on blockchain technologies. The data is stored separately in a distributed object storage that functions as a cloud. The platform also implements a version control system; it provides a history track of the project along with the possibility of reviewing the project's development. This platform aims to be a standardized tool within the Hybrid Technology Hub to ease collaboration, speed research workflow and improve research quality.

Proceedings ArticleDOI
26 May 2021
TL;DR: In this article, the authors present the results of research in the field of forming a unified geo-information environment, which is a development of the concept of the cyber environment of virtual enterprises.
Abstract: The article presents the results of research in the field of forming a unified geoinformation environment. The concept of the geo-information environment is a development of the concept of the cyber environment of virtual enterprises, which is formed from agents of three types: individuals, legal entities and man-made objects, by introducing geo-information into it. The geoinformation cyber environment provides tools that are best suited to automate the management of spatially distributed objects.

DOI
11 Aug 2021
TL;DR: In this paper, the authors have presented a simple program to handle auction sale in a CORBA application using JAVA programming language and its method is being invoked by a client from the other machine that also runs in Windows OS.
Abstract: The use of Common Object Request Broker Architecture (CORBA) has become one of the answer to the requirement for interoperability among the rapidly increasing number of hardware and software products available nowadays. CORBA has been introduced as a mechanism in a distributed computing environment in order to overcome recent interoperability issue. This mechanism allows distributed objects to communicate with each other, whether operate on remote devices or local devices, written in different languages or platforms, or at different locations on a network. In this paper, the concept of CORBA application as a middleware is presented. In order to understand this concept, a simple program to handle auction sale. This system is developed using JAVA programming language. This application is implemented on Windows Operating System (OS) and its method is being invoked by a client from the other machine that also runs in Windows OS. Along with this, the benefits of CORBA and its limitation are discussed in this paper. Keywords—CORBA concept; client-server; benefits of Corba; Application example

Journal ArticleDOI
01 Mar 2021
TL;DR: New excellent domestic products including Phytium chips, Starblaze open-channel flash memory devices, and Kylin operating systems are adopted as components of a new domestic software and hardware platform for distributed storage of meteorological data under domestic platforms.
Abstract: Meteorological data are important information data in China, and have made tremendous contributions to disaster prevention and disaster reduction in China. However, the current meteorological system has a long-term dependence on foreign hardware and software platforms, showing obvious security problems. It is an urgent task to complete the domestication and performance improvement of the meteorological system. At the same time, as the black box architecture of flash memory devices greatly hinders the co-optimization of software and hardware, open-channel flash memory has begun to attract attention as a new type of flash memory architecture. In this article, we have adopted new excellent domestic products including Phytium chips, Starblaze open-channel flash memory devices, and Kylin operating systems as components of a new domestic software and hardware platform. The object storage system of meteorological data has been implemented in a real environment by Ceph. We provides a built solution and optimizes the parameters. Based on the comparison with the commercial platform storage system, the problems and challenges faced by the meteorological data distributed object storage system in autonomy are discussed. Finally, this article looks forward to the distributed storage of meteorological data under domestic platforms.

Journal ArticleDOI
30 Aug 2021
TL;DR: It is shown that modern wireless sensor networks can be considered as distributed information measuring and information control systems.
Abstract: The problems of control and management of geographically distributed objects are considered. The sensor networks operating on the ZigBee technology are considered. The characteristics of the 802.15.4 ZigBee standard are given. The advantages of this technology are shown when building networks that are not very critical to traffic delays. The elements of such a network are considered. The primary converters used in such networks and their energy characteristics are considered. The issues of reducing and compensating delays in control circuits are considered. It is shown that modern wireless sensor networks can be considered as distributed information measuring and information control systems.

Proceedings ArticleDOI
30 Jun 2021
TL;DR: In this paper, the authors proposed a method of replacing the volume distributed object (moisture target) with two-dimensional four-point partially coherent matrix simulator's models, and the synthesis of the starting multipoint model of the moisture target based on requirements to powers and wind velocities distributions over the object is explored.
Abstract: The method of replacing the volume distributed object (moisture target) with two-dimensional four-point partially coherent matrix simulator's models is proposed in this paper. The synthesis of the starting multipoint model of the moisture target based on requirements to powers and wind velocities distributions over the object is explored. Expressions allowing one to replace the moisture target with the low-point partially coherent model based on the multipoint starting model are defined.

Journal ArticleDOI
TL;DR: The ROOT RNTuple I/O system aims at overcoming TTree's limitations and at providing improved efficiency for modern storage systems, such as NVMe devices and distributed object stores.
Abstract: Over the last two decades, ROOT TTree has been used for storing over one exabyte of High-Energy Physics (HEP) events. The TTree columnar on-disk layout has been proved to be ideal for analyses of HEP data that typically require access to many events, but only a subset of the information stored for each of them. Future colliders, and particularly HL-LHC, will bring an increase of at least one order of magnitude in the volume of generated data. Therefore, the use of modern storage hardware, such as low-latency high-bandwidth NVMe devices and distributed object stores, becomes more important. However, TTree was not designed to optimally exploit modern hardware and may become a bottleneck for data retrieval. The ROOT RNTuple I/O system aims at overcoming TTree's limitations and at providing improved efficiency for modern storage systems. In this paper, we extend RNTuple with a backend that uses Intel DAOS as the underlying storage, demonstrating that the RNTuple architecture can accommodate high-performance object stores. From the user perspective, data can be accessed with minimal changes to the code, that is by replacing a filesystem path by a DAOS URI. Our performance evaluation shows that the new backend can be used for realistic analyses, while outperforming the compatibility solution provided by the DAOS project.


DOI
09 Nov 2021
TL;DR: In this article, the authors evaluate the performance of CephFS on this cost-optimized hardware when it is combined with EOS to support the missing functionalities, including third-party copy, SciTokens, and high-level user and quota management.
Abstract: CephFS is a network filesystem built upon the Reliable Autonomic Distributed Object Store (RADOS). At CERN we have demonstrated its reliability and elasticity while operating several 100-to-1000TB clusters which provide NFS-like storage to infrastructure applications and services. At the same time, our lab developed EOS to offer high performance 100PB-scale storage for the LHC at extremely low costs while also supporting the complete set of security and functional APIs required by the particle-physics user community. This work seeks to evaluate the performance of CephFS on this cost-optimized hardware when it is combined with EOS to support the missing functionalities. To this end, we have setup a proof-of-concept Ceph Octopus cluster on high-density JBOD servers (840 TB each) with 100Gig-E networking. The system uses EOS to provide an overlayed namespace and protocol gateways for HTTP(S) and XROOTD, and uses CephFS as an erasure-coded object storage backend. The solution also enables operators to aggregate several CephFS instances and adds features, such as third-party-copy, SciTokens, and high-level user and quota management. Using simple benchmarks we measure the cost/performance tradeoffs of different erasure-coding layouts, as well as the network overheads of these coding schemes. We demonstrate some relevant limitations of the CephFS metadata server and offer improved tunings which can be generally applicable. To conclude, we reflect on the advantages and drawbacks related to this architecture, such as RADOS-level free space requirements and double-network penalties, and offer ideas for improvements in the future.

Journal ArticleDOI
TL;DR: GenoVault is a cloud-based repository for handling Next Generation Sequencing (NGS) data developed using OpenStack based private cloud with various services like keystone for authentication, cinder for block storage, neutron for networking and nova for managing compute instances for the Cloud.
Abstract: GenoVault is a cloud-based repository for handling Next Generation Sequencing (NGS) data. It is developed using OpenStack-based private cloud with various services like keystone for authentication, cinder for block storage, neutron for networking and nova for managing compute instances for the Cloud. GenoVault uses object-based storage, which enables data to be stored as objects instead of files or blocks for faster retrieval from different distributed object nodes. Along with a web-based interface, a JavaFX-based desktop client has also been developed to meet the requirements of large file uploads that are usually seen in NGS datasets. Users can store files in their respective object-based storage areas and the metadata provided by the user during file uploads is used for querying the database. GenoVault repository is designed taking into account future needs and hence can scale both vertically and horizontally using OpenStack-based cloud features. Users have an option to make the data shareable to the public or restrict the access as private. Data security is ensured as every container is a separate entity in object-based storage architecture which is also supported by Secure File Transfer Protocol (SFTP) for data upload and download. The data is uploaded by the user in individual containers that include raw read files (fastq), processed alignment files (bam, sam, bed) and the output of variation detection (vcf). GenoVault architecture allows verification of the data in terms of integrity and authentication before making it available to collaborators as per the user’s permissions. GenoVault is useful for maintaining the organization-wide NGS data generated in various labs which is not yet published and submitted to public repositories like NCBI. GenoVault also provides support to share NGS data among the collaborating institutions. GenoVault can thus manage vast volumes of NGS data on any OpenStack-based private cloud.