scispace - formally typeset
Search or ask a question
Topic

Degree of parallelism

About: Degree of parallelism is a research topic. Over the lifetime, 1515 publications have been published within this topic receiving 25546 citations.


Papers
More filters
Patent
04 Jun 2014
TL;DR: In this paper, a system and method for processing multi-core parallel assembly line signals of a 4G broadband communication system based on a GPP is presented, where a large volume of data and computing tasks are divided into reasonable granularities through a scheduler and are distributed to assembly lines of all levels to be processed respectively.
Abstract: The invention discloses a system and method for processing multi-core parallel assembly line signals of a 4G broadband communication system based on a GPP. In order to meet the strict requirement for the real-time performance of the 4G communication system, the cloud computing idea is utilized, the GPP serves as a computing resource, communication data are processed by utilizing an assembly line processing mode based on the GPP, and a large volume of data and computing tasks are divided into reasonable granularities through a scheduler and are distributed to assembly lines of all levels to be processed respectively. Under the condition that hardware performance is limited, the assembly line mode can meet the requirement for the real-time performance more easily, meanwhile, the time affluence amount is led in, and large delay variation can be borne by the system. Computing resources are fully utilized through reasonable dispatching. According to the system and method for processing the 4G broadband communication system multi-core parallel assembly line signals, three kinds of assembly lines are designed totally, one assembly line is suitable for processing a large data volume and higher in reliability, another assembly line is suitable for processing a small data volume and high in speed and flexibility, and the third assembly line is a composite assembly line based on twp application scenes and with a high degree of parallelism, wherein the performance of the third assembly line is obviously improved.

23 citations

Journal ArticleDOI
TL;DR: It is shown that, although increased parallelism increases radiation sensitivity, the performance gains generally outweigh it in terms of global failure rate, and that an 8-bit integer design can deliver over six times more fault-free executions than a 32-bit floating-point implementation.
Abstract: Convolutional neural networks (CNNs) are becoming attractive alternatives to traditional image-processing algorithms in self-driving vehicles for automotive, military, and aerospace applications. The high computational demand of state-of-the-art CNN architectures requires the use of hardware acceleration on parallel devices. Field-programmable gate arrays (FPGAs) offer a great level of design flexibility, low power consumption, and are relatively low cost, which make them very good candidates for efficiently accelerating neural networks. Unfortunately, the configuration memories of SRAM-based FPGAs are sensitive to radiation-induced errors, which can compromise the circuit implemented on the programmable fabric and the overall reliability of the system. Through neutron beam experiments, we evaluate how lossless quantization processes and subsequent data precision reduction impact the area, performance, radiation sensitivity, and failure rate of neural networks on FPGAs. Our results show that an 8-bit integer design can deliver over six times more fault-free executions than a 32-bit floating-point implementation. Moreover, we discuss the tradeoffs associated with varying degrees of parallelism in a neural network accelerator. We show that, although increased parallelism increases radiation sensitivity, the performance gains generally outweigh it in terms of global failure rate.

23 citations

Journal ArticleDOI
TL;DR: It can be proved that DiagRSMarch can identify all stuck-at, transition, state coupling, and dynamic coupling faults occurring in all memory arrays and is highly dependent on memory topology, defect-type distribution, and degree of parallelism.
Abstract: In this paper, the authors propose a new built-in self-diagnosis method to simultaneously diagnose spatially distributed memory modules with different sizes. Based on the serial interfacing technique, the serial fault masking effect is observed and a bidirectional serial interfacing technique is proposed to deal with such an issue. By tolerating redundant read/write operations, they develop a new march algorithm called DiagRSMarch to achieve the goals of low test signal routing overhead, tolerable diagnostic time, and high diagnostic coverage. It can be proved that DiagRSMarch can identify all stuck-at, transition, state coupling, and dynamic coupling faults occurring in all memory arrays. Experimental results also demonstrate that the test efficiency of DiagRSMarch is highly dependent on memory topology, defect-type distribution, and degree of parallelism.

23 citations

Proceedings ArticleDOI
TL;DR: Xiong et al. as discussed by the authors proposed the Destination-Sorted Sub-Shard (DSSS) structure to store a graph, which divides vertices and edges into intervals and sub-shards to ensure graph data access locality and enable fine-grained scheduling.
Abstract: Recent studies show that graph processing systems on a single machine can achieve competitive performance compared with cluster-based graph processing systems. In this paper, we present NXgraph, an efficient graph processing system on a single machine. With the abstraction of vertex intervals and edge sub-shards, we propose the Destination-Sorted Sub-Shard (DSSS) structure to store a graph. By dividing vertices and edges into intervals and sub-shards, NXgraph ensures graph data access locality and enables fine-grained scheduling. By sorting edges within each sub-shard according to their destination vertices, NXgraph reduces write conflicts among different threads and achieves a high degree of parallelism. Then, three updating strategies, i.e., Single-Phase Update (SPU), Double-Phase Update (DPU), and Mixed-Phase Update (MPU), are proposed in this paper. NXgraph can adaptively choose the fastest strategy for different graph problems according to the graph size and the available memory resources to fully utilize the memory space and reduce the amount of data transfer. All these three strategies exploit streamlined disk access pattern. Extensive experiments on three real-world graphs and five synthetic graphs show that NXgraph can outperform GraphChi, TurboGraph, VENUS, and GridGraph in various situations. Moreover, NXgraph, running on a single commodity PC, can finish an iteration of PageRank on the Twitter graph with 1.5 billion edges in 2.05 seconds; while PowerGraph, a distributed graph processing system, needs 3.6s to finish the same task.

23 citations

Journal ArticleDOI
23 Oct 2009-Entropy
TL;DR: An investigation of how morphology, i.e., the shape of components, affects a self-assembly process shows that the assembly processes were affected by the aggregation sequence in their early stages, where shape induces different behaviors and thus results in variations in aggregation speeds.
Abstract: Self-assembly is a key phenomenon whereby vast numbers of individual components passively interact and form organized structures, as can be seen, for example, in the morphogenesis of a virus. Generally speaking, the process can be viewed as a spatial placement of attractive and repulsive components. In this paper, we report on an investigation of how morphology, i.e., the shape of components, affects a self-assembly process. The experiments were conducted with 3 differently shaped floating tiles equipped with magnets in an agitated water tank. We propose a novel measure involving clustering coefficients, which qualifies the degree of parallelism of the assembly process. The results showed that the assembly processes were affected by the aggregation sequence in their early stages, where shape induces different behaviors and thus results in variations in aggregation speeds.

22 citations


Network Information
Related Topics (5)
Server
79.5K papers, 1.4M citations
85% related
Scheduling (computing)
78.6K papers, 1.3M citations
83% related
Network packet
159.7K papers, 2.2M citations
80% related
Web service
57.6K papers, 989K citations
80% related
Quality of service
77.1K papers, 996.6K citations
79% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
20221
202147
202048
201952
201870
201775