Topic

# Significand

About: Significand is a research topic. Over the lifetime, 1104 publications have been published within this topic receiving 9953 citations. The topic is also known as: coefficient & mantissa.

##### Papers published on a yearly basis

##### Papers

More filters

••

[...]

TL;DR: Algorithms for summation and dot product of floating-point numbers are presented which are fast in terms of measured computing time and it is shown that the computed results are as accurate as if computed in twice or K-fold working precision.

Abstract: Algorithms for summation and dot product of floating-point numbers are presented which are fast in terms of measured computing time. We show that the computed results are as accurate as if computed in twice or K-fold working precision, $K\ge 3$. For twice the working precision our algorithms for summation and dot product are some 40% faster than the corresponding XBLAS routines while sharing similar error estimates. Our algorithms are widely applicable because they require only addition, subtraction, and multiplication of floating-point numbers in the same working precision as the given data. Higher precision is unnecessary, algorithms are straight loops without branch, and no access to mantissa or exponent is necessary.

403 citations

••

[...]

TL;DR: In this paper, the algorithms for various arithmetic operations (including the four basic operations and various algebraic and transcendental operations) on quad-double numbers are presented, implemented in C++.

Abstract: A quad-double number is an unevaluated sum of four IEEE double precision numbers, capable of representing at least 212 bits of significand. We present the algorithms for various arithmetic operations (including the four basic operations and various algebraic and transcendental operations) on quad-double numbers. The performance of the algorithms, implemented in C++, is also presented.

260 citations

•

[...]

Hitachi

^{1}TL;DR: A data processor includes an arithmetic portion incorporated in a floating point unit, in which the arithmetic portion includes a plurality of multipliers supplied mantissa part of floating point number from respectively different data input signal line group and performing mutual multiplication of supplied multipliers as discussed by the authors.

Abstract: A data processor includes an arithmetic portion incorporated in a floating point unit, in which the arithmetic portion includes a plurality of multipliers supplied mantissa part of floating point number from respectively different data input signal line group and performing mutual multiplication of supplied mantissa parts, an aligner receiving outputs of respective multipliers and performing alignment shift, an exponent processing portion for generating number of alignment shift of the aligner and an exponent before normalization on the basis of generation an exponent part of the floating point number, a multi-input adder and the exponent before normalization, reducing scale of the circuit and performing inner product operation and the like with the floating point numbers in high speed and high accuracy.

172 citations

••

[...]

TL;DR: This paper presents an algorithm for calculating a faithful rounding of a vector of floating-point numbers, which adapts to the condition number of the sum, and proves certain constants used in the algorithm to be optimal.

Abstract: Given a vector of floating-point numbers with exact sum $s$, we present an algorithm for calculating a faithful rounding of $s$, i.e., the result is one of the immediate floating-point neighbors of $s$. If the sum $s$ is a floating-point number, we prove that this is the result of our algorithm. The algorithm adapts to the condition number of the sum, i.e., it is fast for mildly conditioned sums with slowly increasing computing time proportional to the logarithm of the condition number. All statements are also true in the presence of underflow. The algorithm does not depend on the exponent range. Our algorithm is fast in terms of measured computing time because it allows good instruction-level parallelism, it neither requires special operations such as access to mantissa or exponent, it contains no branch in the inner loop, nor does it require some extra precision: The only operations used are standard floating-point addition, subtraction, and multiplication in one working precision, for example, double precision. Certain constants used in the algorithm are proved to be optimal.

172 citations

•

[...]

Honeywell

^{1}TL;DR: In this article, a scientific processing unit includes a microprogrammable arithmetic processing apparatus for performing floating point arithmetic operations with operands in long and short form, including a micro programmable control section and a plurality of microprocessor arithmetic and logic unit chip stages organized into two sections and carry look-ahead circuits coupled thereto.

Abstract: A scientific processing unit includes a microprogrammable arithmetic processing apparatus for performing floating point arithmetic operations with operands in long and short form. The apparatus includes a microprogrammable control section and a plurality of microprocessor arithmetic and logic unit chip stages organized into two sections and carry look-ahead circuits coupled thereto. One section includes a predetermined number of series-coupled stages connected to process exponent values or long mantissa values. The other section includes another predetermined number of series coupled stages connected to process short mantissa values. Control circuits interconnect the stages of both sections and connect to the carry look-ahead circuits and to the microprogrammed control section. During the performance of an arithmetic operation, the control circuits in response to signals from the control section, selectively split the two sections and inhibit the propagation of carries generated by the carry look-ahead circuits for operation of both sections independently and as a single unit as desired for efficient execution of the arithmetic operation.

167 citations