% Copyright The Numerical Algorithms Group Limited 1992-94. All rights reserved. % !! DO NOT MODIFY THIS FILE BY HAND !! Created by ht.awk. \newcommand{\FloatXmpTitle}{Float} \newcommand{\FloatXmpNumber}{9.27} % % ===================================================================== \begin{page}{FloatXmpPage}{9.27 Float} % ===================================================================== \beginscroll % \Language{} provides two kinds of floating point numbers. The domain \spadtype{Float} (abbreviation \spadtype{FLOAT}) implements a model of arbitrary %-% \HDindex{floating point!arbitrary precision}{FloatXmpPage}{9.27}{Float} precision floating point numbers. The domain \spadtype{DoubleFloat} (abbreviation \spadtype{DFLOAT}) is intended to make available %-% \HDindex{floating point!hardware}{FloatXmpPage}{9.27}{Float} hardware floating point arithmetic in \Language{}. %-% \HDexptypeindex{DoubleFloat}{FloatXmpPage}{9.27}{Float} The actual model of floating point that \spadtype{DoubleFloat} provides is system-dependent. For example, on the IBM system 370 \Language{} uses IBM double precision which has fourteen hexadecimal digits of precision or roughly sixteen decimal digits. Arbitrary precision floats allow the user to specify the precision at which arithmetic operations are computed. Although this is an attractive facility, it comes at a cost. Arbitrary-precision floating-point arithmetic typically takes twenty to two hundred times more time than hardware floating point. For more information about \Language{}'s numeric and graphic facilities, see \downlink{``\ugGraphTitle''}{ugGraphPage} in Section \ugGraphNumber\ignore{ugGraph}, \downlink{``\ugProblemNumericTitle''}{ugProblemNumericPage} in Section \ugProblemNumericNumber\ignore{ugProblemNumeric}, and \downlink{`DoubleFloat'}{DoubleFloatXmpPage}\ignore{DoubleFloat}. \beginmenu \menudownlink{{9.27.1. Introduction to Float}}{ugxFloatIntroPage} \menudownlink{{9.27.2. Conversion Functions}}{ugxFloatConvertPage} \menudownlink{{9.27.3. Output Functions}}{ugxFloatOutputPage} \menudownlink{{9.27.4. An Example: Determinant of a Hilbert Matrix}}{ugxFloatHilbertPage} \endmenu \endscroll \autobuttons \end{page} % % \newcommand{\ugxFloatIntroTitle}{Introduction to Float} \newcommand{\ugxFloatIntroNumber}{9.27.1.} % % ===================================================================== \begin{page}{ugxFloatIntroPage}{9.27.1. Introduction to Float} % ===================================================================== \beginscroll Scientific notation is supported for input and output %-% \HDindex{floating point!input}{ugxFloatIntroPage}{9.27.1.}{Introduction to Float} of floating point numbers. %-% \HDindex{scientific notation}{ugxFloatIntroPage}{9.27.1.}{Introduction to Float} A floating point number is written as a string of digits containing a decimal point optionally followed by the letter ``{\tt E}'', and then the exponent. \xtc{ We begin by doing some calculations using arbitrary precision floats. The default precision is twenty decimal digits. }{ \spadpaste{1.234} } \xtc{ A decimal base for the exponent is assumed, so the number \spad{1.234E2} denotes \texht{$1.234 \cdot 10^2$}{\spad{1.234 * 10**2}}. }{ \spadpaste{1.234E2} } \xtc{ The normal arithmetic operations are available for floating point numbers. }{ \spadpaste{sqrt(1.2 + 2.3 / 3.4 ** 4.5)} } \endscroll \autobuttons \end{page} % % \newcommand{\ugxFloatConvertTitle}{Conversion Functions} \newcommand{\ugxFloatConvertNumber}{9.27.2.} % % ===================================================================== \begin{page}{ugxFloatConvertPage}{9.27.2. Conversion Functions} % ===================================================================== \beginscroll \labelSpace{3pc} \xtc{ You can use conversion (\downlink{``\ugTypesConvertTitle''}{ugTypesConvertPage} in Section \ugTypesConvertNumber\ignore{ugTypesConvert}) to go back and forth between \spadtype{Integer}, \spadtype{Fraction Integer} and \spadtype{Float}, as appropriate. }{ \spadpaste{i := 3 :: Float \bound{i}} } \xtc{ }{ \spadpaste{i :: Integer \free{i}} } \xtc{ }{ \spadpaste{i :: Fraction Integer \free{i}} } \xtc{ Since you are explicitly asking for a conversion, you must take responsibility for any loss of exactness. }{ \spadpaste{r := 3/7 :: Float \bound{r}} } \xtc{ }{ \spadpaste{r :: Fraction Integer \free{r}} } \xtc{ This conversion cannot be performed: use \spadfunFrom{truncate}{Float} or \spadfunFrom{round}{Float} if that is what you intend. }{ \spadpaste{r :: Integer \free{r}} } \xtc{ The operations \spadfunFrom{truncate}{Float} and \spadfunFrom{round}{Float} truncate \ldots }{ \spadpaste{truncate 3.6} } \xtc{ and round to the nearest integral \spadtype{Float} respectively. }{ \spadpaste{round 3.6} } \xtc{ }{ \spadpaste{truncate(-3.6)} } \xtc{ }{ \spadpaste{round(-3.6)} } \xtc{ The operation \spadfunFrom{fractionPart}{Float} computes the fractional part of \spad{x}, that is, \spad{x - truncate x}. }{ \spadpaste{fractionPart 3.6} } \xtc{ The operation \spadfunFrom{digits}{Float} allows the user to set the precision. It returns the previous value it was using. }{ \spadpaste{digits 40 \bound{d40}} } \xtc{ }{ \spadpaste{sqrt 0.2} } \xtc{ }{ \spadpaste{pi()\$Float \free{d40}} } \xtc{ The precision is only limited by the computer memory available. Calculations at 500 or more digits of precision are not difficult. }{ \spadpaste{digits 500 \bound{d1000}} } \xtc{ }{ \spadpaste{pi()\$Float \free{d1000}} } \xtc{ Reset \spadfunFrom{digits}{Float} to its default value. }{ \spadpaste{digits 20} } Numbers of type \spadtype{Float} are represented as a record of two integers, namely, the mantissa and the exponent where the base of the exponent is binary. That is, the floating point number \spad{(m,e)} represents the number \texht{$m \cdot 2^e$}{\spad{m * 2**e}}. A consequence of using a binary base is that decimal numbers can not, in general, be represented exactly. \endscroll \autobuttons \end{page} % % \newcommand{\ugxFloatOutputTitle}{Output Functions} \newcommand{\ugxFloatOutputNumber}{9.27.3.} % % ===================================================================== \begin{page}{ugxFloatOutputPage}{9.27.3. Output Functions} % ===================================================================== \beginscroll A number of operations exist for specifying how numbers of type \spadtype{Float} are to be %-% \HDindex{floating point!output}{ugxFloatOutputPage}{9.27.3.}{Output Functions} displayed. By default, spaces are inserted every ten digits in the output for readability.\footnote{Note that you cannot include spaces in the input form of a floating point number, though you can use underscores.} \xtc{ Output spacing can be modified with the \spadfunFrom{outputSpacing}{Float} operation. This inserts no spaces and then displays the value of \spad{x}. }{ \spadpaste{outputSpacing 0; x := sqrt 0.2 \bound{x}\bound{os0}} } \xtc{ Issue this to have the spaces inserted every \spad{5} digits. }{ \spadpaste{outputSpacing 5; x \bound{os5}\free{x}} } \xtc{ By default, the system displays floats in either fixed format or scientific format, depending on the magnitude of the number. }{ \spadpaste{y := x/10**10 \bound{y}\free{x os5}} } \xtc{ A particular format may be requested with the operations \spadfunFrom{outputFloating}{Float} and \spadfunFrom{outputFixed}{Float}. }{ \spadpaste{outputFloating(); x \bound{of} \free{os5 x}} } \xtc{ }{ \spadpaste{outputFixed(); y \bound{ox} \free{os5 y}} } \xtc{ Additionally, you can ask for \spad{n} digits to be displayed after the decimal point. }{ \spadpaste{outputFloating 2; y \bound{of2} \free{os5 y}} } \xtc{ }{ \spadpaste{outputFixed 2; x \bound{ox2} \free{os5 x}} } \xtc{ This resets the output printing to the default behavior. }{ \spadpaste{outputGeneral()} } % \endscroll \autobuttons \end{page} % % \newcommand{\ugxFloatHilbertTitle}{An Example: Determinant of a Hilbert Matrix} \newcommand{\ugxFloatHilbertNumber}{9.27.4.} % % ===================================================================== \begin{page}{ugxFloatHilbertPage}{9.27.4. An Example: Determinant of a Hilbert Matrix} % ===================================================================== \beginscroll Consider the problem of computing the determinant of a \spad{10} by \spad{10} %-% \HDindex{Hilbert matrix}{ugxFloatHilbertPage}{9.27.4.}{An Example: Determinant of a Hilbert Matrix} Hilbert matrix. The \eth{\spad{(i,j)}} entry of a Hilbert matrix is given by \spad{1/(i+j+1)}. \labelSpace{2pc} \xtc{ First do the computation using rational numbers to obtain the exact result. }{ \spadpaste{a: Matrix Fraction Integer := matrix [[1/(i+j+1) for j in 0..9] for i in 0..9] \bound{a}} } \xtc{ This version of \spadfunFrom{determinant}{Matrix} uses Gaussian elimination. }{ \spadpaste{d:= determinant a \free{a}\bound{d}} } \xtc{ }{ \spadpaste{d :: Float \free{d}} } \xtc{ Now use hardware floats. Note that a semicolon (;) is used to prevent the display of the matrix. }{ \spadpaste{b: Matrix DoubleFloat := matrix [[1/(i+j+1\$DoubleFloat) for j in 0..9] for i in 0..9]; \bound{b}} } \xtc{ The result given by hardware floats is correct only to four significant digits of precision. In the jargon of numerical analysis, the Hilbert matrix is said to be ``ill-conditioned.'' %-% \HDindex{matrix!ill-conditioned}{ugxFloatHilbertPage}{9.27.4.}{An Example: Determinant of a Hilbert Matrix} }{ \spadpaste{determinant b \free{b}} } \xtc{ Now repeat the computation at a higher precision using \spadtype{Float}. }{ \spadpaste{digits 40 \bound{d40}} } \xtc{ }{ \spadpaste{c: Matrix Float := matrix [[1/(i+j+1\$Float) for j in 0..9] for i in 0..9]; \free{d40} \bound{c}} } \xtc{ }{ \spadpaste{determinant c \free{c}} } \xtc{ Reset \spadfunFrom{digits}{Float} to its default value. }{ \spadpaste{digits 20} } \endscroll \autobuttons \end{page} %