The HP Fortran Programmer Guide describes how to use the different features of HP Fortran to develop, compile, debug, and
optimize programs in HP 9000 systems, and Integrity systems. It also describes how to migrate HP FORTRAN 77 programs to
the current HP Fortran compiler and how to use the different compiler features for porting programs written for other vendors’
Fortran to HP Fortran. If you have any problems with the software, please contact your local Hewlett-Packard Sales Office or
Customer Service Center.
You need not be familiar with the HP parallel architecture, programming models, or optimization concepts to understand the
concepts introduced in this book.
HP Fortran for Integrity systems is a part of the HP-UX PA-RISC Fortran product. HP Fortran is source compatible between the
HP 9000 and Integrity systems. However, HP Fortran for Integrity systems will not run on PA-RISC based systems.
HP Part Number: 766160-001
Published: March 2014
Edition: 12
Confidential computer software. Valid license from HP required for possession, use or copying. Consistent with FAR 12.211 and 12.212, Commercial
Computer Software, Computer Software Documentation, and Technical Data for Commercial Items are licensed to the U.S. Government under
vendor's standard commercial license.
The information contained in this document is subject to change without notice. The only warranties for HP products and services are set forth in the
express warranty statements accompanying such products and services. Nothing herein should be construed as constituting an additional warranty.
HP shall not be liable for technical or editorial errors or omissions contained herein.
Intel and Itanium are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States and other countries.
UNIX is a registered trademark of The Open Group.
Contents
HP secure development lifecycle......................................................................7
1 An overview of HP Fortran...........................................................................8
The HP Fortran compiler environment...........................................................................................8
Starting with HP-UX 11i v3 March 2013 update release, HP secure development lifecycle provides
the ability to authenticate HP-UX software. Software delivered through this release has been digitally
signed using HP's private key. You can now verify the authenticity of the software before installing
the products, delivered through this release.
To verify the software signatures in signed depot, the following products must be installed on your
system:
•B.11.31.1303 or later version of SD (Software Distributor)
•A.01.01.07 or later version of HP-UX Whitelisting (WhiteListInf)
To verify the signatures, run: /usr/sbin/swsign -v –s <depot_path>. For more information,
see Software Distributor documentation at http://www.hp.com/go/sd-docs.
NOTE:Ignite-UX software delivered with HP-UX 11i v3 March 2014 release or later supports
verification of the software signatures in signed depot or media, during cold installation.
For more information, see Ignite-UX documentation at http://www.hp.com/go/ignite-ux-docs.
1 An overview of HP Fortran
When you use the f90 command to compile a Fortran program, the command invokes a number
of components—and not just the compiler—to create the executable. By default, f90 invokes
different components to parse the source files for syntax errors, produce an intermediate code,
optimize the intermediate code, produce object code, search a set of library files for any additional
object code that may be required, and link all of the object code into an executable file that you
run without further processing.
For example, consider a program that consists of three source files: x.f90, y.f90, and z.f90.
The following command line will process the source files and, if they are syntactically correct,
produce an executable file with the default name a.out:
$ f90 x.f90 y.f90 z.f90
After compilation is complete, you can execute the program by invoking the name of the executable,
as follows:
$ a.out
However, it is likely that you’ll want to control what components act on your program and what
they do to it. For example, you may want to give the executable a name other than a.out or to
link in other libraries than the default ones. The HP Fortran compiler supports a variety of
command-line options that enable you to control the compilation process. This chapter provides
an overview of the process and of the options that enable you to control the different components
invoked by the f90 command.
NOTE:To get a summary listing of all f90 options, refer to the f90(1) manpage or use the
command, as shown here:
$ f90 +usage
For a full description of the options, see the most current version of the Parallel Programming Guide
for HP-UX Systems.
The HP Fortran compiler environment
The HP Fortran compiler environment illustrates the compilation process, showing the different
components of the HP Fortran compiler environment; active processes are unshaded and data
elements are shaded. With the exception of the performance analysis tools and the debugger
(WDB), all components are invoked by the f90 command. The C preprocessor and linker can also
be separately invoked by the cpp and ld commands; see the cpp(1)and ld(1) manpages,
respectively, for more information. The remaining sections in this chapter briefly describe the
different components and the command-line options that control them. Included in each section
are references to other parts of this manual for more detailed information.
8An overview of HP Fortran
Driver
Figure 1 HP Fortran compiler environment
The driver parses the f90 command line by which you invoke the compiler, calls each subprocess
as needed to complete the compilation, and retains control throughout the compilation process.
Command-line options that control driver functions enable you to do the following:
•Call subprocesses that you want to substitute for those that f90 calls by default
•Pass arguments to a subprocess
•Get a summary listing of all options supported by the compiler
•Display information about the version of f90 you are using
•Control the level of information that the driver displays about the compilation process
Table 1 (page 9) lists and briefly describes the options that control the driver
Table 1 Options for controlling the f90 driver
FunctionOption
-c
+dryrun
-o
+pre_include=file
Suppress the link phase and produce an object file (.o)
from each source file on the command line.
Generate sub-process information for a given f90
command-line without running the sub-processes.
Name the output file outfile instead of the default file
name (a.out or filename.o). If linking has been
suppressed, the front end uses this option to name the
object file.
Process contents of file before all source files specified
on the command line. The command line can have multiple
occurrences of this option, each specifying a different
file; they are processed in the specified order.
-tx,name
Substitute a private version (name) of one or more
subprocesses (x) of the compilation. The values for x are:
a Assembler
c Compiler
l Linker
p C preprocessor
s Startup file (crt0.o library)
e Debug file (end.o library)
Driver9
Table 1 Options for controlling the f90 driver (continued)
FunctionOption
If you compile and link separately and specify +tl on the
compile line, you must also specify it on the link line.
List and briefly describe all f90 options.+usage
-v
+version
-Wx,name]
C preprocessor
HP Fortran source files can include directives that the C preprocessor (cpp) reads and expands
before the program is passed to the compiler. Among other things, cpp directives enable you to
code parts of your program for conditional compilation. By default, the f90 command passes
source files ending in the .F extension to the C preprocessor.
Table 2 (page 10) lists and briefly describes the options for controlling the preprocessor, including
the +cpp option, which overrides the default behavior and passes all source files on the command
line to the preprocessor. For additional information, see and the cpp(1) manpage.
Print verbose information to standard output as program
is compiled.
Write compiler version information to standard output,
without compiling.
Pass arg1 through argN to a subprocess of the
compilation, identified by x. The values for x are:
a Assembler
c Compiler
l Linker
p C preprocessor
If you compile and link separately and specify +Wl on the
compile line, you must also specify it on the link line.
Table 2 Options for controlling the C preprocessor
+cpp={yes|no|default}
+[no]cpp_keep
-Dname[=def]
-Idirectory
-M
FunctionOption
Invoke the C preprocessor. +cpp=yes passes all source
files to the preprocessor. +cpp=default passes only files
ending in the .F extension. +cpp=no suppresses passing
any files. The default is +cpp=default.
Retain [discard] output from the C preprocessor. If the
source filename is file.f or file.F, output is stored in
file.i; if the source filename is file.f90, the output
filename is file.i90. The default, +nocpp_keep, is to
discard the output.
Define the symbol name to the preprocessor. If def is
specified, name is defined to that value.
Add directory to the list of directories searched for files
specified in include directives. The command line can have
multiple occurrences of this option, each specifying a
different directory.
Generate the make rule corresponding to the compilation,
the output is redirected to the stdout or user specified
file. The rule describes the dependencies of the source file,
containing the object file name for that source file, a colon,
and the names of all the included files.
-Uname
10An overview of HP Fortran
Remove any initial definition of name, a reserved symbol
that is predefined by the preprocessor.
Front-end
The front-end is responsible for parsing the source code and issuing warning and error messages
when the parse fails. Command-line options enable you to control the front end’s assumptions
about the source code, including whether the source is in fixed or free format, uses implicit or
explicit typing, and contains extensions. Other front-end options control the level of error messages
and their language (Native Language Support), default data sizes, and search rules for .mod files.
For a list of the options that control the front end, see Table 3 (page 11).
Table 3 Options for controlling the front end
FunctionOption
+[no]autodbl
+[no]autodbl4
+check={all|none}
+[no]dlines
Promote [do not promote] all integer, logical, and real
items to 8 bytes, and all double-precision and complex
items to 16 bytes. The default is +noautodbl.
nl
For information about using this option, see “Increasing
default data sizes” (page 70).
Promote [do not promote] all integer, logical, and real
items to 8 bytes, and complex items to 16 bytes. The
+autodbl4 option does not promote the size of
double-precision and double-complex items. The default is
+noautodbl4.
nl
For information about using this option, see “Increasing
default data sizes” (page 70).
Enable (+check=all) or disable (+check=none)
compile-time range checking of array subscripts and
uninitialized stack variables at runtime. The default is
+check=none.
nl
For information about using this option, see “Segmentation
violation exception” (page 79).
Compile debug lines as source statements [comments].
Source lines must be in fixed format. The default,
+nodlines, is to treat source lines with a D or d in column
1 as comments.
nl
For information on using this option, see “Using debugging
lines” (page 80).
+[no]escape
+[no]extend_source
+f2003allocassign
-Idirectory
Treat the backslash character (\) as a C-like escape [literal]
character. The default is +noescape.
nl
For information on using this option when porting, see
“Escape sequences” (page 148).
Allow [do not allow] up to 254 characters on a single
source line. The default, +noextend_source, is 72
characters for fixed format and 132 for free format.
nl
For information on using this option when porting, see
“Source formats” (page 147).
Enable compiling of Fortran 2003 standard code.+f2003
Enables automatic allocation or re-allocation of arrays
when an assignment happens to an allocatable array that
is part of F2003 standard.
Add directory to the list of directories searched for files
specified inINCLUDE lines and include directives, and for
.mod files. The command line can have multiple instances
of this option, each specifying a different directory.
nl
For information about using this option, see “Managing
.mod files” (page 58).
Front-end11
Table 3 Options for controlling the front end (continued)
FunctionOption
+[no]implicit_none
+langlvl={90|default}
+[no]list
+moddir=directory
+moduleoptimize
+nls=lang
+[no]onetrip
Cause the types of identifiers to be implicitly undefined
[defined]. The default is implicit typing
(+noimplicit_none).
nl
For information about using this option, see “Disabling
implicit typing” (page 67).
Issue warnings for all extensions to the Fortran standard
(+langlvl=90). The default, +langlvl=default,
allows extensions.
nl
For information about using this option, see Chapter 11,
“Porting to HP Fortran” (page 141).
Write [suppress] a program listing to standard output
during compilation. The default is +nolist.
Write .mod files to directory. The default is to write
.mod files to the current directory.
nl
For information about using this option, see “The HP Fortran
compiler environment” (page 8).
Creates optimized module files by discarding redundant
information. This reduces the compilation time and memory
requirement of the compiler for nested or hierarchical
modules.
Enable 16-bit Native Language Support (NLS) in strings
and comments in the language specified by lang.
Execute any counter-controlled DO loop at least once
(+onetrip). The default is +noonetrip.
nl
For information about using this option when porting, see
“One-trip DO loops” (page 145).
+[no]ppu
+real_constant={single|double}
+source={fixed|free| default}
+[no]uppercase
Postpend [do not postpend] underscores at the end of
definitions of and references to externally visible symbols.
nl
For Itanium architecture, the default is +ppu. For PA-RISC
64-bit architecture, the default is +ppu. For PA-RISC 32-bit
architecture, the default is +noppu.
nl
For information about using this option when porting, see
“Names with appended underscores” (page 147).
Treat all single-precision real and complex constants as
either single-precision (+real_constant=single) or
double-precision (+real_constant=double). The default
is +real_constant=single. This option has no effect
on constants that are explicitly sized or when the
+autodbl or +autodbl4 option is specified.
nl
For information about using this option, see “Increasing
the precision of constants” (page 69).
Accept source files in fixed format (+source=fixed) or
free format (+source=free). The default,
+source=default, is free for .f90 files and fixed for
.f and .F source files.
nl
For information about using this option, see “Source
formats” (page 147).
Use uppercase [lowercase] for all external names. The
default is +nouppercase.
nl
For information about using this option, see “Case
sensitivity” (page 114).
12An overview of HP Fortran
Table 3 Options for controlling the front end (continued)
Suppress warning messages.-w
FunctionOption
+w
Back-end
The two main functions of the back-end are:
•To optimize your program for faster performance
•To generate the code that goes into the object file
Optimization is performed by two subcomponents of the compiler’s back end:
•The High-Level Optimizer (HLO), which performs large-scale, high-semantic-level analyses and
•The low-level optimizer, which performs traditional optimizations (such as common
Options for controlling optimization form the largest group of the command-line options. These
options enable you to do the following:
•To set the level of optimization that is applied to your program
•To apply a package of optimizations that meet certain requirements of your application—for
Warn about all questionable constructs and issue remarks
about coding styles and performance. Without the +w
option, the compiler issues warnings only about constructs
that are almost certainly problems.
transformations to increase performance.
subexpression elimination and dead-code removal) as well as machine-specific optimizations.
example, optimizations that favor compile-time speed over performance
•To apply specific optimization technologies to your program, or to specific parts of your
program, for fine-tuning performance
Table 4 (page 90) lists (in summary form) the options that control optimization. For information
about how to use these options, see
NOTE:If you use the f90 command to compile and link on separate command lines, many of
the optimization options must appear on both the command line and the link line; see “Performance
and optimization ” (page 87). For information about using f90 to compile and link, see “Linking
with f90 vs. ld” (page 50).
Table 4 Options for controlling optimization
FunctionOption
+check=uninit
+DC7200
-O[optlevel]
+Ooptlevel
Check the use of stack variables before it is defined at
runtime.
Perform memory hierarchy optimizations for the PA7200
processor.
Optimize program, where optlevel is 0 (no optimization),
1, 2, 3, or 4 (the highest level). If optlevel is not
specified, the program is optimized at level 2 (-O2).
This option has the same meaning as the -O[optlevel]
option, except that optlevel must be specified. It is
provided for compatibility with makefiles.
Back-end13
Table 4 Options for controlling optimization (continued)
FunctionOption
+O[no]info
Provide [do not provide] feedback information about the
optimization process. This option is most useful at
optimization level 3 and higher. The default is +Onoinfo.
+O[no]optimization
Enable [disable] optimization, a predefined string that
indicates a category of optimizations (for example, those
that do not increase code size) or a specific optimization
technology (for example, inlining). See the HP FortranProgrammer’s Reference, for the different values for
optimization.
The other component of the back end is the code generator (CodeGen), which you can control by
using the command-line options in Table 1.5. These options allow you to specify (among other
things) that the output file include debugging or profiling information or that local variables be
saved in static memory.
Table 5 Options for controlling code generation
FunctionOption
+[no]asm
+DAmodel
Compile the named source files and leave [do not leave]
the assembly language output in corresponding files whose
names are suffixed with .s. The default is +noasm.
Generate code for a specific version of the PA-RISC
architecture. model can be one of the following:
• PA-RISC version number (1.1 or 2.0).
• A model number (for example, 750 or 870).
• One of the PA-RISC processor names (for example,
PA7000, PA7100, or PA8000).
• The word portable to generate code compatible
across all PA-RISC 1.1 and 2.0 workstations and
servers.
nl
For information about using this option, see “Compiling
for different PA-RISC machines” (page 58).
+DSmodel
14An overview of HP Fortran
Perform instruction scheduling appropriate for a specific
implementation of the PA-RISC architecture. model can be
one of the following:
• PA-RISC version number (1.1 or 2.0).
• A model number (for example, 750 or 870).
• One of the PA-RISC processor names (for example,
PA7000, PA7100, or PA8000).
nl
For information about using this option, see “Compiling
for different PA-RISC machines” (page 58).
On Itanium processor family architecture, the defined values
for model are:
• blended: tune for best performance on a combination
of processors (i.e., Itanium or Itanium 2 processor). This
is the Itanium architecture default.
• itanium: tune for best performance on an Itanium
processor.
• itanium2: tune for best performance on an Itanium 2
processor.
• native: tune for best performance on the processor on
which the compiler is running.
Table 5 Options for controlling code generation (continued)
FunctionOption
-g
+[no]gprof
+k
+pic={short|long|no}
Generate debugging information needed by the debugger.
This option is compatible with optimization levels 0, 1, and
2. If you compile and link separately and specify -g on
the command line, you must also specify it on the link line.
nl
For information about using this option to prepare programs
for the debugger, see “Using the HP WDB debugger”
(page 76).
Prepare [do not prepare] object files for profiling with
gprof; see the gprof(1) manpage. The default is
+nogprof. If you compile and link separately and specify
+gprof on the command line, you must also specify it on
the link line.
nl
For information about using this option to profile programs
with gprof, see “gprof” (page 89).
Generate code for programs that reference a very large
number of shared data items. The linker will issue a
diagnostic message in the rare cases when this option is
needed.
NOTE:This option is valid only on the PA-RISC systems.
Generate Position Independent Code (PIC) with short
displacements (+pic=short) or long displacements
(+pic=long) for use in shared libraries. The default is
+pic=no.
nl
For information about using this option when creating
shared libraries, see “Compiling with +pic” (page 59).
Linker
+[no]prof
Prepare [do not prepare] object files for profiling with
prof; see the prof(1) manpage. The default is +noprof.
If you compile and link separately and specify +prof on
the command line, you must also specify it on the link line.
nl
For information about using this option to profile programs
with prof, see “gprof” (page 89).
+[no]save
Save [do not save] all local variables in all subprograms.
nl
For information about using this option when porting, see
“Uninitialized variables” (page 144).
The linker (ld) builds an executable program from the object files produced by the back end and
the libraries. An important group of options for controlling the linker specify what libraries the
linker should search and where it should search for them. Other options control the type of
information that the linker should or should not include in its output file, such as symbol table
information used by the debugger or marks identifying the output file as shareable or demand
loadable. Table 1-6 lists and briefly describes options that control the linker.
NOTE:If you use the f90 command to compile and link on separate command lines and compile
with any of the options (except -c) listed in Table 1-6, you must specify the same options on the
link line as well.
Table 6 Options for controlling the Linker
+[no]demand_load
FunctionOption
Suppress linking; produce object files only.-c
Mark [do not mark] the output file from the linker demandload. If you compile and link separately and specify
Linker15
Table 6 Options for controlling the Linker (continued)
FunctionOption
+demand_load on the command line, you must also
specify it on the link line. The default is +nodemand_load.
nl
For information about using this option, see “Creating
demand-loadable executables” (page 62).
+FPflags
+[no]fp_exceptions
-Ldirectory
-lx
Specify how the runtime environment for trapping
floating-point exceptions should be initialized at program
startup. If you compile and link separately and specify +FP
on the command line, you must also specify it on the link
line with the identical set of flags. The default is that all
traps are disabled. See the ld(1) manpage for specific
values for flags.
nl
For information using this option, see “Floating-point
exceptions” (page 78).
Enable [disable] floating-point exceptions. Enabling
floating-point exceptions also causes the running program
to issue a procedure traceback for runtime errors. The
default is +nofp_exceptions.
nl
For information using this option, see “Floating-point
exceptions” (page 78).
Add directory to the front of the library search path.
This option applies only to libraries specified by the -l
option (see below). If you compile and link separately and
specify -L on the command line, you must also specify it
on the link line.
nl
For information about using this option, see “Library search
rules” (page 54).
Link the library libx.a or libx.sl to the executable
program. If you compile and link separately and specify
-l on the command line, you must also specify it on the
link line.
nl
For information about using this option, see “Linking to
nondefault libraries” (page 52)and the ld(1) manpage.
-ooutfile
+[no]shared
+[no]strip
+[no]ttybuf
+[no]U77
Name the output file outfile instead of the default
a.out. If linking is suppressed (-c), this option is used
instead to name the object files.
Cause the output file from the linker to be marked shared
[unshared]. If you compile and link separately and specify
+shared on the command line, you must also specify it
on the link line. The default is +shared. For information
about using this option, see “Creating shared executables”
(page 62).
Strip [do not strip] symbol table information from the linker
output. For more information, see the ld(1)) and strip(1)
manpages. This option is not compatible with -g. If you
compile and link separately and specify +strip on the
command line, you must also specify it on the link line. The
default is +nostrip.
nl
For information about using this option, see “Stripping
debugging information” (page 76).
Use buffered [unbuffered] output to the terminal. The default
is +ttybuf.
Invoke [do not invoke] support for the libU77 library (BSD
3f). If you compile and link separately and specify +U77
on the compile line, you must also specify it on the link
line. The default is +noU77.
nl
16An overview of HP Fortran
Table 6 Options for controlling the Linker (continued)
FunctionOption
For information about the libU77 library, see “Additional
HP Fortran libraries” (page 53)and the HP Fortran
Programmer’s Reference.
-Wl,options
Tools
The HP Fortran compiler environment includes a high-level language debugger and performance
analysis tools. The debugger is HP WDB, which includes a graphical user interface. To prepare
a program for debugging, you must compile it with the -g option. For information about this
option, see “Using the HP WDB debugger” (page 76).
The performance analysis tools include the standard UNIX utilities, prof and gprof. To use prof
and gprof, you must compile with the +prof and +gprof options, respectively. For more
information about all of the performance analysis tools, see “Using profilers” (page 87)and theCXperf(1), prof(1), gprof(1), and ttv(1) manpages.
HP-UX operating system
Although the HP-UX operating system does not appear Figure 1-1 on page 3 , it provides a variety
of resources for programs executing within HP-UX. For example, HP-UX captures the command line
you use to invoke an executable program, breaks it up into arguments, and makes them available
to your program.
HP-UX also has many callable system routines that provide low-level access to kernel-level resources.
For example, your program can call HP-UX file-processing routines as alternatives to Fortran I/O.
“Writing HP-UX applications” (page 107) discusses how HP Fortran programs can take advantage
of HP-UX resources. For a full description of HP-UX system routines, see the HP-UX Reference.
Pass a comma-separated list of options to the linker.
nl
For information about options supported by the linker, see
the ld(1) manpage.
Tools17
2 Compiling and linking
This chapter discusses how to compile and link HP Fortran programs and covers the following
topics:
•Compiling with the f90 command
•Linking HP Fortran programs
•Special-purpose compilations
•Using environment variables
Compiling with the f90 command
The default behavior of the f90 command is to compile source files listed on the command line
and, if the compilation is successful, to pass the resulting object files to the linker. If the link stage
is successful, the linker produces an executable program with the default name a.out.
Consider, for example, the program hello.f90:
Example 1 hello.f90
PROGRAM main
CALL hello()
END PROGRAM main
SUBROUTINE hello()
PRINT *, 'Hello, I must be going.'
END SUBROUTINE hello
When compiled with the command line:
$ f90 hello.f90
f90 produces two files, hello.o (object code) and a.out(the executable program).
If the command line contains only an object file, as in the following:
$ f90 hello.o
f90passes the object file to the linker, which (if successful produces the executable program a.out.
Here is a sample run of the executable program:
$ a.out
Hello, I must be going.
This section provides more detailed information about using the f90command, including:
•Command-line syntax
•Command-line options
•Filenames recognized by f90
f90 command syntax
The default behavior of the f90 command is to compile source files listed on the command line
and, if the compilation is successful, to pass the resulting object files to the linker. If the link stage
is successful, the linker produces an executable program with the default name a.out.
Consider, for example, the program hello.f90:
18Compiling and linking
Example 2 hello.f90
PROGRAM main
CALL hello()
END PROGRAM main
SUBROUTINE hello()
PRINT *, 'Hello, I must be going.'
END SUBROUTINE hello
When compiled with the command line:
$ f90 hello.f90
f90 produces two files, hello.o (object code) and a.out(the executable program).
If the command line contains only an object file, as in the following:
$ f90 hello.o
f90passes the object file to the linker, which (if successful produces the executable program a.out.
Here is a sample run of the executable program:
$ a.out
Hello, I must be going.
This section provides more detailed information about using the f90 command, including:
•Command-line syntax
•Command-line options
•Filenames recognized by f90
f90 command syntax
The syntax for using the f90command is:
f90 [options][files]
where options is a list of one or more command-line options andfilesis a list of one or more files
containing HP Fortran source code to be compiled or object code to be linked. Items in options
and files can be interspersed on the command line, separated by spaces. However, some options
are order-sensitive. For example, the -l option, which is used to specify a library for linking, must
follow the program file to which it will be linked.
For information about using the-l option, see “Linking to nondefault libraries” (page 52). For more
information about the f90command line, see HP Fortran Programmer’s Reference.
Command-line options
Command-line options enable you to override the default behavior of the f90command. Some
options affect how files are compiled or linked; for example, the -Ooption requests optimization.
Other options may cause the f90 command to skip a process entirely; for example, the -coption
compiles without linking. And still others invoke processes other than the default ones; for example,
the +cpp=yes option causes the f90 command to send source files to the C preprocessor (cpp)
before compiling. (For information about using cpp, see “Using the C preprocessor” (page 60).)
Command-line options enable you to override the default behavior of the f90 command. Some
options affect how files are compiled or linked; for example, the -Ooption requests optimization.
Other options may cause the f90command to skip a process entirely; for example, the -coption
compiles without linking. And still others invoke processes other than the default ones; for example,
the +cpp=yes option causes the f90 command to send source files to the C preprocessor (cpp)
before compiling. (For information about using cpp, see “Using the C preprocessor” (page 60).)
Many options are of the form +[no]option, where +option enables the option’s functionality
and +nooption disables it. Other options have more than just an on or off state; these are of the
Command-line options19
form +option=arg. You can cause f90 to list the values for arg on stderr by specifying just
the option name without an argument. For example, given the command line:
$ f90 +langlvl= prog.f90
f90 will issue the following message:
f90: The '+langlvl=' option requires
one of
s
Still other options take a name as an argument. For example, the -oname option specifies the
name you want to give to the output file. If you misspell an option on the f90 command line, the
driver looks for options that are similar to the one you entered and lists them as possible alternatives
on stderr. It meanwhile compiles the program without the option in question.
For detailed information about the syntax of all the options, see the HP Fortran Programmer’sReference. For a brief descriptive list of the options, use the command line:
$ f90 +usage
Commonly-used options
Table 2-1 identifies commonly-used command-line options for when you want to change the
compiler’s default behavior. For a complete and full description of all HP Fortran command-line
options, see “Option descriptions” (page 21)
Table 7 Commonly-used options
FunctionOption
-c
-g
-Ldirectory
-O
+save
-ooutfile
-v
Compile without linking. Use this option to compile and
link in separate steps.
Prepare program for debugging. Use this option to prepare
your program for debugging.
Specify where to look for libraries; applies only to
succeeding -l options. Be careful about using this option
if the LPATH environment variable is set.
Specify a library. Use this option to link in library routines.-lx
Optimize. Use this option to optimize your program at the
default level of optimization.
Give the SAVE attribute to local variables. Use this option
when porting older Fortran programs that may contain
uninitialized variables.
Name the output file outfile. Use this option to name
the executable or object file.
List all compile-line options currently supported by f90.+usage
Enable verbose mode. Use this option to get a report on
the compilation process.
Set include file and module search path.-Idirectory
Specify that the arg has to be passed by a linker option.-Wl,arg
Command-line options by category
Table 2-2 categorizes the f90 command-line options. For detailed information about each of the
options, see “Option descriptions” (page 21).
The following alphabetical list describes each of the command-line options recognized by HP
Fortran. The +usage option also lists and briefly describes all of the currently supported options.
+A
+noalign64bitpointers
+asm, +dryrun, +langlvl=90, +list, -v,
+version, +what, and -w
+nlsNative language support
+cat,+DA, +DC, +DO, +DS, +fastallocatable,
+O, and -O +Oparallel_intrinsics
Set the alignment of data items within FORTRAN STRUCTUREs, COMMONblocks, and
EQUIVALENCEclasses. Specifying +Awithout any arguments aligns data on 2-byte boundaries;
therefore, it is slightly different than the f77 +A.
Disables the correct alignment of pointers in derived types when compiling for wide mode
(+DA2.0W).
NOTE:This option is only valid on the PA-RISC systems.
+allow_ivdep
Allows the IVDEPdirective.
The IVDEPdirective assists the compiler’s dependence analysis. It can only be applied to
iterative DO loops and is only honored when the +allow_ivdepcompiler flag is used.
The IVDEPdirective takes the following form:
Command-line options21
C$DIR IVDEP
Rules and behavior:
The IVDEPdirective is an assertion to the compiler’s optimizer about the order of memory
references inside a DOloop.
The IVDEPdirective tells the compiler to begin dependence analysis by assuming all
dependences occur in the same forward direction as their appearance in the normal scalar
execution order. This contrasts with normal compiler behavior, which is for the dependence
analysis to make no initial assumptions about the direction of a dependence.
The IVDEPdirective must precede the DOstatement for each DOloop it affects. No source code
lines, other than the following.
The IVDEPdirective is applied to a DOloop in which the user knows that dependences are
in lexical order. For example, if two memory references in the loop touch the same memory
location and one of them modifies the memory location, then the first reference to touch the
location has to be the one that appears earlier lexically in the program source code. This
assumes that the right-hand side of an assignment statement is earlier than the left-hand side.
The IVDEPdirective informs the compiler that the program would behave correctly if the
statements were executed in certain orders other than the sequential execution order, such
as executing the first statement or block to completion of all iterations, then the next statement
or block for all iterations, and so forth. The optimizer can use this information, along with
whatever else it can prove about the dependences, to choose other execution orders.
Example 3 EXAMPLE:
In the following example, the IVDEP directive provides more information about the
dependences within the loop, which may enable loop transformations to occur:
C$DIR IVDEP
DO I+1, N
A(INDARR(I)) = A(INDARR(I)) + B(I)
END DO
+[no]asm
+[no]autodbl
In this case, the scalar execution order follows:
• Retrieve INDARR(I)
• Use the result from Step 1 to retrieve A(INDARR(I));
• Retrieve B(I);
• Add the results from Steps 2 and 3 ;
• Store the results from Step 4 into the location indicated by A(INDARR(I))from Step1.
IVDEPdirects the compiler to initially assume that when Steps 1 and 5 access a common
memory location, Step 1 always accesses the location first because Step 1 occurs earlier in
the execution sequence. This approach lets the compiler reorder instructions, as long as it
chooses an instruction schedule that maintains the relative order of the array references.
Relaxes the natural data type rules for alignment.+allow_unaligned
+asm compiles the named programs and leaves the assembler-language output in
corresponding files whose names have the extension. The assembler-language output produced
by this option is not supported as input to the assembler. The default is +noasm. The -Soption
can be used to perform the same function as +asm.
+autodblincreases the default size of integer, logical, and real items to 8 bytes; see Table
2-3. It also increases the default size of double precision and complex items to 16 bytes.
This option does not increase the size of the following:
• Items of character type
• Items declared with the BYTEstatement
• Items declared with the DOUBLE COMPLEXstatement
• Explicitly sized items
For example, the following are unaffected by +autodbl:
INTEGER(KIND=4)
INTEGER(4) J
REAL*8 D
22Compiling and linking
3.1416_4, 113_4
Note, however, that constants specified with an exponent—for example, 4.0E0and
2.3D0—are doubled.
Items promoted include constants, scalar variables, arrays, components of derived types,
and record fields. This option also promotes intrinsics as well as the results and arguments
of user-defined functions to the proper precision. Types specified in ONstatements are also
promoted.
The entire program should be compiled with this option, not just selected files.
This option is useful when porting programs that depend on the increased precision of 8 and
16 bytes. If you want to promote only single-precision items, use the +autodbl4option.
(REAL(KIND=16)arithmetic is slow.)
The default is +noautodbl.
Table 9 Data type sizes and +autodbl[4]
Sizes in bytes of intrinsic types
ComplexDouble precisionInteger, logical, and real
884Default sizes
16168+autodbl
1688+autodbl4
+[no]autodbl4
-Boption
-c
Like +autodbl, +auotdbl4 increases the default size of integer, logical, and real items to 8 bytes,
and the default size of complex items to 16 bytes; see Table 2-3. Unlike +autodbl, it does not
increase the default size of double precision.
This option does not increase the size of the following:
INTEGER(KIND=4)
INTEGER(4) J
REAL*8 D
3.1416_4, 113_4
Note, however, that constants specified with an exponent—for example, 4.0E0and 2.3D0—are
doubled.
Items promoted include constants, scalar variables, arrays, components of derived types, and
record fields. This option also promotes intrinsics as well as the results and arguments of user-defined
functions to the proper precision. Types specified in ON statements are also promoted.
The entire program should be compiled with this option, not just selected files. Use this option when
you want to promote only the single-precision items.
The default is +noautodbl4.
NOTE:The +autodbl4option causes items declared as REAL, INTEGER, and DOUBLE
PRECISION all to have the same size. This violates the Fortran Standard.
This option can be specified to invoke ld with -bto create a shared library.-b
The -Boptions are recognized by HP Fortran to specify whether references to global symbols may
be resolved to symbols defined in the current translation unit, or whether they must be assumed to
be potentially resolved to symbols defined in another load module. The -Boptions are only supported
on HP-UX version 11i and later. See “Symbol binding options” (page 37).
-c compiles the specified source files but does not link them. The compiler produces a relocatable
file (.o) for each file in the files list (these may include.f90, .f, .F, .i, .i90and .sfiles). When
using -cand -otogether, you may specify only one source file on the command line; the resulting
object file is renamed.
Command-line options23
-C
+charlit77
+check={
}
Performs run-time error checking of array subscripts. This option is deprecated and replaced by
the +check=alloption.
+charlit77 causes character literals to be placed in writable static storage. This allows character
strings passed as actual arguments to be modified by the called routine.
all|
none|
uninit|
bounds|
bounds:array|
bounds:all|
bounds:none|
-check_bounds
+check=all
enables compile-time range checking for array subscripts. The +check=all option will also
cause an executing program to halt with a runtime error if any of the following is detected:
• Integer overflow
• Use of uninitialized stack variable
• Out-of-bounds subscripts
• Out-of-bounds substrings
enables run-time checks for use of uninitialized stack variables.+check=uninit
enables run-time checks for out-of-bounds references to array variables.+check=bounds
enables run-time checks for out-of-bounds references to array variables.+check=bounds:arrary
+check=bounds:all
-check_bounds
+[no]cfc
+cpp={yes|no|default}
+cpp=yes
+cpp=default
+cpp=no
Enable or Disable
Compaq/DEC F90
language mode.
tells the compiler to pass the source files specified on the command line to the C preprocessor
before passing them on to the compiler. This option does not apply to .i and .i90 files.
The default, , is to apply the C preprocessor to files that end in the .F extension but not to files
that end in .f or .f90.
Specifying tells the compiler not to invoke the C preprocessor for all files on the command line,
including those ending in .F.
enables run-time checks for both +check=bounds:array and
+check=uninit.
performs run-time error checking; promote compile-time bound violations
to errors.
If you want to keep the output from the C preprocessor, use the +cpp_keepoption.
+[no]cpp_keep
24Compiling and linking
+cpp_ keep causes the compiler to retain the output from the C preprocessor. If the
source filename is file.for file.F, the output filename is file.i; if the source
filename is file.f90, the output filename is file.i90. The compiler will accept source
files with the.i and .i90 extensions.
The default,+nocpp_keep , is to discard the output file.
Note that this option does not pass source files to the C preprocessor. To do that, you
must also specify the +cpp=yes option.
-D name[=def]
+DAmodel
-Ddefines a symbol name (name)to the C preprocessor. If you do not provide a
definition (def) for the symbol name, name is defined as 1. This option applies only
to files that are passed to the C preprocessor.
+DA generates object code for a particular version of the PA-RISC architecture. By
default, the compiler generates code for the machine model you are compiling on.
With this option, you can override the default, causing the compiler to generate code
for the machine on which the program will execute rather than for the machine on
which it is compiled.
modelcan be one of the following:
• A PA-RISC version number (1.1or2.0™)
• A model number (for example, 750or870)
• One of the PA-RISC processor names (for example, PA7000, PA7100, or PA8000)
• The word portable, which causes the compiler to generate code that is compatible
across all PA-RISC 1.1 and 2.0 systems
For best performance, use +DA with the model number or architecture of the machine
on which you plan to execute the program.
The +DA option also specifies the appropriate search path for HP-UX math libraries.
If your program calls mathematical functions,+DA2.0 links in the PA2.0 version of
the math library, while +DA1.1 links in the PA1.1 library version. (For more information
about using math libraries, see the HP-UX Floating-Point Guide.)
With +DA2.0W, memory addresses are 64-bit values. This allows common blocks
and dynamically allocated memory to exceed 32-bit address limits. This feature is
restricted by the available virtual memory on the system where the application is run.
NOTE:You must specify +DA2.OW to generate 64-bit code. At PA64, all data types
remain the same size as at PA32 except for pointers. Fortran pointers are hidden
from the user and cannot be directly manipulated
+DA2.OW
+DC
+DDdatamodel
+[no]demand_load
+dfname
Changes the default size of a Cray pointer. It will become the large of a default-sized
integer and a C pointer (8-bytes), rather than the 4-byte default.
Specifies cache values to the compiler. The syntax is +DC <a>:<b>:<c>:
where <a> is the total cache size in bytes where <b> is the cache line size in
bytes; and <c> cache associativity.
+DC7200performs memory hierarchy optimizations for the PA7200 processor.+DC7200
Generate code using either the ILP32or LP64data model. Defined values for
datamodelare:
32
64
The default is +DD32.
+demand_load causes the output file from the linker to be marked demand load.
When a process is marked demand load, its pages are brought into physical memory
only when they are accessed. The default, +nodemand_ load, causes the output file
from the linker not to be marked demand load. The -q option performs the same
function as +demand_load, and the -Qoption performs the same function as
+nodemand_load.
Specify profile database file name for profile-based optimizations.
+dfrequires the specification of +P.
This option is a synonym for +Oprofile=use: filename.
This option is only valid on the Itanium® processor family architecture.
The default is flow.dataif a name is not specified.
Use the ILP32data model. The sizes of the int, long, and
pointer data types are 32-bits.
Use the LP64data model. The size of the int data type is
32-bits, and the sizes of the long and pointer data types are
64-bits. Defines LP64 to the preprocessor.
Command-line options25
+[no]dlines
+dlines treats source lines with a “D” or “d” in column 1 as statements to be
compiled. The default, +nodlines, treats lines beginning with “D” or “d” in column
1 as comments.
The +dlines option must be used only with source files in fixed-format.
+DOosname
+dryrun
+DSmodel
+[no]escape
+[no]es
-exec
+DOosname sets the target operating system for the compiler, and is intended for
enabling optimizations that are not backward compatible.
PA-RISC— based system recognized values for osnameare 11.0 and 11.0EP9806.
Itanium®— based system recognized values for osnameare 11.20, 11.22, and
11.23.
+DOosname can be used at any level of optimization. The default value for osname
is the operating system version of the computer on which the compiler is invoked.
+dryrun option generates sub-process information for a given f90 command line
without running the sub-processes.
It is useful in the development process to obtain command lines of compiler
sub-processes to run the commands manually or to use them with other tools.
+DS specifies an instruction scheduling algorithm for a particular implementation of
the PA-RISC architecture, as specified by model.
• A PA-RISC version number (1.1or 2.0)
• A model number (for example, 750or 870)
• One of the PA-RISC processor names (for example, PA7000, PA7100, or PA8000)
• Itanium® values as defined below.
For example, specifying +DS750 performs instruction scheduling tuned for one
implementation of PA-RISC 1.1Specifying +DS 2.0 or +DS 1.1 performs scheduling
for a representative PA-RISC 2.0™or1.1 system, respectively. To improve performance
on a particular model, use +DSwith that model number.
Object code with scheduling tuned for a particular model will execute on other systems,
although possibly less efficiently.
If you do not use this option, the compiler uses the argument specified with the
+DAoption. If you use neither +DSor +DA, the default instruction scheduling is for the
system on which you are compiling.
On Itanium®-based architectures, the defined values for model are:
blended
itanium
itanium2
native
-dynamic is used to generate dynamically-bound executables.-dynamic
+escape treats the backslash character (\) as a C-like escape character. The default,
+noescape, treats the backslash character as a literal character.
+[no]es is similar to +[no]extend_source except that character literals and
hollerith constants continued across a line boundary are not padded. This option
provides compatibility with FORTRAN 77’s +es option.
Indicates that any object files created will be used to create an executable file.
Constants with a protected or hidden export class are placed in the read-only data
section. This option also implies -Bprotected_def.
Tune for best performance on a combination of
processors (i.e., Itanium or Itanium® 2 processor).
Tune for best performance on an Itanium®
processor.
Tune for best performance on an Itanium® 2
processor.
Tune for best performance on the processor on
which the compiler is running.
26Compiling and linking
+[no]extend_source
+extend_source allows extended source lines, which may contain up to 254
characters. The default, +noextend_source, restricts fixed-format source lines to
72 characters and free-format source lines to 132 characters.
Programs that depend on the compiler’s ignoring characters past column 72 will not
compile correctly with the +extend_source option.
+externals=file
+fastallocatable
+fltconst_strict
+FPflags
Specifies a file that contains a list of procedure names to be considered external as
opposed to intrinsic. Using this option causes the routines whose names appear in
file to be treated as if they were declared with theEXTERNAL attribute.
+f2003 enables FORTRAN 2003 features supported by the compiler.+f2003
+fastallocatable enables a different representation for allocatable arrays in the
object code produced by the compiler. This alternate representation avoids problems
in optimizing code containing allocatable array references. Additionally, this alternate
representation for allocatable arrays is binary compatible with the old representation.
Uses a strict power table for converting floating constants with exponents. Use of strict
power table provides a more accurate floating point representation.
+FP initializes the flagsthat specify how runtime floating-point exceptions should
be trapped; uppercase flags enable traps, lowercase flags disable traps. flags
can be concatenated to specify a desired behavior and may not contain spaces or
tabs. Valid values for flags are identified in Table 2-4.
By default, all traps are disabled. However, you can specifically disable a behavior
either by excluding the upper-case letter from flags or by including the equivalent
lower-case letter (v,z,o,u,i,d) in flags. For example, the following command
lines are equivalent:
$ f90 +FPvZI test.f90
$ f90 +FPZI test.f90
If you are using PA1.1 libraries, you can dynamically change these settings at run
time by using thefpsetdefaults or fpsetmask routines. For more information
about these routines, see the fpgetround (3M) man page and the HP-UXFloating-Point Guide.
Enabling sudden underflow may cause the same program to compute different results
on different implementations of the PA-RISC 1.1and 2.0architectures. This is because
some hardware implementations have sudden underflow available, while others do
not. The +FPDoption enables the hardware to flush denormalized values to zero, but
it does not require that it do so.
Table 10 Values for the +FP option
MeaningValue
Trap on invalid floating-point operations. Examples of invalid floating-point operations include the following:V
• Arithmetic operation on NaNs
• Operations such as (+inf) + (-inf) and (+inf) - (+inf)
• Multiplication of 0 and infinity
• Division operations 0/0 and inf/inf
• Certain floating-point remainder operations
• Square root of a negative value
• Certain kinds of comparisons of unordered values Z
Trap on floating-point divide by zero.Z
Trap on floating-point overflow.O
Trap on floating-point underflow.U
Command-line options27
Table 10 Values for the +FP option (continued)
MeaningValue
I
D
+[no]fp_exception
Trap on floating-point operations that produce inexact results. Inexact result traps may occur whenever
roundoff is necessary to produce the result. For example, the fraction 1.0/3.0 produces an inexact trap
because there is no exact floating-point representation for this fraction.
Enable sudden underflow (flush to zero) of denormalized values on those PA-RISC systems greater than version
1.0 that have implemented sudden underflow. (That is, +FPD enables sudden underflow only if it is available
on the processor that is used at run time.) Denormalized values are those values whose absolute value is very
close to zero. For IEEE single precision data types, the largest denormalized value is approximately equal
to 2-126. For IEEE double precision data types, such values are approximately equal to 2 -1022. Sudden
underflow will cause some floating-point applications to run faster, with a possible loss of numerical accuracy
on numbers very close to zero.
+fp_exceptionis a compile time flag. It causes a descriptive message and a procedure
traceback to be issued to standard error when the HP-UX signals listed in Table 2-5 are generated.
For a description of these signals, see signal(2) and signal(5) in the HP-UX Reference. For
information about floating-point exceptions and error handling, see the HP-UX Floating-PointGuide.
You can also use the ON statement to write your own trap procedures. For information about
the syntax of the ON statement, see “Using the ON statement” (page 81).
The default,+nofp_exception , disables traceback information.
Table 11 Signals recognized by the +fp_exception option
MeaningSignal
Illegal instructionSIGILL
-g
-G
gformat77
+[no]gprof
+hugecommon
Floating-point exceptionSIGFPE
Bus error instructionSIGBUS
Segmentation violationSIGSEGV
Bad argument to system callSIGSYS
-g causes the compiler to generate information for use by the HP WDB debugger. The -goption
can be used to prepare code for debugging that has been compiled with optimization
options-O,-O1/+O1, , but not O3/+O3or higher.
+getarg0and +getarg1control the behavior of the getarg intrinsic subroutine. +getarg0
requests the industry standard behavior for getarg, where an index value of zero causes the
program name to be returned. HP’s FORTRAN 77 getarg intrinsic also implements this industry
standard convention. +getarg1is used to request non-standard behavior, where an index value
of one causes the program name to be returned (older releases of HP Fortran behaved in this
manner). The default is +getarg0.
gformat77 requests the FORTRAN 77 style of formatting a value of zero with the G edit
descriptor. Fortran 90 uses an F edit descriptor when the value being written is zero, while
FORTRAN 77 uses an E edit descriptor.
+gprof prepares object code files for profiling with gprof. The default is +nogprof. gprofis
provided as part of the “HP-UX General Programming Tools” product; see gprof(1). The
-Goption can be used to perform the same function as +gprof.
+hugecommon instructs the compiler to place the specified COMMON block into a huge data
segment. The format for this option is: +hugecommon=name
where name is the name of a COMMON block. By default, only COMMON blocks larger than
2 gigabytes are placed into huge data segments.
For example:
28Compiling and linking
Example 4 % f90 +hugecommon=results pcvals.f90
places the COMMON block named results into a huge data segment. +hugecommonis especially
useful when a program contains several different COMMON blocks that together occupy more
than two gigabytes but individually occupy less than two gigabytes. In this situation, the largest
COMMON blocks could be placed in a huge data segment when the program is compiled by
specifying their names in multiple +hugecommon options.
If a common block is specified as huge in one object file, it must be specified huge in all object
files. If it is not, the program will fail to link.
NOTE:PA2.0W objects cannot be combined with 32-bit object files. 64-bit applications will
only execute on PA8000-based systems.
+hugesize
+i2
+i8
+hugesize instructs the compiler to place COMMON blocks that are
larger than the specified size into a huge data segment. The format for
this option is: +hugesize=n
where n is the size in kilobytes (1024 bytes).
The default is to place COMMON blocks larger than two gigabytes
(2147483648 bytes) into huge data segments; that is,
+hugesize=2097152 is the default.
For example: % f90 +hugesize=1024 hello.f90
specifies that COMMON blocks larger than 1048576 bytes (1
megabyte) should be placed into a huge data segment.
If a common block is specified as huge in one object file, it must be
specified huge in all object files. If it is not, the program will fail to link.
PA2.0W objects cannot be combined with 32-bit object files. 64-bit
applications will only execute on PA8000-based systems.
NOTE:+hugesizeand +hugecommon options are not valid on HP
Itanium ®-based systems.
+i2 changes 4-byte integer and logical constants, intrinsics, and user
variables to 2-byte integers (rather than the 4-byte default).
+i8 changes 4-byte integer and logical constants, intrinsics, and user
variables to 8-byte integers (rather than the 4-byte default).
Specifies incremental linking. This option is valid only for 64-bit links.+ild
If the output file does not exist, or if it was created without the +ild option,
the linker performs an initial incremental link. The output file produced
is suitable for subsequent incremental links. The incremental link option
is valid for both executable and shared library links. It is not valid for
relocatable links, options (or tools) that strip the output module, and
certain optimization options. See ld(1)for more information.
+ildrelink
-Idirectory
Perform an initial incremental link, regardless of the output load module.
This option is valid only for 64-bit links.
In certain situations during incremental linking (for example, internal
padding space is exhausted), the incremental linker is forced to perform
an initial incremental link. The +ildrelinkoption allows you to avoid
such unexpected initial incremental links by periodically rebuilding the
output file.
-I specifies a directory where .mod files and files named in the INCLUDE
line or in #include directives may be found if their name is a relative
pathname—that is, does not begin with a slash (/). Directories are
searched in the following order:
• The current source directory—that is, the directory containing the file
with the INCLUDEline or #include.
• Directories specified by the -Ioption, in the order specified
Command-line options29
• The current working directory
• The /usr/include directory
+[no]implicit_none
+indirectcommonlist=file
+initheap_complex=rval:ival
+implicit none forces the types of identifiers to be implicitly undefined.
This is equivalent to specifying IMPLICIT NONE for each program unit
in each file in the files list. The source code that is to be compiled with
this option may contain other IMPLICIT statements; the statements will
be honored. The default, +noimplicit_none, allows identifiers to be
implicitly defined.
The common blocks listed in file (one per line, no enclosing ‘ / ‘s) are
treated as shared common blocks, but are not attached. The user must
attach or otherwise allocate storage for such common blocks before they
are referenced.
A C language program would typically be used to either attach a shared
memory segment, or malloc a block of memory, and store that address
into the external symbol for the common block. All Fortran code that
references such a common block will indirect through the address in the
external symbol for that indirect common block.
All source files that reference variables in such a common block must be
compiled with the +indirectcommonlist flag, and that common
block name must appear in the named file.
Default initialization of heap variables (mentioned in +initheap_set option)
of type complex. rval is real part and ival is imaginary part of complex
number. Valid values for rval and ival are normal real numbers, nan ,
snan.
The numbers represented by rval/ival are mentioned in below table.
Table 12
+initheap_integer=ival
+initheap_real=rval
+initheap_set=var
NumVal
quiet NaNnan
Signaling NaNsnan
snan is supported at default optimization level only.
Default initialization of heap variables (mentioned in +initheap_set option)
of type integer. Valid values of ival are integer values of the form decimal,
octal (starting with 0) and hexadecimal (starting with 0x).
Default initialization of heap variables (mentioned in +initheap_set option)
of type real. Valid values for rval are normal real numbers, nan, snan.
Table 13
NumVal
Quiet NaNnan
Signaling NaNsnan
snan is supported at default optimization level only.
Default Initialization of heap variables is done for the variables specified
by var.
The defined values for var are:
Table 14
30Compiling and linking
allocatable
all heap variablesall
variables having allocatable
attribute
variables in common blockcommon
variables in modulemodule
Table 14 (continued)
none of the heap variablesnone
variables having save attributesave
A combination of different type of heap variables can be specified by
colon separated values, like +initheap_set=common:module:save
The default is +initheap_set=none
+io77
+k
Suppresses the generation of the optional leading “0” before the decimal
point for real numbers printed with the E and F edit descriptors. Fortran
77 suppressed these leading zeros unless the NOSTANDARDIOflag was
used.
This option is only valid for Itanium processor family architectures.-ipo
Enables interprocedural optimizations across files. Object files produced
using this option contain intermediate code in IELF format. At link time,
ld automatically invokes the interprocedural optimizer u2comp if any
of the input object files is an IELF file. This option is ignored at
optimization levels +O1 and +O2. It is enabled by default when +O4
or+O faster are used.
+k generates code for programs that reference a very large number of
shared data items. The linker will issue a diagnostic message in the rare
case when this option is needed. By default, the compiler generates
short-displacement code sequences for programs that reference global
data in shared libraries. For nearly all programs, this is sufficient.
NOTE:+koption is not valid on HP Itanium®-based systems.
+nocheckuf
+nolibs
+nopadsharedcommon
-O[n]
+On
+nocheckuf disables the OPEN statement error check for opening text files with
ACCESS=”sequential”, FORM=” unformatted”. This option is useful only when
BUFFERIN/BUFFEROUT statements will be used to access the opened unit. The main program
must be compiled with this option for it to have any effect, and all OPEN statements will then
skip this error check.
Disables the passing of libraries to the linker. This is useful when symbols are resolved at runtime
through dynamically loaded libraries.
Do not pad shared common blocks to a multiple of 8 bytes. This option is useful when sharing
shared common blocks betweenf 77programs and f 90-generated programs. All source files
referencing the same shared common block must be compiled with the same setting of this flag.
-O invokes the optimizer, where n is the level of optimization, 0 - 4. The default is optimization
level 2.
+O invokes the optimizer, where n is the level of optimization, 0 - 4. The -g option is compatible
with the +O0, +O1, and +O2 options.
Table 2-6 lists and describes the different levels of optimization.
NOTE:See the Parallel Programming Guide for HP-UX Systems for a detailed description of
optimization levels and methods.
+O[no]optimization
+O[no] options enable or disable specific optimizations or classes of optimizations
(for example, optimizations that affect compilation time). For detailed information
about +O[no]optimization, see “Using optimization options” (page 38).
Command-line options31
Table 15 Levels of optimization
OptimizationsLevel
Local optimizations, including constant folding and partial evaluation of test conditions.0
Peephole optimizations, including:1
• Basic block optimizations
• Branch optimizations
• Instruction scheduling
Optimizations performed at level 1, plus the following:2
• Coloring register allocation
• Induction variables and strength reduction
• Common subexpression elimination
• Loop invariant code motion
• Store/copy optimization
• Unused definition elimination
• Dataflow analysis
• Software pipelining
• Scalar replacement
• Sum reduction optimization
4
-ooutfile
+[no]objdebug
Optimizations performed at levels 1 and 2, plus the following:3
• Interprocedural optimizations, including cloning and inlining
• Loop transformations to improve memory performance, including fusion and interchange
Level 4 optimizations are not currently supported by the PA-RISC compiler. If +O4is specified, the
compiler will issue a warning message and compile at optimization level 3.
For Itanium® applications, performs level 3as well as doing interprocedural optimizations across
translation units (link time optimizations). Object files generated at this level contain an intermediate
representation of the user code and are intended to be temporary files. These intermediate object
files are not guaranteed to be compatible from one version of the compiler to the next. This
requires concurrent use of the +Oprofile=use option.
Decrease the optimization to optlevel for one or more specified routines. This option can
occur at optimization levels 1, 2, 3, or 4. The option cannot raise the optimization level above
what was specified in the+Ooptlevel option. This option can occur multiple times, possibly
with different values of optlevel.
-o names the executable file outfile rather than the default name of a.out. If not specified,
a.out will be overwritten if it exists, or created if it does not. The outfile name must not end
with .f,.f90,.F, i, or .i90. Also, it must not begin with + or -. When using -c and -o together,
you may specify only one source file on the command line; the resulting object file is renamed.
This option, when used with any of the -goptions, will cause the debug information to be left in
the object files instead of being placed in the a.out. This will bypass the pxdb step and the
links will be faster. This is available only for wdb(1) or gdb(1). +objdebug is the default at
compile time. +noobjdebug can be used at link time to continue to place the debug info into
the a.out, even if some objects were compiled with +objdebug.
32Compiling and linking
NOTE:Object files and archive libraries must be accessible to the debugger if the +objdebug
method of storing debug information is used.
+[no]onetrip
-p
+P
+onetrip generates code that executes any DO loop at least once. In accordance with the
language standard, HP Fortran will not execute a DO loop if either of the following conditions
is true:
• The increment value is greater than zero, and the initial value is greater than the limit.
• The increment value is less than zero, and the initial value is less than the limit.
However, older implementations of Fortran (for example, some FORTRAN 66 processors) always
execute a DO loop at least once. The +onetrip option provides compatibility with those
nonstandard implementations.
The default is +noonetrip.
Generate code for prof(1) profiling. The has the same functionality as the +prof option. The
default is +noprof.
Optimize the application based on profile data found in the database fileflow.data , produced
by compilation with +Ild(1), +I, and +df, for more details. The +P option is incompatible
with the +I and -Soptions. It is incompatible with the -g option only during compile time.
NOTE:This option is only valid on Itanium®-based systems.
+pa1 compiles an application for routine-level profiling.+pa1
+pa is ignored when the HP Fortran compiler generates position-independent code
(PIC). The following options cause+pa to be ignored: +pic=short, +pic=long,
+zand +Z.
+[no]parallel_intrinsics
+pic={short|long|no}
+[no]ppu
+pre_include=file
+[no]prof
NOTE:+pal s ignored when the HP Fortran compiler generates position-independent
code (PIC). The following options cause +pato be ignored: +pic=short, +pic=long,
+z and +Z.
Link with the parallel version of some intrinsics in the libF90_parallel library. The default
is +Onoparallel_intrinsics.
+pic generates object code that can be added to a shared library. Object code
generated with this option is position-independent code (PIC). All addresses are either
pc-relative or indirect references.
The argument —short or long— specifies the allocated size of the data linkage
table. Normally you would specify +pic=short to generate PIC. Use +pic=long
when the linker issues an error message indicating data linkage table overflow.
Specifying +pic=long causes the compiler to allocate additional space for more
imported symbols.
+ppu appends underscores to external names, including subroutines, functions, and
common blocks (for example, int_sum_ rather than the default int_sum).
For Itanium® and PA-RISC 64-bit architectures,the default is +ppu. For PA-RISC 32-bit
architecture, the default is +noppu.
NOTE:Mixed languages programs are affected by the +ppu option. C languages
references to Fortran routines and COMMON blocks require a trailing underscore
when the Fortran code is compiled with +ppu. +noppu may be used in 64-bit mode
to avoid trailing underscores.
+pre_include causes the compiler to prepend the code in file before any
compilation occurs. This option can appear more than once—each specifying different
files—on the same command line.
+prof prepares object files for profiling with prof. The default is +noprof. The -p
option can be used to perform the same function as +prof. profis provided as part
of the “HP-UX General Programming Tools” product (see prof(1)).
Command-line options33
-q
-Q
+r8
-R4
-R8
Mark output from the linker demand load. This has the same functionality as the
+demand_load option. The default is +nodemand_load.
Mark output from the linker no demand load. This has the same functionality as the
+nodemand_load (the default).
+r8 changes 4-byte real constants, intrinsics, and user variables to 8-byte reals (rather
than the 4-byte default).
Set the precision of real and complex constants to single-precision. This has the same
functionality as the +real_constant=single option (the default).
Set the precision of real and complex constants to double-precision. This has the same
functionality as the +real_constant=double option. The default is
+real_constant=single.
The-R4 and-R8 options can be used to perform the same function.
+real_constant={single|double}
-s
-S
+[no]save
+[no]shared
+real_constant=single treats all single-precision numerical constants as
single-precision, and the +real_constant=double option treats all single-precision
numerical constants as double-precision. The default is +real_constant=single.
The-R4 and -R8 options can be used to perform the same function.
Strip symbol table information from linker output. This has the same functionality as the
+strip option. The default is +nostrip.
Generate assembly listings for the specified source files. This has the same functionality
as the +asm option . The default is +noasm.
+save forces static storage for all local variables. This option provides a convenient
path for porting older Fortran programs that may depend on static allocation of memory.
(Variables in static storage retain their values between invocations of the program units
in which they are declared). The +save option causes all uninitialized variables to be
initialized to zero. The default is +nosave.
If you explicitly declare a variable with the AUTOMATIC attribute, the attribute overrides
the +saveoption.
The +save command-line option inhibits many of the optimizations performed by the
compiler. Generally, you will get better performance with the +Oinitcheck option,
which also sets uninitialized variables to zero but is more selective than +save; see
“Using optimization options” (page 38).
+noshared causes the output file from the linker to be marked unshared. The default,
+shared, is to mark the output file asshared .
The-n option performs the same function as +shared, and the -N option performs
the same function as +noshared.
+[no]sharedlibF90
+[no]sharedlibIO77
+[no]sharedlibU77
+[no]signedzero
+source={fixed|free|default}
+[no]srcpos
34Compiling and linking
+sharedlibF90allows users to link the shared version of libF90or libF90_parallel
from /usr/lib. This resolves potential issues with the Fortran 90 driver trying to link
with the shared versions of libF90. The default is +nosharedlibF90.
Link with the shared [archive] version of the libIO77 library. The default is to link with
the archive version of libIO77. This option is only supported on the Itanium® platform.
Link with the shared [archive] version of the libU77 library, when combined with +U77.
The default is to link with the archive version of libU77. This option is only supported
on the Itanium® platform.
+[no]signedzero enables signed-zero support. This option forces a floating point
value of negative zero that appears as a formatted output list item to be represented
in the output record with a leading “-”. This option also changes the behavior of the
SIGNintrinsic. The default is +signedzero.
+source tells the compiler that source files are in either fixed or free form. The default
(+source=default) is free form for .f90source files and fixed form for .f and
.Fsource files.
Control the generation of source position information for the Caliper analysis tool.
When +srcpos is in effect (the default), the compiler generates this information.
When +nosrcpos is in effect, the compiler does not generate this information, and
the compiler will instruct the linker to discard any of this information encountered in
object files. This option is only supported on Itanium® platforms.
+[no]strip
tx,path-
Table 16
+strip causes the linker to strip symbol table information from the executable
program. This option is incompatible with the -g option. The default is+nostrip.
The -soption can be used to perform the same function as+strip .
-t looks in path for the subprocess identified by xand substitutes it for the default
subprocess. x can be one or more identifiers indicating the subprocesses.
This option works in two modes:
• If x is a single identifier and path ends in with a slash (/), path represents the
directory with the new subprocess, and the name of the subprocess is the standard
name. If path ends in a filename, it is the name of the subprocess.
If x is a set of identifiers, path is a directory that holds the subprocesses identified in
x. The subprocesses in path have their standard names.
Table 2-7 lists the identifiers for x, the subprocesses each indicates, and the standard
subprocess name.
The following example of the-t option tells the compiler to pass the source files to the
K&R version of the C preprocessor for preprocessing:
-tp,/usr/ccs/lbin/cpp
Standard nameSubprocessValue
asAssemblera
f90comCompilerc
+trigdacc
+[no]ttybuf
-Uname
end.oDebug filee
ldLinkerl
cppC preprocessorp
crt0.o, gcrt0.o, mcrt0.oStart-up files
The compiler will use more accurate cosd,sind , andtand
intrinsics, than the standard ones. While the standard ones
have some accuracy problems with the boundary cases
(i.e., cosd(90) needs to be 0.0), these new intrinsics get
these cases right.
NOTE:+trigdacc option is not valid on HP
Itanium®-based systems.
+ttybuf controls tty buffering, using buffered output.
+nottybuf uses unbuffered output. The default is buffered
output (+ttybuf).The +ttybuf option forces buffered
output even on systems whose default is unbuffered output.
The +[no]ttybuf option is recognized only when the
main program is a Fortran program. If the main program
is written in another language, use the
TTYUNBUFenvironment variable (see f90(1).
-U undefines or removes any initial definition of name in
the C preprocessor (cpp). See the cpp (1) in the HP-UX
Reference for details.
+[no]U77
+U77 option invokes support for the BSD 3F library,
libU77, which provides an HP Fortran interface to some
of the libc system routines. To call routines in this library,
you must compile and link with +U77. For information
about these routines, see the HP Fortran Programmer’sReference.of an unsatisfied symbol. If the libU77 name
is the same as a libc name, the name might resolve to a
Command-line options35
libc name. This situation does not cause an error at
compile time, but can produce unpredictable results.
+[no]uppercase
+usage
-v
+version
-w
+what
-Wx,arg1,arg2,...,argN
+uppercase uses uppercase for external names. The
default, +nouppercase, is to convert external names to
lowercase.
If you need to control the case of specific names, use
the$HP$ ALIAS directive, as described in “$HP$ ALIAS”
(page 124).
+usage lists and briefly describes all of the command-line
options currently supported by the HP Fortran compiler.
No compile occurs.
-v enables the verbose mode, producing a step-by-step
description of the compilation process on the standard
error output.
+version displays compiler version information only; no
compilation occurs.
-w suppresses warning messages. If this option is omitted,
warnings are sent to standard error.
+whatprints the what string for the Fortran 90 driver,
providing version and patch numbers.
-Wcauses arg1 through argNto be handed off to
subprocess x. Each arg takes the form:
-option[,value]
where option is the name of an option recognized by
the subprocess and value is a separate argument to option,
where necessary. The values that x can assume are listed
in Table 2-8.
For example, the following option tells the linker to print a
trace of each input file as ld processes it:
-Wl,-t
The next example passes the -a shared option to the linker,
causing it to select shared libraries for linking.
-Wl,-a,shared
Table 17
-Y=lang
+z
+Z
36Compiling and linking
MeaningValue
Assemblera
Compilerc
Linkerl
C preprocessorp
Enable Native Language Support in the given language
(lang). This has the same functionality as the +nlsoption.
see+pic=short in this chapter. If +z is specified when
creating 64-bit code, it instead maps to +Z.
see+pic=long in this chapter for a description. Note that
when creating 64-bit shared executables (such as when
Symbol binding options
The following -B options are recognized by HP Fortran to specify whether references to global
symbols may be resolved to symbols defined in the current translation unit, or whether they must
be assumed to be potentially resolved to symbols defined in another load module.
All -B options are applied in a ‘left to right’ order so that options at the end of a string may
supercede those that appear earlier in the string.
+DA2.0W is specified), the +Zoption is on by default. This
is the only PIC option supported for 64-bit executables.
NOTE:To not generate position-independent code for
64-bit executables, specify the -W1, -noshared option:
-Bdefault=symbol[,symbol...]
-Bdefault:filename
-Bextern [=symbol [,symbol...]]
-Bextern:filename
The named symbols are assigned the default export class.
These symbols may be imported or exported outside of the
current load module. The compiler will access tentative
symbols through the linkage table. Any symbol that is not
assigned to another export class through use of another
-B option (or the deprecated +O[no]extern option) will
have the default on a per-symbol basis to specify exceptions
to global -Bprotected, -Bhidden, or-Bextern options.
The file indicated by filename contains a list of symbols,
separated by spaces or newlines. These symbols are
assigned the default export class.
The named symbols, or all undefined symbols if no list is
provided, are assigned the default export class.
Additionally, the compiler will inline the import stub for
calls to these symbols. No compile-time binding of these
symbols will be done. All references to these symbols will
be through the linkage table, so an unnecessary
performance penalty will occur if -Bextern is applied to
a listed symbol that is resolved in the same load module.
The file indicated by filename contains a list of symbols,
separated by spaces or newlines. These symbols are
assigned the default export class. Additionally, the compiler
will inline the import stub for calls to these symbols. No
compile-time binding of these symbols will be done. All
references to these symbols will be through the linkage
table, so an unnecessary performance penalty will occur
if -Bextern is applied to a symbol that is resolved in the
same load module.
Bhidden [=symbol [,symbol...]]
The named symbols, or all symbols if no symbols are
specified, are assigned the hidden export class. The hidden
export class is similar to the protected export class. These
symbols will not be preempted by symbols from other load
modules, so the compiler may bypass the linkage table for
both code and data references and bind them to
locally-defined code and data symbols. In addition, hidden
symbols will not be exported outside the current load
module. The linker may eliminate them from a shared
library, but in an executable, they remain accessible to the
debugger unless +Oprocelim is also specified.
When used with no symbol list,-Bhidden impliesW1,
-aarchive_shared, causing the linker to prefer an
archive library over a shared library if one is available.
This can be overridden by following the -Bhidden option
with a subsequent -W1,-a option.
Symbol binding options37
-Bhidden:filename
The file indicated by filename contains a list of symbols,
separated by spaces or newlines. These symbols are
assigned the hidden export class.
-Bprotected_data
-Bprotected_def
-Bsymbolic
Using optimization options
The options described in this section allow you to control the different optimizations that the compiler
can apply to your program. These options fall into two categories:
•Options that control classes of optimization (for example, optimizations that affect code size)
•Options that control specific optimizations (for example, inlining)
The following subsections describe the options in both categories. For information about the options
that control levels of optimization, see the description of the +On option in the on page 27. The+O[no]info option, which provides compile-time information about the optimization process, is
described in the same section.
NOTE:You can insert (or remove) underscore characters in the names of any of the optimization
options to improve their readability. The compiler will recognize the option name with or without
underscores.
Marks only data symbols as having the protected export
class.
This is the same as-Bprotected, but only locally-defined
(non-tentative) symbols are assigned the protected export
class.
All symbols are assigned the protected export class. This
is equivalent to -Bprotected with no symbol list.
Reviewing general optimization options
The following options allow you to control how optimization affects code size, compilation time,
runtime performance, and other user-visible effects. The syntax for using these options is:
+O[no]optimization
Except for +Oall, +Ofast and +Ofaster the options do not override a specified level of
optimization, nor do they imply a particular level. (The +Oalloption automatically invokes the
highest level of optimization. +Ofast invokes +O2 level of optimization in addition to few other
specific flags described later. Whereas, +Ofaster invokes +O3 level of optimization). To use any
of these options you must also include the +On option on the same command line, where n specifies
the level at which the type of optimization is effective. Thus, if you wish to apply all optimizations
available at level 3 except those that might significantly increase code size, you would use the
command line:
f90 +O3 +Osize my_prog.f90
If an option is mistakenly used at a level at which the corresponding optimization is not performed,
the compiler will issue a warning message.
The defaults specified in the following descriptions are in effect only at the specified optimization
levels, unless stated otherwise.
where optimization is a parameter that specifies the
class of optimization to apply to your program. The
different parameters are described below. The prefix no
negates the effect of optimization.
+O[no]aggressive
38Compiling and linking
+Oaggressive enables optimizations that can result in
significant performance improvement but can also change
a program’s behavior. This option is only effective at
optimization level 2 or higher.
The +Oaggressive option performs optimizations invoked
by the following options:
• +Oentrysched
• +Olibcalls
• +Onofltacc
• +Onoinitcheck
• +FPD
+FPD is enabled only if +Oaggressiveis used on the
link line.
NOTE:The +Oaggressive option is incompatible with
+Oconservative.
The default is +Onoaggressive.
NOTE:This option is only valid on the PA-RISC systems.
+O[no]all
+O[no]autopar
+Oall performs maximum optimization, including
aggressive optimizations and optimizations that can
significantly increase compile time and memory usage.
The +Oall option automatically invokes the highest level
of optimization.
The default is +Onoall.
This option has been deprecated on version HP-UX 11i
and later.
+O[no]autopar controls automatic parallelization of
loops that are deemed safe and profitable by the loop
parallelizer.
The auto-parallelization feature enables applications to use
the idle resources on multicore or multiprocessor systems,
by automatically
The +Oautopar option can be used for programs at
optimization levels, +O3and above. If both +Oopenmpand
+Oautopar are specified on the command line, the
OpenMP directives gain precedence over the +Oautopar
option. The compiler auto-parallelizes only the loops of
code that are not controlled by the OpenMP directives.
On HP 9000 systems, when +Oautopar is used with the
+Oparalleloption, +O4is inhibited. If automatic
parallelization is desired along with +O4, then use
+Oautopar and omit +Oparallel.
To compile a program with +Oautopar support, the
libcps, libomp, and libpthreadsruntime support
libraries must be present at compile-time and runtime. This
feature is available on Itanium®-based systems and HP
9000 systems.
The +Oconservativeoption sets the following options:
• +Ofltacc
• +Onofltacc
• +Onomoveflops
• +Oparmsoverlap
Use +Oconservative when conservative assumptions
are necessary due to the coding style, as with
nonstandard-conforming programs. Note that it is
incompatible with +Oaggressive.
Reviewing general optimization options39
The +Onoconservative option relaxes the optimizer’s
assumptions about the target program.
The +Onoconservative option relaxes the optimizer’s
assumptions about the target program.
The default is +Onoconservative. This option has been
deprecated starting with HP-UX version 11i and later.
+O[no]limit
+O[no]loop_transform
+O[no]size
Fine-tuning optimization options
The following options allow you to fine-tune the optimization process by providing control over the
specific techniques that the optimizer applies to your program. The syntax for using these options
is
+O[no]optimization
+Olimit suppresses optimizations that significantly
increase compilation time or that can consume large
amounts of memory at compile time. This option is only
effective at optimization level 2 or higher.
The +Onolimit option allows optimizations to be
performed regardless of their effect on compilation time or
memory usage. The default is +Olimit.
Enables [disables] the following transformations: loop unroll
and jam, loop distribution, loop interchange, loop blocking,
loop fusion, and loop unroll. The default is
+Oloop_transform.
+Osize suppresses optimizations that significantly increase
code size. This option is only effective at optimization level
2 or higher. The +Onosize option permits optimizations
that can increase code size. The default is +Onosize.
where optimization is a parameter that specifies an
optimization technique to apply to your program. The
different parameters are described below. The prefix no
negates the effect of optimization.
The options do not override a specified level of optimization, nor do they imply a particular level.
To use any of these options you must also include the +On option on the same command line,
where n specifies the level at which the type of optimization can be performed.
For example, if you find that the optimizer is causing your program to produce different floating-point
results from those produced by the unoptimized program, you could use the following command
line to suppress optimizations that affect floating-point calculations:
f90 +O3 +Onomoveflops +Ofltacc my_prog.f90
If an option is mistakenly used at a level for which the corresponding optimization is not performed,
the compiler will issue a warning message.
The defaults given in the following descriptions are in effect only at the specified optimization
levels, unless stated otherwise.
+O[no]cache_pad_common
+Ocache_pad_common can improve program
performance by padding common blocks to avoid
cache collisions. Cache-line collisions occur when the
difference between the addresses of two data points
is a multiple of the cache size. By inserting empty
space between large variables (for example, arrays),
the optimizer ensures that they do not start at nearby
addresses, where the possibility of a cache collision
is greater. This option is only effective at optimization
level 3 or higher.
40Compiling and linking
Note the following precautions when using this option:
• All program modules that reference the common
block must be compiled with the
+Ocache_pad_common option.
• Each common block in the program should have
the same layout in all program units within which
it is declared. If the layouts are different, they must
be fully independent—that is, they must not pass
values between them.
The default, +Onocache_pad_common, disables
padding.
+O[no]cxlimitedrange
+O[no]cross_region_addressing
+O[no]dataprefetch
+O[no]dataprefetch=[direct|indirect|none]
Enables [disables] the use of floating point math in the
compilation unit.
The default is +Onocxlimitedrange.
Enable [disable] the use of cross-region addressing.
Cross-region addressing is required if a pointer (such
as an array base) points to a different region than the
data being addressed. This is usually due to an offset
which results in a cross-over into another region.
Standard-conforming applications do not require the
use of cross-region addressing. The default is
+onocross_region_addressing.
+Odataprefetch causes the optimizer to insert
instructions within innermost loops to explicitly prefetch
data from memory into the data cache. Data prefetch
instructions will be inserted only for data structures
referenced within innermost loops using simple loop
varying addresses—that is, in a simple arithmetic
progression. It is not available for PA-RISC 1.1 targets.
Use this option for applications that have high data
cache miss overhead. The default is
+Onodataprefetch. On HP-UX version 11i and
later, +Odataprefetch is the same as
+Odataprefetch=indirect and
+Onodataprefetch is the same as
+Odataprefetch=none.At +O2and higher, the
default is +Odataprefetch.
Control generation of data prefetch instructions for
data structures referenced within innermost loops. The
defined values for kind are:
direct—enable generation of data prefetch
instructions for the benefit of direct memory accesses,
but not indirect memory accesses.
indirect—enable generation of data prefetch
instructions for the benefit of both direct and indirect
memory accesses. This is the default at optimization
levels +O2 and above.
none—disable generation of data prefetch instructions.
This is the default at optimization levels +O1 and
below.
+O[no]dynopt
Enable [disable] dynamic optimization for the output
file, if the run-time environment supports this feature.
Both forms of this option change the default setting,
which allows the run-time environment to enable or
disable dynamic optimization according to
asystem-wide default. This option applies only to
executable files and shared libraries. chatr(1) can
be used to change this setting, including restoration
of the default setting, after the output file has been
created.
Fine-tuning optimization options41
+O[no]entrysched
+Oentrysched allows the optimizer to perform
instruction scheduling on a subprogram’s entry and
exit code sequences. This option is only effective at
optimization level 1 or higher.
The option can change the behavior of programs that
perform exception-handling or that handle
asynchronous interrupts.
The default is +Onoentrysched.
NOTE:This option is valid only on the PA-RISC
systems.
+O[no]failsafe
+Ofast
+Ofaster
+O[no]fenvaccess
+O[no]fastaccess
Enable [disable] failsafe optimization. When a
compilation fails at the current optimization level,
+Ofailsafe will automatically restart the
compilation at +O2(for specific high level optimizer
errors +O3/+O4) or +O1 for Itanium® processor family
architecture and +O0 for PA-RISC architecture. The
default is +Ofailsafe.
Select a combination of compilation options for
optimum execution speed of the compiled program.
Those options include: +O2,
+Olibcalls+Onolimit, +Ofltacc=relaxed,
and +FPD. +Ofast also causes the +pi4M, +pd4M
options to be passed to the linker. In addition to these
options, on the Itanium processor family architecture,
+DSnative instruction scheduling is selected and
+mergeseg is also passed to the linker.
Selects the +Ofast option at optimization level +O3
for PA-RISC based architectures. For Itanium processor
family architectures, +Ofaster runs at +O4and implies
-ipo.
Inform the compiler that the program accesses [does
not access] the floating point environment to test flags
or run under non-default modes. If it knows that a
program does not access the floating point
environment, the compiler is allowed to perform certain
optimizations that it otherwise may not perform, such
as global common subexpression elimination, code
motion, or constant folding.
The default is +Onofenvaccess.
+Ofastaccess improves execution time by speeding
up access to global data items. You can use this option
at optimization level 1 or higher. Note that the
+Ofastaccess option may increase link time. The
default is +Onofastaccess at optimization levels 1,
2, and 3; and +Ofastaccess at optimization level
4.
NOTE:This option is valid only on
+O[no]fltacc
42Compiling and linking
+Onofltacc enables optimizations that follow the
rules of algebra but may change the order of
expression evaluation. For example, if a, b, and c are
floating-point variables, the expressions (a + b) +c and a + (b + c)may give slightly different results
due to roundoff.
The +Onofltacc option also enables the fusion of
adjacent multiply and add operations—resulting in
Fused Multiply-Add (FMA). (At optimization level 2 or
higher, FMA occurs by default.) FMA improves
performance but occasionally produces results that
may differ in accuracy from results produced by code
where fusion has not occurred. In general, the
differences are slight.
Table 18 Optimizations performed by +O[no]fltacc
+Ofltacc disables optimizations that change the
order of expression evaluation and therefore may affect
the accuracy of the result. The +Ofltaccoption also
disables fusing.
NOTE:+Ofltacc is the same as
+Ofltacc=strict. +Onofltacc is the same as
+Ofltacc=relaxed.
Table 2-9 identifies the different actions taken by the
optimizer, according to whether you specify
+Ofltacc, +Onofltacc, or neither option. In all
cases, the table assumes that you are compiling at
optimization level 2 (+O2) or higher.
FMA?Expression reordering?+O[no]fltacc
YesNo
NoNo+Ofltacc
YesYes+Onofltacc
+O[no]fltacc=[strict|default|limited|relaxed]
+Ofrequently_called=function1[,function2...]
Control the level of floating point
optimizations that the compiler may
perform. The defined values for optlevel
are:
strict—disallow any floating point
optimization that can result in numerical
differences. This is the same as
+Ofltacc.
default—allow contractions, such as
fused multiply-add (FMA), but disallows
any other floating point optimization that
can result in numerical differences.
limited—like default, but also allows
floating point optimizations which may
affect the generation and propogation of
infinities, NaNs, and the sign of zero.
Also implies +Ocxlimitedrange.
relaxed—in addition to the
optimizations allowed by limited, permits
optimizations, such as reordering of
expressions, even if parenthesized, that
may affect a rounding error. This is the
same as +Onofltacc. Also implies
+Ocxlimitedrange.
The named functions are assumed to be
frequently called. This option overrides
any information in a profile database.
+Ofrequently_called:filename
+O[no]info
The file indicated by filename contains a
list of function names, separated by
spaces or newlines. These functions are
assumed to be frequently called. This
option overrides any information in a
profile database.
+Oinfo causes the compiler to display
informational messages about the
optimization process. The +Oinfo option
provides feedback that can help you to
Fine-tuning optimization options43
determine whether the compiler optimized
time-critical sections of your program. It
can be used at any level of optimization
but is most useful at level 3.
Currently, this option provides feedback
for the following optimizations:
• Cloning, the replacement of a call to
a routine by a call to a clone, which
is a copy of the routine with changes
specific to that call site.
• Inlining.
• Loop transformations to improve cache
performance.
• Vectorization
The default, +Onoinfo, disables the
display of informational messages about
optimization.
+O[no]initcheck
+O[no]inline
The initialization checking feature of the
optimizer has three possible states: on,
off, or unspecified. When this option is
specified in the on state (+Oinitcheck),
the optimizer initializes to zero any local,
nonarray, nonstatic variables that are
uninitialized with respect to at least one
path leading to a use of the variable.
When +Onoinitcheckis specified, the
optimizer issues warning messages when
it discovers definitely uninitialized
variables, but does not initialize them.
When this option is unspecified, the
optimizer initializes to zero any local,
scalar, nonstatic variables that are
definitely uninitialized with respect to all
paths leading to a use of the variable.
This option is only effective at optimization
level 2 or higher.
+Oinline makes all subprograms
eligible for inlining. This option is only
effective at optimization level 3 or higher.
The +Onoinline option disables inlining
for all subprograms in your program.
The default is +Oinline at optimization
level 3 and +Onoinline at the lower levels.
+O[no]inline:filename
+O[no]inline=function1[,function2...]
+Oinline_budget=n
44Compiling and linking
The file indicated by filename contains a
list of function names, separated by
spaces or newlines. This option enables
[disables] optimizer inlining for the named
functions. This optimization can occur at
optimization levels 3 and 4.
The default is +Oinline.
Enable [disable] optimizer inlining for the
named functions. This optimization can
occur at optimization levels 3 and 4.
The default is +Oinline.
+Oinline_budget enables the
optimizer to perform more aggressive
Table 19 Values for the +Oinline_budget option
inlining. This option has the following
syntax:
+Oinline_budget=n
where n is an integer in the range 1 -
1000000 that specifies the level of
aggressiveness, as listed in Table on page
70.
The +Onolimit and +Osize options
also affect inlining. Specifying the
+Onolimitoption has the same effect as
specifying +Oinline_budget=200. The
+Osize option has the same effect as
+Oinline_budget=1.
Note, however, that the
+Oinline_budget option takes
precedence over both of these options.
This means that you can override the effect
of +Onolimit or +Osizeoption on
inlining by specifying the
+Oinline_budgetoption on the same
command line.
This option is only effective at optimization
level 3 or higher.
MeaningValues for n
Default level of inlining.= 100
> 100
2 - 99
+O[no]libcalls
More aggressive inlining. The optimizer is less restricted by compilation time and code size
when searching for eligible routines to inline.
Less aggressive inlining. The optimizer gives more weight to compilation time and code size
when determining whether to inline.
Only inline if it reduces code size.= 1
invokes millicode versions of a number of frequently called intrinsic functions; see Table on page
70. Millicode routines have very low call overhead and provide no error-handling. Use this
option to improve the performance of selected library routines only when your program does
not depend upon exception-handling.
The default is +Onolibcallsat optimization levels 0 and 1; at optimization level 2 or higher,
the default is +Olibcalls.
Table 20 Millicode versions of intrinsic functions
log10atan2
powcosacos
sinexpasin
tanlogatan
+o[no]libmerrno
+O[no]loop_block
Enable [disable] support for errnoin libm functions.
The default is +Onolibmerrno.
+O[no]loop_blockenables or disables blocking of
eligible loops for improved cache performance. The
+Onoloop_blockoption disables both automatic and
directive-specified loop blocking.
Fine-tuning optimization options45
+O[no]loop_transform
+Oloop_transformenables transformation of eligible
loops for improved cache performance. The most
important transformation is the interchange of nested
loops to make the inner loop unit stride, resulting in fewer
cache misses. +Onoloop_transform disables
transformation of eligible loops. The default is
+Oloop_transform.
NOTE:This option is deprecated and not supported in
future releases. Use +inline_level.
+inline_level num
+O[no]loop_unroll[=factor]
+O[no]loop_unroll_jam
+O[no]moveflop
This option controls inlining in fortran. The format for num
is N[.n], where num is either an integral value from0
to 9 or a value with a single decimal place from 0.0to
9.0. For more information on this option, see
F90(1)manpage.
+Oloop_unrollturns on loop unrolling. factor is the
unroll factor that controls the code expansion. The default
unroll factor is 4; that is, four copies of the loop body.
By experimenting with different factors, you may improve
the performance of your program. This option is only
effective at optimization level 2 or higher. The default is
+Oloop_unroll=4.
+loop_unroll_jamenables loop unrolling and
jamming. +Onoloop_unroll_jam (the default) disables
both automatic and directive-specified unroll and jam.
Loop unrolling and jamming increases register
exploitation.
+Omoveflops allows the optimizer to move conditional
floating-point instructions, enabling other optimizations
to occur. This option is only effective at optimization level
2 or higher.
The behavior of floating-point exception handling may
be altered by this option.
Using +Onomoveflops is recommended if floating-point
traps are enabled and you do not want the behavior of
floating-point exceptions to be altered by the relocation
of floating-point instructions, as when your program uses
the ON statement. The default is +Omoveflops.
+Oprefetch
+O[no]openmp
+O[no]parallel
46Compiling and linking
+Omultiprocessortells the compiler to appropriately
optimize several different processes on multiprocessor
machines. The optimizations are those appropriate for
executables and or shared libraries.
+Onomultiprocessor, the default, disables the
optimization of more than one process running on a
multiprocessor machine.
NOTE:This option is valid only on the PA-RISC systems.
+Oopenmp allows users to enable the OpenMP directives.
+Onoopenmpwill disable the OpenMP directives.
+O[no]openmpis accepted at all optimization levels.The default is +Onoopenmp.
+Oparallel causes the compiler to transform eligible
loops for parallel execution on multiprocessor machines.
This option is effective only at optimization level 3 or
higher. This option is available only on HP 9000 systems.
If you link separately from the command line and compile
the program with the +Oparallel option, you must link
with the f90 command and specify the
+Oparalleloption to link in the correct runtime support.
The +Onoparallel option disables parallelization for
the target program. It is the default at all levels of
optimization. The +Oparallel option is not available
on Itanium®-based systems for HP Fortran Version 3.2
and later. You can use the +Oautopar option instead
of +Oparallel on Itanium® based systems.
NOTE:The +Oparallel option should not be used
for programs that make explicit calls to the kernel threads
library.
+Oparallel_intrinsics
+O[no]parmsoverlap
+O[no]pipeline
+Oprefetch_latency=num option
+O[no]procelim
+Oparallel_intrinsics links in the parallel version
of many of the Fortran intrinsics located in
libF90_parallel.
+Oparmsoverlap causes the optimizer to assume that
the actual arguments of function calls overlap in memory,
thus preventing any optimizations that violate this
assumption. This option is only effective at optimization
level 2 or higher.
Use the +Onoparmsoverlapoption with programs that
conform to the standard requirement that parameters
must not overlap
The default is+Onoparmsoverlap.
+Opipelineenables software pipelining. This option is
only effective at optimization level 2 or higher. Use
+Onopipeline(disable software pipelining) to conserve
code space. The default is +Opipeline.
NOTE:This option is valid only on the PA-RISC systems.
Applies to loops for which the compiler generates data
prefetch instructions. For a given loop, the compiler
divides “cycles” by the estimated loop length to arrive
at the number of loop iterations for which to generate
advance prefetches. The default value is 480 cycles for
loops containing floating-point accesses and 150 cycles
for loops that do not contain any floating-point accesses.
For tuning purposes, it is recommended that users
measure their application’s performance using a few
different prefetch latency settings to determine the optimal
value. Some floating-point codes may benefit by
increasing the distance to 960. Parallel applications
frequently benefit from a shorter prefetch distance of 150.
When +Oprocelimis specified, procedures that are not
referenced by the application are eliminated from the
output executable file. When +Onoprocelim is
specified, procedures that are not referenced by the
application are not eliminated from the output executable
file. You can use this option at any level of optimization.
Use +Oprocelimto reduce the size of the executable
file, especially when optimizing at levels 3 and 4, when
inlining can remove all calls to some routines.
The default is +Onoprocelimat levels 0-3, and
+Oprocelimat level 4.
+Oprofile=use:filename
+Oprofile=collect<qualifiers>
Specify filname as the name of the profile database file.
This option is a synonym for the +P, +dffilenameoptions.The :filename argument is optional (the default
filename is flow.data). See ld (1), +P, and +df for
more details. This option is only valid for Itanium®-based
applications.
In the absence of qualifiers, +Oprofile=collectis
the same as +I.
< qualifiers > are a comma-separated list of profile
collection qualifiers. Supported profile collection qualifiers
are :
Fine-tuning optimization options47
arc—collect arc counts (synonym to
+Oprofile=collect). This is the default value.
stride—collect stride data.
all—collect all types of profile data. This is a synonym
for +Oprofile=collect:arc,stride
+Oprofile=collect:arc,stride
+Oprofile=collectinstruments the application for
profile-based optimization.
This option is only valid for Itanium®-based applications.
+O[no]promote_indirect_calls
+O[no]ptrs_to_globals[=name1,name2,...,nameN]
+Orarely_called=function1[,function2...]
+Orarely_called:filename
+O[no]recovery
+O[no]regreassoc
Enable [disable} the promotion of indirect calls to direct
calls. Indirect calls occur with pointers to functions. This
option can be used at optimization levels 3 and 4. The
default is +Onopromote_indirect_calls.
Tells the optimizer whether global variables are modified
[are not modified] through pointers. This optimization
can occur at levels 2, 3, and 4. The default is
+Optrs_to_globals.
The named functions are assumed to be infrequently
called. This option overrides any information in a profile
database.
The file indicated by filename contains a list of function
names, separated by spaces or newlines. These functions
are assumed to be infrequently called. This option
overrides any information in a profile database.
Generate [do not generate] recovery code for control
speculation. The default is +Onorecovery
+Onoregreassocdisables register reassociation. This
option is only effective at optimization level 2 or higher.
Use +Onoregreassocto disable register reassociation
in the rare case that this optimization degrades
performance.
+Oregreassocis the default.
NOTE:This option is valid only on the PA-RISC systems.
+O[no]report
+Oshortdata[=size]
+Oreportspecifies the contents of the Optimization
Report.
All objects of size bytes or smaller will be placed in the
short data area, and references to such data will assume
it resides in the short data area. Valid values of n are 0,
or a decimal number between 8 and 4,194,304 (4MB).
If no size is specified, all data is placed in the short data
area. If size is 0, no data will be placed in the short data
area, and all data references will use long offsets. The
default is +Oshortdata=8.
NOTE:Using a value that is too big or without the
optional size, possibly through +Ofast, may give various
linker fixup errors, if there is more than 4Mb of short
data.
NOTE:All object files participating in a link need to
be consistently compiled with respect to +Oshortdata.
Since the +Ofaster and +Ofast options imply
+Oshortdata, other Fortran object files which reference
common blocks should also be explicitly compiled with
+Oshortdata. Alternatively, the +Ofasterand
+Ofastcompilations can be compiled with an additional
+Oshortdata=8appended to the command line.
48Compiling and linking
+O[no]sideeffects=name1, name2,...nameN
+O[no]static_prediction
Optimize with the assumption that the specified
subprograms do [do not] modify global variables. This
option can be used at optimization level 2 or higher. The
default is to assume that all subprograms have side effects
unless the optimizer can determine that there are none.
Enables [disables] the use of static branch prediction for
decision on conditional branches. This is more applicable
to large programs with poor locality. This option is
available at optimization level 3 and above.
NOTE:This option is valid only on the PA-RISC systems.
+O[no]store_ordering
+O[no]vectorize
+O[no]whole_program_mode
Filenames
The f90command accepts files with any of the filename extensions listed in Table 2-12. The table
also describes the meaning each name has for the f90command. Files with names other than
those listed in the table are passed to the linker.
Preserve [do not preserve] the original program order
for stores to memory that is potentially visible to multiple
threads. This does not imply strong ordering. The default
is +Onostoreordering.
+Ovectorizecauses the compiler to replace certain
loops with calls to the math library. This option is only
effective at optimization level 3 or higher.
If you link separately from the command line and you
compiled with the +Ovectorizeoption, you must ensure
that the link line causes the math library to be searched.
+Onovectorizeis the default.
NOTE:This option is valid only on the PA-RISC systems.
Assert [do not assert] that only files compiled with this
option directly reference any global variables and
procedures that are defined in these files. This option
enables optimizations that assume that there are no
unseen accesses to the globals. This option can be used
at optimization level 4.
The default is +Onowhole_program_mode.
Table 21 Filenames recognized by f90
MeaningFilenames
Free-form Fortran source code; processed by the compiler.file.f90
Fixed-form Fortran source code; processed by the compiler.file.f
file.F
file.i90
file.i
Fixed-form Fortran source code; first processed by the C preprocessor (cpp), then
by the compiler.
Free-form output from the C preprocessor (if the source file ends in .f90); processed
by the compiler.
Fixed-form output from the C preprocessor (if the source file ends in .For .f);
processed by the compiler.
Object code; passed to the linker (ld).file.o
Assembly language code; passed to the assembler (as).file.s
Filenames49
NOTE:The compiler generates a.modfile for each file that defines a Fortran module. It also reads
the .modfiles when compiling source files that use modules. Do not specify .mod files on the
command line. If you do, the compiler will pass them to the linker, which will try (and fail) to link
them into the executable. For more information about .mod files, see “Compiling programs with
modules” (page 54).
Linking HP Fortran programs
This section discusses how to link object files and covers the following topics:
•The advantages of using the f90 command for linking as well as for compiling
•How to link libraries, including shared libraries
•How to establish the search rules used by the linker when it looks for libraries
For more information about the linker, see Programming on HP-UX and to the ld (1) man page.
Linking with f90 vs. ld
By default, the f90 command both compiles and links, producing an executable program. You
can modify this behavior with the -coption, which causes f90to compile only, writing the object
files (if the compilation is successful) in the current working directory. If the command line contains
object files only, f90passes them to the linker (ld) for linking into the executable program. In other
words, you can use the f90command to compile and link in one command line or in separate
command lines. You do not need to invoke the ld command separately.
In fact, we recommend that you use the f90 command whenever you link HP Fortran object files
and that you use the same command line for linking as for compiling.
When you use the f90 command to compile and link in the same command line, the driver passes
certain information—search paths, library names, and options—to the linker. If you use the ld
command to link separately, you must specify this same information on theld command line. Not
doing so can cause the link to fail. Using the samef90 command line to link as you use to compile
avoids the problem of passing insufficient or incorrect information to the linker.
To see what information f90passes to the linker, compile with the-v option (verbose mode). Here
is the hello.f90 program (listed in “Compiling with the f90 command” (page 18)) compiled in
verbose mode. The lines are numbered for the convenience of referencing:
Line 1 is the f90 command line.
Line 2 is the information f90 passes to the compiler, including the full pathname of the compiler,
the name of the source file (hello.f90), and the internal names of the option settings as determined
by the defaults and the f90command line.
Lines 3 - 6 show the progress of the compilation; line 6 indicates that the compilation was successful.
Line 7 displays the value to which f90 has defined the LPATH environment variable. If you use
the ld command to link hello.f90, you must defineLPATH on the command line before invoking
the linker. See “LPATH environment variable” (page 65).
50Compiling and linking
Line 8 is the command line thatf90 passes to the linker (ld). If you use the ld command to link
hello.f90, the command line should be similar to the one shown here.
As noted in the comments on lines 7 and 8, compiling and linking hello.f90successfully using
both the f90and ldcommands requires three command lines:
$ f90 -c hello.f90 # compile
$ export LPATH=/opt/fortran90/lib/pa1.1:/usr/lib/pa1.1:\
/opt/fortran90/lib:/usr/lib:/opt/langtools/lib# set LPATH
-lcl -lc -lisamstub # link
The command line to setLPATH in the csh is:
$ setenv LPATH /opt/fortran90/lib/pa1.1:/usr/lib/pa1.1:\
/opt/fortran90/lib:/usr/lib:/opt/langtools/lib# set LPATH
For more information about the linker, see the ld(1) man page. For a list of f90 options that you
can use to control the linker, see Table on page 14. To pass linker options from the f90 command
line to the linker, use the-Wl option (for an example, see “Linking to shared libraries” (page 53)).
The HP Fortran Programmer’s Reference, fully describes the -Wl option.
Linking to libraries
When you use the f90 command to create an executable program, the linker looks in the libraries
listed in Table 2-14 to resolve references. By default, the linker uses the shared libraries, if available.
For information about shared libraries, see “Linking to shared libraries” (page 53).
The libisamstublibrary is provided as a tool for migrating HP FORTRAN77™ programs that
call ISAM routines. The ISAM library is not available with HP Fortran, but the stub library allows
the linker to resolve references to ISAM routines in HP FORTRAN 77 programs.
Table 22 Libraries linked by default on PA-RISC
ContentsLibrary
Archive version of HP Fortran runtime library/usr/lib/libcl.a
Shared version of HP Fortran runtime library/usr/lib/libcl.sl
Archive library of array intrinsic procedures/opt/fortran90/lib/libF90.a
Shared library of intrinsic procedures and system routines,/usr/lib/libc.sl
Archive libraries of stubs to satisfy ISAM references/opt/fortran90/lib/libisamstub.a
/opt/fortran90/lib/libisamstubs.a
Shared library of stubs to satisfy ISAM references/usr/lib/libisamstub.sl
Table 23 Libraries linked by default on Itanium
ContentsLibrary
Archive version of HP Fortran runtime library/usr/lib/hpux[32|64]/libIO77.a
Shared version of HP Fortran runtime library/usr/lib/hpux[32|64]/libIO77.so
/usr/lib/hpux[32|64]/libF90_paral lel.a
/usr/lib/hpux[32|64]/libF90_paral lel.so
Archive library of array intrinsic procedures/usr/lib/hpux[32|64]/libF90.a
Shared library for Fortran intrinsic procedures/usr/lib/hpux[32|64]/libF90.so
Archive library for Fortran intrinsic procedures on parallel
mode
Shared library for Fortran intrinsic procedures on parallel
mode
Linking HP Fortran programs51
Table 23 Libraries linked by default on Itanium (continued)
ContentsLibrary
/usr/lib/hpux[32|64]/libc.so
Linking to nondefault libraries
The -loption enables you to specify other libraries for linking, in addition to the default libraries
listed in Table 2-14. The syntax for this option is:
-lx
where xis a sequence of characters that completes a library name of the form /lib/libx.a or
/usr/lib/libx.a. For example,-lm specifies the math library, /usr/lib/libm.a. (The
.aextension indicates an archive library. You can also link to shared libraries, which have the
.slextension; see “Linking to shared libraries” (page 53).)
The -loption is order-sensitive: when the linker finds a reference in an object file that it cannot
resolve in the default libraries, it searches the libraries (if any) specified after the file on the command
line. For example, the following command line tells the linker to look for unresolved references in
the math library as well as the default libraries:
$ f90 prog.f90 -lm
You can also link a library to your program by specifying its name after the name of the source
file that references it, as follows:
$ f90 prog.f90 /usr/lib/libm.a
Shared library for intrinsic procedures and system
routines
Archive library for Fortran 2003 standard features/opt/fortran90/lib/hpux[32|64]/li bF2003.a
This form of the command line is useful for linking libraries that do not conform to the naming
convention required by the-l option or that reside in a directory other than /lib or /usr/lib.
As with the -l option, the library name must follow the name of the source file that references it.
For example, the following command line links prog.f90with the my_routineslibrary , both
of which reside in the current working directory:
$ f90 prog.f90 my_routines
If your program calls routines in a library but the linker is unable to resolve the references, compile
with the -Wl,-voption. The f90command passes -vto the linker, causing it to process in verbose
mode. The verbose information includes:
•The names of the libraries that the linker is searching. This information can confirm that the
linker is searching the correct libraries.
•The names of the object files selected by the linker to resolve the references. The linker may
have found the same name in another library and resolved the reference there.
Many library-related problems are owing to a misplaced-l on the command line. The -L option
(discussed in “Library search rules” (page 54)) is also order-sensitive and can cause similar problems.
Linking HP Fortran 90 routines
When calling HP Fortran 90 routines on the HP 9000 Series 800, you must include the appropriate
run-time libraries by adding certain arguments to the aCC command when linking your program.
These arguments depend on how the Fortran 90 routines were compiled:
32-bit PA-RISC 1.1 code
-L/opt/fortran90/lib/ -lF90 -lisamstub
32-bit PA-RISC 2.0 code
-L/opt/fortran90/lib/pa2.0/ -lF90 -lisamstub
64-bit PA-RISC 2.0 code
52Compiling and linking
-L/opt/fortran90/lib/pa20_64/ -lF90 -lisamstub
NOTE:For more information on Itanium library paths corresponding to the above examples, see
Table 23 (page 51).
When the linker finds a reference in your program to a name that is not defined in the program
(for example, the DOT_PRODUCTintrinsic), it looks to resolve it in the default libraries. If it cannot
find the name in the default libraries, the link will fail unless the command line specifies additional,
non default libraries. This section discusses how to link to nondefault libraries (including shared
libraries) and library search rules.
Additional HP Fortran libraries
HP Fortran provides the following two libraries you can link with Fortran programs:
•/opt/fortran90/lib/libU77.aon PA-RISC or /usr/lib/hpux[32|64]/libU77.aOn
Itanium: The BSD 3f (libU77)library, which provides a Fortran interface to some of the
libcsystem routines. Programs that reference routines in this library must be compiled with
the +U77option. For information about porting Fortran programs that reference libU77routines,
see “Migrating to HP Fortran” (page 131).
•/opt/fortran90/lib/libblas.a: The Basic Linear Algebra Subroutine (BLAS) library,
which provides routines that perform common vector and matrix operations. Programs that
reference routines in this library must be compiled with the +lblasoption. For more information,
see “Calling BLAS library routines” (page 105). (PA-RISC only)
Both the libU77and BLASlibraries are described in the HP Fortran Programmer’s Reference.
Linking to shared libraries
Many HP Fortran libraries as well as HP-UX libraries exist in both shared and archive versions, as
indicated by the library extension name (.sl or.aor .slon Itanium). For example, there are
both shared and archive versions of the HP Fortran runtime library, /usr/lib/libcl.sl and/usr/lib/libcl.a.
The difference between a shared library and an archive library is that the linker does not actually
link the code in a shared library with your program. Instead, any references that your program
makes to entities in the shared library are resolved at load-time, when the library is loaded into
the executable program’s address space. By contrast, code in the archive library is copied to the
executable program file.
The advantages of linking shared libraries are:
•The executable is smaller than it would be if linked with an archive file because the executable
file is incomplete—it doesn’t include code from the library.
•Using shared libraries ensures that you always get the most recent version of the library. If
you link with an archive version, you get the version that was available at link-time. If, later
on, you want a more recent version of the library, you must re-link your program with that
library.
The disadvantage of linking with a shared library is that it creates a dependency between the
library and the program; both the shared library and the program must always be installed together
on the same system. By contrast, linking with an archive library makes the executable program
independent of the library to which it was linked. Also, programs that make frequent calls to library
routines may run more slowly when linked to shared libraries.
By default, the linker selects the shared version of a library, if one is available; otherwise, it selects
the archive version.
Linking HP Fortran programs53
NOTE:For libF90, libU77, and libIO77(Itanium® only), archive libraries are selected by
default (see +sharedlibF90, +sharedlibU77, and +sharedlibIO77options).
To force the linker to select archive libraries, specify the -Wl,-a,archiveoption on the
f90command line. f90passes the arguments to the -Wl option (-a and archive) to the linker.
This option must appear before the names of any libraries also specified on the command line.
The following command line compiles prog.f90and links it with the archive versions of the default
libraries as well as with the math library (as specified by the-lm option):
$ f90 -Wl,-a,archive prog.f90 -lm
For information about the linker’s -a option, see the ld(1) man page. For more information about
shared libraries, see “Creating shared libraries” (page 59).
Library search rules
When you use the -loption to specify the name of a library, the linker searches for the library in
the directories specified by the LPATHenvironment variable. The f90command sets this variable
so that the linker looks first in /opt/fortran90/lib, then in/usr/lib . You can specify another
directory to search by settingLPATHyourself; see “LPATH environment variable” (page 65).
Alternatively, you can use the -Ldirectoryoption to direct the linker to search directorybefore
it looks anywhere else to resolve references. For example, the following command line:
$ f90 -L/my_libs prog.f90 -lstuff
causes the linker to search for libraries (including libstuff.sl and libstuff.a), starting
with the directory /my_libsand then looking in /opt/fortran90/lib and /usr/lib.
Special-purpose compilations
The default behavior of the HP Fortran compiler has been designed to handle typical compilations.
Most applications should require no more than a few of the f90options to compile successfully
(see Table on page 25 for a list of commonly used options).
However, the compiler can also meet the needs of more specialized compilations. This section
explains how to use the f90command for the following purposes:
•To compile programs that contain Fortran modules.
•To compile programs that will execute on different PA-RISC machines.
•To create object files for shared libraries.
•To process source files that contain C preprocessor directives.
•To create demand-loadable programs.
•To create shareable executable programs.
•To compile 32-bit programs in 64-bit mode.
Compiling programs with modules
One of the features of standard Fortran is the module, a program unit that facilitates shared access
to data and procedures. Modules are fully described in the HP Fortran Programmer’s Reference.
A benefit to using modules is that they provide interface information to the compiler, allowing it to
catch mismatch errors between (for example) dummy arguments and actual arguments. When the
HP Fortran compiler processes a file that defines a module, it generates a .modfile with the interface
information. Later, when the compiler processes a file that uses the module, it reads the .modfile
and checks that module entities that are referenced in the using file correctly match the information
in the.mod file.
54Compiling and linking
To make the .modfiles available to the compiler, you must therefore compile the files that define
modules before the files that use modules. Likewise, if you make changes to a file that defines a
module, you must recompile that file as well as any files that use the module, in that order.
Also, if a module is defined and used in the same file, the definition must lexically precede
anyUSEstatements that reference the module. This requirement allows the compiler to generate the
.modfile first, so that it can resolve the references in any USEstatements.
This section discusses the following topics:
•How to compile a program that uses modules
•How to design makefiles to work with modules
•How to use the -Iand +moddiroptions to manage .modfiles
Special-purpose compilations55
Examples
Example 5
Consider, for example, a program that consists of three files: main.f90, code.f90, anddata.f90
. The main program unit is in main.f90, as follows.
Example 6 Example 2-2 main.f90
PROGRAM keep_stats
! stats_code contains module procedures for operating
! on statistical database
USE stats_code
INTEGER :: n
! print prompt, using nonadvancing I/O
WRITE (*, FMT='(A)', ADVANCE='NO') 'Enter an integer '// &
'(hint: 77 is current average): '
READ *, n
IF (n == 0) THEN
PRINT *, 'But not that one.'
ELSE
CALL update_db(n)
IF (n >= get_avg()) THEN ! get_avg is in stats_code
PRINT *, 'Average or better.'
ELSE
PRINT *, 'Below average.'
END IF
END IF
END PROGRAM keep_stats
The first specification statement (USE) in the main program indicates that it uses the module
stats_code. This module is defined in code.f90, as follows:
Example 7 Example 2-3 code.f90
! stats_code: a (partial!) package of module procedures for
! performing statistical operations
MODULE stats_code
! shared data to be used by procedures declared below
USE stats_db
CONTAINS ! module procedures
! update_db: updates shared variables in module stats_db
SUBROUTINE update_db (new_item)
INTEGER :: new_item
n_items = n_items +1
item(n_items) = new_item
sum = sum + new_item
END SUBROUTINE update_db
! get_avg: returns arithmetic mean
INTEGER FUNCTION get_avg ()
get_avg = sum / n_items
END FUNCTION get_avg
END MODULE stats_code
This program unit also begins with a USEstatement, which identifies the module it uses as stats_db.
This module is defined in data.f90, as follows:
56Compiling and linking
Example 8 Example 2-4 data.f90
! stats_db: shared data declared here
MODULE stats_db
INTEGER, PARAMETER :: size = 100 ! max number of items in array
! n_items, sum, and item hold the data for statistical analysis
INTEGER :: n_items, sum
INTEGER, DIMENSION(size) :: item
! the initializations are just to start the program going
DATA n_items, sum, item/3, 233, 97, 22, 114, 97*0/
END MODULE stats_db
The use of modules in this program creates dependencies between the files because a file that uses
a module that is defined in another file is dependent on that other file. These dependencies affect
the order in which the program files must be compiled. The dependencies in the example program
are:
•main.f90is dependent upon code.f90.
•code.f90is dependent upon data.f90.
These dependencies require that data.f90be compiled before code.f90, and that code.f90be
compiled before main.f90. This order ensures that the compiler will have created each of the
.mod files before it needs to read them.
The order of the source files listed in the following command line ensures that they will compile
and link successfully:
$ f90 -o do_stats data.f90 code.f90 main.f90
During compilation, f90will create two .mod files, STATS_CODE.modand STATS_DB.mod.
These will be written to the current working directory, along with the object files and the executable
program, do_stats. Following is a sample run of the executable program:
$ do_stats
Enter an integer (hint: 77 is current average): 77
Average or better.
If instead of the preceding command line, the program had been compiled as follows:
$ f90 -o do_stats main.f90 data.f90 code.f90
the compilation would fail and f90would print the error message:
Error FCE37 : Module STATS_CODE not found
The compilation would fail because the compiler cannot process main.f90without
STATS_CODE.mod. But the order in which the program files appear on the command line prevents
the compiler from processing code.f90(and thereby creating STATS_CODE.mod) until after it
has processed main.f90.
Compiling with make
If you use the make utility to compile Fortran programs, the description file should take into account
the dependencies created by modules. For example, to compile the do_stats program using
the make utility, the description file should express the dependencies as follows:
Special-purpose compilations57
Example 9 Example 2-5 makefile
makefile # description for building do_stats
do_stats : main.o code.o data.o
f90 -o do_stats main.o code.o data.o
# main.o is dependent on main.f90 and code.f90
main.o : main.f90 code.o f90 -c main.f90
# code.o is dependent on code.f90 and data.f90
code.o : code.f90 data.o f90 -c code.f90
# data.o is dependent only its source, data.f90
data.o : data.f90
f90 -c data.f90
Note that the dependencies correspond to the order in which the source files are specified in the
followingf90 command line
$ f90 -o do_stats data.f90 code.f90 main.f90
Assuming that you name the description file makefile, the command line to compile the program
with make is:
$ make
Managing .mod files
By default, the compiler writes.mod files to the current working directory and looks there when it
has to read them. The +moddir=directoryand -I directory options enable you to specify
different directories. The +moddiroption causes the compiler to write .modfiles in directory, and
the -Ioption causes the compiler to search directory for .modfiles to read. (The space character
between -Iand directoryis optional.)
Using the example of the do_statsprogram, the following command line compiles (without
linking) data.f90and writes a.modfile to the subdirectory mod_files:$ f90
-c+moddir=mod_filesdata.f90The command line: $
f90-c+moddir=mod_files-Imod_filescode.f90uses both the +moddir and -Ioptions,
as follows:
•The +moddir option causes f90 to write the .mod file for code.f90 in the subdirectory mod_files.
The -I option causes f90 to look in the same subdirectory for the .mod file to read when compiling
code.f90. The command line: $ f90 -odo_stats -I mod_files main.f90 code.o data.o causes f90 to
compile main.f90, look for the .mod file in the subdirectory mod_files, and link all of the object
files into an executable program named do_stats.
Compiling for different PA-RISC machines
When you compile an HP Fortran 90 program, the object code that the compiler generates by
default is based on the PA-RISC model of the machine that is running the compiler. If your program
will execute on a different PA-RISC model machine, the code may run less efficiently or (in the case
of PA2.0 code that attempts to run on a PA1.1 machine) may not run at all.
Also, some libraries (for example, the math library) are available in different PA-RISC versions. By
default, the compiler selects the version that is based on the PA-RISC model of the compiling
machine. If your program will execute on a different model machine, it may not be linked with the
appropriate libraries.
Compiling with the +DAmodel option ensures that the compiler generates code that is based on
the architecture specified by model and that the linker selects libraries that are compatible with
model. model must be one of the following:
•A PA-RISC version number—1.1, 2.0, or 2.0W. Use +DA2.0W to compile in 64-bit mode;
see “Compiling in 64-bit mode” (page 63).
•A model number—for example, 750or 870.
58Compiling and linking
•A PA-RISC processor name—for example, PA7100or PA8000.
•portable—code that is compatible across all models. Use +DAportableonly if you want
to ensure that your program will run on different models.
Use the uname -mcommand to learn the model of your machine, as follows:
$ uname
-m 9000/879
Alternatively, you can use the grep command to look up the model number in the file /opt/
langtools/lib/sched.modelsand find its architecture type, as follows:
$ grep 879 /opt/langtools/lib/sched.models
PA80002.0879
You can also use the +DSmodeloption to specify an architecture-specific instruction scheduler,
where model has the same meaning as it does for the +DAoption. Like the +DAoption, the +DSoption
is unnecessary if the program will run on the same machine as you use to compile it. Also, if you
compile with +DAmodel, the compiler will select the scheduling algorithm based on the same
architecture—unless you use the +DSoption to specify a different architecture.
NOTE:Code generated for PA1.1 systems will execute on PA2.0 systems, but the reverse is not
true: the loader will not allow PA2.0 code to run on a PA1.1 system.
Creating shared libraries
As mentioned in “Linking to shared libraries” (page 53), many of the HP-UX as well as HP Fortran
libraries are available in shared as well as archive versions. Linking with shared libraries can make
the executable program smaller and can ensure that it always has the most current version of the
library.
You can make shared versions of your own libraries, using the +piccommand-line option and the
-blinker option. The following sections describe how to use these options and show an example
of how to create a shared library.
Compiling with +pic
The +picoption causes the compiler to generate Position- Independent Code (PIC) for use in a
shared library. PIC contains no absolute addresses and can therefore be placed anywhere in a
process’s address space without addresses having to be relocated. This characteristic of PIC makes
it shareable by multiple processes.
The syntax of the +pic option is:
+pic={short|long|no}
Although compiling with either +pic=short or +pic=longwill generate PIC, in general you
should use the +pic=shortoption. If the linker issues an error message saying that the number
of referenced symbols in the shared library exceeds its limit, recompile with +pic=long, which
will cause the compiler to allocate space for a longer symbol table.
The +pic=no is the default, which causes the compiler to generate absolute code, such as you
would want for executable programs.
The following command line creates three object files—x.o, y.o, and z.o; the code in each file
will be PIC:
$ f90 -c +pic=short x.f90 y.f90 z.f90
For more information about the +picoption, see the HP Fortran Programmer’s Reference.
Special-purpose compilations59
Linking with -b
The -boption is a linker option. It causes the linker to bind PIC object files into a shared library,
instead of creating a normal executable file. The-b option must be used with the ldcommand;
you cannot use thef90 command to create a shared library. Also, the object files specified on
theld command line must consist of PIC; that is, they must have been created with either
+pic=shortor +pic=long.
The following command line links the object filesx.o , y.o, and z.ointo a shared library, named
my_lib.sl:
$ ld -b -o my_lib.sl x.o y.o z.o
Note that thisld command line is much simpler than the ld command line required to link an
executable file (for example, see “Linking with f90 vs. ld” (page 50).
Examples
Example 10 Example 2-6 hi.f90
SUBROUTINE say_hi()
PRINT *, 'Hi!'
END SUBROUTINE say_hi
Example 11 Example 2-7 bye.f90
SUBROUTINE say_bye()
PRINT *, 'Bye!'
END SUBROUTINE say_bye
Example 12 Example 2-8 greet.f90
PROGRAM main
CALL say_hi()
CALL say_bye()
END PROGRAM main
The following command line creates the PIC object files (the -c option suppresses linking):
$ f90 -c +pic=short bye.f90 hi.f90
The next command line links the object files into the shared library:
$ ld -b -o my_lib.sl bye.o hi.o
The last command line compiles the source file greet.f90 and links the object code with the
shared library to produce the executable program a.out:
$ f90 greet.f90 my_lib.sl
The following is the output from a sample run of the executable program:
$ a.out
Hi!
Bye!
Using the C preprocessor
You can use the f90 command to pass source files to the C preprocessor (cpp) before they are
compiled. If the source files contain C preprocessor directives, cpp will act on the directives,
modifying the source text accordingly. The f90 driver will then pass the preprocessed source text
to the compiler. Adding cpp directives to program source files and having the cpp command
preprocess them is a convenient way to maintain multiple versions of a program—for example, a
debugging version and a production version—in one set of files.
60Compiling and linking
cppdirectives are similar to debugging lines, a feature of many Fortran implementations see, “Using
debugging lines” (page 80). Like cppdirectives, debugging lines enable the compiler to treat
source lines as either compilable statements or comments to be removed before compilation. But
debugging lines are nonstandard, available only in fixed-form source, and not nearly as powerful
as the cpp directives. Although cppdirectives are not a standard feature of Fortran, cppis a de
facto standard feature of UNIX systems.
This section discusses how to do the following:
•Invoke cpp from the f90command line.
•Use the -D option to define cppmacros.
•Save the preprocessed output generated by cpp.
For more information about the cppcommand and the directives it supports, see the cpp (1) man
page.
Processing cpp directives
By default, the f90 command passes source files ending in the .Fextension to cpp. Compiling
with the +cpp=yesoption enables you to override this default and cause the f90 driver to pass
all source files to cpp. If you do not compile with the +cpp=yes option and if the source file does
not have the .F extension, the compiler treats any cpp directives (but not any embedded Fortran
statements) as comments and ignores them. (As a language extension, HP Fortran allows comments
to begin with the , which is also the prefix character for all cpp directives.)
Consider the following program:
Example 13 Example 2-9 cpp_direct.f90
PROGRAM main
REAL :: x
WRITE (6, FMT=’(A)’, ADVANCE=’NO’) ‘Enter a real number:
‘ READ *, x
#ifdef DEBUG
PRINT *, ‘The value of x in main: ‘, x
#endif
PRINT *, ‘x =’, double_it(x)
END PROGRAM main
REAL FUNCTION double_it(arg)
REAL :: arg
#ifdef DEBUG
PRINT *, ‘The value of x in double_it: ‘, arg
#endif
double_it = 2.0 * arg
END FUNCTION double_it
The program uses the #ifdefand #endifdirectives around PRINTstatements. If the macro DEBUG
is defined, cppwill leave the PRINTstatements in the source text that is passed to the compiler; if
it is not defined, cpp will remove the statements. You can define the macro in the source text, using
the #definedirective; or you can define it on the command line, using the -D command-line
option. The advantage of the option is that it does not require editing the source file to define or
undefine a macro.
$ f90 +cpp=yes -D DEBUG cpp_direct.f90
Here is the output from a sample run of the executable program created by the preceding command
line:
$ a.out
Enter a real number: 3
The value of x in main: 3.0
Using the C preprocessor61
The value of x in double_it: 3.0
x = 6.0
The next command line does not use the-D option, so that DEBUGis undefined, causing cpp to
remove the PRINTstatements from the source text that is passed to the compiler:
$ f90 +cpp=yes cpp_direct.f90
Here is the output from the nondebugging version of the program:
$ a.out
Enter a real number: 3.3
x = 6.6
Saving the cpp output file
By default, the f90command discards the source text as processed by cpp after compilation.
However, you can preserve this text by compiling with the +cpp_keepoption. If the source file
has the .For .fextension, the output from cppis written to a file with the same name but with the
.iextension. If the source file extension is .f90, the output file has the .i90extension.
Here is the previous command line to preprocess and compile cpp_direct.f90, with the addition
of the +cpp_keep option:
$ f90 +cpp_keep +cpp=yes cpp_direct.f90
After the PRINTstatements have been removed, the resulting output file looks like this:
$ cat cpp_direct.i90
# 1 "cpp_direct.f90"
PROGRAM main
REAL :: x
WRITE (6, FMT='(A)', ADVANCE='NO') 'Enter a real number:'
READ *, x
PRINT *, 'x =', double_it(x)
END PROGRAM main
REAL FUNCTION double_it(arg)
REAL :: arg
double_it = 2.0 * arg
END FUNCTION double_it
Compiling
Creating demand-loadable executables
By default, the loader loads the entire code for an executable program into virtual memory. For
very large programs, this can increase startup time. You can override this default by causing the
linker to mark your program demand load. A demand-loadable program is loaded into memory
a page at a time, as it is accessed.
Use the +demand_loadoption to make your program demand loadable, as follows:
$ f90 +demand_load prog.f90
The f90 command passes this option to the linker, which marks the executable program demand
load.
Demand loading allows a program to start up faster because page loading can be spread across
the execution of the program. The disadvantage of demand loading is that it can degrade
performance throughout execution.
Creating shared executables
By default, the linker marks an executable program as shared. A shared executable is shareable
by all processes that use the program. The first process to run the program loads its code into
virtual memory. If the program is already loaded by another process, then a process shares the
code with the other process.
62Compiling and linking
You can override this default with the +nosharedoption, which causes the linker to mark the
executable as unshared, making the program’s code nonshareable. The following command line
causes the linker to mark prog.f90as unshared:
$ f90 +noshared prog.f90
In some circumstances, it may help to debug a program or to improve its runtime performance by
making it nonshareable. In general, however, it is not desirable because nonshareable executables
place greater demands on memory resources.
Compiling in 64-bit mode
Compiling HP Fortran programs with the +DA2.0Woption (PA-RISC only; for Itanium, use +DD64)
causes f90 to produce 64-bit executable programs. You should consider compiling in 64-bit mode
if your program does any of the following:
•Accesses a large shared memory (greater than 1.75 gigabytes) or large data spaces (greater
than 1 gigabyte or, if using EXEC_MAGIC, greater than 1.9 gigabytes)
•Uses large data elements—greater than 32-bit words
•Provides objects or libraries that might be used in a 64-bit application
There are no HP Fortran language differences between 32-bit and 64-bit programs. Recompiling
should suffice to convert a 32-bit Fortran program to run as a 64-bit program.
However, the C language has some differences in data type sizes. If your Fortran program calls
functions written in C and is compiled in 64-bit mode, the size differences may require promoting
the data items that are passed to or from the C functions. See Table on page 181 and Table on
page 181 for the size differences between Fortran and C data types when compiled in 64-bit
mode.
NOTE:If your program does not need to run in 64-bit mode, there is no benefit to compiling it
in 64-bit mode. In fact, the executable program may run slower than if compiled in 32-bit mode.
Using environment variables
Environment variables are variables that are defined in the operating environment of the system
and are available to various system components. For example, when you run a program, the shell
looks at the PATHvariable to determine where the program is located. Table 2-15 lists and briefly
describes the environment variables that control the way HP Fortran programs are compiled, linked,
and run.
Table 24 HP Fortran environment variables
F90ROOT
ST_F90COM[64]
HP_F90OPTS
LPATH
MP_NUMBER_OF_THREADS
DescriptionEnvironment variable
Specifies the alternate root for Fortran compiler and its
libraries.
Specifies the path of 32-bit or 64-bit f90compiler
respectively in case of non-standard compiler path.
Specifies a list of command-line options that f90inserts in
the command line that invokes the HP Fortran compiler.
Specifies a list of directories that the linker is to search for
libraries.
Specifies the desired number of processors to be used to
run HP Fortran programs that have been compiled for
parallel execution.
Compiling in 64-bit mode63
Table 24 HP Fortran environment variables (continued)
DescriptionEnvironment variable
TMPDIR
TTYUNBUF
The following sections describe how to use the HP_F90_OPTS, LPATH, and
MP_NUMBER_OF_THREADSenvironment variables. See theenviron (5) man page for information
about system-level environment variables.
F90ROOT environment variable
The F90ROOT environment variable causes f90 driver to invoke all subprocesses from an alternate
F90 root directory, rather than from their default directory. The default F90 root directory is
/opt/fortran90/. For example,
•Install F90 compiler in an alternate root using '-r' option in swinstall or copy /opt/fortran90/
•f90com32/f90com64 is invoked from <F90ROOT>/opt/fortran90/lbin/ directory.
Specifies a directory for temporary files; used in place of
the default directory /var/tmp.
Controls tty buffering. To enable tty buffering, set
TTYUNBUFto 0; to disable tty buffering, set it to a nonzero
value.
Only files in /opt/fortran90 are affected by F90ROOT environment variable. Regardless of the
HP Fortran driver you use, the compiler still uses the libraries, linker, and other files located in
/usr/lib and /usr/ccs.
ST_F90COM[64] environment variable
The ST_F90COM / ST_F90COM64 environment variables specify the f90com32 / f90com64
path on a non-standard path.
The HP_F90OPTSenvironment variable is read by the f90driver for options to insert in the command
line. This variable is useful when you want the same options and arguments each time you invoke
the f90command. For example, if HP_F90OPTSis set to the -voption, the following command
line:
$ f90 +list hello.f90
is equivalent to:
$ f90 -v +list hello.f90
The syntax of the HP_F90OPTSvariable allows the bar (|) character to be used to specify that
options appearing before |are to be recognized before any options on the command line and
that options appearing after |are to be recognized after any options on the command line. For
example, the commands:
The LPATHenvironment variable is read by the linker to determine where to look for libraries to
link with a program’s object file. Depending on whetherLPATH is set or not, one of the following
actions occurs:
•If LPATHis already set, only the directories listed in LPATHare searched. This happens, for
example, whenLPATH is set in a user’s .kshrcor .cshrcfile, or after LPATHis defined from
the command line.
•IfLPATH is not set, the f90command sets defaultLPATH settings that are used when linking
the object files listed on thef90 command line.
Because thef90 command sets LPATHbefore calling the linker, it should not be necessary to set
this variable for most compilations. However, if you do need to set it (for example, you use theld
command to link), the following directories should be the first items in LPATH:
•/opt/fortran90/lib
•/usr/lib
•/opt/langtools/lib
The following command lines set LPATH to include these directories, using (respectively) the
kshandcsh syntax:
To see how f90sets LPATHbefore calling the linker, compile with the -voption for verbose output.
For an example, see “Linking with f90 vs. ld” (page 50).
MP_NUMBER_OF_THREADS environment variable
The MP_NUMBER_OF_THREADSenvironment variable sets the number of processors that are to
execute a program that has been compiled for parallel execution. If you do not set this variable,
it defaults to the number of processors on the executing machine.
The following command lines set MP_NUMBER_OF_THREADSto specify that programs compiled
for parallel execution can execute on two processors:
$ export MP_NUMBER_OF_THREADS=2 # ksh syntax
% setenv MP_NUMBER_OF_THREADS 2 # csh syntax
For information about parallel execution, see “Compiling for parallel execution” (page 100).
Floating installation
More than one version of the HP Fortran compiler can be installed on one system at the same time.
The floating installation feature allows you to install the compiler in any location. You can install
as many compiler versions as required, depending on your system’s resources.
By default, HP Fortran is installed under the /opt/fortran90 directory. In earlier releases, the
compiler driver (f90) looked for related files in subdirectories of the /opt/fortran90 directory. This
prevented installation of more than one version of HP Fortran on the same system at the same time.
Only files in /opt/fortran90 are affected by floating installation. Regardless of the HP Fortran
driver you use, the compiler still uses the libraries, linker, and other files located in /usr/lib and
/usr/ccs.
Floating installation is designed to help facilitate in-house development. You must not ship libraries
in non-standard places, because explicit runtime library specifications and linker options are
required.
Floating installation65
Setting up floating installation
You may want to install the most recent compiler version and keep the prior version on one system.
If there are problems with the most recent version, you can easily switch to the prior one. Following
is an example of how to set up the floating installation feature for this purpose. Assume that your
system will have two versions of the compiler, both floating install enabled. In this case, 3.6.1 is
the prior version, and 3.6.2 or 3.7 is the more recent version.
To setup floating installation, complete the following steps:
1.Copy the prior version to another directory.
$ cp -rp /opt/fortran90 /opt/fortran90.3.6.1
2.Use swinstall to install the new version (3.6.2 or 3.7 in this case).
3.To use 3.6.1, invoke as,
$ /opt/fortran90.3.6.1/bin/f90
In this case, the compiler is picked up from /opt/fortran90.3.6.1/lbin/.
To use 3.6.2 or 3.7 whichever is installed latest, invoke as default,
$ f90
or
$ /opt/fortran90/bin/f90
Floating Installation can also work with:
•Alternate path, for example,
<alternate-path>/opt/fortran90.3.6.1/
•Extended float path, that is, if the Fortran driver is invoked from <float-path>/bin/f90,
the compiler is picked up from the relative <float-path>/lbin/ path. As this is not a
recommended usage, the compiler uses Fortran and other libraries, linker and other files
located in default paths.
The precedence of various environment variables that control the Fortran driver to pick up the
Fortran compiler is in the following order:
•F90ROOT
•SDKROOT
•ST_F90COM[64]
NOTE:To use floating installation, do not set the following environment variables:
•F90ROOT
•SDKROOT
•ST_F90COM[64]
66Compiling and linking
3 Controlling data storage
This chapter describes the use of command-line options, directives, and other language features
to control data in HP Fortran programs. In particular, it discusses the following topics:
•Disabling implicit typing
•Automatic and static variables
•Increasing the precision of constants
•Sharing data among programs
•Modules vs. common blocks
NOTE:For information about how HP Fortran aligns data, see HP Fortran Programmer's Reference.
Disabling implicit typing
By default, HP Fortran uses implicit typingto determine the type of a variable or function
that has not been declared with a type declaration statement. That is, the type of an undeclared
entity is determined by the first letter of its name: if the letter is in the rangeI - N , the entity is of
type integer; otherwise, it is of type real.
Although implicit typing is mandated by the Standard, its use can become a source of runtime
bugs because implicit typing allows the inadvertent use of undeclared variables or functions. For
the sake of illustration, consider a program that calls a nonintrinsic library function named foo.
Assume that:
•The default typing rules are in effect.
•fooreturns an integer.
•The programmer has not declared the return type offoo and has assigned its return value to
a variable of type real.
Experience has shown that this is not an unlikely scenario and that it can produce unexpected
results
The Standard provides the IMPLICIT NONEstatement to override implicit typing. But the IMPLICITNONEstatement is limited in scope to the program unit in which it appears. To force explicit typing
for all files specified on the command line, use the +implicit_noneoption. This option disables
implicit typing; that is, all variables, arrays, named constants, function subprograms, ENTRY names,
and statement functions (but not intrinsic functions) must be explicitly declared.
Using this option is equivalent to specifyingIMPLICIT NONE for each program unit in each file
specified on the f90command line. However, the +implicit_noneoption does not override any
IMPLICITstatements in the source file. The HP Fortran Programmer's Reference describes the
implicit typing rules, the IMPLICIT NONE statement, and the +implicit_noneoption.
Automatic and static variables
By default, HP Fortran allocates stack storage for program variables. Such variables are called
automatic variables because they are allocated at each invocation of the program unit in which
they are declared.
Static variables are allocated storage from static memory when the program is first loaded into
memory.
They remain allocated for the life of the program.
Disabling implicit typing67
HP Fortran allocates static storage for the following variables:
•Variables specified in a COMMON or EQUIVALENCE statement.
•Variables initialized in a type declaration statement or in a DATA statement.
•Variables specified in a SAVE or STATIC statement. A SAVE statement without a variable list
specifies static storage for all variables in the scoping unit.
•Variables in program files that have been compiled with the +saveor
+Oinitcheckcommand-line option. See “Uninitialized variables” (page 144)for information
about using these options when porting.
Static variables have two characteristics that are of special interest:
•They are set to 0 or null value at load-time.
•They do not require re-initialization at each invocation of their program unit.
Static variables have several disadvantages. In Fortran programs that use recursion, static variables
can defeat one purpose of recursion—to provide a fresh set of local variables at each recursive
call. Also, the widespread use of static variables in a program can slow its performance: static
variables are ineligible for such fundamental optimizations as register allocation, and they can
limit the optimization of program units that use them.
The following example program illustrates the difference between automatic and static variables.
The program consists of a main program unit that calls a recursive internal subroutine. The subroutine
increments two variables (stat_val and auto_val), prints the updated variables, and then calls itself
recursively. Neither of the two variables is explicitly initialized, but stat_val is declared with the
SAVE attribute, which means that it is allocated static storage and is pre-initialized to 0 by the
compiler.
Example 3-1 recursive.f90
PROGRAM main
! This program calls a recursive internal subroutine.
CALL recurse
CONTAINS
! This subroutine calls itself four times.
! Each time it is called, it adds 1 to the values in
! stat_val and auto_val and displays the result.
! stat_val has the SAVE attribute and therefore is
! pre-initialized and retains its value between calls.
! auto_val is an automatic variable and therefore has
! an unpredictable value (plus 1) at each call.
RECURSIVE SUBROUTINE recurse
INTEGER(KIND=1), SAVE :: stat_val
INTEGER(KIND=1) :: auto_val
stat_val = stat_val + 1
auto_val = auto_val + 1
PRINT *, ‘stat_val = ‘, stat_val
PRINT *, ‘auto_val = ‘, auto_val
IF (stat_val < 4) THEN
CALL recurse()
END IF
END SUBROUTINE recurse
END PROGRAM main
Following are the command lines to compile and execute this program, along with sample output.
Notice that stat_val regularly increments at each call. The reason is that it is a static variable and
therefore retains its value between calls. But auto_val is not actually incremented; it is an automatic
variable and is given a fresh (and uninitialized) memory location at each call. In other words, the
subroutine adds 1 to whatever value happened to be in the memory location that was allocated
to auto_val at the start of the call:
NOTE:HP Fortran provides the AUTOMATIC and STATIC statements as porting extensions. The
STATIC statement is functionally the same as the SAVE statement, and the AUTOMATIC statement
may be used to declare a variable as automatic. However, such a declaration is generally pointless
because variables compiled under HP Fortran are automatic by default.
The HP Fortran Programmer's Reference provides detailed information about the AUTOMATIC,
SAVE, and STATIC statements.
Increasing the precision of constants
By default, HP Fortran evaluates all floating-point constants as single-precision. For example, the
compiler treats following constant
3.14159265358979323846
as though you had specified:
3.1415927
Although the loss of precision might be acceptable when assigning to single-precision variables,
it is might be less acceptable when assigning to double-precision variables or when using
floating-point constants in expressions where the loss in precision might result in significant round-off
differences.
NOTE:HP Fortran provides two ways to override the default precision of individual constants:
the kind parameter and the exponent form. The kind parameter indicates the precision of
floating-point constants: 4 for single-precision, 8 for double-precision, and 16 for quad-precision.
In the following example, the kind parameter _8 specifies that the constant is to be evaluated as
double-precision:
3.14159265358979323846_8
To change the precision of all floating-point constants (except those having a kind parameter), you
can use the +real_constantoption. This option takes two forms, +real_constant=double
and +real_constant=single, which specify (respectively) double-precision and single-precision
for floating-point constants in the files compiled with this option. The
+real_constant=singleform is the default. Neither form of the option has any affect on
constants that have the kind parameter.
To promote all floating-point constants in the source files x.f, y.f, and z.f, compile with the
command line:
$ f90 +real_constant=double x.f y.f z.f
The +real_constant=singleoption specifies that all floating-point constants in a file are to be
treated as single-precision (the default). The following command line specifies single-precision for
all floating-point constants in the files a.f, b.f, andc.f :
$ f90 +real_constant=single a.f b.f c.f
Note that +real_constant=singledoes not demote constants that use either the kind parameter
or the exponent form (for example, 4.0D0) .
For information about increasing the precision of variables, see “Increasing default data sizes”
(page 70). The HP Fortran Programmer’s Reference describes the syntax of the kind parameter
and the exponent form and the +real_constant option. For detailed information about how
Increasing the precision of constants69
floating-point arithmetic is implemented on HP 9000 computers and how floating-point behavior
affects the programmer, see the HP-UX Floating-Point Guide.
Increasing default data sizes
The +autodbland +autodbl4options enable you to increase the default sizes (that is, the number
of storage bytes) for both constants and variables of default numeric and logical types. Unlike the
+real_constant option, the +autodbland +autodbl4options affect both constants and
variables of both real and integer types. (For information about using the +real_constantoption,
see “Increasing the precision of constants” (page 69).)
When compiled with the +autodbland +autodbl4options, constants are treated as though they
had twice the default number of bytes (4) available for evaluating them. The effect of these options
is to increase the range of default integers and the precision of default rails.
The +autodbl and +autodbl4 options have no effect on the size of entities declared with the
CHARACTER, BYTE, or DOUBLE COMPLEXstatements, nor on entities that are explicitly sized. That
is, if a variable is declared with a kind parameter or if a constant has a kind parameter, it is
unchanged by +autodbl or +autodbl4.
NOTE:HP Fortran interprets the kind parameter as indicating the number of storage bytes to
allocate for a variable. When used with variables and constants of type real, the kind parameter
also indicates the precision: 4 for single-precision, 8 for double-precision, and 16 for quad-precision.
Promoting double-precision variables to quad-precision can have a severe impact on performance
because the instructions to perform quad-precision operations are implemented in software. If you
are concerned about performance and want to increase default data sizes, consider using the
+autodbl4option, which does not promote variables declared with the DOUBLE PRECISION
statement. There is no other difference between +autodbl or +autodbl4.
These options affect all files on the command line. To increase the size or precision of selected
variables and constants, use the kind parameter.
“Increasing default data sizes” (page 70) shows the default data types whose sizes are changed
by the +autodbl and +autodbl4 options.
Increasing default data sizes
Figure 2 Increasing default data sizes
70Controlling data storage
The following program illustrates the different effects of the +autodbl and +autodbl4 options. The
program assigns the same quad-precision constant to three variables:
•x, a default (that is, single-precision) real
•y, a real that is declared as double-precision with the kind parameter
•z, a double-precision real that is declared with the DOUBLE PRECISION statement
Example 3-2 precision.f90
PROGRAM main
REAL x
REAL(KIND=16) y
DOUBLE PRECISION z
! Assign a quad-precision constant to a default real:
x = 3.14159265358979323846_16
PRINT 10, ‘Stored in x: ‘, x
! Assign a quad-precision constant to a variable that
! has been explicitly sized for quad-precision:
y = 3.14159265358979323846_16
PRINT 10, ‘Stored in y: ‘, y
! Assign a quad-precision constant to a variable
! declared with the DOUBLE PRECISION statement:
z = 3.14159265358979323846_16
PRINT 10, ‘Stored in z: ‘, z
10 FORMAT (A, F22.20)
END PROGRAM main
Following are three different sets of command lines to compile and execute this program, including
sample output from each compilation. Note that variable y remains the same for each compilation:
the compiler does not promote variables that are sized with the kind parameter.
First, the program is compiled without any option:
$ f90 precision2.f90
$ a.out
Stored in x: 3.14159274101257320000
Stored in y: 3.14159265358979323846
Stored in z: 3.14159265358979310000
Next, the program is compiled with the +autodbl option. As shown in the output, x is promoted to
double-precision and z to quad-precision:
$ f90 +autodbl precision2.f90
$ a.out
Stored in x: 3.14159265358979310000
Stored in y: 3.14159265358979323846
Stored in z: 3.14159265358979323846
Finally, the program is compiled with the +autodbl4 option. As shown in the output, x is promoted,
but z is not:
$ f90 +autodbl4 precision2.f90
$ a.out
Stored in x: 3.14159265358979310000
Stored in y: 3.14159265358979323846
Stored in z: 3.14159265358979310000
Though useful for increasing the range and precision of numerical data, the +autodbl and +autodbl4
options are especially useful when porting; see “Large word size” (page 145). For detailed
information about these options, see the HP Fortran Programmer’s Reference. For detailed information
about how floating-point arithmetic is implemented on HP 9000 computers and how floating-point
behavior affects the programmer, see the HP-UX Floating-Point Guide.
Increasing default data sizes71
Sharing data among programs
If you are designing an application that requires multiple threads of control that share the same
data, the design can take either of two forms:
•The program makes calls to the threads library:
/usr/lib/libpthread.sl
which creates multiple threads executing in a single process and therefore all sharing the same
address space.
which creates multiple threads
•The application consists of several programs that run simultaneously in separate processes
and that access an HP-UX shared memory segment.
The first approach is beyond the scope of this manual and requires that you have an understanding
of how to call the threads library.1 The second approach is described here.
To share data among several HP Fortran programs that are executing simultaneously in separate
processes, use the $HP$ SHARED_COMMONdirective. This directive enables you to create a common
block that is accessible by HP Fortran programs executing in different processes.
The $HP$ SHARED_COMMONdirective causes the compiler to insert HP-UX system calls to perform
shared memory operations. To the programmer, the programs sharing the memory segment appear
as though they were program units in the same program, accessing a set of common block variables.
Following are two programs to illustrate how the $HP$ SHARED_COMMONdirective works:
•The first program, go_to_sleep.f90, must execute first. Because it executes first, it creates
the shared memory segment and then enters a DOloop, where it waits until the second program
starts to execute. You can use the ipcs -m command to confirm that a shared memory segment
has been created.
•When the second program, wakeup.f90, starts to execute, it writes to the shared common
block variables, one of which causes go_to_sleep.f90to break out of the DOloop and run
to completion.
The $HP$ SHARED_COMMONdirective must appear at the beginning of the specification part of
the main program unit of each program sharing the memory segment. Also, the common block
specified for sharing must have the same layout in all files in which it is declared.
You can use the ipcs -mcommand both to determine that HP-UX has created a shared memory
segment and, after the programs complete execution, to confirm that it has been released.
•Specifying the +Oparalleloption causes the compiler to transform eligible loops in an HP
Fortran program for parallel execution on HP 9000 systems. For information about compiling
for parallel execution, see “Compiling for parallel execution” (page 100).
The following two examples illustrate these concepts.
Example 3-3 go_to_sleep.f90
PROGRAM main
! This program, go_to_sleep.f90, and its companion, wake_up.f90,
! share data in a common block, using the $HP$ SHARED_COMMON
! directive. Execute this program first. After it starts to
! execute, use ipcs(1) to confirm that a shared memory segment
! has been created. In a separate process, run wake.f90.
! When it executes, it assigns to alarm, ending this program.
LOGICAL :: alarm
CHARACTER(LEN=8) :: message
! Declare a common block, shared_data, for sharing among
! multiple, simultaneously executing programs. Each program
! that shares the common block must reference it by the same
! key, 'scb1'
!$HP$ SHARED_COMMON KEY=’scb1’ /shared_data/
! Declare a common block with two variables: alarm and message.
72Controlling data storage
! when alarm is set by wake_up.f90, this program breaks out
! of the DO loop, prints message (which wake_up.f90 has
! written to), and exits.
COMMON /shared_data/ alarm, message
alarm = .FALSE.
! Wait for alarm to be set...
DO WHILE (alarm .EQ. .FALSE.)
! sleep(1) is an HP-UX system call that suspends a process
! for the number of seconds specified by the argument.
! The %VAL function tells Fortran that sleep expects its
! argument to be passed by value.
CALL sleep(%VAL(1))
END DO
! Message from wake.f90:
PRINT *, message
! The shared memory segment is destroyed when this program halts.
END
IMPORTANT In the example above, you must use +U77to access the correct sleep in the Fortran
library. If you use +U77, the line above:
CALL sleep (%VAL(1))
should instead read:
CALL sleep (1)
nl
PROGRAM main
! This program, wake_up.f90, should be run just after its
! companion, go_to_sleep.f90, starts to execute but in a
! separate process. The $HP$ SHARED_COMMON directive
! enables both programs to share the same memory.
! Directive puts the common block in shared memory.
$SHARED_COMMON KEY=’scb1’ /shared_common/
LOGICAL :: alarm
CHARACTER(LEN=8) :: message
! Declare a named common block for shared memory. It must
! be laid out n exactly the same way in both programs.
COMMON /shared_common/ alarm, message
! Write to message, sleep reads it.
message = "I'm up!"
! Set alarm to wake up sleep.
alarm = .TRUE.
! The shared memory segment will now be detached.
! However, because go_to_sleep is still running,
! the segment will still be present in memory until
! it stops executing, too.
END
Following are the command lines to compile each program:
Run the first program in any process by doing the following:
$ go_to_sleep
In another process, use the following command line to confirm that a shared memory segment has
been created for the program (the last in the list is the newly created one):
$ ipcs -m
IPC status from /dev/kmem as of Fri Mar 21 15:55:29 1997
T ID KEY MODE OWNER GROUP
Shared Memory:
m 0 0x4119c72b --rw-rw-rw- root root
m 1 0x4e180002 --rw-rw-rw- root root
m 2 0x41187bf4 --rw-rw-rw- root root
m 3 0x00000000 --rw------- root sys
m 7004 0x43186ea0 --rw-rw-rw- daemon daemon
m 6005 0x73636231 --rw-rw-rw- ed lang
Sharing data among programs73
Now run the second program in the second process:
$ wake_up
At this point, the program executing in the first process outputs the following and completes
execution:
I'm up!
The following command line confirms that the shared memory segment was released:
$ ipcs -m
IPC status from /dev/kmem as of Fri Mar 21 15:55:29 1997
T ID KEY MODE OWNER GROUP
Shared Memory:
m 0 0x4119c72b --rw-rw-rw- root root
m 1 0x4e180002 --rw-rw-rw- root root
m 2 0x41187bf4 --rw-rw-rw- root root
m 3 0x00000000 --rw------- root sys
m 7004 0x43186ea0 --rw-rw-rw- daemon daemon
For information about sharing data between Fortran program units and C functions within the same
program, see “Sharing data” (page 122). The HP Fortran Programmer's Reference provides detailed
information about the COMMONstatement and about the $HP$ SHARED_COMMONdirective. Refer to
the shmop(2) man page for information about HP-UX shared memory operations.
Modules vs. common blocks
The common block has been a mainstay of Fortran programs throughout the evolution of the
language, and it continues to be a part of Fortran. The common block provides a convenient means
to share data among program units, especially when the program units sharing data do not
otherwise communicate with each other. The common block can also be used to share data between
simultaneously executing Fortran programs (see “Sharing data among programs” (page 72)) and
between Fortran program units and C functions linked together in the same program (see “Sharing
data” (page 122)).
One of the problems with the common block, however, is that the programmer must replicate the
COMMONdeclaration in each of the sharing program units. If any of the common variables are out
of order or have a different type or size, the program units may not access the same data. The
compiler gives no indication of this discrepancy because it assumes that the programmer is giving
one program unit a different view of the shared storage—even when the discrepancy is owing to
oversight.
To deal with this problem, many implementations of FORTRAN 77 have provided the
INCLUDEextension. This extension enables the user to centralize common block definitions in one
file. At compile-time, the compiler reads the file into program units that have the INCLUDEI line.
While this approach eliminates the problem of discrepant common blocks, it introduces another
problem: the INCLUDEfacility is nonstandard FORTRAN 77, and its use is nonportable.
To deal with the portability issue, Standard Fortran defines the INCLUDEline. Unfortunately, the
definition in the Standard leaves many of the details up to the implementation, so that use of the
INCLUDEline in Fortran programs still runs the risk of nonportability.
Another problem with the common block—especially when used with equivalencing—is that it can
inhibit optimization. Common block variables are generally ineligible for register allocation, and
aliasing variables in common can prevent the optimization of the program units that use the aliased
variables.
The module program unit is the Fortran answer to the common block. The programmer declares
shareable variables in a module. Any program unit that wants to access them references the name
of the module in a USEstatement. The concept of the module eliminates the need to re-declare the
common variables, without requiring the INCLUDEline.
74Controlling data storage
In addition, the module provides the following controls on access to module data:
•The PUBLIC and PRIVATEstatements declare which module variables are accessible outside
the module and which are not.
•The USE statement has an ONLYclause that specifies which module variables are accessible
to a particular program unit.
•The USE statement also has a renaming feature to resolve name clashes between local variables
and module variables.
Another feature of the module is that it can include procedures. This feature provides a way to
package data with the procedures needed to operate on the data. A program unit accesses module
procedures in the same way it does module data, with the USEstatement. The interface of module
procedures is available to the compiler, which can perform compile-time checks on the actual
arguments that are passed to a module procedure.
Although the module does not completely replace the common block (see, for example, “Sharing
data among programs” (page 72)), it does provide a safer and more flexible alternative to the
more common uses—and abuses—of the common block.
For an example of a program that uses the module to share data, see “Compiling programs with
modules” (page 54). The HP Fortran Programmer's Reference provides detailed information about
the module program unit and the MODULEand USEstatements.
Modules vs. common blocks75
4 Debugging
This chapter describes different HP Fortran features for debugging programs. These features include
compile-line options, compiler directives, and programming tools that are useful for locating errors
in your program. More specifically, this chapter discusses the following topics:
•Using the HP WDB debugger
•Stripping debugging information
•Handling runtime exceptions
•Using debugging lines
Using the HP WDB debugger
The HP WDB debuggeris the primary tool for debugging HP Fortran programs. The debugger
provides such basic debugging functions as program control, process control, program and data
monitoring, and expression evaluation. The debugger has both a graphical interface and a
line-mode interface.
The debugger software includes different managers that enable it to handle different source
languages, target machines, object file formats, and user formats. The Fortran language manager
allows you to use Fortran syntax when entering expressions on the debugger command line.
Before beginning a debugging session, you must compile the program with the -gcompile-line
option. If you compile and link separately, you must use the -goption on both command lines. The
option causes the compiler to generate additional information needed by the debugger and to
insert it into the output code.
After compiling your program with the -goption, invoke the debugger with the wdbcommand,
supplying the name of the executable as an argument. For example, the following command
compiles prog.f90for debugging:
$ f90 -g prog.f90 -o db_prog
Here is the command to start debugging the executable program:
$ wdb db_prog
You can use the debugger to debug code that has been optimized at levels 0, 1, and 2. To debug
optimized code, compile the program with both the -gand +Oopt-leveloptions, where opt-level
is 0, 1, or 2. The following command line prog.f90at optimization level 2 and prepares for
debugging:
$ f90 +O2 -g prog.f90 -o db_prog
Compiling with the -goption increases the size of both the object file and the executable file. After
you have debugged your program and are ready to build the production version, you may want
to recompile without the -goption.
For complete information about HP WDB debugger, see http://www.hp.com/go/wdb. Printed
and online documentation are available at this site.
Stripping debugging information
Programs compiled with HP Fortran include minimal debugging information in the executable
program. This information consists of a symbol table—a list of all the symbols in your program
and their offset addresses. The symbol table provides the information needed to produce a procedure
traceback. It is also used by the debugger and by the CXperf performance analysis tool.
However, the symbol table is not the same as the debugging information that is added to your
program when you compile with the -goption. The symbol table is added to an executable even
if the program is not compiled with the -goption.
76Debugging
If the size of executable is critical to your application, you can use the +stripoption to remove
symbol table information from the production version of your program. If you compile and link on
separate command lines, you must use the +stripoption on both command lines. Instead of
recompiling with +strip, you can use the strip utility, which removes all debugging information,
including the symbol table.
If the size of your executable is not important, you may want to retain the symbol table in the
production version of your program. This table can be used by the debugger to provide minimal
debugging. If a program has not been compiled with -g and does not include a symbol table, it
is unusable by the debugger. Also, without the information provided by the symbol table, a
procedure traceback displays virtual addresses only.
The amount of code that the symbol table information that adds to an executable is considerably
less than the amount that compiling with -g adds. For descriptions of the -gand +strip options,
see the HP Fortran Programmer’s Reference. For information about the strip utility, refer to thestrip(1)
man page.
Handling runtime exceptions
Broadly defined, an exception is an error or fault condition that affects a program’s results.
Exceptions can range from the relatively begin in exact result condition that occurs in certain
floating-point operations to the more severe segmentation violation that occurs when a runaway
program attempts to access unallocated memory.
Exceptions that threaten the integrity of the operating system can cause HP-UX to raise an exception
signal (for example, SIGSEGV for a segmentation violation) so that the process can take appropriate
action to recover from the exception. Such exceptions may cause the program that took the exception
to abort, but not necessarily. By trapping an exception—that is, by catching the signal—a program
may handle the exception, if only by aborting when it occurs.
There are also a well-defined set of floating-point conditions that, although they pose no threat to
the operating system, can also cause an exception—for example, dividing a floating-point number
by zero. By default, traps for floating-point exceptions are disabled on HP9000 computers, but
they can be enabled by +fp_exceptionand +FPoptions. (You can also use the ON statement to
enable traps for floating-point exceptions.)
Programs that have been compiled with the +fp_exception option can trap the exceptions listed
in Table4-1. Any of the exceptions listed in the second column will cause the operating system to
generate the signal listed in the first column. Programs compiled with +FP can trap specific
floating-point exceptions (SIGFPE).
Table 25 Signals recognized by +fp_exception
ExceptionSignal
Bus error instructionSIGBUS
Floating-point exceptionsSIGFPE
Illegal instructionSIGILL
Segmentation violation or memory faultSIGSEGV
Bad argument to a kernel system callSIGSYS
When a program compiled with +fp_exception takes an exception, the following events occur:
•The program traps the exception.
•A procedure traceback is displayed on standard error. A procedure traceback lists routine
names and their offsets that are helpful in locating the code that triggered the exception.
•The program aborts.
Handling runtime exceptions77
The following sections discuss each of exceptions listed in Table4-1. For more information about
signals, refer to the signal((2))) and signal((5))man pages.
NOTE:Standard Fortran90 provides the IOSTAT=and ERR=specifiers for handling I/O runtime
errors. For information about these specifiers, refer to the descriptions of the I/O statements (for
example, OPENand READ) in the HPFortran Programmer’s Reference. For a descriptive list of the
error messages that can be returned byIOSTAT=, see the HPFortran Programmer’s Reference.
Bus error exception
A bus error exception occurs when a program references an inaccessible memory location, typically
because the reference is to an unaligned or nonexistent address, or because of a hardware failure.
The most likely cause of a bus error is unaligned data reference. A program that passes an array
of (KIND=1) elements to a routine that attempts to access them as (KIND=4)elements may take
a bus error exception. Or if an array of(KIND=1) elements is declared in a common block and
the third element is passed to a routine that attempts to access it as a (KIND=4) variable, the
program will take a bus error exception. For information about the alignment of HP Fortran data
types, see the HPFortran Programmer’s Reference.
Bus errors can occur (as can other exceptions) in any program that generates bad address
references. Although less likely to happen with programs that use the standard Fortran90 pointer,
bad address references can happen when the Cray-style pointer extension is misused or when
Fortran program unit passes a parameter by value to a C routine that attempts to use it as a pointer.
Floating-point exceptions
In accordance with the IEEE Posix Standard, floating-point exceptions are disabled on HP9000
computers. Thus, if a program attempts the following operation:
x = 1.0/0.0
it will not trap it as an exception and will not abort. Instead, the value of a positive infinity (displayed
as +INF) will be assigned to x.
HPFortran provides two compile-line options, +FPand +fp_exception, which enable traps for
floating-point exceptions. The differences between the two options are:
The +fp_exceptionoption enables traps for the following IEEE floating-point exceptions:
❏ Invalid operation
❏ Division by zero
❏ Division by zero
❏ Overflow
❏ Underflow
The +FP option also enables the trap for the inexact operation exception. For detailed descriptions
of these exceptions, see the HP-UX Floating-Point Guide.
•Unlike the +fp_exception option, the +FPoption includes a flags argument by which you
can enable specific exceptions.
•The +FPoption can also be used to enable fast underflow on systems that support it (chiefly
PA2.0 systems).
•Both options cause your program to abort when it traps the exception. However,
You can also trap floating-point exceptions with the ONstatement. Although the ONstatement requires
you to modify source code, it enables you to write trap procedures so that your program can
recover from exceptions. For more information about using the ON statement, see Chapter 5,“Using
the ON statement” (page 81).
78Debugging
+fp_exception identifies the type of the exception that occurred and the virtual address of
the statement that triggered it. Also, +FP causes a core dump;+fp_exception does not.
See the HPFortran Programmer’s Reference, for detailed information about the +FPand
+fp_exceptionoptions. Also, the HP-UX Floating-Point Guide has a useful discussion of both
options and includes detailed information on floating-point exceptions and how to handle them.
Illegal instruction exception
An illegal instruction exception occurs when a program attempts to execute a bit pattern that is
not an op-code. A common cause of this exception is an overwritten stack. If a program overwrites
the part of the stack that holds the return address, the new (and bad) address may cause execution
control to jump to a memory location that contains data or some other nonexecutable bit pattern.
The attempt to execute this location will result in an illegal instruction exception.
This exception can also occur if your program is linked to a bad library, especially if the library
contains code that was written in assembler or if it was corrupted during a file transfer.
This exception may indicate a compiler error. If you cannot find the cause of this exception in your
code, contact your HP support representative.
Segmentation violation exception
Before a program starts to execute, it is allocated a memory segment, which defines the area of
memory that it can use. If the program attempts to access a memory location outside its segment,
the operating system will raise the SIGSEGVsignal, indicating a segmentation violation or memory
fault.
Any program that can generate address references outside its segment—for example, by indexing
beyond the declared boundary of an array—may cause a segmentation violation. In C programs,
bad pointers often result in this exception. The standard Fortran90 pointer is more self-protective
than the C pointer, but it too can be misused and lead to the state of mind memorialized in the
lyric (known only to Cooper Redwine1): “I’ve got those segmentation violation, core dumped
blues.” The Cray-style pointer extension is more like the C pointer and is therefore more susceptible
to the abuse that results in segmentation violations.
Programs that cause a stack overflow (for example, by attempting to allocate more local variables
on the stack than the kernel can handle or by infinite recursion) can also cause a segmentation
violation. If your program needs a bigger stack, run the System Administrator Manager (SAM)
and increase the maxssiz parameter. Also, see the HP-UX System Administration Tasks manual for
information about reconfiguring the kernel.
Segmentation violations are especially common when calling C functions from Fortran program
units. If the number, type, or calling conventions of the arguments being passed do not match, the
call is likely to result in an exception. For example, if you use the built-in function %VAL to declare
an argument as passed by value, but the C function is expecting a pointer, a segmentation violation
may occur. (%VAL and %REF are HPFortran extensions; for information about using them when
calling a C routine from Fortran, see “Argument-passing conventions” (page 113).)
In most cases, debugging requires locating the code that caused the segmentation violation and
rewriting it. If your program aborts with this error, recompile it with the +fp_exceptionoption.
A program compiled with this option will display a procedure traceback when it aborts. The
procedure traceback lists procedure names and offset addresses of the code that caused the
exception.
If you suspect that an out-of-bounds array reference is causing the segmentation violation, you can
use the +check=alloption instead of the +fp_exceptionoption. When compiled with the
+check=all option, a program that attempts to reference an array element that is outside the
declared array boundary will abort with an error message that gives the line number of where the
reference was detected.
The +check=allalso performs runtime checks for out-of-bounds substrings and for integer overflow;
see “Calling a trap procedure” (page 84). The +check option is fully described in the HPFortranProgrammer’s Reference.
Handling runtime exceptions79
Bad argument exception
This exception occurs when a bad argument (for example, an out-of-range argument) is passed to
a kernel system routine. This exception can also occur in programs that make explicit calls to the
kernel threads library, /usr/lib/libpthread.sl, and pass bad arguments.
Using debugging lines
An HPFortran program that has been written in fixed source form can contain debugging lines.
These are statements that begin with the letter D or d in column 1. When compiled with the
+dlinesoption, the debugging lines are treated as statements and compiled; otherwise, they are
treated as comments and ignored. A program that contains debugging lines must also be compiled
for fixed source form; that is, the filename extension must be either .f or the program must be
compiled with the +source=fixedoption.
The +dlinesoption makes it possible to include WRITEstatements as debugging lines in the source
file and to remove them from the production version of the program without having to change
source code. Instead of deleting the WRITEstatements when you are ready to build the production
version, you recompile without the +dlinesoption, or with the +nodlines option.
Although debugging lines are supported by many implementations of Fortran (especially
FORTRAN77), it is nonstandard and therefore nonportable. Use of this feature is even more
restrictive by reason of its being incompatible with free source form. If you try to compile a Fortran90
program as free source form and the program contains debugging lines, the compilation will almost
certainly fail with syntax errors.
The C preprocessor (cpp) provides a set of directives that have the same functionality as debugging
lines but are much more powerful and can be used in either fixed or free source form. Although
the cpp directives are not part of standard Fortran 90, they are available on most UNIX systems,
such as HP-UX.
The cpp directives are described in the cpp((1)) man page. See the HPFortran Programmer’sReference for information about the source form of HPFortran programs and the +dlinesoption.
80Debugging
5 Using the ON statement
Whenever a runtime error occurs, the default action of your program depends on the type of the
error. If the error results from a floating-point exception, the program will continue to execute.
Other errors will cause it to abort.
As described in “Handling runtime exceptions” (page 77), the +fp_exception and +FPoptions
provide control over how a program behaves when a runtime error occurs. The ON statement
provides an additional level of control by enabling your program to handle floating-point and
integer exceptions and +Ctrl-C interrupts. Before an exception can be handled, the flow of
control must pass through an ON statement that specifies:
•The type of the exception
•One of the following actions:
❏ Execute a trap procedure
❏ Ignore the interrupt
❏ Abort the program
The action specified by the ON statement can only be changed by another ON statement that
specifies the same exception.
This chapter describes how to use the ON statement. The syntax of the ON statement is described
in the HPFortran Programmer’s Reference. For detailed information about trapping math errors,
see the HP-UX Floating-Point Guide.
NOTE:If you include theON statement in a program that you optimize at level 2 or higher and
the program takes an exception, the results may vary from those you would get from an unoptimized
program or from a program that didn’t have the ONstatement.
Exceptions handled by the ON statement
Like the +fp_exceptionoption, the ONstatement enables traps for floating-point exceptions (by
default, traps for floating-point exceptions are disabled on HP 9000 computers). When traps are
enabled, an executing program that takes any of the following exceptions will abort, unless an
ONstatement specifies a different action:
•Division by zero
•Overflow
•Underflow
•Invalid (or illegal) operation
These exceptions are defined by the IEEE standard for floating-point operations. The ONstatement
enables traps for these exceptions, regardless of whether the exception is taken by user code or
by a call to a library routine. In addition, the ON statement also enables traps for integer division
by zero, integer overflow, and +Ctrl-Cinterrupts. The +Ctrl-Cinterrupt occurs when the user
presses +Ctrl-Cduring program execution.
Table5-1 on page131 lists the exceptions handled by the ONstatement and gives the keywords
that must be specified in the ONstatement to indicate the exception being handled. The first column
indicates the type of exception. The second column gives the keywords that must appear in the
ON statement, immediately following the word ON. The third column gives alternate keywords you
can specify instead of those in the second column.
For example, the following ONstatement will trap attempts to divide by zero with 8-byte floating-point
operands:
ON REAL(8) DIV 0 CALL div_zero_trap
Exceptions handled by the ON statement81
The next example ONstatement does the same as the first but uses the alternate keywords from the
third column of the table:
ON DOUBLE PRECISION DIV 0 CALL div_zero_trap
Table 26 Exceptions handled by the ON statement
Alternate keywordsException keywordsExceptions
REAL DIV 0REAL(4) DIV 0Division by zero
DOUBLE PRECISION DIV 0REAL(8) DIV 0
(none)REAL(16) DIV 0
INTEGER*2 DIV 0INTEGER(2) DIV 0
INTEGER DIV 0INTEGER(4) DIV 0
REAL OVERFLOWREAL(4) OVERFLOWOverflow
DOUBLE PRECISION OVERFLOWREAL(8) OVERFLOW
(none)REAL(16) OVERFLOW
INTEGER*2 OVERFLOWINTEGER(2) OVERFLOW
INTEGER OVERFLOWINTEGER(4) OVERFLOW
REAL UNDERFLOWREAL(4) UNDERFLOWUnderflow
Actions specified by ON
The action taken after an exception is trapped depends on the action specified by the most recently
executed ONstatement for that exception. To specify an action, the ON statement must include the
keyword ABORT, IGNORE, or CALL. These keywords have the following meanings:
•If ABORTis specified, a standard error message is generated and the program is aborted.
•If IGNOREis specified, processing continues with the next instruction.If the exception is an
integer division by zero, the result is set to zero. For other conditions, the previous content of
the target register is supplied as the result. IGNOREis particularly useful for preventing +Ctrl-C
interrupts at inconvenient times during program execution.
•+Ctrl-C
If CALLis specified, the normal (ABORT)error message is suppressed, and control is transferred
to the specified trap procedure.
Zero or one parameter is passed to the trap procedure. If an argument is specified, it is the result
of the operation that took the exception. The procedure can analyze this value to get more precise
information, and it can assign another value to the parameter to recover from the error. The type
of the argument must be the same as that specified in the keywords. The specified trap procedure
is generally an external procedure. However, it is also possible to specify a dummy procedure
argument.
The following sections describe how to use the ONstatement to specify different actions to take in
the event of an exception.
DOUBLE PRECISION UNDERFLOWREAL(8) UNDERFLOW
(none)REAL(16) UNDERFLOW
REAL ILLEGALREAL(4) ILLEGALInvalid (illegal) operation
DOUBLE PRECISION ILLEGALREAL(8) ILLEGAL
(none)CONTROLC+Ctrl-C interrupt
82Using the ON statement
Terminating program execution
Use the ABORTform of the CALLstatement to terminate the program when an exception occurs. In
the following example, the log is taken of a negative number. The ABORTclause causes the program
immediately after the exception is detected and to issue a procedure traceback:
Example 14 Example5-1 abort.f90
PROGRAM main
REAL :: x, y, z
! The next statement enables traps for floating-point exceptions
! and specifies the action to take for divide by zero.
! ON REAL DIV 0 ABORT
x = 10.0
y = 0.0
z = x / y
PRINT *, y
END PROGRAM main
Here is the command line and the output from a sample run:
$ f90 abort.f90
$ a.out
PROGRAM ABORTED : IEEE divide by zero
PROCEDURE TRACEBACK:
( 0) 0x0000248c _start + 0x6c [./a.out]
The program would have the same result if you were to comment out the ON statement and compile
with the +fp_exceptionoption.
Ignoring errors
You can use the ONstatement to ignore an exception by specifying the IGNOREkeyword. The
following paragraphs discuss an example program,ignore.f90, that uses the ONstatement to
ignore an invalid operation. The following program illustrates this.
Example 15 Example5-2 ignore.f90
PROGRAM main REAL :: x, y, z
! The following ON statement enables traps for floating-point
! exceptions and causes the program to ignore an invalid
! operation exception. ON REAL ILLEGAL IGNORE
! The next two statements pass a negative argument to the LOG
! intrinsic, resulting in an invalid operation. This
! exception is ignored, as specified by the ON statement.
x = -10.0 y = LOG(x)
PRINT *, y
! The next three statements attempt to divide by zero. The
! trap for this exception is enabled by the previous
! ON statement but no action is specified. Therefore,
! the program will abort execution.
x = 9.0 y = 0 z = x/y
PRINT *, z
END PROGRAM main
As defined by the IEEE standard, a floating-point operation that results in a NaN is an exception
known as an invalid operation. The example program performs an invalid operation when it passes
a negative argument to the LOG intrinsic, causing the intrinsic to return a NaN. The following ON
statement:
ON REAL INVALID IGNORE
causes the program to ignore this exception and continue execution.
Actions specified by ON83
The program also attempts to divide by zero. Although the ONstatement enables the trap triggered
by a divide-by-zero exception, the statement has no other effect. As a result, the exception will
cause the program to abort. To ignore the divide-by-zero exception would require an additional
ONstatement:
ON REAL DIV 0 IGNORE
Here is command line to compile the program, followed by the output from a sample run:
$ f90 ignore.f90
$ a.out
NaN
PROGRAM ABORTED : IEEE divide by zero
PROCEDURE TRACEBACK:
(0) 0x00002504 _start + 0xbc [./a.out]
Calling a trap procedure
You can write trap procedures that are callable by the ON statement to handle arithmetic errors
in user code and in library routines. Trap procedures can take zero or one argument. If an argument
is specified, it is the result and must have the type specified by the exception keyword. For example,
if the following ONstatement occurs in a program:
ON DOUBLE PRECISION OVERFLOW CALL trap
then the procedure trap could declare one argument of type DOUBLE PRECISION. Note that the
argument is optional. Also, depending on the exception, the contents of the argument may not
always be meaningful.
The following sections discuss two example programs that use the ONstatement to call a trap
procedure for floating-point exception and for an integer exception.
Trapping floating-point exceptions
The following program, call_fptrap.f90, causes an invalid operation exception and includes
an ONstatement to handle the exception. The ONstatement calls the trap procedure trap_illegal,
which assigns a different value to the result argument. The program prints the result. Here is the
program listing:
Example 16 Example5-3 call_fptrap.f90
PROGRAM main
REAL :: x, y
ON REAL ILLEGAL CALL trap_illegal x = -10.0 y = LOG(x) ! causes an invalid operation
PRINT *, yEND PROGRAM main
SUBROUTINE trap_illegal(res)
! res is the result value of the invalid operation
! trapped by the ON statement
REAL :: res
res = 99.87 ! assign another value to the result argument
END SUBROUTINE trap_illegal
Here is the command line, followed by the output from a sample run:
$ f90 call_fptrap.f90
$ a.out
99.87
Trapping integer overflow exceptions
This section discusses an example program that illustrates how to use the ONstatement to call a
trap procedure for an integer overflow exception.
An integer overflow occurs when an operation on an integer variable results in the attempt to
assign it an out-of-range value. HPFortran does not trap this exception by default. However, you
can use the ON statement in conjunction with the $HP$ CHECK_OVERFLOWdirective to trap an
integer overflow. The following program, call_itrap.f90, illustrates how to do this:
84Using the ON statement
Example 17 Example5-4 call_itrap.f90
PROGRAM main
!$HP$ CHECK_OVERFLOW INTEGER ON
INTEGER :: i ON INTEGER OVERFLOW CALL trap_oflow
! assign to i the biggest number it can hold
I = 2147483647
! now add 1
I = i + 1
PRINT *, i
END PROGRAM main
SUBROUTINE trap_oflow(n)
INTEGER :: n
! write error message to standard error
WRITE (7, *) 'integer overflow occurred, assigning 0 to result'
n = 0
END SUBROUTINE trap_oflow
If you were to comment out the ON statement but keep the directive, the program would abort with
a procedure traceback and a core dump. Compiling with the +check=alloption would have the
same effect.
Trapping +Ctrl-C trap interrupts
A +Ctrl-Cinterrupt can occur during the following circumstances:
•When the user enters the interrupt code from the terminal while the program is running or
awaiting input
•During the execution of a PAUSEstatement
The trap procedure for a +Ctrl-Cinterrupt must have no formal arguments. The interrupt code is
the character defined by the HP-UX stty(1)command for the intrparameter. The system default
for intris +Ctrl-C.
You can use the +Ctrl-Cform of the ONstatement to handle the interrupt signal 2. In the following
example, when an interrupt occurs, the program reports status information on standard output,
assuring the user that the program is still at work in the DOloop. The program uses the ONstatement
to set the action for a +Ctrl-Cinterrupt to be the call to the trap handler status:
PROGRAM main
COMMON i
ON CONTROLC CALL status
DO i = 1, 100000
... ! Long computation END DO
END
SUBROUTINE status
COMMON i
PRINT *, 'Currently on iteration ', i
END SUBROUTINE status
When this program is run, a +Ctrl-Cinterrupt causes the status routine to be called, which prints
the iteration count. The program then resumes executing the DOloop.
Allowing core dumps
If a program includes the ONstatement and takes an exception other than the one specified by the
exception keywords, the program will abort with a procedure traceback but without a core dump.
If you want to allow a core dump for one or more signals for a program that includes the
ONstatement, you must revise the program for each such signal.
For example, you may wish to handle floating-point exceptions with the ONstatement, but still allow
a core dump for other signals (for example, a bus error). The following example program uses
theSIGNAL routine in the libU77library to reset the default behavior for a bus error signal. The
Trapping +Ctrl-C trap interrupts85
program uses the ON statement to handle floating-point exceptions, but allows a core dump when
a bus error occurs:
Example 18 Example 5-5 allow_core.f90
PROGRAM main
ON REAL OVERFLOW IGNORE
CALL take_err
END PROGRAM main
SUBROUTINE take_err
DOUBLE PRECISION :: d
POINTER (ip, d) ! Cray-style pointer
REAL :: x, y
INTEGER, PARAMETER :: sigbus=10, sigdfl=0
INTEGER :: sigrtn, SIGNAL
! Set the action for bus error to be the default (DUMP CORE),
! overriding the action of issuing a procedure traceback
! that is established by using the ON statement.
! To suppress the core dump and enable a procedure traceback,
! comment out the next statement sigrtn = SIGNAL(sigbus, 0, sigdfl) x = 1.0E38 x = y * 10.0 !
causes a real overflow
! Bus error is caused by the next statements
ip = MALLOC(40)
ip = ip + 4 ! ip is now 4-byte aligned d = 99.0
! bus error
END SUBROUTINE take_err
This program must be compiled with the +U77option to link in the libU77library. Here is the
command line and the output from a sample run:
$ f90 +U77 allow_core.f90
$ a.out
Bus error(coredump)
$ ls corecore
86Using the ON statement
6 Performance and optimization
This chapter describes how to use different features of HP Fortran to tune your program for optimum
performance. The most important of these features is the optimizer. You invoke the optimizer when
compiling your program by specifying either +On(where nrepresents the level of optimization to
be applied to your program) or the -Ooption for the default level of optimization (level 2). The -O
option is provided for compatibility with the POSIX standard and has the same functionality as the
+O2 option.The following command line compiles , usingprog.f90 the default level of optimization:
$ f90 -O prog.f90
For most applications, -O provides effective optimization. However, some applications can realize
significant increases in performance at higher levels of optimization or when you use other features
of the optimizer to boost performance. This chapter discusses these features as well as the following
topics:
•Using profilers
•Using options to control optimization
•Conservative vs. aggressive optimization
•Parallelizing HP Fortran programs
•Vectorization
•Controlling code generation for performance
For information about getting the best performance from floating-point intensive applications running
on HP-UX, see the HP-UX Floating-Point Guide.
Using profilers
A profiler is a tool for sampling a program during execution so that you can determine where your
program spends most of its time. After examining the data provided by a profiler, you can decide
whether to redesign parts of the program to improve their performance or to re-compile the program
with optimization options. For example, if your program contains a loop with an embedded call
and profiling reveals that the program spends much of its time in the loop, you may decide to
inline the embedded call.
The following sections describe the HP Caliper performance analysis tool, which is bundled with
HPFortranas well as the two UNIX profilers,gprof and prof.
NOTE:As described in “Stripping debugging information” (page 76), all programs compiled
by HP Fortran include symbol table information in the executable file, unless you compile with the
+stripoption or have removed the symbol table with the strip utility. This information must be
present in the executable in order to use the profiling tools.
HP Caliper
HP Caliper works with your compiler to perform Profile Based Optimization (PBO).
The compiler option +Oprofile=collectuses HP Caliper in the background to collect profile
information for your entire application. This is equivalent to the +Ioption on HP PA-RISC systems.
Using profilers87
Invoking HP Caliper with the compiler to optimize your program
The simplest way to use HP Caliper is to use the PBO mode to create a flow-data file that the
compiler can use to further optimize your program.
Step 1.
Step 2.
Compile your program with the +Oprofile=collectoption.
/opt/ansic/bin/cc -Aa +O3 -o program +Oprofile=collect
program.c
The loader sets a run-time flag on the program to automatically invoke HP Caliper.
HP Caliper supports programs compiled with any compiler option. When you
use the +Oprofile=collectoption, however, the compiler forces
+01optimization to fully measure the program performance.
Run your program to generate the flow-data file.
program[program_arguments]
HP Caliper automatically runs to generate the flow-data information in a file
namedflow.data in the current directory.
You can repeat this step multiple times with different program_arguments to create
aggregated profile information in the flow.datafile to improve your program’s
optimization.
If the flow.datafile already exists in the current directory when you run your
program, then HP Caliper merges the results into the file. If you run your program
multiple times in different directories, then HP Caliper creates a separate flow.data
file in each directory. You can combine the files using the fdm(1)utility program,
which is bundled with the HP C, HP aC++, and Fortran 90 compilers.
When collecting PBO data, the more your program use scenarios resemble the
ways in which your program will actually be used, the more the compiler can
optimize your program specifically for how it is actually used.
When you make changes to your source files, you should delete the flow.data
file before collecting more PBO data on your program.
Step 3.
Comparing Program Performance
You can use HP Caliper’s other measurement features to explicitly see the results of using PBO.
An example process for comparing performance would be:
1.Compile your program with +Oprofile=collectto generate the executable to optimize.
2.Run your program to generate the profile data file. Use as many use scenarios as possible to
collect representative profile data.
3.Compile your program with +O3to generate the baseline executable to compare against the
fully optimized executable.
4.Run HP Caliper on your +O3executable with a configuration file such as total_cputo measure
performance.
5.Recompile your program with +O3optimization and +Oprofile=useto use the profile data
collected in Step 2 as input.
6.Run HP Caliper on your fully optimized executable from Step 5 with the same configuration
file from Step 4 and compare the report results with the report from Step 4.
Use the flow.datafile as input to the compiler when you specify the
+Oprofile=use option on subsequent re-compiles.
/opt/ansic/bin/cc -Aa +O3 -o program +Oprofile=use program.c
The compiler uses the HP Caliper information to help optimize your program.
Note that the benefit of profile-based optimization is application dependent. Some
programs may not improve while others may improve significantly.
88Performance and optimization
Using Options to Control Data Collection
You can use various options to specify how to collect profile information about your program.
•Collecting information on specific load modules.
•Improving HP Caliper performance by excluding load modules you don’t want to profile.
•Collecting information on specific processes.
You can specify HP Caliper data collection options by using the CALIPER_OPTSenvironment
variable. In your Makefile or build scripts you can set CALIPER_OPTS to the options you want to
use. HP Caliper uses those options when it is started by the compiler.
Here are some examples of ways you can control how HP Caliper collects optimization information.
Specifying PBO file names and locations
HP Caliper creates two files when doing profile-based optimization:
You can use the FLOW_DATAand FLOW_LOGenvironment variables to specify the file names.
If the file you specify with the FLOW_DATAvariable already exists when you run your program,
then HP Caliper merges the results of the current run in into the file.
The file name used to store profile data.flow.data
The file name used to log diagnostic messages.flow.log
gprof
CAUTION:You cannot combine profile information from different programs into a
singleflow.data file. Combining profile information from multiple programs corrupts the
flow.datafile. The file is designed to contain data from multiple runs of the same program with
different input to help improve optimization results.
The gprof profiler enables you to determine which subprograms are called the most and how
much time is spent in each subprogram. To use gprof, do the following:
1.Compile the program with the +gprofoption. For example:
$ f90 -o prog +gprof prog.f90
2.Run the program. This creates the file gmon.outin the current directory. For example:
$ prog
$ ls gmon.out
gmon.out
3.Run gprof, specifying the name of the program as an argument. It will display two tables to
standard output: a flat profile and a call graph profile. Since these tables can be quite large,
you may want to redirect the output from gprof, as follows:
$ gprof prog >gprof.out
The flat profilelists the number of times each subprogram was called and the percentage
of the total execution time for each of the subprogram times. The call graph profile
includes such information as the index of the function in the call graph listing, the percentage
of total time of the program accounted for by a routine and its descendents, and the number
of seconds spent in the routine itself.
4.Once gprofis finished, you can view the output tables using an ASCII editor.
For more information about gprof, see the gprof((1) )man page.
Using profilers89
prof
The profprofiler can also be used for profiling. Unlike the gprofprofiler, prof does not generate
the call graph profile. To use prof, do the following:
1.Compile the program with the +prof option. For example:
$ f90 -o prog +prof prog.f90
2.Run the program. This creates a file named mon.out in the current directory. For example:
$ prog$ ls mon.outmon.out
3.Run prof, giving the name of the program as an argument, as follows:
$ prof prog
profproduces a listing on standard output showing the time spent in each routine.
For more information about prof, see the prof((1))man page.
Using options to control optimization
HPFortranincludes a rich set of command-line options for controlling optimization. For most
applications, we recommend optimizing with -O, which enables the default level of optimization.
(For information about the default level of optimization, refer to Table on page 149; look up +O2
in the first column.) You can raise or lower the level of optimization with the +Oopt-leveloption,
and you can use the +Ooptimizationoption to control the kinds of optimizations that are available
at each level.
The following sections describe how to use the +Oopt-leveland +Ooptimizationoptions. For
detailed descriptions of the optimization options, see the HPFortranProgrammer’s Reference.
Using +O to set optimization levels
HP Fortran provides four levels of optimization. Each higher level is a superset of the lower levels;
level 4 is the highest level and can result in a significant increase in program performance. Level
2 is the default level of optimization.
You invoke optimization by compiling with the +Oopt-leveloption, where opt-levelis an
integer in the range 0 - 4. The following command line invokes the optimizer at the highest level:
$ f90 +O4 file.f90
You can invoke level 2 (the default level) by specifying the -Ooption.
Table6-1 summarizes each level, giving the option that invokes that level, the advantages,
disadvantages, and recommended usages. For technical information about the specific optimizations
at each level, see the HP PA-RISC Compiler Optimization Technology White Paper. A PostScript
version of this document is available online in /opt/langtools/newconfig/white_papers/optimize.ps.
NOTE:You can debug programs optimized up to level 2. To prepare an optimized program for
debugging, use the command line:
$ f90 -g +Oopt-level prog.f90
Table 27 Optimization levels
Optimizations
performedOption
Recommended useDisadvantagesAdvantages
+O0default
+O1
90Performance and optimization
Constant folding and
partial evaluation of
test conditions.
Level 0
optimizations, plus
Compiles fastest;
compatible with the
debugger option -g.
Produces faster
programs than level 0;
During program development.Does very little
optimization.
During program development.Compiles slower
than level 0.
Table 27 Optimization levels (continued)
Optimizations
performedOption
Recommended useDisadvantagesAdvantages
+O2, -O
+O3
branch optimization,
dead code
elimination, more
efficient use of
registers, instruction
scheduling, and
peephole
optimization.
Default level
optimizations,
including level 1,
plus coloring register
allocation, induction
variable eliminationon large float and double
reduction, common
subexpression
elimination, loop
invariant code
motion, store/copy
optimization, unused
definition elimination,
software pipelining,
and register
reassociation.
Level 2
optimizations, plus
loop transforms,
parallelization,containing frequently executed
cloning, and inlining
within a file. Some
optimizations may
require additional
options; see “Using
the optimization
options” (page 91).
compiles faster than
level 2; compatible with
the debugger option
-g.
Can significantly
increase performance
over level 1; works with
debugger option-g.
Can significantly
increase performance
over level 2.
Compiles slower
than level 0 and
1.
Compiles slower
than lower levels;
code size; not
compatible withvectorization,
the debugger
option -g.
During program development
and when building the
production version; especially
effective in optimizing loops that
perform arithmetic operations
arrays.and strength
When building the production
version; especially effective
when used on source filesincreases object
loops and subprograms.
+O4
Level 3 optimizations
applied across all
compiled with +O4.containing frequently executed
Using the optimization options
The +Ooptimizationoptions enable you to control the kind of optimizations that are applied to
your program at each level. Table on page151 and Table6-3 on page152 list the options. The
first column of each table lists each option, the second column gives the optimization level at which
the option can be used, and the third column identifies what the option does. When using any of
these options except +Oall, you must also use the +Onoption to specify the optimization level
listed in the second column of the tables. The +Oalloption automatically invokes the optimizer at
the highest level.
Provides the highest
level of optimization;
increase performance
over level 3.
Can use large
amounts of
may increase
link-time and
object code size;
not compatible
with the
debugger option
-g.
Using options to control optimization91
When building the production
version; especially effective
when used on source filessystem resources;can significantlyprogram files
loops and subprograms.
Table6-2 lists the “packaged” options. These options enable or disable a set of related optimizations,
such as optimizations that do not increase code size. Table6-3 lists options that enable or disable
specific optimizations.
The options in both tables can be combined on the same command line, except as noted. For
example, the following command line requests aggressive optimizations at level 2 that do not
increase code size:
$ f90 +02 +Oaggressive +Osize prog.f90
Nearly all of the optimization options can be used to enable or disable an optimization or a
package of optimizations. For example, the following command line requests aggressive level 4
optimizations that do not result in roundoff errors:
$ f90 +O4 +Oaggressive +Ofltacc prog.f90
The f90.1 manpage fully describes all of the optimization options.
Table 28 Packaged optimization options
FunctionLevelOption
+O[no]autopar
+O2or higher+O[no]aggressive
Invokes highest level+O[no]all
+O2or higher+O[no]conservative
Enable [disable] optimizations that can significantly improve
performance in standard-conforming programs. The default
is +Onoaggressive. For more information about this
option, see “Conservative vs. aggressive optimization”
(page 99).
Enable [disable] maximum optimization. The default is
+Onoall.
Do [do not] automatically parallelize loops that are deemed
safe and profitable by the loop parallelizer. The default is
+Onoautopa. The +Oautopar option can be used for
programs at optimization levels, +O3and above.
If both +Oopenmpand +Oautopar are specified on the
command line, the OpenMP directives gain precedence
over the +Oautoparoption. The compiler auto-parallelizes
only the loops of code that are not controlled by the
OpenMP directives.
To compile a program with +Oautoparsupport, the
libcps, libomp, and libpthreads runtime support
libraries must be present at compile-time and runtime. This
feature is available on Integrity systems and HP 9000
systems.
Suppress [do not suppress] optimizations that assume strict
conformity to the Fortran90 standard. The default is
+Onoconservative. For more information about this
option, see “Conservative vs. aggressive optimization”
(page 99).
NOTE:This option is valid only on the PA-RISC systems.
Table 29 Fine-tuning optimization options
92Performance and optimization
+O2or higher+O[no]limit
+O2or higher+O[no]size
+O3or higher+O[no]cache_pad_common
Enable [disable] optimizations that do not make large
demands on system resources. The default is +Onolimit.
Enable [disable] optimizations that do not significantly
increase code size. The default is +Onosize.
Insert [do not insert]
instructions within innermost
loops to explicitly prefetch
data from memory into the
data cache. The default is
+Onodataprefetch.
Enable [disable] dynamic
optimization for the output
file, if the run-time
environment supports this
feature. Both forms of this
option change the default
setting, which allows the
run-time environment to
enable or disable dynamic
optimization according to a
system-wide default. This
option applies only to
executable files and shared
libraries.chatr(1) can be
used to change this setting,
including restoration of the
default setting, after the
output file has been created.
Perform [do not perform]
instruction scheduling on
entry and exit code. The
default is
+Onoentrysched.
NOTE:This option is valid
only on the PA-RISC systems.
All+O[no]fastaccess
All+Ofast
Enable [disable] fast access
to global data. The default
is +Onofastaccessat
levels 1, 2, and 3;
+Ofastaccessat level 4.
Select a combination of
optimization options for
optimum execution speed
and reasonable build times.
Currently chosen options
are:
option for programs that
depend on IEEE standard
floating point denormalized
numbers. Otherwise,
different numerical results
may occur. See
+Ofltacc=relaxed. This
option is a synonym for
-fast.
All+Ofaster
+O2or higher+O[no]fltacc
All+Ofrequently_called=function1[,function2...]
All+Ofrequently_called:filename
This option is equivalent to
+Ofastwith an increased
optimization level. The
definition of +Ofastermay
change or it may be
deprecated in future
releases.
Disable [enable]
floating-point optimizations
that can result in numerical
differences. By default, the
optimizer does not perform
such optimizations. For
information about the effect
this option can have on your
program, see the HP-UXFloating-Point Guide.
The named functions are
assumed to be frequently
called. This option overrides
any information in a profile
database.
The file indicated by
filename contains a list of
functions, separated by
spaces or newlines. These
functions are assumed to be
frequently called. This option
overrides any information in
a profile database.
94Performance and optimization
All+O[no]info
+O2or higher+O[no]initcheck
Display [do not display]
information about the
optimization process. This
option is most useful at level
3 and above. The default is
+Onoinfo.
Enable [disable]
initialization of any local,
scalar, automatic variable
that is found to be
uninitialized. The default is
to initialize if the variable is
uninitialized with respect to
every path leading to its use.
For more information about
this option, see
Enable [disable] inlining.
The default is +Oinline.
This option controls inlining
in fortran. The format for
num is N[.n], where num is
either an integral value from
0 to 9 or a value with a
single decimal place from
0.0 to 9.0.
For more information on this
option, see
F90((1))manpage.
Substitute [do not substitute]
millicode versions of specific
intrinsics. The default is
+Olibcalls.
Loop blocking is a
combination of strip mining
and interchange that
improves data cache
locality. It is provided
primarily to deal with nested
loops that manipulate arrays
that are too large to fit into
the data cache. Under
certain circumstances, loop
blocking allows reuse of
these arrays by transforming
the loops that manipulate
them so that they manipulate
strips of the arrays that fit
into the cache.
+O2or higher+O[no]loop_unroll=n
+O3or higher+O[no]loop_unroll_jam
All+moduleoptimize
Unroll [do not unroll]
program loops by a factor
of n. The default is
+Oloop_unroll=4.
Loop unroll-and-jam involves
partially unrolling one or
more loops higher in the nest
than the innermost loop, and
fusing ("jamming") the
resulting loops back
together. This transformation
is primarily intended to
increase register reuse and
decrease memory loads and
stores per operation within
an iteration of a nested loop.
The compiler reads only
required information from a
module file. Optimized
module files are created by
discarding redundant
information while importing
the module file. In case of
nested modules or
hierarchical modules, the
compilation time and
memory requirement of
Enable [disable] moving
conditional floating-point
instructions out of loops. The
default is +Omoveflops.
Transform [do not transform]
eligible loops for parallel
execution. The default is
+Onoparallel.
This option is available only
on HP 9000 systems. On
Integrity systems, you can
use the +Oautopar option
instead to parallelize loops.
Suppress optimizations that
assume [do not assume] that
arguments may refer to the
same memory locations. The
default is
+Onoparmsoverlap.
Enable [disable] software
pipelining. The default is
+Opipeline.
NOTE:This option is valid
only on the PA-RISC systems.
All+O[no]procelim
+O3or higher+O[no]promote_indirect_calls
Remove [do not remove]
unreferenced procedures
from the executable. The
default is +Onoprocelimat
levels 0 - 3, +Oprocelimat
level 4.
Use profile data from
profile-based optimization
and other information to
determine the most likely
target of indirect calls and
promotes them to direct
calls. Indirect calls occur
with pointers to functions
and virtual calls.
In all cases the optimized
code tests to make sure the
direct call is being taken and
if not, executes the indirect
call. If +Oinlineis in effect,
the optimizer may also inline
the promoted calls.
+Opromote_indirect_calls
is only effective with
profile-based optimization.
to determine the most likely
target of indirect calls. If the
profile data is incomplete or
ambiguous, the optimizer
may not select the best
target. If this happens, your
code’s performance may
decrease.
This option can be used at
optimization levels 3 and 4.
At +O3, it is only effective if
indirect calls from functions
within a file are mostly to
target functions within the
same file. This is because
+O3optimizes only within a
file whereas, +O4optimizes
across files.
The default is
+Opromote_indirect_callsat
optimization level 3 and
above.
+Onopromote_indirect_callswill
be the default at
optimization level 2 and
below.
+Orarely_called:filename
All+Orarely_called=function1[,function2...]
All+O[no]recovery
The named functions are
assumed to be rarely called.
This option overrides any
information in a profile
database.
The file indicated by
filename contains a list of
functions, separated by
spaces or newlines. These
functions are assumed to be
rarely called. This option
overrides any information in
a profile database.
This option generates [does
not generate] recovery code
for control speculation. For
code that writes to
uncacheable memory that
may not be properly
identified as volatile, the
+Orecoveryoption reduces
the risk of incorrect
behavior. The default is
+Orecovery.
NOTE:The program that
uses signal handlers to catch
signals raised by memory
accesses may not behave
correctly under
+Onorecovery.
Enable [disable] register
association. The default is
+Oregreassoc.
NOTE:This option is valid
only on the PA-RISC systems.
This option provides a hint
to the optimizer that the
named functions never return
to the call site. If the
+Ono_return:filenameform
is used, the file filename
contains a list of function
names separated by spaces
or newlines. This
optimization is available at
all optimization levels.
All objects of size bytes or
smaller are placed in the
short data area, and
references to such data
assume it resides in the short
data area. Valid values of
size are a decimal number
between 8 and 4,194,304
(4MB). If no size is
specified, all data is placed
in the short data area.
The default is
+Oshortdata=8.
All+[no]srcpos
All+O[no]store_ordering
+O3or higher+O[no]vectorize
This option controls the
generation of source position
information for HP Caliper.
The default is +srcpos.
When +srcpos, is in effect,
the compiler generates
source position information.
When+nosrcposis in
effect, the compiler does not
generate this information
and the compiler instructs the
linker to discard any of this
information found in the
object files.
This option preserves [does
not preserve] the original
program order for stores to
memory that is visible to
multiple threads. This does
not imply strong ordering.
The default is
+Onostore_ordering.
Replace [do not replace]
eligible loops with calls to
the math library; for more
information, see “Using the
+Ovectorizeoption” on
page168. The default
is+Onovectorize.
Enable the assertion that
only the files that are
compiled with this option
directly reference any global
variables and procedures
that are defined in these
files. In other words, this
option asserts that there are
no unseen accesses to the
globals. When this assertion
is in effect, the optimizer can
hold global variables in
registers longer and delete
inlined or cloned global
procedures. This option is in
effect only at +O4level of
optimization.All files
compiled with
+Owhole_program_mode
must also be compiled with
+O4. If any of the files are
compiled with +O4but are
not compiled with
+Owhole_program_mode,
the linker disables the
assertion for all files in the
program.Use this option to
increase performance speed,
but only when you are
certain that only the files
compiled with
+Owhole_program_modedirectly
access any globals that are
defined in these files.The
default is
+Onowhole_program_modewhich
disables the assertion.
Conservative vs. aggressive optimization
At optimization level 2 or higher, the optimizer makes a number of assumptions about the program
it is optimizing—for example, that re-ordering an expression for improved instruction scheduling
will not change its results. In general, these assumptions relate to how closely the target program
conforms to the Fortran90 Standard. For programs that conform to the Standard, it is safe for the
optimizer to apply certain optimizations that can significantly improve performance. For
nonstandard-conforming programs, these same optimizations could change the results or behavior
of the program in ways that may not be acceptable to the programmer.
The +Oconservative and +Oaggressive options enable you to set the optimizer’s assumptions
about which optimizations it can and cannot apply to a program. Each option invokes a subset
of the fine-tuning options that balances safety and performance according to the coding style of
the target program. You can use either option at optimization level 2 or higher.
NOTE:+Oaggressiveand +Oconservativeare incompatible and must not appear on the
same command line.
Conservative vs. aggressive optimization99
Table on page160 lists the assumptions that the optimizer makes about your program when you
compile with +Oconservative, +Oaggressive, or neither option (the default). The table also
lists the fine-tuning options that are invoked by +Oconservativeand +Oaggressive. The
options listed for the default case are the subset of the ones invoked by +Oconservative and
+Oaggressive. For information about the fine-tuning options listed in the third column, see Table
on page 152.
Table 30 Conservative, aggressive, and default optimizations
Invoked optionsAssumptionsSpecified options
+Onoconservative+Onoaggressive(the
default)
+Oaggressive
• Standard-conforming
• Nonstandard
• Sensitive to rounding
differences
• Contains floating-point
expressions that must be
evaluated in the specified order
• Procedure arguments may
overlap
• Standard-conforming
• Contains floating-point
expressions that permit
re-ordering for optimization
• Does not contain uninitialized
variables
+Onoentrysched
+Omoveflops
+Onoparmsoverlap
+Onovectorize
+Ofltacc+Oconservative
+Onomoveflops
+Oparmsoverlap
+Oentrysched
+Onofltacc
+Onoinitcheck
+Ovectorize
NOTE:The +Oaggressiveand +Oconservative options are valid only on the PA-RISC
systems.
Parallelizing HP Fortran programs
The following sections discuss how to use the +Oparallel option and the parallel directives when
preparing and compiling HP Fortran programs for parallel execution. Later sections also discuss
reasons why the compiler may not have performed parallelization. The last section describes
runtime warning and error messages unique to parallel-executing programs.
For a description of the +Oparalleloption, see “Fine-tuning optimization options” (page 40).
NOTE:The +Oparalleloption is not available on Integrity systems for HP Fortran Version 3.2
and later. You must use the +Oautop ar option instead to parallelize loops.
Compiling for parallel execution
The following command lines compile (without linking) three source files: x.f90, y.f90, and
z.f90. The files x.f90and y.f90are compiled for parallel execution. The file z.f90 is compiled
for serial execution, even though its object file will be linked with x.oandy.o.
f90 +O3 +Oparallel -c x.f90 y.f90
f90 +O3 -c z.f90
The following command line links the three object files, producing the executable file para_prog:
f90 +O3 +Oparallel -o para_prog x.o y.o z.o
100 Performance and optimization
Loading...
+ hidden pages
You need points to download manuals.
1 point = 1 manual.
You can buy points or you can get point for every manual you upload.