HP HP-UX 11i v1, HP-UX 11i v2, HP-UX 11 User's Guide

HP Fortran Programmer Guide

HP-UX 11i v1, HP-UX 11i v2, and HP-UX 11i v3

Abstract

The HP Fortran Programmer Guide describes how to use the different features of HP Fortran to develop, compile, debug, and optimize programs in HP 9000 systems, and Integrity systems. It also describes how to migrate HP FORTRAN 77 programs to the current HP Fortran compiler and how to use the different compiler features for porting programs written for other vendors’ Fortran to HP Fortran. If you have any problems with the software, please contact your local Hewlett-Packard Sales Office or Customer Service Center.

You need not be familiar with the HP parallel architecture, programming models, or optimization concepts to understand the concepts introduced in this book.

HP Fortran for Integrity systems is a part of the HP-UX PA-RISC Fortran product. HP Fortran is source compatible between the HP 9000 and Integrity systems. However, HP Fortran for Integrity systems will not run on PA-RISC based systems.

HP Part Number: 766160-001 Published: March 2014 Edition: 12

Confidential computer software. Valid license from HP required for possession, use or copying. Consistent with FAR 12.211 and 12.212, Commercial Computer Software, Computer Software Documentation, and Technical Data for Commercial Items are licensed to the U.S. Government under vendor's standard commercial license.

The information contained in this document is subject to change without notice. The only warranties for HP products and services are set forth in the express warranty statements accompanying such products and services. Nothing herein should be construed as constituting an additional warranty. HP shall not be liable for technical or editorial errors or omissions contained herein.

Intel and Itanium are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States and other countries.

UNIX is a registered trademark of The Open Group.

HP secure development lifecycle......................................................................7

1 An overview of HP Fortran...........................................................................8

The HP Fortran compiler environment...........................................................................................8

Driver......................................................................................................................................9

C preprocessor.......................................................................................................................10

Front-end...............................................................................................................................11

Back-end...............................................................................................................................13

Linker....................................................................................................................................15

Tools.....................................................................................................................................17

HP-UX operating system...........................................................................................................17

2 Compiling and linking..............................................................................18

Compiling with the f90 command.............................................................................................18

f90 command syntax..........................................................................................................18

Command-line options............................................................................................................19

Commonly-used options......................................................................................................20

Command-line options by category......................................................................................20

Option descriptions............................................................................................................21

Symbol binding options...........................................................................................................37

Using optimization options.......................................................................................................38

Reviewing general optimization options.....................................................................................38

Fine-tuning optimization options...............................................................................................40

Filenames..............................................................................................................................49

Linking HP Fortran programs....................................................................................................50

Linking with f90 vs. ld.........................................................................................................50

Linking to libraries..............................................................................................................51

Linking to nondefault libraries.........................................................................................52

Linking HP Fortran 90 routines........................................................................................52

Additional HP Fortran libraries........................................................................................53

Linking to shared libraries...............................................................................................53

Library search rules.......................................................................................................54

Special-purpose compilations...................................................................................................54

Compiling programs with modules.......................................................................................54

Examples.....................................................................................................................56

Compiling with make.....................................................................................................57

Managing .mod files.....................................................................................................58

Compiling for different PA-RISC machines..............................................................................58

Creating shared libraries....................................................................................................59

Compiling with +pic......................................................................................................59

Linking with -b..............................................................................................................60

Examples.....................................................................................................................60

Using the C preprocessor........................................................................................................60

Processing cpp directives....................................................................................................61

Saving the cpp output file...................................................................................................62

Creating demand-loadable executables.....................................................................................62

Creating shared executables....................................................................................................62

Compiling in 64-bit mode........................................................................................................63

Using environment variables.....................................................................................................63

F90ROOT environment variable...........................................................................................64

ST_F90COM[64] environment variable.................................................................................64

HP_F90OPTS environment variable......................................................................................64

Contents 3

LPATH environment variable.................................................................................................65

MP_NUMBER_OF_THREADS environment variable.................................................................65

Floating installation.................................................................................................................65

Setting up floating installation..............................................................................................66

3 Controlling data storage............................................................................67

Disabling implicit typing..........................................................................................................67

Automatic and static variables..................................................................................................67

Increasing the precision of constants.........................................................................................69

Increasing default data sizes....................................................................................................70

Increasing default data sizes...............................................................................................70

Sharing data among programs.................................................................................................72

Modules vs. common blocks.....................................................................................................74

4 Debugging..............................................................................................76

Using the HP WDB debugger..................................................................................................76

Stripping debugging information..............................................................................................76

Handling runtime exceptions....................................................................................................77

Bus error exception............................................................................................................78

Floating-point exceptions.....................................................................................................78

Illegal instruction exception.................................................................................................79

Segmentation violation exception.........................................................................................79

Bad argument exception.....................................................................................................80

Using debugging lines............................................................................................................80

5 Using the ON statement............................................................................81

Exceptions handled by the ON statement..................................................................................81

Actions specified by ON.........................................................................................................82

Terminating program execution............................................................................................83

Ignoring errors...................................................................................................................83

Calling a trap procedure....................................................................................................84

Trapping floating-point exceptions...................................................................................84

Trapping integer overflow exceptions...............................................................................84

Trapping +Ctrl-C trap interrupts................................................................................................85

Allowing core dumps..............................................................................................................85

6 Performance and optimization ...................................................................87

Using profilers........................................................................................................................87

HP Caliper........................................................................................................................87

Invoking HP Caliper with the compiler to optimize your program.........................................88

Comparing Program Performance....................................................................................88

Using Options to Control Data Collection.........................................................................89

Specifying PBO file names and locations..........................................................................89

gprof................................................................................................................................89

prof.................................................................................................................................90

Using options to control optimization.........................................................................................90

Using +O to set optimization levels......................................................................................90

Using the optimization options.............................................................................................91

Conservative vs. aggressive optimization...................................................................................99

Parallelizing HP Fortran programs...........................................................................................100

Compiling for parallel execution........................................................................................100

Performance and parallelization.........................................................................................101

Profiling parallelized programs..........................................................................................101

Conditions inhibiting loop parallelization............................................................................101

Calling routines with side effects parallellization .............................................................101

Indeterminate iteration counts.......................................................................................102

Data dependences......................................................................................................102

4 Contents

Vectorization........................................................................................................................103

Using the +Ovectorize option............................................................................................103

Controlling vectorization locally.........................................................................................104

Calling BLAS library routines.............................................................................................105

Controlling code generation for performance...........................................................................106

7 Writing HP-UX applications......................................................................107

Accessing command-line arguments........................................................................................107

Calling HP-UX system and library routines................................................................................108

Using HP-UX file I/O.............................................................................................................108

Stream I/O using FSTREAM..............................................................................................108

Performing I/O using HP-UX system calls.............................................................................108

Establishing a connection to a file......................................................................................108

Obtaining an HP-UX file descriptor.....................................................................................109

8 Calling C routines from HP Fortran............................................................110

Data types...........................................................................................................................110

Unsigned integers............................................................................................................111

Logicals..........................................................................................................................111

Complex numbers............................................................................................................112

Derived types..................................................................................................................113

Pointers...........................................................................................................................113

Argument-passing conventions................................................................................................113

Case sensitivity.....................................................................................................................114

Arrays.................................................................................................................................116

Memory layout of a two-dimensional array in Fortran and C..................................................116

C strings..............................................................................................................................118

C null-terminated string.....................................................................................................118

Fortran hidden length argument.........................................................................................118

Passing a string...............................................................................................................118

File handling........................................................................................................................120

Sharing data........................................................................................................................122

9 Using Fortran directives...........................................................................123

Directive syntax....................................................................................................................123

Using HP Fortran directives....................................................................................................123

$HP$ ALIAS....................................................................................................................124

$HP$ CHECK_OVERFLOW...............................................................................................126

$HP$ LIST.......................................................................................................................127

$HP$ OPTIMIZE..............................................................................................................127

Compatibility directives.........................................................................................................128

Controlling vectorization...................................................................................................128

Controlling parallelization.................................................................................................129

Controlling dependence checks.........................................................................................129

Controlling checks for side effects.......................................................................................129

10 Migrating to HP Fortran.........................................................................131

Incompatibilities with HP FORTRAN 77...................................................................................131

Compiler limits................................................................................................................131

Command-line options not supported..................................................................................131

Format field widths...........................................................................................................132

Floating-point constants.....................................................................................................132

Intrinsic functions.............................................................................................................132

Procedure calls and definitions..........................................................................................133

Data types and constants..................................................................................................133

Input/output....................................................................................................................134

Directives........................................................................................................................134

Contents 5

Miscellaneous ................................................................................................................135

Migration issues...................................................................................................................135

Source code issues...........................................................................................................135

Directives...................................................................................................................136

Intrinsic functions.........................................................................................................137

Command-line option issues..............................................................................................137

Object code issues...........................................................................................................138

Data file issues................................................................................................................138

Approaches to migration.......................................................................................................139

HP-supplied migration tools...............................................................................................139

HP FORTRAN 77 compiler...........................................................................................139

HP Fortran compiler.....................................................................................................139

Lintfor........................................................................................................................139

Fortran incompatibilities detector...................................................................................139

11 Porting to HP Fortran..............................................................................141

Compatibility extensions........................................................................................................141

Compiler directives..........................................................................................................142

Intrinsic procedures..........................................................................................................143

Using porting options............................................................................................................144

Uninitialized variables......................................................................................................144

Large word size...............................................................................................................145

One-trip DO loops...........................................................................................................145

Name conflicts................................................................................................................146

Names with appended underscores...................................................................................147

Source formats................................................................................................................147

Escape sequences............................................................................................................148

Porting from Tru64 to HP Fortran.............................................................................................148

+cfc...............................................................................................................................148

Enhancements.................................................................................................................149

New options...................................................................................................................149

12 Fortran 2003 Features...........................................................................151

Interoperability with C...........................................................................................................151

Input/output enhancements...............................................................................................151

Miscellaneous enhancements.............................................................................................151

Data enhancements.........................................................................................................152

Object orientation features................................................................................................152

13 Documentation Feedback.......................................................................153

Glossary..................................................................................................154

Index.......................................................................................................159

6 Contents

HP secure development lifecycle

Starting with HP-UX 11i v3 March 2013 update release, HP secure development lifecycle provides the ability to authenticate HP-UX software. Software delivered through this release has been digitally signed using HP's private key. You can now verify the authenticity of the software before installing the products, delivered through this release.

To verify the software signatures in signed depot, the following products must be installed on your system:

• B.11.31.1303 or later version of SD (Software Distributor)

• A.01.01.07 or later version of HP-UX Whitelisting (WhiteListInf)

To verify the signatures, run: /usr/sbin/swsign -v –s <depot_path>. For more information, see Software Distributor documentation at http://www.hp.com/go/sd-docs.

NOTE: Ignite-UX software delivered with HP-UX 11i v3 March 2014 release or later supports

verification of the software signatures in signed depot or media, during cold installation. For more information, see Ignite-UX documentation at http://www.hp.com/go/ignite-ux-docs.

1 An overview of HP Fortran

When you use the f90 command to compile a Fortran program, the command invokes a number of components—and not just the compiler—to create the executable. By default, f90 invokes different components to parse the source files for syntax errors, produce an intermediate code, optimize the intermediate code, produce object code, search a set of library files for any additional object code that may be required, and link all of the object code into an executable file that you run without further processing.

For example, consider a program that consists of three source files: x.f90, y.f90, and z.f90. The following command line will process the source files and, if they are syntactically correct, produce an executable file with the default name a.out:

$ f90 x.f90 y.f90 z.f90

After compilation is complete, you can execute the program by invoking the name of the executable, as follows:

$ a.out

However, it is likely that you’ll want to control what components act on your program and what they do to it. For example, you may want to give the executable a name other than a.out or to link in other libraries than the default ones. The HP Fortran compiler supports a variety of command-line options that enable you to control the compilation process. This chapter provides an overview of the process and of the options that enable you to control the different components invoked by the f90 command.

NOTE: To get a summary listing of all f90 options, refer to the f90(1) manpage or use the

command, as shown here:

$ f90 +usage

For a full description of the options, see the most current version of the Parallel Programming Guide for HP-UX Systems.

The HP Fortran compiler environment

The HP Fortran compiler environment illustrates the compilation process, showing the different components of the HP Fortran compiler environment; active processes are unshaded and data elements are shaded. With the exception of the performance analysis tools and the debugger (WDB), all components are invoked by the f90 command. The C preprocessor and linker can also be separately invoked by the cpp and ld commands; see the cpp(1)and ld(1) manpages, respectively, for more information. The remaining sections in this chapter briefly describe the different components and the command-line options that control them. Included in each section are references to other parts of this manual for more detailed information.

8 An overview of HP Fortran

Driver

Figure 1 HP Fortran compiler environment

The driver parses the f90 command line by which you invoke the compiler, calls each subprocess as needed to complete the compilation, and retains control throughout the compilation process.

Command-line options that control driver functions enable you to do the following:

• Call subprocesses that you want to substitute for those that f90 calls by default

• Pass arguments to a subprocess

• Get a summary listing of all options supported by the compiler

• Display information about the version of f90 you are using

• Control the level of information that the driver displays about the compilation process

Table 1 (page 9) lists and briefly describes the options that control the driver

Table 1 Options for controlling the f90 driver

FunctionOption

-c

+dryrun

-o

+pre_include=file

Suppress the link phase and produce an object file (.o) from each source file on the command line.

Generate sub-process information for a given f90 command-line without running the sub-processes.

Name the output file outfile instead of the default file name (a.out or filename.o). If linking has been suppressed, the front end uses this option to name the object file.

Process contents of file before all source files specified on the command line. The command line can have multiple occurrences of this option, each specifying a different file; they are processed in the specified order.

-tx,name

Substitute a private version (name) of one or more subprocesses (x) of the compilation. The values for x are:

a Assembler c Compiler l Linker p C preprocessor s Startup file (crt0.o library) e Debug file (end.o library)

Driver 9

Table 1 Options for controlling the f90 driver (continued)

FunctionOption

If you compile and link separately and specify +tl on the compile line, you must also specify it on the link line.

List and briefly describe all f90 options.+usage

-v

+version

-Wx,name]

C preprocessor

HP Fortran source files can include directives that the C preprocessor (cpp) reads and expands before the program is passed to the compiler. Among other things, cpp directives enable you to code parts of your program for conditional compilation. By default, the f90 command passes source files ending in the .F extension to the C preprocessor.

Table 2 (page 10) lists and briefly describes the options for controlling the preprocessor, including

the +cpp option, which overrides the default behavior and passes all source files on the command line to the preprocessor. For additional information, see and the cpp(1) manpage.

Print verbose information to standard output as program is compiled.

Write compiler version information to standard output, without compiling.

Pass arg1 through argN to a subprocess of the compilation, identified by x. The values for x are:

a Assembler c Compiler l Linker p C preprocessor

If you compile and link separately and specify +Wl on the compile line, you must also specify it on the link line.

Table 2 Options for controlling the C preprocessor

+cpp={yes|no|default}

+[no]cpp_keep

-Dname[=def]

-Idirectory

-M

FunctionOption

Invoke the C preprocessor. +cpp=yes passes all source files to the preprocessor. +cpp=default passes only files ending in the .F extension. +cpp=no suppresses passing any files. The default is +cpp=default.

Retain [discard] output from the C preprocessor. If the source filename is file.f or file.F, output is stored in file.i; if the source filename is file.f90, the output filename is file.i90. The default, +nocpp_keep, is to discard the output.

Define the symbol name to the preprocessor. If def is specified, name is defined to that value.

Add directory to the list of directories searched for files specified in include directives. The command line can have multiple occurrences of this option, each specifying a different directory.

Generate the make rule corresponding to the compilation, the output is redirected to the stdout or user specified file. The rule describes the dependencies of the source file, containing the object file name for that source file, a colon, and the names of all the included files.

-Uname

10 An overview of HP Fortran

Remove any initial definition of name, a reserved symbol that is predefined by the preprocessor.

Front-end

The front-end is responsible for parsing the source code and issuing warning and error messages when the parse fails. Command-line options enable you to control the front end’s assumptions about the source code, including whether the source is in fixed or free format, uses implicit or explicit typing, and contains extensions. Other front-end options control the level of error messages and their language (Native Language Support), default data sizes, and search rules for .mod files. For a list of the options that control the front end, see Table 3 (page 11).

Table 3 Options for controlling the front end

FunctionOption

+[no]autodbl

+[no]autodbl4

+check={all|none}

+[no]dlines

Promote [do not promote] all integer, logical, and real items to 8 bytes, and all double-precision and complex items to 16 bytes. The default is +noautodbl.

For information about using this option, see “Increasing

default data sizes” (page 70).

Promote [do not promote] all integer, logical, and real items to 8 bytes, and complex items to 16 bytes. The +autodbl4 option does not promote the size of double-precision and double-complex items. The default is +noautodbl4.

For information about using this option, see “Increasing

default data sizes” (page 70).

Enable (+check=all) or disable (+check=none) compile-time range checking of array subscripts and uninitialized stack variables at runtime. The default is +check=none.

For information about using this option, see “Segmentation

violation exception” (page 79).

Compile debug lines as source statements [comments]. Source lines must be in fixed format. The default, +nodlines, is to treat source lines with a D or d in column 1 as comments.

For information on using this option, see “Using debugging

lines” (page 80).

+[no]escape

+[no]extend_source

+f2003allocassign

-Idirectory

Treat the backslash character (\) as a C-like escape [literal] character. The default is +noescape.

For information on using this option when porting, see

“Escape sequences” (page 148).

Allow [do not allow] up to 254 characters on a single source line. The default, +noextend_source, is 72 characters for fixed format and 132 for free format.

For information on using this option when porting, see

“Source formats” (page 147).

Enable compiling of Fortran 2003 standard code.+f2003

Enables automatic allocation or re-allocation of arrays when an assignment happens to an allocatable array that is part of F2003 standard.

Add directory to the list of directories searched for files specified inINCLUDE lines and include directives, and for .mod files. The command line can have multiple instances of this option, each specifying a different directory.

For information about using this option, see “Managing

.mod files” (page 58).

Front-end 11

Table 3 Options for controlling the front end (continued)

FunctionOption

+[no]implicit_none

+langlvl={90|default}

+[no]list

+moddir=directory

+moduleoptimize

+nls=lang

+[no]onetrip

Cause the types of identifiers to be implicitly undefined [defined]. The default is implicit typing (+noimplicit_none).

For information about using this option, see “Disabling

implicit typing” (page 67).

Issue warnings for all extensions to the Fortran standard (+langlvl=90). The default, +langlvl=default, allows extensions.

For information about using this option, see Chapter 11,

“Porting to HP Fortran” (page 141).

Write [suppress] a program listing to standard output during compilation. The default is +nolist.

Write .mod files to directory. The default is to write .mod files to the current directory.

For information about using this option, see “The HP Fortran

compiler environment” (page 8).

Creates optimized module files by discarding redundant information. This reduces the compilation time and memory requirement of the compiler for nested or hierarchical modules.

Enable 16-bit Native Language Support (NLS) in strings and comments in the language specified by lang.

Execute any counter-controlled DO loop at least once (+onetrip). The default is +noonetrip.

For information about using this option when porting, see

“One-trip DO loops” (page 145).

+[no]ppu

+real_constant={single|double}

+source={fixed|free| default}

+[no]uppercase

Postpend [do not postpend] underscores at the end of definitions of and references to externally visible symbols.

For Itanium architecture, the default is +ppu. For PA-RISC 64-bit architecture, the default is +ppu. For PA-RISC 32-bit architecture, the default is +noppu.

For information about using this option when porting, see

“Names with appended underscores” (page 147).

Treat all single-precision real and complex constants as either single-precision (+real_constant=single) or double-precision (+real_constant=double). The default is +real_constant=single. This option has no effect on constants that are explicitly sized or when the +autodbl or +autodbl4 option is specified.

For information about using this option, see “Increasing

the precision of constants” (page 69).

Accept source files in fixed format (+source=fixed) or free format (+source=free). The default,

+source=default, is free for .f90 files and fixed for .f and .F source files.

For information about using this option, see “Source

formats” (page 147).

Use uppercase [lowercase] for all external names. The default is +nouppercase.

For information about using this option, see “Case

sensitivity” (page 114).

12 An overview of HP Fortran

Table 3 Options for controlling the front end (continued)

Suppress warning messages.-w

FunctionOption

Back-end

The two main functions of the back-end are:

• To optimize your program for faster performance

• To generate the code that goes into the object file

Optimization is performed by two subcomponents of the compiler’s back end:

• The High-Level Optimizer (HLO), which performs large-scale, high-semantic-level analyses and

• The low-level optimizer, which performs traditional optimizations (such as common

Options for controlling optimization form the largest group of the command-line options. These options enable you to do the following:

• To set the level of optimization that is applied to your program

• To apply a package of optimizations that meet certain requirements of your application—for

Warn about all questionable constructs and issue remarks about coding styles and performance. Without the +w option, the compiler issues warnings only about constructs that are almost certainly problems.

transformations to increase performance.

subexpression elimination and dead-code removal) as well as machine-specific optimizations.

example, optimizations that favor compile-time speed over performance

• To apply specific optimization technologies to your program, or to specific parts of your

program, for fine-tuning performance

Table 4 (page 90) lists (in summary form) the options that control optimization. For information

about how to use these options, see

NOTE: If you use the f90 command to compile and link on separate command lines, many of

the optimization options must appear on both the command line and the link line; see “Performance

and optimization ” (page 87). For information about using f90 to compile and link, see “Linking with f90 vs. ld” (page 50).

Table 4 Options for controlling optimization

FunctionOption

+check=uninit

+DC7200

-O[optlevel]

+Ooptlevel

Check the use of stack variables before it is defined at runtime.

Perform memory hierarchy optimizations for the PA7200 processor.

Optimize program, where optlevel is 0 (no optimization), 1, 2, 3, or 4 (the highest level). If optlevel is not specified, the program is optimized at level 2 (-O2).

This option has the same meaning as the -O[optlevel] option, except that optlevel must be specified. It is provided for compatibility with makefiles.

Back-end 13

Table 4 Options for controlling optimization (continued)

FunctionOption

+O[no]info

Provide [do not provide] feedback information about the optimization process. This option is most useful at optimization level 3 and higher. The default is +Onoinfo.

+O[no]optimization

Enable [disable] optimization, a predefined string that indicates a category of optimizations (for example, those that do not increase code size) or a specific optimization technology (for example, inlining). See the HP Fortran Programmer’s Reference, for the different values for optimization.

The other component of the back end is the code generator (CodeGen), which you can control by using the command-line options in Table 1.5. These options allow you to specify (among other things) that the output file include debugging or profiling information or that local variables be saved in static memory.

Table 5 Options for controlling code generation

FunctionOption

+[no]asm

+DAmodel

Compile the named source files and leave [do not leave] the assembly language output in corresponding files whose names are suffixed with .s. The default is +noasm.

Generate code for a specific version of the PA-RISC architecture. model can be one of the following:

• PA-RISC version number (1.1 or 2.0).

• A model number (for example, 750 or 870).

• One of the PA-RISC processor names (for example,

PA7000, PA7100, or PA8000).

• The word portable to generate code compatible

across all PA-RISC 1.1 and 2.0 workstations and servers.

For information about using this option, see “Compiling

for different PA-RISC machines” (page 58).

+DSmodel

14 An overview of HP Fortran

Perform instruction scheduling appropriate for a specific implementation of the PA-RISC architecture. model can be one of the following:

• PA-RISC version number (1.1 or 2.0).

• A model number (for example, 750 or 870).

• One of the PA-RISC processor names (for example,

PA7000, PA7100, or PA8000).

For information about using this option, see “Compiling

for different PA-RISC machines” (page 58).

On Itanium processor family architecture, the defined values for model are:

• blended: tune for best performance on a combination

of processors (i.e., Itanium or Itanium 2 processor). This is the Itanium architecture default.

• itanium: tune for best performance on an Itanium

processor.

• itanium2: tune for best performance on an Itanium 2

processor.

• native: tune for best performance on the processor on

which the compiler is running.

Table 5 Options for controlling code generation (continued)

FunctionOption

-g

+[no]gprof

+pic={short|long|no}

Generate debugging information needed by the debugger. This option is compatible with optimization levels 0, 1, and

2. If you compile and link separately and specify -g on the command line, you must also specify it on the link line.

For information about using this option to prepare programs for the debugger, see “Using the HP WDB debugger”

(page 76).

Prepare [do not prepare] object files for profiling with

gprof; see the gprof(1) manpage. The default is +nogprof. If you compile and link separately and specify +gprof on the command line, you must also specify it on

the link line.

For information about using this option to profile programs with gprof, see “gprof” (page 89).

Generate code for programs that reference a very large number of shared data items. The linker will issue a diagnostic message in the rare cases when this option is needed.

NOTE: This option is valid only on the PA-RISC systems.

Generate Position Independent Code (PIC) with short displacements (+pic=short) or long displacements (+pic=long) for use in shared libraries. The default is +pic=no.

For information about using this option when creating shared libraries, see “Compiling with +pic” (page 59).

Linker

+[no]prof

Prepare [do not prepare] object files for profiling with prof; see the prof(1) manpage. The default is +noprof. If you compile and link separately and specify +prof on the command line, you must also specify it on the link line.

For information about using this option to profile programs with prof, see “gprof” (page 89).

+[no]save

Save [do not save] all local variables in all subprograms.

For information about using this option when porting, see

“Uninitialized variables” (page 144).

The linker (ld) builds an executable program from the object files produced by the back end and the libraries. An important group of options for controlling the linker specify what libraries the linker should search and where it should search for them. Other options control the type of information that the linker should or should not include in its output file, such as symbol table information used by the debugger or marks identifying the output file as shareable or demand loadable. Table 1-6 lists and briefly describes options that control the linker.

NOTE: If you use the f90 command to compile and link on separate command lines and compile

with any of the options (except -c) listed in Table 1-6, you must specify the same options on the link line as well.

Table 6 Options for controlling the Linker

+[no]demand_load

FunctionOption

Suppress linking; produce object files only.-c

Mark [do not mark] the output file from the linker demand load. If you compile and link separately and specify

Linker 15

Table 6 Options for controlling the Linker (continued)

FunctionOption

+demand_load on the command line, you must also specify it on the link line. The default is +nodemand_load.

For information about using this option, see “Creating

demand-loadable executables” (page 62).

+FPflags

+[no]fp_exceptions

-Ldirectory

-lx

Specify how the runtime environment for trapping floating-point exceptions should be initialized at program startup. If you compile and link separately and specify +FP on the command line, you must also specify it on the link line with the identical set of flags. The default is that all traps are disabled. See the ld(1) manpage for specific values for flags.

For information using this option, see “Floating-point

exceptions” (page 78).

Enable [disable] floating-point exceptions. Enabling floating-point exceptions also causes the running program to issue a procedure traceback for runtime errors. The default is +nofp_exceptions.

For information using this option, see “Floating-point

exceptions” (page 78).

Add directory to the front of the library search path. This option applies only to libraries specified by the -l option (see below). If you compile and link separately and specify -L on the command line, you must also specify it on the link line.

For information about using this option, see “Library search

rules” (page 54).

Link the library libx.a or libx.sl to the executable program. If you compile and link separately and specify

-l on the command line, you must also specify it on the link line.

For information about using this option, see “Linking to

nondefault libraries” (page 52)and the ld(1) manpage.

-ooutfile

+[no]shared

+[no]strip

+[no]ttybuf

+[no]U77

Name the output file outfile instead of the default a.out. If linking is suppressed (-c), this option is used instead to name the object files.

Cause the output file from the linker to be marked shared [unshared]. If you compile and link separately and specify +shared on the command line, you must also specify it on the link line. The default is +shared. For information about using this option, see “Creating shared executables”

(page 62).

Strip [do not strip] symbol table information from the linker output. For more information, see the ld(1)) and strip(1) manpages. This option is not compatible with -g. If you compile and link separately and specify +strip on the command line, you must also specify it on the link line. The default is +nostrip.

For information about using this option, see “Stripping

debugging information” (page 76).

Use buffered [unbuffered] output to the terminal. The default is +ttybuf.

Invoke [do not invoke] support for the libU77 library (BSD 3f). If you compile and link separately and specify +U77 on the compile line, you must also specify it on the link line. The default is +noU77.

16 An overview of HP Fortran

Table 6 Options for controlling the Linker (continued)

FunctionOption

For information about the libU77 library, see “Additional

HP Fortran libraries” (page 53)and the HP Fortran

Programmer’s Reference.

-Wl,options

Tools

The HP Fortran compiler environment includes a high-level language debugger and performance analysis tools. The debugger is HP WDB, which includes a graphical user interface. To prepare a program for debugging, you must compile it with the -g option. For information about this option, see “Using the HP WDB debugger” (page 76).

The performance analysis tools include the standard UNIX utilities, prof and gprof. To use prof and gprof, you must compile with the +prof and +gprof options, respectively. For more information about all of the performance analysis tools, see “Using profilers” (page 87)and the CXperf(1), prof(1), gprof(1), and ttv(1) manpages.

HP-UX operating system

Although the HP-UX operating system does not appear Figure 1-1 on page 3 , it provides a variety of resources for programs executing within HP-UX. For example, HP-UX captures the command line you use to invoke an executable program, breaks it up into arguments, and makes them available to your program.

HP-UX also has many callable system routines that provide low-level access to kernel-level resources. For example, your program can call HP-UX file-processing routines as alternatives to Fortran I/O.

“Writing HP-UX applications” (page 107) discusses how HP Fortran programs can take advantage

of HP-UX resources. For a full description of HP-UX system routines, see the HP-UX Reference.

Pass a comma-separated list of options to the linker.

For information about options supported by the linker, see the ld(1) manpage.

Tools 17

2 Compiling and linking

This chapter discusses how to compile and link HP Fortran programs and covers the following topics:

• Compiling with the f90 command

• Linking HP Fortran programs

• Special-purpose compilations

• Using environment variables

Compiling with the f90 command

The default behavior of the f90 command is to compile source files listed on the command line and, if the compilation is successful, to pass the resulting object files to the linker. If the link stage is successful, the linker produces an executable program with the default name a.out.

Consider, for example, the program hello.f90:

Example 1 hello.f90

PROGRAM main CALL hello() END PROGRAM main SUBROUTINE hello() PRINT *, 'Hello, I must be going.' END SUBROUTINE hello

When compiled with the command line:

$ f90 hello.f90 f90 produces two files, hello.o (object code) and a.out(the executable program).

If the command line contains only an object file, as in the following:

$ f90 hello.o f90passes the object file to the linker, which (if successful produces the executable program a.out.

Here is a sample run of the executable program:

$ a.out Hello, I must be going.

This section provides more detailed information about using the f90command, including:

• Command-line syntax

• Command-line options

• Filenames recognized by f90

f90 command syntax

Consider, for example, the program hello.f90:

18 Compiling and linking

Example 2 hello.f90

PROGRAM main CALL hello() END PROGRAM main SUBROUTINE hello() PRINT *, 'Hello, I must be going.' END SUBROUTINE hello

When compiled with the command line:

$ f90 hello.f90 f90 produces two files, hello.o (object code) and a.out(the executable program).

If the command line contains only an object file, as in the following:

$ f90 hello.o f90passes the object file to the linker, which (if successful produces the executable program a.out.

Here is a sample run of the executable program:

$ a.out Hello, I must be going.

This section provides more detailed information about using the f90 command, including:

• Command-line syntax

• Command-line options

• Filenames recognized by f90

f90 command syntax

The syntax for using the f90command is:

f90 [options][files]

where options is a list of one or more command-line options andfilesis a list of one or more files containing HP Fortran source code to be compiled or object code to be linked. Items in options and files can be interspersed on the command line, separated by spaces. However, some options are order-sensitive. For example, the -l option, which is used to specify a library for linking, must follow the program file to which it will be linked.

For information about using the-l option, see “Linking to nondefault libraries” (page 52). For more information about the f90command line, see HP Fortran Programmer’s Reference.

Command-line options

Command-line options enable you to override the default behavior of the f90command. Some options affect how files are compiled or linked; for example, the -Ooption requests optimization. Other options may cause the f90 command to skip a process entirely; for example, the -coption compiles without linking. And still others invoke processes other than the default ones; for example, the +cpp=yes option causes the f90 command to send source files to the C preprocessor (cpp) before compiling. (For information about using cpp, see “Using the C preprocessor” (page 60).)

Command-line options enable you to override the default behavior of the f90 command. Some options affect how files are compiled or linked; for example, the -Ooption requests optimization. Other options may cause the f90command to skip a process entirely; for example, the -coption compiles without linking. And still others invoke processes other than the default ones; for example, the +cpp=yes option causes the f90 command to send source files to the C preprocessor (cpp) before compiling. (For information about using cpp, see “Using the C preprocessor” (page 60).)

Many options are of the form +[no]option, where +option enables the option’s functionality and +nooption disables it. Other options have more than just an on or off state; these are of the

Command-line options 19

form +option=arg. You can cause f90 to list the values for arg on stderr by specifying just the option name without an argument. For example, given the command line:

$ f90 +langlvl= prog.f90

f90 will issue the following message:

f90: The '+langlvl=' option requires one of s

Still other options take a name as an argument. For example, the -oname option specifies the name you want to give to the output file. If you misspell an option on the f90 command line, the driver looks for options that are similar to the one you entered and lists them as possible alternatives on stderr. It meanwhile compiles the program without the option in question.

For detailed information about the syntax of all the options, see the HP Fortran Programmer’s Reference. For a brief descriptive list of the options, use the command line:

$ f90 +usage

Commonly-used options

Table 2-1 identifies commonly-used command-line options for when you want to change the compiler’s default behavior. For a complete and full description of all HP Fortran command-line options, see “Option descriptions” (page 21)

Table 7 Commonly-used options

FunctionOption

-c

-g

-Ldirectory

-O

+save

-ooutfile

-v

Compile without linking. Use this option to compile and link in separate steps.

Prepare program for debugging. Use this option to prepare your program for debugging.

Specify where to look for libraries; applies only to succeeding -l options. Be careful about using this option if the LPATH environment variable is set.

Specify a library. Use this option to link in library routines.-lx

Optimize. Use this option to optimize your program at the default level of optimization.

Give the SAVE attribute to local variables. Use this option when porting older Fortran programs that may contain uninitialized variables.

Name the output file outfile. Use this option to name the executable or object file.

List all compile-line options currently supported by f90.+usage

Enable verbose mode. Use this option to get a report on the compilation process.

Set include file and module search path.-Idirectory

Specify that the arg has to be passed by a linker option.-Wl,arg

Command-line options by category

Table 2-2 categorizes the f90 command-line options. For detailed information about each of the options, see “Option descriptions” (page 21).

20 Compiling and linking

Table 8 Options listed by category

OptionsCategory

Compatibility and porting

Data storage

Debugging

Language features

Linking

+autodbl, +autodbl4, +charlit77 +[no]es, +extend_source, gformat77, +i8, +io77, +langlvl, +multi_open, +nocheckuf, +nopadsharedcommon, +onetrip, +ppu, +r8, +save, +[no]signedzero, and +U77

-t and -WCompiler configuration

+autodbl, +autodbl4, +hugecommon, +hugesize, +i8, +indirectcommonlist=file, +nopadsharedcommon, +r8, +real_constant, and +save

+U77, -I, -L, -l, and +moddirDirectory, module, and library search path

+check, +dlines,+fp_exception, -g, +initheap_complex, +initheap_integer, +initheap_real and +initheap_set

+FP and +fp_exceptionError control

+O[no]openmpIndustry standardized extensions

+escape, +extend_source, +escape, +f2003, +implicit_none, +langlvl, +[no]signedzero, and +source

+demand_load, -dynamic, +FP, +k, -L, -l, +shared, +sharedlibF90, +strip, and +uppercase

Listing and messages

Performance and optimization

Miscellaneous

Option descriptions

The following alphabetical list describes each of the command-line options recognized by HP Fortran. The +usage option also lists and briefly describes all of the currently supported options.

+noalign64bitpointers

+asm, +dryrun, +langlvl=90, +list, -v, +version, +what, and -w

+nlsNative language support

+cat,+DA, +DC, +DO, +DS, +fastallocatable, +O, and -O +Oparallel_intrinsics

+cpp, +cpp_keep, -D, and -UPreprocessor

+gprof, +prof, +pa, +palProfiling

+asm, -c, +getarg0, +getarg1, +noalign64bitpointers, -o, +pic, +pre_include, +ttybuf, and +usage, +Z, +z

Set the alignment of data items within FORTRAN STRUCTUREs, COMMONblocks, and EQUIVALENCEclasses. Specifying +Awithout any arguments aligns data on 2-byte boundaries;

therefore, it is slightly different than the f77 +A.

Disables the correct alignment of pointers in derived types when compiling for wide mode (+DA2.0W).

NOTE: This option is only valid on the PA-RISC systems.

+allow_ivdep

Allows the IVDEPdirective. The IVDEPdirective assists the compiler’s dependence analysis. It can only be applied to

iterative DO loops and is only honored when the +allow_ivdepcompiler flag is used. The IVDEPdirective takes the following form:

Command-line options 21

C$DIR IVDEP

Rules and behavior: The IVDEPdirective is an assertion to the compiler’s optimizer about the order of memory

references inside a DOloop. The IVDEPdirective tells the compiler to begin dependence analysis by assuming all

dependences occur in the same forward direction as their appearance in the normal scalar execution order. This contrasts with normal compiler behavior, which is for the dependence analysis to make no initial assumptions about the direction of a dependence.

The IVDEPdirective must precede the DOstatement for each DOloop it affects. No source code lines, other than the following.

The IVDEPdirective is applied to a DOloop in which the user knows that dependences are in lexical order. For example, if two memory references in the loop touch the same memory location and one of them modifies the memory location, then the first reference to touch the location has to be the one that appears earlier lexically in the program source code. This assumes that the right-hand side of an assignment statement is earlier than the left-hand side.

The IVDEPdirective informs the compiler that the program would behave correctly if the statements were executed in certain orders other than the sequential execution order, such as executing the first statement or block to completion of all iterations, then the next statement or block for all iterations, and so forth. The optimizer can use this information, along with whatever else it can prove about the dependences, to choose other execution orders.

Example 3 EXAMPLE:

In the following example, the IVDEP directive provides more information about the dependences within the loop, which may enable loop transformations to occur:

C$DIR IVDEP DO I+1, N A(INDARR(I)) = A(INDARR(I)) + B(I) END DO

+[no]asm

+[no]autodbl

In this case, the scalar execution order follows:

• Retrieve INDARR(I)

• Use the result from Step 1 to retrieve A(INDARR(I));

• Retrieve B(I);

• Add the results from Steps 2 and 3 ;

• Store the results from Step 4 into the location indicated by A(INDARR(I))from Step1.

IVDEPdirects the compiler to initially assume that when Steps 1 and 5 access a common memory location, Step 1 always accesses the location first because Step 1 occurs earlier in the execution sequence. This approach lets the compiler reorder instructions, as long as it chooses an instruction schedule that maintains the relative order of the array references.

Relaxes the natural data type rules for alignment.+allow_unaligned

+asm compiles the named programs and leaves the assembler-language output in corresponding files whose names have the extension. The assembler-language output produced by this option is not supported as input to the assembler. The default is +noasm. The -Soption can be used to perform the same function as +asm.

+autodblincreases the default size of integer, logical, and real items to 8 bytes; see Table 2-3. It also increases the default size of double precision and complex items to 16 bytes. This option does not increase the size of the following:

• Items of character type

• Items declared with the BYTEstatement

• Items declared with the DOUBLE COMPLEXstatement

• Explicitly sized items

For example, the following are unaffected by +autodbl: INTEGER(KIND=4)

INTEGER(4) J

REAL*8 D

22 Compiling and linking

3.1416_4, 113_4 Note, however, that constants specified with an exponent—for example, 4.0E0and

2.3D0—are doubled.

Items promoted include constants, scalar variables, arrays, components of derived types, and record fields. This option also promotes intrinsics as well as the results and arguments of user-defined functions to the proper precision. Types specified in ONstatements are also promoted.

The entire program should be compiled with this option, not just selected files. This option is useful when porting programs that depend on the increased precision of 8 and

16 bytes. If you want to promote only single-precision items, use the +autodbl4option. (REAL(KIND=16)arithmetic is slow.)

The default is +noautodbl.

Table 9 Data type sizes and +autodbl[4]

Sizes in bytes of intrinsic types

ComplexDouble precisionInteger, logical, and real

884Default sizes

16168+autodbl

1688+autodbl4

+[no]autodbl4

-Boption

-c

Like +autodbl, +auotdbl4 increases the default size of integer, logical, and real items to 8 bytes, and the default size of complex items to 16 bytes; see Table 2-3. Unlike +autodbl, it does not increase the default size of double precision. This option does not increase the size of the following:

INTEGER(KIND=4)

INTEGER(4) J

REAL*8 D

3.1416_4, 113_4

Note, however, that constants specified with an exponent—for example, 4.0E0and 2.3D0—are doubled.

The entire program should be compiled with this option, not just selected files. Use this option when you want to promote only the single-precision items.

The default is +noautodbl4.

NOTE: The +autodbl4option causes items declared as REAL, INTEGER, and DOUBLE

PRECISION all to have the same size. This violates the Fortran Standard.

This option can be specified to invoke ld with -bto create a shared library.-b

The -Boptions are recognized by HP Fortran to specify whether references to global symbols may be resolved to symbols defined in the current translation unit, or whether they must be assumed to be potentially resolved to symbols defined in another load module. The -Boptions are only supported on HP-UX version 11i and later. See “Symbol binding options” (page 37).

-c compiles the specified source files but does not link them. The compiler produces a relocatable file (.o) for each file in the files list (these may include.f90, .f, .F, .i, .i90and .sfiles). When using -cand -otogether, you may specify only one source file on the command line; the resulting object file is renamed.

Command-line options 23

-C

+charlit77

+check={

}

Performs run-time error checking of array subscripts. This option is deprecated and replaced by the +check=alloption.

+charlit77 causes character literals to be placed in writable static storage. This allows character strings passed as actual arguments to be modified by the called routine.

all|

none|

uninit|

bounds|

bounds:array|

bounds:all|

bounds:none|

-check_bounds

+check=all

enables compile-time range checking for array subscripts. The +check=all option will also cause an executing program to halt with a runtime error if any of the following is detected:

• Integer overflow

• Use of uninitialized stack variable

• Out-of-bounds subscripts

• Out-of-bounds substrings

enables run-time checks for use of uninitialized stack variables.+check=uninit

enables run-time checks for out-of-bounds references to array variables.+check=bounds

enables run-time checks for out-of-bounds references to array variables.+check=bounds:arrary

+check=bounds:all

-check_bounds

+[no]cfc

+cpp={yes|no|default}

+cpp=yes

+cpp=default

+cpp=no

Enable or Disable Compaq/DEC F90 language mode.

tells the compiler to pass the source files specified on the command line to the C preprocessor before passing them on to the compiler. This option does not apply to .i and .i90 files.

The default, , is to apply the C preprocessor to files that end in the .F extension but not to files that end in .f or .f90.

Specifying tells the compiler not to invoke the C preprocessor for all files on the command line, including those ending in .F.

enables run-time checks for both +check=bounds:array and +check=uninit.

disables out-of-bounds run-time check.+check=bounds:none

performs run-time error checking; promote compile-time bound violations to errors.

If you want to keep the output from the C preprocessor, use the +cpp_keepoption.

+[no]cpp_keep

24 Compiling and linking

+cpp_ keep causes the compiler to retain the output from the C preprocessor. If the source filename is file.for file.F, the output filename is file.i; if the source filename is file.f90, the output filename is file.i90. The compiler will accept source files with the.i and .i90 extensions.

The default,+nocpp_keep , is to discard the output file.

Note that this option does not pass source files to the C preprocessor. To do that, you must also specify the +cpp=yes option.

-D name[=def]

+DAmodel

-Ddefines a symbol name (name)to the C preprocessor. If you do not provide a

definition (def) for the symbol name, name is defined as 1. This option applies only to files that are passed to the C preprocessor.

+DA generates object code for a particular version of the PA-RISC architecture. By default, the compiler generates code for the machine model you are compiling on. With this option, you can override the default, causing the compiler to generate code for the machine on which the program will execute rather than for the machine on which it is compiled.

modelcan be one of the following:

• A PA-RISC version number (1.1or2.0™)

• A model number (for example, 750or870)

• One of the PA-RISC processor names (for example, PA7000, PA7100, or PA8000)

• The word portable, which causes the compiler to generate code that is compatible

across all PA-RISC 1.1 and 2.0 systems

For best performance, use +DA with the model number or architecture of the machine on which you plan to execute the program.

The +DA option also specifies the appropriate search path for HP-UX math libraries. If your program calls mathematical functions,+DA2.0 links in the PA2.0 version of the math library, while +DA1.1 links in the PA1.1 library version. (For more information about using math libraries, see the HP-UX Floating-Point Guide.)

With +DA2.0W, memory addresses are 64-bit values. This allows common blocks and dynamically allocated memory to exceed 32-bit address limits. This feature is restricted by the available virtual memory on the system where the application is run.

NOTE: You must specify +DA2.OW to generate 64-bit code. At PA64, all data types

remain the same size as at PA32 except for pointers. Fortran pointers are hidden from the user and cannot be directly manipulated

+DA2.OW

+DC

+DDdatamodel

+[no]demand_load

+dfname

Changes the default size of a Cray pointer. It will become the large of a default-sized integer and a C pointer (8-bytes), rather than the 4-byte default.

Specifies cache values to the compiler. The syntax is +DC <a>:<b>:<c>: where <a> is the total cache size in bytes where <b> is the cache line size in bytes; and <c> cache associativity.

+DC7200performs memory hierarchy optimizations for the PA7200 processor.+DC7200

Generate code using either the ILP32or LP64data model. Defined values for datamodelare:

The default is +DD32.

+demand_load causes the output file from the linker to be marked demand load. When a process is marked demand load, its pages are brought into physical memory only when they are accessed. The default, +nodemand_ load, causes the output file from the linker not to be marked demand load. The -q option performs the same function as +demand_load, and the -Qoption performs the same function as +nodemand_load.

Specify profile database file name for profile-based optimizations. +dfrequires the specification of +P.

This option is a synonym for +Oprofile=use: filename. This option is only valid on the Itanium® processor family architecture. The default is flow.dataif a name is not specified.

Use the ILP32data model. The sizes of the int, long, and pointer data types are 32-bits.

Use the LP64data model. The size of the int data type is 32-bits, and the sizes of the long and pointer data types are 64-bits. Defines LP64 to the preprocessor.

Command-line options 25

+[no]dlines

+dlines treats source lines with a “D” or “d” in column 1 as statements to be

compiled. The default, +nodlines, treats lines beginning with “D” or “d” in column 1 as comments.

The +dlines option must be used only with source files in fixed-format.

+DOosname

+dryrun

+DSmodel

+[no]escape

+[no]es

-exec

+DOosname sets the target operating system for the compiler, and is intended for

enabling optimizations that are not backward compatible. PA-RISC— based system recognized values for osnameare 11.0 and 11.0EP9806. Itanium®— based system recognized values for osnameare 11.20, 11.22, and

11.23. +DOosname can be used at any level of optimization. The default value for osname

is the operating system version of the computer on which the compiler is invoked.

+dryrun option generates sub-process information for a given f90 command line without running the sub-processes.

It is useful in the development process to obtain command lines of compiler sub-processes to run the commands manually or to use them with other tools.

+DS specifies an instruction scheduling algorithm for a particular implementation of the PA-RISC architecture, as specified by model.

• A PA-RISC version number (1.1or 2.0)

• A model number (for example, 750or 870)

• One of the PA-RISC processor names (for example, PA7000, PA7100, or PA8000)

• Itanium® values as defined below.

For example, specifying +DS750 performs instruction scheduling tuned for one implementation of PA-RISC 1.1Specifying +DS 2.0 or +DS 1.1 performs scheduling

for a representative PA-RISC 2.0™or1.1 system, respectively. To improve performance on a particular model, use +DSwith that model number.

Object code with scheduling tuned for a particular model will execute on other systems, although possibly less efficiently.

If you do not use this option, the compiler uses the argument specified with the +DAoption. If you use neither +DSor +DA, the default instruction scheduling is for the system on which you are compiling.

On Itanium®-based architectures, the defined values for model are:

blended

itanium

itanium2

native

-dynamic is used to generate dynamically-bound executables.-dynamic

+escape treats the backslash character (\) as a C-like escape character. The default, +noescape, treats the backslash character as a literal character.

+[no]es is similar to +[no]extend_source except that character literals and

hollerith constants continued across a line boundary are not padded. This option provides compatibility with FORTRAN 77’s +es option.

Indicates that any object files created will be used to create an executable file. Constants with a protected or hidden export class are placed in the read-only data section. This option also implies -Bprotected_def.

Tune for best performance on a combination of processors (i.e., Itanium or Itanium® 2 processor).

Tune for best performance on an Itanium® processor.

Tune for best performance on an Itanium® 2 processor.

Tune for best performance on the processor on which the compiler is running.

26 Compiling and linking

+[no]extend_source

+extend_source allows extended source lines, which may contain up to 254

characters. The default, +noextend_source, restricts fixed-format source lines to 72 characters and free-format source lines to 132 characters.

Programs that depend on the compiler’s ignoring characters past column 72 will not compile correctly with the +extend_source option.

+externals=file

+fastallocatable

+fltconst_strict

+FPflags

Specifies a file that contains a list of procedure names to be considered external as opposed to intrinsic. Using this option causes the routines whose names appear in file to be treated as if they were declared with theEXTERNAL attribute.

+f2003 enables FORTRAN 2003 features supported by the compiler.+f2003

+fastallocatable enables a different representation for allocatable arrays in the

object code produced by the compiler. This alternate representation avoids problems in optimizing code containing allocatable array references. Additionally, this alternate representation for allocatable arrays is binary compatible with the old representation.

Uses a strict power table for converting floating constants with exponents. Use of strict power table provides a more accurate floating point representation.

+FP initializes the flagsthat specify how runtime floating-point exceptions should be trapped; uppercase flags enable traps, lowercase flags disable traps. flags can be concatenated to specify a desired behavior and may not contain spaces or tabs. Valid values for flags are identified in Table 2-4.

By default, all traps are disabled. However, you can specifically disable a behavior either by excluding the upper-case letter from flags or by including the equivalent lower-case letter (v,z,o,u,i,d) in flags. For example, the following command lines are equivalent:

$ f90 +FPvZI test.f90

$ f90 +FPZI test.f90

If you are using PA1.1 libraries, you can dynamically change these settings at run time by using thefpsetdefaults or fpsetmask routines. For more information about these routines, see the fpgetround (3M) man page and the HP-UX Floating-Point Guide.

Enabling sudden underflow may cause the same program to compute different results on different implementations of the PA-RISC 1.1and 2.0architectures. This is because some hardware implementations have sudden underflow available, while others do not. The +FPDoption enables the hardware to flush denormalized values to zero, but it does not require that it do so.

Table 10 Values for the +FP option

MeaningValue

Trap on invalid floating-point operations. Examples of invalid floating-point operations include the following:V

• Arithmetic operation on NaNs

• Operations such as (+inf) + (-inf) and (+inf) - (+inf)

• Multiplication of 0 and infinity

• Division operations 0/0 and inf/inf

• Certain floating-point remainder operations

• Square root of a negative value

• Certain kinds of comparisons of unordered values Z

Trap on floating-point divide by zero.Z

Trap on floating-point overflow.O

Trap on floating-point underflow.U

Command-line options 27

Table 10 Values for the +FP option (continued)

MeaningValue

+[no]fp_exception

Trap on floating-point operations that produce inexact results. Inexact result traps may occur whenever roundoff is necessary to produce the result. For example, the fraction 1.0/3.0 produces an inexact trap because there is no exact floating-point representation for this fraction.

Enable sudden underflow (flush to zero) of denormalized values on those PA-RISC systems greater than version

1.0 that have implemented sudden underflow. (That is, +FPD enables sudden underflow only if it is available on the processor that is used at run time.) Denormalized values are those values whose absolute value is very close to zero. For IEEE single precision data types, the largest denormalized value is approximately equal to 2-126. For IEEE double precision data types, such values are approximately equal to 2 -1022. Sudden underflow will cause some floating-point applications to run faster, with a possible loss of numerical accuracy on numbers very close to zero.

+fp_exceptionis a compile time flag. It causes a descriptive message and a procedure traceback to be issued to standard error when the HP-UX signals listed in Table 2-5 are generated.

For a description of these signals, see signal(2) and signal(5) in the HP-UX Reference. For information about floating-point exceptions and error handling, see the HP-UX Floating-Point Guide.

You can also use the ON statement to write your own trap procedures. For information about the syntax of the ON statement, see “Using the ON statement” (page 81).

The default,+nofp_exception , disables traceback information.

Table 11 Signals recognized by the +fp_exception option

MeaningSignal

Illegal instructionSIGILL

-g

-G

gformat77

+[no]gprof

+hugecommon

Floating-point exceptionSIGFPE

Bus error instructionSIGBUS

Segmentation violationSIGSEGV

Bad argument to system callSIGSYS

-g causes the compiler to generate information for use by the HP WDB debugger. The -goption can be used to prepare code for debugging that has been compiled with optimization options-O,-O1/+O1, , but not O3/+O3or higher.

+getarg0and +getarg1control the behavior of the getarg intrinsic subroutine. +getarg0 requests the industry standard behavior for getarg, where an index value of zero causes the program name to be returned. HP’s FORTRAN 77 getarg intrinsic also implements this industry standard convention. +getarg1is used to request non-standard behavior, where an index value of one causes the program name to be returned (older releases of HP Fortran behaved in this manner). The default is +getarg0.

gformat77 requests the FORTRAN 77 style of formatting a value of zero with the G edit descriptor. Fortran 90 uses an F edit descriptor when the value being written is zero, while FORTRAN 77 uses an E edit descriptor.

+gprof prepares object code files for profiling with gprof. The default is +nogprof. gprofis provided as part of the “HP-UX General Programming Tools” product; see gprof(1). The

-Goption can be used to perform the same function as +gprof.

+hugecommon instructs the compiler to place the specified COMMON block into a huge data segment. The format for this option is: +hugecommon=name

where name is the name of a COMMON block. By default, only COMMON blocks larger than 2 gigabytes are placed into huge data segments.

For example:

28 Compiling and linking

Example 4 % f90 +hugecommon=results pcvals.f90

places the COMMON block named results into a huge data segment. +hugecommonis especially useful when a program contains several different COMMON blocks that together occupy more than two gigabytes but individually occupy less than two gigabytes. In this situation, the largest COMMON blocks could be placed in a huge data segment when the program is compiled by specifying their names in multiple +hugecommon options.

If a common block is specified as huge in one object file, it must be specified huge in all object files. If it is not, the program will fail to link.

NOTE: PA2.0W objects cannot be combined with 32-bit object files. 64-bit applications will

only execute on PA8000-based systems.

+hugesize

+i2

+i8

+hugesize instructs the compiler to place COMMON blocks that are

larger than the specified size into a huge data segment. The format for this option is: +hugesize=n

where n is the size in kilobytes (1024 bytes). The default is to place COMMON blocks larger than two gigabytes

(2147483648 bytes) into huge data segments; that is, +hugesize=2097152 is the default.

For example: % f90 +hugesize=1024 hello.f90 specifies that COMMON blocks larger than 1048576 bytes (1

megabyte) should be placed into a huge data segment. If a common block is specified as huge in one object file, it must be

specified huge in all object files. If it is not, the program will fail to link. PA2.0W objects cannot be combined with 32-bit object files. 64-bit

applications will only execute on PA8000-based systems.

NOTE: +hugesizeand +hugecommon options are not valid on HP

Itanium ®-based systems.

+i2 changes 4-byte integer and logical constants, intrinsics, and user variables to 2-byte integers (rather than the 4-byte default).

+i8 changes 4-byte integer and logical constants, intrinsics, and user variables to 8-byte integers (rather than the 4-byte default).

Specifies incremental linking. This option is valid only for 64-bit links.+ild If the output file does not exist, or if it was created without the +ild option,

the linker performs an initial incremental link. The output file produced is suitable for subsequent incremental links. The incremental link option is valid for both executable and shared library links. It is not valid for relocatable links, options (or tools) that strip the output module, and certain optimization options. See ld(1)for more information.

+ildrelink

-Idirectory

Perform an initial incremental link, regardless of the output load module. This option is valid only for 64-bit links.

In certain situations during incremental linking (for example, internal padding space is exhausted), the incremental linker is forced to perform an initial incremental link. The +ildrelinkoption allows you to avoid such unexpected initial incremental links by periodically rebuilding the output file.

-I specifies a directory where .mod files and files named in the INCLUDE line or in #include directives may be found if their name is a relative pathname—that is, does not begin with a slash (/). Directories are searched in the following order:

• The current source directory—that is, the directory containing the file

with the INCLUDEline or #include.

• Directories specified by the -Ioption, in the order specified

Command-line options 29

• The current working directory

• The /usr/include directory

+[no]implicit_none

+indirectcommonlist=file

+initheap_complex=rval:ival

+implicit none forces the types of identifiers to be implicitly undefined.

This is equivalent to specifying IMPLICIT NONE for each program unit in each file in the files list. The source code that is to be compiled with this option may contain other IMPLICIT statements; the statements will be honored. The default, +noimplicit_none, allows identifiers to be implicitly defined.

The common blocks listed in file (one per line, no enclosing ‘ / ‘s) are treated as shared common blocks, but are not attached. The user must attach or otherwise allocate storage for such common blocks before they are referenced.

A C language program would typically be used to either attach a shared memory segment, or malloc a block of memory, and store that address into the external symbol for the common block. All Fortran code that references such a common block will indirect through the address in the external symbol for that indirect common block.

All source files that reference variables in such a common block must be compiled with the +indirectcommonlist flag, and that common block name must appear in the named file.

Default initialization of heap variables (mentioned in +initheap_set option) of type complex. rval is real part and ival is imaginary part of complex number. Valid values for rval and ival are normal real numbers, nan , snan.

The numbers represented by rval/ival are mentioned in below table.

Table 12

+initheap_integer=ival

+initheap_real=rval

+initheap_set=var

NumVal

quiet NaNnan

Signaling NaNsnan

snan is supported at default optimization level only.

Default initialization of heap variables (mentioned in +initheap_set option) of type integer. Valid values of ival are integer values of the form decimal, octal (starting with 0) and hexadecimal (starting with 0x).

Default initialization of heap variables (mentioned in +initheap_set option) of type real. Valid values for rval are normal real numbers, nan, snan.

Table 13

NumVal

Quiet NaNnan

Signaling NaNsnan

snan is supported at default optimization level only.

Default Initialization of heap variables is done for the variables specified by var.

The defined values for var are:

Table 14

30 Compiling and linking

allocatable

all heap variablesall

variables having allocatable attribute

variables in common blockcommon

variables in modulemodule

Table 14 (continued)

none of the heap variablesnone

variables having save attributesave

A combination of different type of heap variables can be specified by colon separated values, like +initheap_set=common:module:save

The default is +initheap_set=none

+io77

Suppresses the generation of the optional leading “0” before the decimal point for real numbers printed with the E and F edit descriptors. Fortran 77 suppressed these leading zeros unless the NOSTANDARDIOflag was used.

This option is only valid for Itanium processor family architectures.-ipo Enables interprocedural optimizations across files. Object files produced

using this option contain intermediate code in IELF format. At link time, ld automatically invokes the interprocedural optimizer u2comp if any of the input object files is an IELF file. This option is ignored at optimization levels +O1 and +O2. It is enabled by default when +O4 or+O faster are used.

+k generates code for programs that reference a very large number of shared data items. The linker will issue a diagnostic message in the rare case when this option is needed. By default, the compiler generates short-displacement code sequences for programs that reference global data in shared libraries. For nearly all programs, this is sufficient.

NOTE: +koption is not valid on HP Itanium®-based systems.

+nocheckuf

+nolibs

+nopadsharedcommon

-O[n]

+On

+nocheckuf disables the OPEN statement error check for opening text files with

ACCESS=”sequential”, FORM=” unformatted”. This option is useful only when BUFFERIN/BUFFEROUT statements will be used to access the opened unit. The main program must be compiled with this option for it to have any effect, and all OPEN statements will then skip this error check.

Disables the passing of libraries to the linker. This is useful when symbols are resolved at runtime through dynamically loaded libraries.

Do not pad shared common blocks to a multiple of 8 bytes. This option is useful when sharing shared common blocks betweenf 77programs and f 90-generated programs. All source files referencing the same shared common block must be compiled with the same setting of this flag.

-O invokes the optimizer, where n is the level of optimization, 0 - 4. The default is optimization level 2.

+O invokes the optimizer, where n is the level of optimization, 0 - 4. The -g option is compatible with the +O0, +O1, and +O2 options.

Table 2-6 lists and describes the different levels of optimization.

NOTE: See the Parallel Programming Guide for HP-UX Systems for a detailed description of

optimization levels and methods.

+O[no]optimization

+O[no] options enable or disable specific optimizations or classes of optimizations

(for example, optimizations that affect compilation time). For detailed information about +O[no]optimization, see “Using optimization options” (page 38).

Command-line options 31

Table 15 Levels of optimization

OptimizationsLevel

Local optimizations, including constant folding and partial evaluation of test conditions.0

Peephole optimizations, including:1

• Basic block optimizations

• Branch optimizations

• Instruction scheduling

Optimizations performed at level 1, plus the following:2

• Coloring register allocation

• Induction variables and strength reduction

• Common subexpression elimination

• Loop invariant code motion

• Store/copy optimization

• Unused definition elimination

• Dataflow analysis

• Software pipelining

• Scalar replacement

• Sum reduction optimization

-ooutfile

+[no]objdebug

Optimizations performed at levels 1 and 2, plus the following:3

• Interprocedural optimizations, including cloning and inlining

• Loop transformations to improve memory performance, including fusion and interchange

Level 4 optimizations are not currently supported by the PA-RISC compiler. If +O4is specified, the compiler will issue a warning message and compile at optimization level 3.

For Itanium® applications, performs level 3as well as doing interprocedural optimizations across translation units (link time optimizations). Object files generated at this level contain an intermediate representation of the user code and are intended to be temporary files. These intermediate object files are not guaranteed to be compatible from one version of the compiler to the next. This requires concurrent use of the +Oprofile=use option.

Decrease the optimization to optlevel for one or more specified routines. This option can occur at optimization levels 1, 2, 3, or 4. The option cannot raise the optimization level above what was specified in the+Ooptlevel option. This option can occur multiple times, possibly with different values of optlevel.

-o names the executable file outfile rather than the default name of a.out. If not specified, a.out will be overwritten if it exists, or created if it does not. The outfile name must not end

with .f,.f90,.F, i, or .i90. Also, it must not begin with + or -. When using -c and -o together, you may specify only one source file on the command line; the resulting object file is renamed.

This option, when used with any of the -goptions, will cause the debug information to be left in the object files instead of being placed in the a.out. This will bypass the pxdb step and the links will be faster. This is available only for wdb(1) or gdb(1). +objdebug is the default at compile time. +noobjdebug can be used at link time to continue to place the debug info into the a.out, even if some objects were compiled with +objdebug.

32 Compiling and linking

NOTE: Object files and archive libraries must be accessible to the debugger if the +objdebug

method of storing debug information is used.

+[no]onetrip

-p

+onetrip generates code that executes any DO loop at least once. In accordance with the

language standard, HP Fortran will not execute a DO loop if either of the following conditions is true:

• The increment value is greater than zero, and the initial value is greater than the limit.

• The increment value is less than zero, and the initial value is less than the limit.

However, older implementations of Fortran (for example, some FORTRAN 66 processors) always execute a DO loop at least once. The +onetrip option provides compatibility with those nonstandard implementations.

The default is +noonetrip.

Generate code for prof(1) profiling. The has the same functionality as the +prof option. The default is +noprof.

Optimize the application based on profile data found in the database fileflow.data , produced by compilation with +Ild(1), +I, and +df, for more details. The +P option is incompatible with the +I and -Soptions. It is incompatible with the -g option only during compile time.

NOTE: This option is only valid on Itanium®-based systems.

+pa1 compiles an application for routine-level profiling.+pa1 +pa is ignored when the HP Fortran compiler generates position-independent code

(PIC). The following options cause+pa to be ignored: +pic=short, +pic=long, +zand +Z.

+[no]parallel_intrinsics

+pic={short|long|no}

+[no]ppu

+pre_include=file

+[no]prof

NOTE: +pal s ignored when the HP Fortran compiler generates position-independent

code (PIC). The following options cause +pato be ignored: +pic=short, +pic=long, +z and +Z.

Link with the parallel version of some intrinsics in the libF90_parallel library. The default is +Onoparallel_intrinsics.

+pic generates object code that can be added to a shared library. Object code generated with this option is position-independent code (PIC). All addresses are either pc-relative or indirect references.

The argument —short or long— specifies the allocated size of the data linkage table. Normally you would specify +pic=short to generate PIC. Use +pic=long when the linker issues an error message indicating data linkage table overflow. Specifying +pic=long causes the compiler to allocate additional space for more imported symbols.

+ppu appends underscores to external names, including subroutines, functions, and common blocks (for example, int_sum_ rather than the default int_sum).

For Itanium® and PA-RISC 64-bit architectures,the default is +ppu. For PA-RISC 32-bit architecture, the default is +noppu.

NOTE: Mixed languages programs are affected by the +ppu option. C languages

references to Fortran routines and COMMON blocks require a trailing underscore when the Fortran code is compiled with +ppu. +noppu may be used in 64-bit mode to avoid trailing underscores.

+pre_include causes the compiler to prepend the code in file before any compilation occurs. This option can appear more than once—each specifying different files—on the same command line.

+prof prepares object files for profiling with prof. The default is +noprof. The -p option can be used to perform the same function as +prof. profis provided as part of the “HP-UX General Programming Tools” product (see prof(1)).

Command-line options 33

-q

-Q

+r8

-R4

-R8

Mark output from the linker demand load. This has the same functionality as the +demand_load option. The default is +nodemand_load.

Mark output from the linker no demand load. This has the same functionality as the

+nodemand_load (the default).

+r8 changes 4-byte real constants, intrinsics, and user variables to 8-byte reals (rather

than the 4-byte default).

Set the precision of real and complex constants to single-precision. This has the same functionality as the +real_constant=single option (the default).

Set the precision of real and complex constants to double-precision. This has the same functionality as the +real_constant=double option. The default is +real_constant=single.

The-R4 and-R8 options can be used to perform the same function.

+real_constant={single|double}

-s

-S

+[no]save

+[no]shared

+real_constant=single treats all single-precision numerical constants as

single-precision, and the +real_constant=double option treats all single-precision numerical constants as double-precision. The default is +real_constant=single.

The-R4 and -R8 options can be used to perform the same function.

Strip symbol table information from linker output. This has the same functionality as the +strip option. The default is +nostrip.

Generate assembly listings for the specified source files. This has the same functionality as the +asm option . The default is +noasm.

+save forces static storage for all local variables. This option provides a convenient path for porting older Fortran programs that may depend on static allocation of memory. (Variables in static storage retain their values between invocations of the program units in which they are declared). The +save option causes all uninitialized variables to be initialized to zero. The default is +nosave.

If you explicitly declare a variable with the AUTOMATIC attribute, the attribute overrides the +saveoption.

The +save command-line option inhibits many of the optimizations performed by the compiler. Generally, you will get better performance with the +Oinitcheck option, which also sets uninitialized variables to zero but is more selective than +save; see

“Using optimization options” (page 38).

+noshared causes the output file from the linker to be marked unshared. The default, +shared, is to mark the output file asshared .

The-n option performs the same function as +shared, and the -N option performs the same function as +noshared.

+[no]sharedlibF90

+[no]sharedlibIO77

+[no]sharedlibU77

+[no]signedzero

+source={fixed|free|default}

+[no]srcpos

34 Compiling and linking

+sharedlibF90allows users to link the shared version of libF90or libF90_parallel from /usr/lib. This resolves potential issues with the Fortran 90 driver trying to link with the shared versions of libF90. The default is +nosharedlibF90.

Link with the shared [archive] version of the libIO77 library. The default is to link with the archive version of libIO77. This option is only supported on the Itanium® platform.

Link with the shared [archive] version of the libU77 library, when combined with +U77. The default is to link with the archive version of libU77. This option is only supported on the Itanium® platform.

+[no]signedzero enables signed-zero support. This option forces a floating point value of negative zero that appears as a formatted output list item to be represented in the output record with a leading “-”. This option also changes the behavior of the

SIGNintrinsic. The default is +signedzero.

+source tells the compiler that source files are in either fixed or free form. The default (+source=default) is free form for .f90source files and fixed form for .f and .Fsource files.

Control the generation of source position information for the Caliper analysis tool. When +srcpos is in effect (the default), the compiler generates this information.

When +nosrcpos is in effect, the compiler does not generate this information, and the compiler will instruct the linker to discard any of this information encountered in object files. This option is only supported on Itanium® platforms.

+[no]strip

tx,path-

Table 16

+strip causes the linker to strip symbol table information from the executable program. This option is incompatible with the -g option. The default is+nostrip. The -soption can be used to perform the same function as+strip .

-t looks in path for the subprocess identified by xand substitutes it for the default subprocess. x can be one or more identifiers indicating the subprocesses.

This option works in two modes:

• If x is a single identifier and path ends in with a slash (/), path represents the

directory with the new subprocess, and the name of the subprocess is the standard name. If path ends in a filename, it is the name of the subprocess.

If x is a set of identifiers, path is a directory that holds the subprocesses identified in x. The subprocesses in path have their standard names.

Table 2-7 lists the identifiers for x, the subprocesses each indicates, and the standard subprocess name.

The following example of the-t option tells the compiler to pass the source files to the K&R version of the C preprocessor for preprocessing:

-tp,/usr/ccs/lbin/cpp

Standard nameSubprocessValue

asAssemblera

f90comCompilerc

+trigdacc

+[no]ttybuf

-Uname

end.oDebug filee

ldLinkerl

cppC preprocessorp

crt0.o, gcrt0.o, mcrt0.oStart-up files

The compiler will use more accurate cosd,sind , andtand intrinsics, than the standard ones. While the standard ones have some accuracy problems with the boundary cases (i.e., cosd(90) needs to be 0.0), these new intrinsics get these cases right.

NOTE: +trigdacc option is not valid on HP

Itanium®-based systems.

+ttybuf controls tty buffering, using buffered output. +nottybuf uses unbuffered output. The default is buffered

output (+ttybuf).The +ttybuf option forces buffered output even on systems whose default is unbuffered output.

The +[no]ttybuf option is recognized only when the main program is a Fortran program. If the main program is written in another language, use the

TTYUNBUFenvironment variable (see f90(1).

-U undefines or removes any initial definition of name in

the C preprocessor (cpp). See the cpp (1) in the HP-UX Reference for details.

+[no]U77

+U77 option invokes support for the BSD 3F library, libU77, which provides an HP Fortran interface to some

of the libc system routines. To call routines in this library, you must compile and link with +U77. For information about these routines, see the HP Fortran Programmer’s Reference.of an unsatisfied symbol. If the libU77 name is the same as a libc name, the name might resolve to a

Command-line options 35

libc name. This situation does not cause an error at compile time, but can produce unpredictable results.

+[no]uppercase

+usage

-v

+version

-w

+what

-Wx,arg1,arg2,...,argN

+uppercase uses uppercase for external names. The

default, +nouppercase, is to convert external names to lowercase.

If you need to control the case of specific names, use the$HP$ ALIAS directive, as described in “$HP$ ALIAS”

(page 124).

+usage lists and briefly describes all of the command-line options currently supported by the HP Fortran compiler. No compile occurs.

-v enables the verbose mode, producing a step-by-step description of the compilation process on the standard error output.

+version displays compiler version information only; no compilation occurs.

-w suppresses warning messages. If this option is omitted, warnings are sent to standard error.

+whatprints the what string for the Fortran 90 driver, providing version and patch numbers.

-Wcauses arg1 through argNto be handed off to subprocess x. Each arg takes the form:

-option[,value]

where option is the name of an option recognized by the subprocess and value is a separate argument to option, where necessary. The values that x can assume are listed in Table 2-8.

For example, the following option tells the linker to print a trace of each input file as ld processes it:

-Wl,-t

The next example passes the -a shared option to the linker, causing it to select shared libraries for linking.

-Wl,-a,shared

Table 17

-Y=lang

36 Compiling and linking

MeaningValue

Assemblera

Compilerc

Linkerl

C preprocessorp

Enable Native Language Support in the given language (lang). This has the same functionality as the +nlsoption.

see+pic=short in this chapter. If +z is specified when creating 64-bit code, it instead maps to +Z.

see+pic=long in this chapter for a description. Note that when creating 64-bit shared executables (such as when

Symbol binding options

The following -B options are recognized by HP Fortran to specify whether references to global symbols may be resolved to symbols defined in the current translation unit, or whether they must be assumed to be potentially resolved to symbols defined in another load module.

All -B options are applied in a ‘left to right’ order so that options at the end of a string may supercede those that appear earlier in the string.

+DA2.0W is specified), the +Zoption is on by default. This is the only PIC option supported for 64-bit executables.

NOTE: To not generate position-independent code for

64-bit executables, specify the -W1, -noshared option:

-Bdefault=symbol[,symbol...]

-Bdefault:filename

-Bextern [=symbol [,symbol...]]

-Bextern:filename

The named symbols are assigned the default export class. These symbols may be imported or exported outside of the current load module. The compiler will access tentative symbols through the linkage table. Any symbol that is not assigned to another export class through use of another

-B option (or the deprecated +O[no]extern option) will have the default on a per-symbol basis to specify exceptions to global -Bprotected, -Bhidden, or-Bextern options.

The file indicated by filename contains a list of symbols, separated by spaces or newlines. These symbols are assigned the default export class.

The named symbols, or all undefined symbols if no list is provided, are assigned the default export class. Additionally, the compiler will inline the import stub for calls to these symbols. No compile-time binding of these symbols will be done. All references to these symbols will be through the linkage table, so an unnecessary performance penalty will occur if -Bextern is applied to a listed symbol that is resolved in the same load module.

The file indicated by filename contains a list of symbols, separated by spaces or newlines. These symbols are assigned the default export class. Additionally, the compiler will inline the import stub for calls to these symbols. No compile-time binding of these symbols will be done. All references to these symbols will be through the linkage table, so an unnecessary performance penalty will occur if -Bextern is applied to a symbol that is resolved in the same load module.

Bhidden [=symbol [,symbol...]]

The named symbols, or all symbols if no symbols are specified, are assigned the hidden export class. The hidden export class is similar to the protected export class. These symbols will not be preempted by symbols from other load modules, so the compiler may bypass the linkage table for both code and data references and bind them to locally-defined code and data symbols. In addition, hidden symbols will not be exported outside the current load module. The linker may eliminate them from a shared library, but in an executable, they remain accessible to the debugger unless +Oprocelim is also specified.

When used with no symbol list,-Bhidden impliesW1,

-aarchive_shared, causing the linker to prefer an archive library over a shared library if one is available. This can be overridden by following the -Bhidden option with a subsequent -W1,-a option.

Symbol binding options 37

-Bhidden:filename

The file indicated by filename contains a list of symbols, separated by spaces or newlines. These symbols are assigned the hidden export class.

-Bprotected_data

-Bprotected_def

-Bsymbolic

Using optimization options

The options described in this section allow you to control the different optimizations that the compiler can apply to your program. These options fall into two categories:

• Options that control classes of optimization (for example, optimizations that affect code size)

• Options that control specific optimizations (for example, inlining)

The following subsections describe the options in both categories. For information about the options that control levels of optimization, see the description of the +On option in the on page 27. The +O[no]info option, which provides compile-time information about the optimization process, is described in the same section.

NOTE: You can insert (or remove) underscore characters in the names of any of the optimization

options to improve their readability. The compiler will recognize the option name with or without underscores.

Marks only data symbols as having the protected export class.

This is the same as-Bprotected, but only locally-defined (non-tentative) symbols are assigned the protected export class.

All symbols are assigned the protected export class. This is equivalent to -Bprotected with no symbol list.

Reviewing general optimization options

The following options allow you to control how optimization affects code size, compilation time, runtime performance, and other user-visible effects. The syntax for using these options is:

+O[no]optimization

Except for +Oall, +Ofast and +Ofaster the options do not override a specified level of optimization, nor do they imply a particular level. (The +Oalloption automatically invokes the highest level of optimization. +Ofast invokes +O2 level of optimization in addition to few other specific flags described later. Whereas, +Ofaster invokes +O3 level of optimization). To use any of these options you must also include the +On option on the same command line, where n specifies the level at which the type of optimization is effective. Thus, if you wish to apply all optimizations available at level 3 except those that might significantly increase code size, you would use the command line:

f90 +O3 +Osize my_prog.f90

If an option is mistakenly used at a level at which the corresponding optimization is not performed, the compiler will issue a warning message.

The defaults specified in the following descriptions are in effect only at the specified optimization levels, unless stated otherwise.

where optimization is a parameter that specifies the class of optimization to apply to your program. The different parameters are described below. The prefix no negates the effect of optimization.

+O[no]aggressive

38 Compiling and linking

+Oaggressive enables optimizations that can result in significant performance improvement but can also change

a program’s behavior. This option is only effective at optimization level 2 or higher.

The +Oaggressive option performs optimizations invoked by the following options:

• +Oentrysched

• +Olibcalls

• +Onofltacc

• +Onoinitcheck

• +FPD

+FPD is enabled only if +Oaggressiveis used on the link line.

NOTE: The +Oaggressive option is incompatible with

+Oconservative. The default is +Onoaggressive.

NOTE: This option is only valid on the PA-RISC systems.

+O[no]all

+O[no]autopar

+Oall performs maximum optimization, including

aggressive optimizations and optimizations that can significantly increase compile time and memory usage. The +Oall option automatically invokes the highest level of optimization.

The default is +Onoall. This option has been deprecated on version HP-UX 11i

and later.

+O[no]autopar controls automatic parallelization of loops that are deemed safe and profitable by the loop parallelizer.

The auto-parallelization feature enables applications to use the idle resources on multicore or multiprocessor systems, by automatically

The +Oautopar option can be used for programs at optimization levels, +O3and above. If both +Oopenmpand +Oautopar are specified on the command line, the OpenMP directives gain precedence over the +Oautopar option. The compiler auto-parallelizes only the loops of code that are not controlled by the OpenMP directives.

On HP 9000 systems, when +Oautopar is used with the +Oparalleloption, +O4is inhibited. If automatic parallelization is desired along with +O4, then use +Oautopar and omit +Oparallel.

To compile a program with +Oautopar support, the libcps, libomp, and libpthreadsruntime support libraries must be present at compile-time and runtime. This feature is available on Itanium®-based systems and HP 9000 systems.

The +Oconservativeoption sets the following options:

• +Ofltacc

• +Onofltacc

• +Onomoveflops

• +Oparmsoverlap

Use +Oconservative when conservative assumptions are necessary due to the coding style, as with nonstandard-conforming programs. Note that it is incompatible with +Oaggressive.

Reviewing general optimization options 39

The +Onoconservative option relaxes the optimizer’s assumptions about the target program.

The default is +Onoconservative. This option has been deprecated starting with HP-UX version 11i and later.

+O[no]limit

+O[no]loop_transform

+O[no]size

Fine-tuning optimization options

The following options allow you to fine-tune the optimization process by providing control over the specific techniques that the optimizer applies to your program. The syntax for using these options is

+O[no]optimization

+Olimit suppresses optimizations that significantly

increase compilation time or that can consume large amounts of memory at compile time. This option is only effective at optimization level 2 or higher.

The +Onolimit option allows optimizations to be performed regardless of their effect on compilation time or memory usage. The default is +Olimit.

Enables [disables] the following transformations: loop unroll and jam, loop distribution, loop interchange, loop blocking, loop fusion, and loop unroll. The default is

+Oloop_transform.

+Osize suppresses optimizations that significantly increase

code size. This option is only effective at optimization level 2 or higher. The +Onosize option permits optimizations that can increase code size. The default is +Onosize.

where optimization is a parameter that specifies an optimization technique to apply to your program. The different parameters are described below. The prefix no negates the effect of optimization.

The options do not override a specified level of optimization, nor do they imply a particular level. To use any of these options you must also include the +On option on the same command line, where n specifies the level at which the type of optimization can be performed.

For example, if you find that the optimizer is causing your program to produce different floating-point results from those produced by the unoptimized program, you could use the following command line to suppress optimizations that affect floating-point calculations:

f90 +O3 +Onomoveflops +Ofltacc my_prog.f90

If an option is mistakenly used at a level for which the corresponding optimization is not performed, the compiler will issue a warning message.

The defaults given in the following descriptions are in effect only at the specified optimization levels, unless stated otherwise.

+O[no]cache_pad_common

+Ocache_pad_common can improve program

performance by padding common blocks to avoid cache collisions. Cache-line collisions occur when the difference between the addresses of two data points is a multiple of the cache size. By inserting empty space between large variables (for example, arrays), the optimizer ensures that they do not start at nearby addresses, where the possibility of a cache collision is greater. This option is only effective at optimization level 3 or higher.

40 Compiling and linking

Note the following precautions when using this option:

• All program modules that reference the common

block must be compiled with the +Ocache_pad_common option.

• Each common block in the program should have

the same layout in all program units within which it is declared. If the layouts are different, they must be fully independent—that is, they must not pass values between them.

The default, +Onocache_pad_common, disables padding.

+O[no]cxlimitedrange

+O[no]cross_region_addressing

+O[no]dataprefetch

+O[no]dataprefetch=[direct|indirect|none]

Enables [disables] the use of floating point math in the compilation unit. The default is +Onocxlimitedrange.

Enable [disable] the use of cross-region addressing. Cross-region addressing is required if a pointer (such as an array base) points to a different region than the data being addressed. This is usually due to an offset which results in a cross-over into another region. Standard-conforming applications do not require the use of cross-region addressing. The default is

+onocross_region_addressing.

+Odataprefetch causes the optimizer to insert

instructions within innermost loops to explicitly prefetch data from memory into the data cache. Data prefetch instructions will be inserted only for data structures referenced within innermost loops using simple loop varying addresses—that is, in a simple arithmetic progression. It is not available for PA-RISC 1.1 targets. Use this option for applications that have high data cache miss overhead. The default is +Onodataprefetch. On HP-UX version 11i and later, +Odataprefetch is the same as

+Odataprefetch=indirect and +Onodataprefetch is the same as +Odataprefetch=none.At +O2and higher, the

default is +Odataprefetch.

Control generation of data prefetch instructions for data structures referenced within innermost loops. The defined values for kind are: direct—enable generation of data prefetch instructions for the benefit of direct memory accesses, but not indirect memory accesses.

indirect—enable generation of data prefetch instructions for the benefit of both direct and indirect memory accesses. This is the default at optimization levels +O2 and above.

none—disable generation of data prefetch instructions. This is the default at optimization levels +O1 and below.

+O[no]dynopt

Enable [disable] dynamic optimization for the output file, if the run-time environment supports this feature. Both forms of this option change the default setting, which allows the run-time environment to enable or disable dynamic optimization according to asystem-wide default. This option applies only to executable files and shared libraries. chatr(1) can be used to change this setting, including restoration of the default setting, after the output file has been created.

Fine-tuning optimization options 41

+O[no]entrysched

+Oentrysched allows the optimizer to perform

instruction scheduling on a subprogram’s entry and exit code sequences. This option is only effective at optimization level 1 or higher.

The option can change the behavior of programs that perform exception-handling or that handle asynchronous interrupts.

The default is +Onoentrysched.

NOTE: This option is valid only on the PA-RISC

systems.

+O[no]failsafe

+Ofast

+Ofaster

+O[no]fenvaccess

+O[no]fastaccess

Enable [disable] failsafe optimization. When a compilation fails at the current optimization level, +Ofailsafe will automatically restart the compilation at +O2(for specific high level optimizer errors +O3/+O4) or +O1 for Itanium® processor family architecture and +O0 for PA-RISC architecture. The default is +Ofailsafe.

Select a combination of compilation options for optimum execution speed of the compiled program. Those options include: +O2, +Olibcalls+Onolimit, +Ofltacc=relaxed, and +FPD. +Ofast also causes the +pi4M, +pd4M options to be passed to the linker. In addition to these options, on the Itanium processor family architecture,

+DSnative instruction scheduling is selected and +mergeseg is also passed to the linker.

Selects the +Ofast option at optimization level +O3 for PA-RISC based architectures. For Itanium processor family architectures, +Ofaster runs at +O4and implies

-ipo.

Inform the compiler that the program accesses [does not access] the floating point environment to test flags or run under non-default modes. If it knows that a program does not access the floating point environment, the compiler is allowed to perform certain optimizations that it otherwise may not perform, such as global common subexpression elimination, code motion, or constant folding. The default is +Onofenvaccess.

+Ofastaccess improves execution time by speeding up access to global data items. You can use this option at optimization level 1 or higher. Note that the +Ofastaccess option may increase link time. The default is +Onofastaccess at optimization levels 1, 2, and 3; and +Ofastaccess at optimization level

NOTE: This option is valid only on

+O[no]fltacc

42 Compiling and linking

+Onofltacc enables optimizations that follow the rules of algebra but may change the order of expression evaluation. For example, if a, b, and c are floating-point variables, the expressions (a + b) + c and a + (b + c)may give slightly different results due to roundoff.

The +Onofltacc option also enables the fusion of adjacent multiply and add operations—resulting in Fused Multiply-Add (FMA). (At optimization level 2 or higher, FMA occurs by default.) FMA improves performance but occasionally produces results that may differ in accuracy from results produced by code where fusion has not occurred. In general, the differences are slight.

Table 18 Optimizations performed by +O[no]fltacc

+Ofltacc disables optimizations that change the order of expression evaluation and therefore may affect the accuracy of the result. The +Ofltaccoption also disables fusing.

NOTE: +Ofltacc is the same as

+Ofltacc=strict. +Onofltacc is the same as +Ofltacc=relaxed.

Table 2-9 identifies the different actions taken by the optimizer, according to whether you specify +Ofltacc, +Onofltacc, or neither option. In all cases, the table assumes that you are compiling at optimization level 2 (+O2) or higher.

FMA?Expression reordering?+O[no]fltacc

YesNo

NoNo+Ofltacc

YesYes+Onofltacc

+O[no]fltacc=[strict|default|limited|relaxed]

+Ofrequently_called=function1[,function2...]

Control the level of floating point optimizations that the compiler may perform. The defined values for optlevel are: strict—disallow any floating point optimization that can result in numerical differences. This is the same as

+Ofltacc.

default—allow contractions, such as

fused multiply-add (FMA), but disallows any other floating point optimization that can result in numerical differences.

limited—like default, but also allows floating point optimizations which may affect the generation and propogation of infinities, NaNs, and the sign of zero. Also implies +Ocxlimitedrange.

relaxed—in addition to the optimizations allowed by limited, permits optimizations, such as reordering of expressions, even if parenthesized, that may affect a rounding error. This is the same as +Onofltacc. Also implies +Ocxlimitedrange.

The named functions are assumed to be frequently called. This option overrides any information in a profile database.

+Ofrequently_called:filename

+O[no]info

The file indicated by filename contains a list of function names, separated by spaces or newlines. These functions are assumed to be frequently called. This option overrides any information in a profile database.

+Oinfo causes the compiler to display informational messages about the optimization process. The +Oinfo option provides feedback that can help you to

Fine-tuning optimization options 43

determine whether the compiler optimized time-critical sections of your program. It can be used at any level of optimization but is most useful at level 3.

Currently, this option provides feedback for the following optimizations:

• Cloning, the replacement of a call to

a routine by a call to a clone, which is a copy of the routine with changes specific to that call site.

• Inlining.

• Loop transformations to improve cache

performance.

• Vectorization

The default, +Onoinfo, disables the display of informational messages about optimization.

+O[no]initcheck

+O[no]inline

The initialization checking feature of the optimizer has three possible states: on, off, or unspecified. When this option is specified in the on state (+Oinitcheck), the optimizer initializes to zero any local, nonarray, nonstatic variables that are uninitialized with respect to at least one path leading to a use of the variable.

When +Onoinitcheckis specified, the optimizer issues warning messages when it discovers definitely uninitialized variables, but does not initialize them.

When this option is unspecified, the optimizer initializes to zero any local, scalar, nonstatic variables that are definitely uninitialized with respect to all paths leading to a use of the variable.

This option is only effective at optimization level 2 or higher.

+Oinline makes all subprograms eligible for inlining. This option is only effective at optimization level 3 or higher.

The +Onoinline option disables inlining for all subprograms in your program.

The default is +Oinline at optimization level 3 and +Onoinline at the lower levels.

+O[no]inline:filename

+O[no]inline=function1[,function2...]

+Oinline_budget=n

44 Compiling and linking

The file indicated by filename contains a list of function names, separated by spaces or newlines. This option enables [disables] optimizer inlining for the named functions. This optimization can occur at optimization levels 3 and 4. The default is +Oinline.

Enable [disable] optimizer inlining for the named functions. This optimization can occur at optimization levels 3 and 4. The default is +Oinline.

+Oinline_budget enables the optimizer to perform more aggressive

Table 19 Values for the +Oinline_budget option

inlining. This option has the following syntax:

+Oinline_budget=n where n is an integer in the range 1 -

1000000 that specifies the level of aggressiveness, as listed in Table on page

70. The +Onolimit and +Osize options

also affect inlining. Specifying the +Onolimitoption has the same effect as specifying +Oinline_budget=200. The

+Osize option has the same effect as +Oinline_budget=1.

Note, however, that the +Oinline_budget option takes precedence over both of these options. This means that you can override the effect of +Onolimit or +Osizeoption on inlining by specifying the +Oinline_budgetoption on the same command line.

This option is only effective at optimization level 3 or higher.

MeaningValues for n

Default level of inlining.= 100

> 100

2 - 99

+O[no]libcalls

More aggressive inlining. The optimizer is less restricted by compilation time and code size when searching for eligible routines to inline.

Less aggressive inlining. The optimizer gives more weight to compilation time and code size when determining whether to inline.

Only inline if it reduces code size.= 1

invokes millicode versions of a number of frequently called intrinsic functions; see Table on page

70. Millicode routines have very low call overhead and provide no error-handling. Use this option to improve the performance of selected library routines only when your program does not depend upon exception-handling. The default is +Onolibcallsat optimization levels 0 and 1; at optimization level 2 or higher, the default is +Olibcalls.

Table 20 Millicode versions of intrinsic functions

log10atan2

powcosacos

sinexpasin

tanlogatan

+o[no]libmerrno

+O[no]loop_block

Enable [disable] support for errnoin libm functions. The default is +Onolibmerrno.

+O[no]loop_blockenables or disables blocking of eligible loops for improved cache performance. The +Onoloop_blockoption disables both automatic and directive-specified loop blocking.

Fine-tuning optimization options 45

+O[no]loop_transform

+Oloop_transformenables transformation of eligible

loops for improved cache performance. The most important transformation is the interchange of nested loops to make the inner loop unit stride, resulting in fewer cache misses. +Onoloop_transform disables transformation of eligible loops. The default is +Oloop_transform.

NOTE: This option is deprecated and not supported in

future releases. Use +inline_level.

+inline_level num

+O[no]loop_unroll[=factor]

+O[no]loop_unroll_jam

+O[no]moveflop

This option controls inlining in fortran. The format for num is N[.n], where num is either an integral value from0 to 9 or a value with a single decimal place from 0.0to

9.0. For more information on this option, see F90(1)manpage.

+Oloop_unrollturns on loop unrolling. factor is the

unroll factor that controls the code expansion. The default unroll factor is 4; that is, four copies of the loop body. By experimenting with different factors, you may improve the performance of your program. This option is only effective at optimization level 2 or higher. The default is

+Oloop_unroll=4.

+loop_unroll_jamenables loop unrolling and

jamming. +Onoloop_unroll_jam (the default) disables both automatic and directive-specified unroll and jam. Loop unrolling and jamming increases register exploitation.

+Omoveflops allows the optimizer to move conditional floating-point instructions, enabling other optimizations to occur. This option is only effective at optimization level 2 or higher.

The behavior of floating-point exception handling may be altered by this option.

Using +Onomoveflops is recommended if floating-point traps are enabled and you do not want the behavior of floating-point exceptions to be altered by the relocation of floating-point instructions, as when your program uses the ON statement. The default is +Omoveflops.

+Oprefetch

+O[no]openmp

+O[no]parallel

46 Compiling and linking

+Omultiprocessortells the compiler to appropriately optimize several different processes on multiprocessor machines. The optimizations are those appropriate for executables and or shared libraries. +Onomultiprocessor, the default, disables the optimization of more than one process running on a multiprocessor machine.

NOTE: This option is valid only on the PA-RISC systems.

+Oopenmp allows users to enable the OpenMP directives. +Onoopenmpwill disable the OpenMP directives. +O[no]openmpis accepted at all optimization levels. The default is +Onoopenmp.

+Oparallel causes the compiler to transform eligible

loops for parallel execution on multiprocessor machines. This option is effective only at optimization level 3 or higher. This option is available only on HP 9000 systems. If you link separately from the command line and compile the program with the +Oparallel option, you must link with the f90 command and specify the +Oparalleloption to link in the correct runtime support. The +Onoparallel option disables parallelization for the target program. It is the default at all levels of optimization. The +Oparallel option is not available

on Itanium®-based systems for HP Fortran Version 3.2 and later. You can use the +Oautopar option instead of +Oparallel on Itanium® based systems.

NOTE: The +Oparallel option should not be used

for programs that make explicit calls to the kernel threads library.

+Oparallel_intrinsics

+O[no]parmsoverlap

+O[no]pipeline

+Oprefetch_latency=num option

+O[no]procelim

+Oparallel_intrinsics links in the parallel version

of many of the Fortran intrinsics located in libF90_parallel.

+Oparmsoverlap causes the optimizer to assume that the actual arguments of function calls overlap in memory, thus preventing any optimizations that violate this assumption. This option is only effective at optimization level 2 or higher.

Use the +Onoparmsoverlapoption with programs that conform to the standard requirement that parameters must not overlap

The default is+Onoparmsoverlap.

+Opipelineenables software pipelining. This option is only effective at optimization level 2 or higher. Use +Onopipeline(disable software pipelining) to conserve code space. The default is +Opipeline.

NOTE: This option is valid only on the PA-RISC systems.

Applies to loops for which the compiler generates data prefetch instructions. For a given loop, the compiler divides “cycles” by the estimated loop length to arrive at the number of loop iterations for which to generate advance prefetches. The default value is 480 cycles for loops containing floating-point accesses and 150 cycles for loops that do not contain any floating-point accesses. For tuning purposes, it is recommended that users measure their application’s performance using a few different prefetch latency settings to determine the optimal value. Some floating-point codes may benefit by increasing the distance to 960. Parallel applications frequently benefit from a shorter prefetch distance of 150.

When +Oprocelimis specified, procedures that are not referenced by the application are eliminated from the output executable file. When +Onoprocelim is specified, procedures that are not referenced by the application are not eliminated from the output executable file. You can use this option at any level of optimization.

Use +Oprocelimto reduce the size of the executable file, especially when optimizing at levels 3 and 4, when inlining can remove all calls to some routines.

The default is +Onoprocelimat levels 0-3, and

+Oprocelimat level 4.

+Oprofile=use:filename

+Oprofile=collect<qualifiers>

Specify filname as the name of the profile database file. This option is a synonym for the +P, +dffilename options.The :filename argument is optional (the default filename is flow.data). See ld (1), +P, and +df for more details. This option is only valid for Itanium®-based applications.

In the absence of qualifiers, +Oprofile=collectis the same as +I.

< qualifiers > are a comma-separated list of profile collection qualifiers. Supported profile collection qualifiers are :

Fine-tuning optimization options 47

arc—collect arc counts (synonym to +Oprofile=collect). This is the default value.

stride—collect stride data. all—collect all types of profile data. This is a synonym

for +Oprofile=collect:arc,stride

+Oprofile=collect:arc,stride +Oprofile=collectinstruments the application for

profile-based optimization. This option is only valid for Itanium®-based applications.

+O[no]promote_indirect_calls

+O[no]ptrs_to_globals[=name1,name2,...,nameN]

+Orarely_called=function1[,function2...]

+Orarely_called:filename

+O[no]recovery

+O[no]regreassoc

Enable [disable} the promotion of indirect calls to direct calls. Indirect calls occur with pointers to functions. This option can be used at optimization levels 3 and 4. The default is +Onopromote_indirect_calls.

Tells the optimizer whether global variables are modified [are not modified] through pointers. This optimization can occur at levels 2, 3, and 4. The default is +Optrs_to_globals.

The named functions are assumed to be infrequently called. This option overrides any information in a profile database.

The file indicated by filename contains a list of function names, separated by spaces or newlines. These functions are assumed to be infrequently called. This option overrides any information in a profile database.

Generate [do not generate] recovery code for control speculation. The default is +Onorecovery

+Onoregreassocdisables register reassociation. This option is only effective at optimization level 2 or higher.

Use +Onoregreassocto disable register reassociation in the rare case that this optimization degrades performance.

+Oregreassocis the default.

NOTE: This option is valid only on the PA-RISC systems.

+O[no]report

+Oshortdata[=size]

+Oreportspecifies the contents of the Optimization

Report.

All objects of size bytes or smaller will be placed in the short data area, and references to such data will assume it resides in the short data area. Valid values of n are 0, or a decimal number between 8 and 4,194,304 (4MB). If no size is specified, all data is placed in the short data area. If size is 0, no data will be placed in the short data area, and all data references will use long offsets. The default is +Oshortdata=8.

NOTE: Using a value that is too big or without the

optional size, possibly through +Ofast, may give various linker fixup errors, if there is more than 4Mb of short data.

NOTE: All object files participating in a link need to

be consistently compiled with respect to +Oshortdata. Since the +Ofaster and +Ofast options imply +Oshortdata, other Fortran object files which reference common blocks should also be explicitly compiled with

+Oshortdata. Alternatively, the +Ofasterand +Ofastcompilations can be compiled with an additional +Oshortdata=8appended to the command line.

48 Compiling and linking

+O[no]sideeffects=name1, name2,...nameN

+O[no]static_prediction

Optimize with the assumption that the specified subprograms do [do not] modify global variables. This option can be used at optimization level 2 or higher. The default is to assume that all subprograms have side effects unless the optimizer can determine that there are none.

Enables [disables] the use of static branch prediction for decision on conditional branches. This is more applicable to large programs with poor locality. This option is available at optimization level 3 and above.

NOTE: This option is valid only on the PA-RISC systems.

+O[no]store_ordering

+O[no]vectorize

+O[no]whole_program_mode

Filenames

The f90command accepts files with any of the filename extensions listed in Table 2-12. The table also describes the meaning each name has for the f90command. Files with names other than those listed in the table are passed to the linker.

Preserve [do not preserve] the original program order for stores to memory that is potentially visible to multiple threads. This does not imply strong ordering. The default is +Onostoreordering.

+Ovectorizecauses the compiler to replace certain loops with calls to the math library. This option is only effective at optimization level 3 or higher.

If you link separately from the command line and you compiled with the +Ovectorizeoption, you must ensure that the link line causes the math library to be searched.

+Onovectorizeis the default.

NOTE: This option is valid only on the PA-RISC systems.

Assert [do not assert] that only files compiled with this option directly reference any global variables and procedures that are defined in these files. This option enables optimizations that assume that there are no unseen accesses to the globals. This option can be used at optimization level 4. The default is +Onowhole_program_mode.

Table 21 Filenames recognized by f90

MeaningFilenames

Free-form Fortran source code; processed by the compiler.file.f90

Fixed-form Fortran source code; processed by the compiler.file.f

file.F

file.i90

file.i

Fixed-form Fortran source code; first processed by the C preprocessor (cpp), then by the compiler.

Free-form output from the C preprocessor (if the source file ends in .f90); processed by the compiler.

Fixed-form output from the C preprocessor (if the source file ends in .For .f); processed by the compiler.

Object code; passed to the linker (ld).file.o

Assembly language code; passed to the assembler (as).file.s

Filenames 49

NOTE: The compiler generates a.modfile for each file that defines a Fortran module. It also reads

the .modfiles when compiling source files that use modules. Do not specify .mod files on the command line. If you do, the compiler will pass them to the linker, which will try (and fail) to link them into the executable. For more information about .mod files, see “Compiling programs with

modules” (page 54).

Linking HP Fortran programs

This section discusses how to link object files and covers the following topics:

• The advantages of using the f90 command for linking as well as for compiling

• How to link libraries, including shared libraries

• How to establish the search rules used by the linker when it looks for libraries

For more information about the linker, see Programming on HP-UX and to the ld (1) man page.

Linking with f90 vs. ld

By default, the f90 command both compiles and links, producing an executable program. You can modify this behavior with the -coption, which causes f90to compile only, writing the object files (if the compilation is successful) in the current working directory. If the command line contains object files only, f90passes them to the linker (ld) for linking into the executable program. In other words, you can use the f90command to compile and link in one command line or in separate command lines. You do not need to invoke the ld command separately.

In fact, we recommend that you use the f90 command whenever you link HP Fortran object files and that you use the same command line for linking as for compiling.

When you use the f90 command to compile and link in the same command line, the driver passes certain information—search paths, library names, and options—to the linker. If you use the ld command to link separately, you must specify this same information on theld command line. Not doing so can cause the link to fail. Using the samef90 command line to link as you use to compile avoids the problem of passing insufficient or incorrect information to the linker.

To see what information f90passes to the linker, compile with the-v option (verbose mode). Here is the hello.f90 program (listed in “Compiling with the f90 command” (page 18)) compiled in verbose mode. The lines are numbered for the convenience of referencing:

1 $ f90 -v hello.f90 2 /opt/fortran90/lbin/f90com -cm -w90 -nbs -auto

-WB -hp\”-Oq00,al,ag,cn,Lm,sz,Ic,vo,lc,mf,po,es,rs,sp, in,vc,pi,fa,pe,Rr,Fl,pv,pa,nf,cp,lx,st,ap,Pg, ug,lu,dp,fs,bp,wp\!\” hello.f90 3 hello.f90 4 program MAIN 5 external subroutine HELLO 6 7 Lines Compiled 7 LPATH is: /opt/fortran90/lib/pa1.1:/usr/lib/pa1.1: /opt/fortran90/lib:/usr/lib:/opt/langtools/lib 8 /usr/ccs/bin/ld -x /opt/langtools/lib/crt0.o hello.o /opt/fortran90/lib/libF90.a -lcl -lc -lisamstub

Line 1 is the f90 command line. Line 2 is the information f90 passes to the compiler, including the full pathname of the compiler,

the name of the source file (hello.f90), and the internal names of the option settings as determined by the defaults and the f90command line.

Lines 3 - 6 show the progress of the compilation; line 6 indicates that the compilation was successful. Line 7 displays the value to which f90 has defined the LPATH environment variable. If you use

the ld command to link hello.f90, you must defineLPATH on the command line before invoking the linker. See “LPATH environment variable” (page 65).

50 Compiling and linking

Line 8 is the command line thatf90 passes to the linker (ld). If you use the ld command to link hello.f90, the command line should be similar to the one shown here.

As noted in the comments on lines 7 and 8, compiling and linking hello.f90successfully using both the f90and ldcommands requires three command lines:

$ f90 -c hello.f90 # compile

$ export LPATH=/opt/fortran90/lib/pa1.1:/usr/lib/pa1.1:\ /opt/fortran90/lib:/usr/lib:/opt/langtools/lib# set LPATH

$ ld -x /opt/langtools/lib/crt0.o hello.o \ /opt/fortran90/lib/libF90.a

-lcl -lc -lisamstub # link The command line to setLPATH in the csh is:

$ setenv LPATH /opt/fortran90/lib/pa1.1:/usr/lib/pa1.1:\ /opt/fortran90/lib:/usr/lib:/opt/langtools/lib# set LPATH

For more information about the linker, see the ld(1) man page. For a list of f90 options that you can use to control the linker, see Table on page 14. To pass linker options from the f90 command line to the linker, use the-Wl option (for an example, see “Linking to shared libraries” (page 53)). The HP Fortran Programmer’s Reference, fully describes the -Wl option.

Linking to libraries

When you use the f90 command to create an executable program, the linker looks in the libraries listed in Table 2-14 to resolve references. By default, the linker uses the shared libraries, if available. For information about shared libraries, see “Linking to shared libraries” (page 53).

The libisamstublibrary is provided as a tool for migrating HP FORTRAN77™ programs that call ISAM routines. The ISAM library is not available with HP Fortran, but the stub library allows the linker to resolve references to ISAM routines in HP FORTRAN 77 programs.

Table 22 Libraries linked by default on PA-RISC

ContentsLibrary

Archive version of HP Fortran runtime library/usr/lib/libcl.a

Shared version of HP Fortran runtime library/usr/lib/libcl.sl

Archive library of array intrinsic procedures/opt/fortran90/lib/libF90.a

Shared library of intrinsic procedures and system routines,/usr/lib/libc.sl

Archive libraries of stubs to satisfy ISAM references/opt/fortran90/lib/libisamstub.a

/opt/fortran90/lib/libisamstubs.a

Shared library of stubs to satisfy ISAM references/usr/lib/libisamstub.sl

Table 23 Libraries linked by default on Itanium

ContentsLibrary

Archive version of HP Fortran runtime library/usr/lib/hpux[32|64]/libIO77.a

Shared version of HP Fortran runtime library/usr/lib/hpux[32|64]/libIO77.so

/usr/lib/hpux[32|64]/libF90_paral lel.a

/usr/lib/hpux[32|64]/libF90_paral lel.so

Archive library of array intrinsic procedures/usr/lib/hpux[32|64]/libF90.a

Shared library for Fortran intrinsic procedures/usr/lib/hpux[32|64]/libF90.so

Archive library for Fortran intrinsic procedures on parallel mode

Shared library for Fortran intrinsic procedures on parallel mode

Linking HP Fortran programs 51

Table 23 Libraries linked by default on Itanium (continued)

ContentsLibrary

/usr/lib/hpux[32|64]/libc.so

Linking to nondefault libraries

The -loption enables you to specify other libraries for linking, in addition to the default libraries listed in Table 2-14. The syntax for this option is:

-lx where xis a sequence of characters that completes a library name of the form /lib/libx.a or

/usr/lib/libx.a. For example,-lm specifies the math library, /usr/lib/libm.a. (The .aextension indicates an archive library. You can also link to shared libraries, which have the .slextension; see “Linking to shared libraries” (page 53).)

The -loption is order-sensitive: when the linker finds a reference in an object file that it cannot resolve in the default libraries, it searches the libraries (if any) specified after the file on the command line. For example, the following command line tells the linker to look for unresolved references in the math library as well as the default libraries:

$ f90 prog.f90 -lm

You can also link a library to your program by specifying its name after the name of the source file that references it, as follows:

$ f90 prog.f90 /usr/lib/libm.a

Shared library for intrinsic procedures and system routines

Archive library for Fortran 2003 standard features/opt/fortran90/lib/hpux[32|64]/li bF2003.a

This form of the command line is useful for linking libraries that do not conform to the naming convention required by the-l option or that reside in a directory other than /lib or /usr/lib. As with the -l option, the library name must follow the name of the source file that references it. For example, the following command line links prog.f90with the my_routineslibrary , both of which reside in the current working directory:

$ f90 prog.f90 my_routines

If your program calls routines in a library but the linker is unable to resolve the references, compile with the -Wl,-voption. The f90command passes -vto the linker, causing it to process in verbose mode. The verbose information includes:

• The names of the libraries that the linker is searching. This information can confirm that the

linker is searching the correct libraries.

• The names of the object files selected by the linker to resolve the references. The linker may

have found the same name in another library and resolved the reference there.

Many library-related problems are owing to a misplaced-l on the command line. The -L option (discussed in “Library search rules” (page 54)) is also order-sensitive and can cause similar problems.

Linking HP Fortran 90 routines

When calling HP Fortran 90 routines on the HP 9000 Series 800, you must include the appropriate run-time libraries by adding certain arguments to the aCC command when linking your program. These arguments depend on how the Fortran 90 routines were compiled:

32-bit PA-RISC 1.1 code

-L/opt/fortran90/lib/ -lF90 -lisamstub

32-bit PA-RISC 2.0 code

-L/opt/fortran90/lib/pa2.0/ -lF90 -lisamstub

64-bit PA-RISC 2.0 code

52 Compiling and linking

-L/opt/fortran90/lib/pa20_64/ -lF90 -lisamstub

NOTE: For more information on Itanium library paths corresponding to the above examples, see

Table 23 (page 51).

When the linker finds a reference in your program to a name that is not defined in the program (for example, the DOT_PRODUCTintrinsic), it looks to resolve it in the default libraries. If it cannot find the name in the default libraries, the link will fail unless the command line specifies additional, non default libraries. This section discusses how to link to nondefault libraries (including shared libraries) and library search rules.

Additional HP Fortran libraries

HP Fortran provides the following two libraries you can link with Fortran programs:

• /opt/fortran90/lib/libU77.aon PA-RISC or /usr/lib/hpux[32|64]/libU77.aOn

Itanium: The BSD 3f (libU77)library, which provides a Fortran interface to some of the libcsystem routines. Programs that reference routines in this library must be compiled with the +U77option. For information about porting Fortran programs that reference libU77routines, see “Migrating to HP Fortran” (page 131).

• /opt/fortran90/lib/libblas.a: The Basic Linear Algebra Subroutine (BLAS) library,

which provides routines that perform common vector and matrix operations. Programs that reference routines in this library must be compiled with the +lblasoption. For more information, see “Calling BLAS library routines” (page 105). (PA-RISC only)

Both the libU77and BLASlibraries are described in the HP Fortran Programmer’s Reference.

Linking to shared libraries

Many HP Fortran libraries as well as HP-UX libraries exist in both shared and archive versions, as indicated by the library extension name (.sl or.aor .slon Itanium). For example, there are both shared and archive versions of the HP Fortran runtime library, /usr/lib/libcl.sl and /usr/lib/libcl.a.

The difference between a shared library and an archive library is that the linker does not actually link the code in a shared library with your program. Instead, any references that your program makes to entities in the shared library are resolved at load-time, when the library is loaded into the executable program’s address space. By contrast, code in the archive library is copied to the executable program file.

The advantages of linking shared libraries are:

• The executable is smaller than it would be if linked with an archive file because the executable

file is incomplete—it doesn’t include code from the library.

• Using shared libraries ensures that you always get the most recent version of the library. If

you link with an archive version, you get the version that was available at link-time. If, later on, you want a more recent version of the library, you must re-link your program with that library.

The disadvantage of linking with a shared library is that it creates a dependency between the library and the program; both the shared library and the program must always be installed together on the same system. By contrast, linking with an archive library makes the executable program independent of the library to which it was linked. Also, programs that make frequent calls to library routines may run more slowly when linked to shared libraries.

By default, the linker selects the shared version of a library, if one is available; otherwise, it selects the archive version.

Linking HP Fortran programs 53

NOTE: For libF90, libU77, and libIO77(Itanium® only), archive libraries are selected by

default (see +sharedlibF90, +sharedlibU77, and +sharedlibIO77options). To force the linker to select archive libraries, specify the -Wl,-a,archiveoption on the

f90command line. f90passes the arguments to the -Wl option (-a and archive) to the linker. This option must appear before the names of any libraries also specified on the command line. The following command line compiles prog.f90and links it with the archive versions of the default libraries as well as with the math library (as specified by the-lm option):

$ f90 -Wl,-a,archive prog.f90 -lm

For information about the linker’s -a option, see the ld(1) man page. For more information about shared libraries, see “Creating shared libraries” (page 59).

Library search rules

When you use the -loption to specify the name of a library, the linker searches for the library in the directories specified by the LPATHenvironment variable. The f90command sets this variable so that the linker looks first in /opt/fortran90/lib, then in/usr/lib . You can specify another directory to search by settingLPATHyourself; see “LPATH environment variable” (page 65).

Alternatively, you can use the -Ldirectoryoption to direct the linker to search directorybefore it looks anywhere else to resolve references. For example, the following command line:

$ f90 -L/my_libs prog.f90 -lstuff

causes the linker to search for libraries (including libstuff.sl and libstuff.a), starting with the directory /my_libsand then looking in /opt/fortran90/lib and /usr/lib.

Special-purpose compilations

The default behavior of the HP Fortran compiler has been designed to handle typical compilations. Most applications should require no more than a few of the f90options to compile successfully (see Table on page 25 for a list of commonly used options).

However, the compiler can also meet the needs of more specialized compilations. This section explains how to use the f90command for the following purposes:

• To compile programs that contain Fortran modules.

• To compile programs that will execute on different PA-RISC machines.

• To create object files for shared libraries.

• To process source files that contain C preprocessor directives.

• To create demand-loadable programs.

• To create shareable executable programs.

• To compile 32-bit programs in 64-bit mode.

Compiling programs with modules

One of the features of standard Fortran is the module, a program unit that facilitates shared access to data and procedures. Modules are fully described in the HP Fortran Programmer’s Reference.

A benefit to using modules is that they provide interface information to the compiler, allowing it to catch mismatch errors between (for example) dummy arguments and actual arguments. When the HP Fortran compiler processes a file that defines a module, it generates a .modfile with the interface information. Later, when the compiler processes a file that uses the module, it reads the .modfile and checks that module entities that are referenced in the using file correctly match the information in the.mod file.

54 Compiling and linking

To make the .modfiles available to the compiler, you must therefore compile the files that define modules before the files that use modules. Likewise, if you make changes to a file that defines a module, you must recompile that file as well as any files that use the module, in that order.

Also, if a module is defined and used in the same file, the definition must lexically precede anyUSEstatements that reference the module. This requirement allows the compiler to generate the .modfile first, so that it can resolve the references in any USEstatements.

This section discusses the following topics:

• How to compile a program that uses modules

• How to design makefiles to work with modules

• How to use the -Iand +moddiroptions to manage .modfiles

Special-purpose compilations 55

Examples

Example 5

Consider, for example, a program that consists of three files: main.f90, code.f90, anddata.f90 . The main program unit is in main.f90, as follows.

Example 6 Example 2-2 main.f90

PROGRAM keep_stats ! stats_code contains module procedures for operating ! on statistical database USE stats_code INTEGER :: n ! print prompt, using nonadvancing I/O WRITE (*, FMT='(A)', ADVANCE='NO') 'Enter an integer '// & '(hint: 77 is current average): ' READ *, n IF (n == 0) THEN PRINT *, 'But not that one.' ELSE CALL update_db(n) IF (n >= get_avg()) THEN ! get_avg is in stats_code PRINT *, 'Average or better.' ELSE PRINT *, 'Below average.' END IF END IF END PROGRAM keep_stats

The first specification statement (USE) in the main program indicates that it uses the module stats_code. This module is defined in code.f90, as follows:

Example 7 Example 2-3 code.f90

! stats_code: a (partial!) package of module procedures for ! performing statistical operations MODULE stats_code ! shared data to be used by procedures declared below USE stats_db CONTAINS ! module procedures ! update_db: updates shared variables in module stats_db SUBROUTINE update_db (new_item) INTEGER :: new_item n_items = n_items +1 item(n_items) = new_item sum = sum + new_item END SUBROUTINE update_db ! get_avg: returns arithmetic mean INTEGER FUNCTION get_avg () get_avg = sum / n_items END FUNCTION get_avg END MODULE stats_code

This program unit also begins with a USEstatement, which identifies the module it uses as stats_db. This module is defined in data.f90, as follows:

56 Compiling and linking

Example 8 Example 2-4 data.f90

! stats_db: shared data declared here MODULE stats_db INTEGER, PARAMETER :: size = 100 ! max number of items in array ! n_items, sum, and item hold the data for statistical analysis INTEGER :: n_items, sum INTEGER, DIMENSION(size) :: item ! the initializations are just to start the program going DATA n_items, sum, item/3, 233, 97, 22, 114, 97*0/ END MODULE stats_db

The use of modules in this program creates dependencies between the files because a file that uses a module that is defined in another file is dependent on that other file. These dependencies affect the order in which the program files must be compiled. The dependencies in the example program are:

• main.f90is dependent upon code.f90.

• code.f90is dependent upon data.f90.

These dependencies require that data.f90be compiled before code.f90, and that code.f90be compiled before main.f90. This order ensures that the compiler will have created each of the .mod files before it needs to read them.

The order of the source files listed in the following command line ensures that they will compile and link successfully:

$ f90 -o do_stats data.f90 code.f90 main.f90

During compilation, f90will create two .mod files, STATS_CODE.modand STATS_DB.mod. These will be written to the current working directory, along with the object files and the executable program, do_stats. Following is a sample run of the executable program:

$ do_stats

Enter an integer (hint: 77 is current average): 77 Average or better. If instead of the preceding command line, the program had been compiled as follows:

$ f90 -o do_stats main.f90 data.f90 code.f90

the compilation would fail and f90would print the error message: Error FCE37 : Module STATS_CODE not found The compilation would fail because the compiler cannot process main.f90without

STATS_CODE.mod. But the order in which the program files appear on the command line prevents the compiler from processing code.f90(and thereby creating STATS_CODE.mod) until after it has processed main.f90.

Compiling with make

If you use the make utility to compile Fortran programs, the description file should take into account the dependencies created by modules. For example, to compile the do_stats program using the make utility, the description file should express the dependencies as follows:

Special-purpose compilations 57

Example 9 Example 2-5 makefile

makefile # description for building do_stats do_stats : main.o code.o data.o f90 -o do_stats main.o code.o data.o # main.o is dependent on main.f90 and code.f90 main.o : main.f90 code.o f90 -c main.f90 # code.o is dependent on code.f90 and data.f90 code.o : code.f90 data.o f90 -c code.f90 # data.o is dependent only its source, data.f90 data.o : data.f90 f90 -c data.f90

Note that the dependencies correspond to the order in which the source files are specified in the followingf90 command line

$ f90 -o do_stats data.f90 code.f90 main.f90

Assuming that you name the description file makefile, the command line to compile the program with make is:

$ make

Managing .mod files

By default, the compiler writes.mod files to the current working directory and looks there when it has to read them. The +moddir=directoryand -I directory options enable you to specify different directories. The +moddiroption causes the compiler to write .modfiles in directory, and the -Ioption causes the compiler to search directory for .modfiles to read. (The space character between -Iand directoryis optional.)

Using the example of the do_statsprogram, the following command line compiles (without linking) data.f90and writes a.modfile to the subdirectory mod_files:$ f90

-c+moddir=mod_filesdata.f90The command line: $ f90-c+moddir=mod_files-Imod_filescode.f90uses both the +moddir and -Ioptions,

as follows:

• The +moddir option causes f90 to write the .mod file for code.f90 in the subdirectory mod_files.

The -I option causes f90 to look in the same subdirectory for the .mod file to read when compiling code.f90. The command line: $ f90 -odo_stats -I mod_files main.f90 code.o data.o causes f90 to compile main.f90, look for the .mod file in the subdirectory mod_files, and link all of the object files into an executable program named do_stats.

Compiling for different PA-RISC machines

When you compile an HP Fortran 90 program, the object code that the compiler generates by default is based on the PA-RISC model of the machine that is running the compiler. If your program will execute on a different PA-RISC model machine, the code may run less efficiently or (in the case of PA2.0 code that attempts to run on a PA1.1 machine) may not run at all.

Also, some libraries (for example, the math library) are available in different PA-RISC versions. By default, the compiler selects the version that is based on the PA-RISC model of the compiling machine. If your program will execute on a different model machine, it may not be linked with the appropriate libraries.

Compiling with the +DAmodel option ensures that the compiler generates code that is based on the architecture specified by model and that the linker selects libraries that are compatible with model. model must be one of the following:

• A PA-RISC version number—1.1, 2.0, or 2.0W. Use +DA2.0W to compile in 64-bit mode;

see “Compiling in 64-bit mode” (page 63).

• A model number—for example, 750or 870.

58 Compiling and linking

• A PA-RISC processor name—for example, PA7100or PA8000.

• portable—code that is compatible across all models. Use +DAportableonly if you want

to ensure that your program will run on different models.

Use the uname -mcommand to learn the model of your machine, as follows:

$ uname

-m 9000/879

Alternatively, you can use the grep command to look up the model number in the file /opt/ langtools/lib/sched.modelsand find its architecture type, as follows:

$ grep 879 /opt/langtools/lib/sched.models

PA80002.0879

You can also use the +DSmodeloption to specify an architecture-specific instruction scheduler, where model has the same meaning as it does for the +DAoption. Like the +DAoption, the +DSoption is unnecessary if the program will run on the same machine as you use to compile it. Also, if you compile with +DAmodel, the compiler will select the scheduling algorithm based on the same architecture—unless you use the +DSoption to specify a different architecture.

NOTE: Code generated for PA1.1 systems will execute on PA2.0 systems, but the reverse is not

true: the loader will not allow PA2.0 code to run on a PA1.1 system.

Creating shared libraries

As mentioned in “Linking to shared libraries” (page 53), many of the HP-UX as well as HP Fortran libraries are available in shared as well as archive versions. Linking with shared libraries can make the executable program smaller and can ensure that it always has the most current version of the library.

You can make shared versions of your own libraries, using the +piccommand-line option and the

-blinker option. The following sections describe how to use these options and show an example

of how to create a shared library.

Compiling with +pic

The +picoption causes the compiler to generate Position- Independent Code (PIC) for use in a shared library. PIC contains no absolute addresses and can therefore be placed anywhere in a process’s address space without addresses having to be relocated. This characteristic of PIC makes it shareable by multiple processes.

The syntax of the +pic option is:

+pic={short|long|no}

Although compiling with either +pic=short or +pic=longwill generate PIC, in general you should use the +pic=shortoption. If the linker issues an error message saying that the number of referenced symbols in the shared library exceeds its limit, recompile with +pic=long, which will cause the compiler to allocate space for a longer symbol table.

The +pic=no is the default, which causes the compiler to generate absolute code, such as you would want for executable programs.

The following command line creates three object files—x.o, y.o, and z.o; the code in each file will be PIC:

$ f90 -c +pic=short x.f90 y.f90 z.f90

For more information about the +picoption, see the HP Fortran Programmer’s Reference.

Special-purpose compilations 59

Linking with -b

The -boption is a linker option. It causes the linker to bind PIC object files into a shared library, instead of creating a normal executable file. The-b option must be used with the ldcommand; you cannot use thef90 command to create a shared library. Also, the object files specified on theld command line must consist of PIC; that is, they must have been created with either +pic=shortor +pic=long.

The following command line links the object filesx.o , y.o, and z.ointo a shared library, named

my_lib.sl:

$ ld -b -o my_lib.sl x.o y.o z.o

Note that thisld command line is much simpler than the ld command line required to link an executable file (for example, see “Linking with f90 vs. ld” (page 50).

Examples

Example 10 Example 2-6 hi.f90

SUBROUTINE say_hi() PRINT *, 'Hi!' END SUBROUTINE say_hi

Example 11 Example 2-7 bye.f90

SUBROUTINE say_bye() PRINT *, 'Bye!' END SUBROUTINE say_bye

Example 12 Example 2-8 greet.f90

PROGRAM main CALL say_hi() CALL say_bye() END PROGRAM main

The following command line creates the PIC object files (the -c option suppresses linking):

$ f90 -c +pic=short bye.f90 hi.f90

The next command line links the object files into the shared library:

$ ld -b -o my_lib.sl bye.o hi.o

The last command line compiles the source file greet.f90 and links the object code with the shared library to produce the executable program a.out:

$ f90 greet.f90 my_lib.sl

The following is the output from a sample run of the executable program:

$ a.out

Hi!

Bye!

Using the C preprocessor

You can use the f90 command to pass source files to the C preprocessor (cpp) before they are compiled. If the source files contain C preprocessor directives, cpp will act on the directives, modifying the source text accordingly. The f90 driver will then pass the preprocessed source text to the compiler. Adding cpp directives to program source files and having the cpp command preprocess them is a convenient way to maintain multiple versions of a program—for example, a debugging version and a production version—in one set of files.

60 Compiling and linking

cppdirectives are similar to debugging lines, a feature of many Fortran implementations see, “Using

debugging lines” (page 80). Like cppdirectives, debugging lines enable the compiler to treat

source lines as either compilable statements or comments to be removed before compilation. But debugging lines are nonstandard, available only in fixed-form source, and not nearly as powerful as the cpp directives. Although cppdirectives are not a standard feature of Fortran, cppis a de facto standard feature of UNIX systems.

This section discusses how to do the following:

• Invoke cpp from the f90command line.

• Use the -D option to define cppmacros.

• Save the preprocessed output generated by cpp.

For more information about the cppcommand and the directives it supports, see the cpp (1) man page.

Processing cpp directives

By default, the f90 command passes source files ending in the .Fextension to cpp. Compiling with the +cpp=yesoption enables you to override this default and cause the f90 driver to pass all source files to cpp. If you do not compile with the +cpp=yes option and if the source file does not have the .F extension, the compiler treats any cpp directives (but not any embedded Fortran statements) as comments and ignores them. (As a language extension, HP Fortran allows comments to begin with the , which is also the prefix character for all cpp directives.)

Consider the following program:

Example 13 Example 2-9 cpp_direct.f90

PROGRAM main REAL :: x WRITE (6, FMT=’(A)’, ADVANCE=’NO’) ‘Enter a real number: ‘ READ *, x #ifdef DEBUG PRINT *, ‘The value of x in main: ‘, x #endif PRINT *, ‘x =’, double_it(x) END PROGRAM main REAL FUNCTION double_it(arg) REAL :: arg #ifdef DEBUG PRINT *, ‘The value of x in double_it: ‘, arg #endif double_it = 2.0 * arg END FUNCTION double_it

The program uses the #ifdefand #endifdirectives around PRINTstatements. If the macro DEBUG is defined, cppwill leave the PRINTstatements in the source text that is passed to the compiler; if it is not defined, cpp will remove the statements. You can define the macro in the source text, using the #definedirective; or you can define it on the command line, using the -D command-line option. The advantage of the option is that it does not require editing the source file to define or undefine a macro.

$ f90 +cpp=yes -D DEBUG cpp_direct.f90

Here is the output from a sample run of the executable program created by the preceding command line:

$ a.out Enter a real number: 3 The value of x in main: 3.0

Using the C preprocessor 61

The value of x in double_it: 3.0 x = 6.0

The next command line does not use the-D option, so that DEBUGis undefined, causing cpp to remove the PRINTstatements from the source text that is passed to the compiler:

$ f90 +cpp=yes cpp_direct.f90

Here is the output from the nondebugging version of the program:

$ a.out Enter a real number: 3.3 x = 6.6

Saving the cpp output file

By default, the f90command discards the source text as processed by cpp after compilation. However, you can preserve this text by compiling with the +cpp_keepoption. If the source file has the .For .fextension, the output from cppis written to a file with the same name but with the .iextension. If the source file extension is .f90, the output file has the .i90extension.

Here is the previous command line to preprocess and compile cpp_direct.f90, with the addition of the +cpp_keep option:

$ f90 +cpp_keep +cpp=yes cpp_direct.f90 After the PRINTstatements have been removed, the resulting output file looks like this:

$ cat cpp_direct.i90 # 1 "cpp_direct.f90" PROGRAM main REAL :: x WRITE (6, FMT='(A)', ADVANCE='NO') 'Enter a real number:' READ *, x PRINT *, 'x =', double_it(x) END PROGRAM main REAL FUNCTION double_it(arg) REAL :: arg double_it = 2.0 * arg END FUNCTION double_it Compiling

Creating demand-loadable executables

By default, the loader loads the entire code for an executable program into virtual memory. For very large programs, this can increase startup time. You can override this default by causing the linker to mark your program demand load. A demand-loadable program is loaded into memory a page at a time, as it is accessed.

Use the +demand_loadoption to make your program demand loadable, as follows:

$ f90 +demand_load prog.f90

The f90 command passes this option to the linker, which marks the executable program demand load.

Demand loading allows a program to start up faster because page loading can be spread across the execution of the program. The disadvantage of demand loading is that it can degrade performance throughout execution.

Creating shared executables

By default, the linker marks an executable program as shared. A shared executable is shareable by all processes that use the program. The first process to run the program loads its code into virtual memory. If the program is already loaded by another process, then a process shares the code with the other process.

62 Compiling and linking

You can override this default with the +nosharedoption, which causes the linker to mark the executable as unshared, making the program’s code nonshareable. The following command line causes the linker to mark prog.f90as unshared:

$ f90 +noshared prog.f90

In some circumstances, it may help to debug a program or to improve its runtime performance by making it nonshareable. In general, however, it is not desirable because nonshareable executables place greater demands on memory resources.

Compiling in 64-bit mode

Compiling HP Fortran programs with the +DA2.0Woption (PA-RISC only; for Itanium, use +DD64) causes f90 to produce 64-bit executable programs. You should consider compiling in 64-bit mode if your program does any of the following:

• Accesses a large shared memory (greater than 1.75 gigabytes) or large data spaces (greater

than 1 gigabyte or, if using EXEC_MAGIC, greater than 1.9 gigabytes)

• Uses large data elements—greater than 32-bit words

• Provides objects or libraries that might be used in a 64-bit application

There are no HP Fortran language differences between 32-bit and 64-bit programs. Recompiling should suffice to convert a 32-bit Fortran program to run as a 64-bit program.

However, the C language has some differences in data type sizes. If your Fortran program calls functions written in C and is compiled in 64-bit mode, the size differences may require promoting the data items that are passed to or from the C functions. See Table on page 181 and Table on page 181 for the size differences between Fortran and C data types when compiled in 64-bit mode.

NOTE: If your program does not need to run in 64-bit mode, there is no benefit to compiling it

in 64-bit mode. In fact, the executable program may run slower than if compiled in 32-bit mode.

Using environment variables

Environment variables are variables that are defined in the operating environment of the system and are available to various system components. For example, when you run a program, the shell looks at the PATHvariable to determine where the program is located. Table 2-15 lists and briefly describes the environment variables that control the way HP Fortran programs are compiled, linked, and run.

Table 24 HP Fortran environment variables

F90ROOT

ST_F90COM[64]

HP_F90OPTS

LPATH

MP_NUMBER_OF_THREADS

DescriptionEnvironment variable

Specifies the alternate root for Fortran compiler and its libraries.

Specifies the path of 32-bit or 64-bit f90compiler respectively in case of non-standard compiler path.

Specifies a list of command-line options that f90inserts in the command line that invokes the HP Fortran compiler.

Specifies a list of directories that the linker is to search for libraries.

Specifies the desired number of processors to be used to run HP Fortran programs that have been compiled for parallel execution.

Compiling in 64-bit mode 63

Table 24 HP Fortran environment variables (continued)

DescriptionEnvironment variable

TMPDIR

TTYUNBUF

The following sections describe how to use the HP_F90_OPTS, LPATH, and MP_NUMBER_OF_THREADSenvironment variables. See theenviron (5) man page for information

about system-level environment variables.

F90ROOT environment variable

The F90ROOT environment variable causes f90 driver to invoke all subprocesses from an alternate F90 root directory, rather than from their default directory. The default F90 root directory is /opt/fortran90/. For example,

• Install F90 compiler in an alternate root using '-r' option in swinstall or copy /opt/fortran90/

directory to an alternate directory.

$ export F90ROOT=directory (ksh/sh notation) $ setenv F90ROOT directory (csh notation)

• Invoke f90 <F90ROOT>/opt/fortran90/bin/f90.

• f90com32/f90com64 is invoked from <F90ROOT>/opt/fortran90/lbin/ directory.

Specifies a directory for temporary files; used in place of the default directory /var/tmp.

Controls tty buffering. To enable tty buffering, set TTYUNBUFto 0; to disable tty buffering, set it to a nonzero value.

Only files in /opt/fortran90 are affected by F90ROOT environment variable. Regardless of the HP Fortran driver you use, the compiler still uses the libraries, linker, and other files located in /usr/lib and /usr/ccs.

ST_F90COM[64] environment variable

The ST_F90COM / ST_F90COM64 environment variables specify the f90com32 / f90com64 path on a non-standard path.

$ export ST_F90COM=/tmp/f90com32 $ export ST_F90COM64=/tmp/f90com64

$ setenv ST_F90COM /tmp/f90com32 $ setenv ST_F90COM64 /tmp/f90com64

HP_F90OPTS environment variable

The HP_F90OPTSenvironment variable is read by the f90driver for options to insert in the command line. This variable is useful when you want the same options and arguments each time you invoke the f90command. For example, if HP_F90OPTSis set to the -voption, the following command line:

$ f90 +list hello.f90

is equivalent to:

$ f90 -v +list hello.f90

The syntax of the HP_F90OPTSvariable allows the bar (|) character to be used to specify that options appearing before |are to be recognized before any options on the command line and that options appearing after |are to be recognized after any options on the command line. For example, the commands:

64 Compiling and linking

$ export HP_F90OPTS="-O|-lmylib" $ f90 -v hello.f90

LPATH environment variable

The LPATHenvironment variable is read by the linker to determine where to look for libraries to link with a program’s object file. Depending on whetherLPATH is set or not, one of the following actions occurs:

• If LPATHis already set, only the directories listed in LPATHare searched. This happens, for

example, whenLPATH is set in a user’s .kshrcor .cshrcfile, or after LPATHis defined from the command line.

• IfLPATH is not set, the f90command sets defaultLPATH settings that are used when linking

the object files listed on thef90 command line.

Because thef90 command sets LPATHbefore calling the linker, it should not be necessary to set this variable for most compilations. However, if you do need to set it (for example, you use theld command to link), the following directories should be the first items in LPATH:

• /opt/fortran90/lib

• /usr/lib

• /opt/langtools/lib

The following command lines set LPATH to include these directories, using (respectively) the kshandcsh syntax:

$ export LPATH:/opt/fortran90/lib:/usr/lib:/opt/langtools/lib

% setenv LPATH "/opt/fortran90/lib:/usr/lib:/opt/langtools/lib"

To see how f90sets LPATHbefore calling the linker, compile with the -voption for verbose output. For an example, see “Linking with f90 vs. ld” (page 50).

MP_NUMBER_OF_THREADS environment variable

The MP_NUMBER_OF_THREADSenvironment variable sets the number of processors that are to execute a program that has been compiled for parallel execution. If you do not set this variable, it defaults to the number of processors on the executing machine.

The following command lines set MP_NUMBER_OF_THREADSto specify that programs compiled for parallel execution can execute on two processors:

$ export MP_NUMBER_OF_THREADS=2 # ksh syntax

% setenv MP_NUMBER_OF_THREADS 2 # csh syntax

For information about parallel execution, see “Compiling for parallel execution” (page 100).

Floating installation

More than one version of the HP Fortran compiler can be installed on one system at the same time. The floating installation feature allows you to install the compiler in any location. You can install as many compiler versions as required, depending on your system’s resources.

By default, HP Fortran is installed under the /opt/fortran90 directory. In earlier releases, the compiler driver (f90) looked for related files in subdirectories of the /opt/fortran90 directory. This prevented installation of more than one version of HP Fortran on the same system at the same time.

Only files in /opt/fortran90 are affected by floating installation. Regardless of the HP Fortran driver you use, the compiler still uses the libraries, linker, and other files located in /usr/lib and /usr/ccs.

Floating installation is designed to help facilitate in-house development. You must not ship libraries in non-standard places, because explicit runtime library specifications and linker options are required.

Floating installation 65

Setting up floating installation

You may want to install the most recent compiler version and keep the prior version on one system. If there are problems with the most recent version, you can easily switch to the prior one. Following is an example of how to set up the floating installation feature for this purpose. Assume that your system will have two versions of the compiler, both floating install enabled. In this case, 3.6.1 is the prior version, and 3.6.2 or 3.7 is the more recent version.

To setup floating installation, complete the following steps:

1. Copy the prior version to another directory.

$ cp -rp /opt/fortran90 /opt/fortran90.3.6.1

2. Use swinstall to install the new version (3.6.2 or 3.7 in this case).

3. To use 3.6.1, invoke as,

$ /opt/fortran90.3.6.1/bin/f90

In this case, the compiler is picked up from /opt/fortran90.3.6.1/lbin/. To use 3.6.2 or 3.7 whichever is installed latest, invoke as default,

$ f90

$ /opt/fortran90/bin/f90

Floating Installation can also work with:

• Alternate path, for example,

<alternate-path>/opt/fortran90.3.6.1/

• Extended float path, that is, if the Fortran driver is invoked from <float-path>/bin/f90,

the compiler is picked up from the relative <float-path>/lbin/ path. As this is not a recommended usage, the compiler uses Fortran and other libraries, linker and other files located in default paths.

The precedence of various environment variables that control the Fortran driver to pick up the Fortran compiler is in the following order:

• F90ROOT

• SDKROOT

• ST_F90COM[64]

NOTE: To use floating installation, do not set the following environment variables:

• F90ROOT

• SDKROOT

• ST_F90COM[64]

66 Compiling and linking

3 Controlling data storage

This chapter describes the use of command-line options, directives, and other language features to control data in HP Fortran programs. In particular, it discusses the following topics:

• Disabling implicit typing

• Automatic and static variables

• Increasing the precision of constants

• Sharing data among programs

• Modules vs. common blocks

NOTE: For information about how HP Fortran aligns data, see HP Fortran Programmer's Reference.

Disabling implicit typing

By default, HP Fortran uses implicit typingto determine the type of a variable or function that has not been declared with a type declaration statement. That is, the type of an undeclared entity is determined by the first letter of its name: if the letter is in the rangeI - N , the entity is of type integer; otherwise, it is of type real.

Although implicit typing is mandated by the Standard, its use can become a source of runtime bugs because implicit typing allows the inadvertent use of undeclared variables or functions. For the sake of illustration, consider a program that calls a nonintrinsic library function named foo. Assume that:

• The default typing rules are in effect.

• fooreturns an integer.

• The programmer has not declared the return type offoo and has assigned its return value to

a variable of type real.

Experience has shown that this is not an unlikely scenario and that it can produce unexpected results

The Standard provides the IMPLICIT NONEstatement to override implicit typing. But the IMPLICIT NONEstatement is limited in scope to the program unit in which it appears. To force explicit typing for all files specified on the command line, use the +implicit_noneoption. This option disables implicit typing; that is, all variables, arrays, named constants, function subprograms, ENTRY names, and statement functions (but not intrinsic functions) must be explicitly declared.

Using this option is equivalent to specifyingIMPLICIT NONE for each program unit in each file specified on the f90command line. However, the +implicit_noneoption does not override any IMPLICITstatements in the source file. The HP Fortran Programmer's Reference describes the implicit typing rules, the IMPLICIT NONE statement, and the +implicit_noneoption.

Automatic and static variables

By default, HP Fortran allocates stack storage for program variables. Such variables are called automatic variables because they are allocated at each invocation of the program unit in which they are declared.

Static variables are allocated storage from static memory when the program is first loaded into memory.

They remain allocated for the life of the program.

Disabling implicit typing 67

HP Fortran allocates static storage for the following variables:

• Variables specified in a COMMON or EQUIVALENCE statement.

• Variables initialized in a type declaration statement or in a DATA statement.

• Variables specified in a SAVE or STATIC statement. A SAVE statement without a variable list

specifies static storage for all variables in the scoping unit.

• Variables in program files that have been compiled with the +saveor

+Oinitcheckcommand-line option. See “Uninitialized variables” (page 144)for information

about using these options when porting.

Static variables have two characteristics that are of special interest:

• They are set to 0 or null value at load-time.

• They do not require re-initialization at each invocation of their program unit.

Static variables have several disadvantages. In Fortran programs that use recursion, static variables can defeat one purpose of recursion—to provide a fresh set of local variables at each recursive call. Also, the widespread use of static variables in a program can slow its performance: static variables are ineligible for such fundamental optimizations as register allocation, and they can limit the optimization of program units that use them.

The following example program illustrates the difference between automatic and static variables. The program consists of a main program unit that calls a recursive internal subroutine. The subroutine increments two variables (stat_val and auto_val), prints the updated variables, and then calls itself recursively. Neither of the two variables is explicitly initialized, but stat_val is declared with the SAVE attribute, which means that it is allocated static storage and is pre-initialized to 0 by the compiler.

Example 3-1 recursive.f90 PROGRAM main ! This program calls a recursive internal subroutine. CALL recurse CONTAINS ! This subroutine calls itself four times. ! Each time it is called, it adds 1 to the values in ! stat_val and auto_val and displays the result. ! stat_val has the SAVE attribute and therefore is ! pre-initialized and retains its value between calls. ! auto_val is an automatic variable and therefore has ! an unpredictable value (plus 1) at each call. RECURSIVE SUBROUTINE recurse INTEGER(KIND=1), SAVE :: stat_val INTEGER(KIND=1) :: auto_val stat_val = stat_val + 1 auto_val = auto_val + 1 PRINT *, ‘stat_val = ‘, stat_val PRINT *, ‘auto_val = ‘, auto_val IF (stat_val < 4) THEN CALL recurse() END IF END SUBROUTINE recurse END PROGRAM main

Following are the command lines to compile and execute this program, along with sample output. Notice that stat_val regularly increments at each call. The reason is that it is a static variable and therefore retains its value between calls. But auto_val is not actually incremented; it is an automatic variable and is given a fresh (and uninitialized) memory location at each call. In other words, the subroutine adds 1 to whatever value happened to be in the memory location that was allocated to auto_val at the start of the call:

$ f90 recursive.f90 $ a.out

68 Controlling data storage

stat_val = 1 auto_val = 124 stat_val = 2 auto_val = 1 stat_val = 3 auto_val = 65 stat_val = 4 auto_val = 65

NOTE: HP Fortran provides the AUTOMATIC and STATIC statements as porting extensions. The

STATIC statement is functionally the same as the SAVE statement, and the AUTOMATIC statement

may be used to declare a variable as automatic. However, such a declaration is generally pointless because variables compiled under HP Fortran are automatic by default.

The HP Fortran Programmer's Reference provides detailed information about the AUTOMATIC, SAVE, and STATIC statements.

Increasing the precision of constants

By default, HP Fortran evaluates all floating-point constants as single-precision. For example, the compiler treats following constant

3.14159265358979323846

as though you had specified:

3.1415927

Although the loss of precision might be acceptable when assigning to single-precision variables, it is might be less acceptable when assigning to double-precision variables or when using floating-point constants in expressions where the loss in precision might result in significant round-off differences.

NOTE: HP Fortran provides two ways to override the default precision of individual constants:

the kind parameter and the exponent form. The kind parameter indicates the precision of floating-point constants: 4 for single-precision, 8 for double-precision, and 16 for quad-precision.

In the following example, the kind parameter _8 specifies that the constant is to be evaluated as double-precision:

3.14159265358979323846_8

To change the precision of all floating-point constants (except those having a kind parameter), you can use the +real_constantoption. This option takes two forms, +real_constant=double and +real_constant=single, which specify (respectively) double-precision and single-precision for floating-point constants in the files compiled with this option. The +real_constant=singleform is the default. Neither form of the option has any affect on constants that have the kind parameter.

To promote all floating-point constants in the source files x.f, y.f, and z.f, compile with the command line:

$ f90 +real_constant=double x.f y.f z.f

The +real_constant=singleoption specifies that all floating-point constants in a file are to be treated as single-precision (the default). The following command line specifies single-precision for all floating-point constants in the files a.f, b.f, andc.f :

$ f90 +real_constant=single a.f b.f c.f

Note that +real_constant=singledoes not demote constants that use either the kind parameter or the exponent form (for example, 4.0D0) .

For information about increasing the precision of variables, see “Increasing default data sizes”

(page 70). The HP Fortran Programmer’s Reference describes the syntax of the kind parameter

and the exponent form and the +real_constant option. For detailed information about how

Increasing the precision of constants 69

floating-point arithmetic is implemented on HP 9000 computers and how floating-point behavior affects the programmer, see the HP-UX Floating-Point Guide.

Increasing default data sizes

The +autodbland +autodbl4options enable you to increase the default sizes (that is, the number of storage bytes) for both constants and variables of default numeric and logical types. Unlike the +real_constant option, the +autodbland +autodbl4options affect both constants and variables of both real and integer types. (For information about using the +real_constantoption, see “Increasing the precision of constants” (page 69).)

When compiled with the +autodbland +autodbl4options, constants are treated as though they had twice the default number of bytes (4) available for evaluating them. The effect of these options is to increase the range of default integers and the precision of default rails.

The +autodbl and +autodbl4 options have no effect on the size of entities declared with the CHARACTER, BYTE, or DOUBLE COMPLEXstatements, nor on entities that are explicitly sized. That is, if a variable is declared with a kind parameter or if a constant has a kind parameter, it is unchanged by +autodbl or +autodbl4.

NOTE: HP Fortran interprets the kind parameter as indicating the number of storage bytes to

allocate for a variable. When used with variables and constants of type real, the kind parameter also indicates the precision: 4 for single-precision, 8 for double-precision, and 16 for quad-precision.

Promoting double-precision variables to quad-precision can have a severe impact on performance because the instructions to perform quad-precision operations are implemented in software. If you are concerned about performance and want to increase default data sizes, consider using the +autodbl4option, which does not promote variables declared with the DOUBLE PRECISION statement. There is no other difference between +autodbl or +autodbl4.

These options affect all files on the command line. To increase the size or precision of selected variables and constants, use the kind parameter.

“Increasing default data sizes” (page 70) shows the default data types whose sizes are changed

by the +autodbl and +autodbl4 options.

Increasing default data sizes

Figure 2 Increasing default data sizes

70 Controlling data storage

The following program illustrates the different effects of the +autodbl and +autodbl4 options. The program assigns the same quad-precision constant to three variables:

• x, a default (that is, single-precision) real

• y, a real that is declared as double-precision with the kind parameter

• z, a double-precision real that is declared with the DOUBLE PRECISION statement

Example 3-2 precision.f90 PROGRAM main REAL x REAL(KIND=16) y DOUBLE PRECISION z ! Assign a quad-precision constant to a default real: x = 3.14159265358979323846_16 PRINT 10, ‘Stored in x: ‘, x ! Assign a quad-precision constant to a variable that ! has been explicitly sized for quad-precision: y = 3.14159265358979323846_16 PRINT 10, ‘Stored in y: ‘, y ! Assign a quad-precision constant to a variable ! declared with the DOUBLE PRECISION statement: z = 3.14159265358979323846_16 PRINT 10, ‘Stored in z: ‘, z 10 FORMAT (A, F22.20) END PROGRAM main

Following are three different sets of command lines to compile and execute this program, including sample output from each compilation. Note that variable y remains the same for each compilation: the compiler does not promote variables that are sized with the kind parameter.

First, the program is compiled without any option:

$ f90 precision2.f90 $ a.out Stored in x: 3.14159274101257320000 Stored in y: 3.14159265358979323846 Stored in z: 3.14159265358979310000

Next, the program is compiled with the +autodbl option. As shown in the output, x is promoted to double-precision and z to quad-precision:

$ f90 +autodbl precision2.f90 $ a.out Stored in x: 3.14159265358979310000 Stored in y: 3.14159265358979323846 Stored in z: 3.14159265358979323846

Finally, the program is compiled with the +autodbl4 option. As shown in the output, x is promoted, but z is not:

$ f90 +autodbl4 precision2.f90 $ a.out Stored in x: 3.14159265358979310000 Stored in y: 3.14159265358979323846 Stored in z: 3.14159265358979310000

Though useful for increasing the range and precision of numerical data, the +autodbl and +autodbl4 options are especially useful when porting; see “Large word size” (page 145). For detailed information about these options, see the HP Fortran Programmer’s Reference. For detailed information about how floating-point arithmetic is implemented on HP 9000 computers and how floating-point behavior affects the programmer, see the HP-UX Floating-Point Guide.

Increasing default data sizes 71

Sharing data among programs

If you are designing an application that requires multiple threads of control that share the same data, the design can take either of two forms:

• The program makes calls to the threads library:

/usr/lib/libpthread.sl

which creates multiple threads executing in a single process and therefore all sharing the same address space.

which creates multiple threads

• The application consists of several programs that run simultaneously in separate processes

and that access an HP-UX shared memory segment.

The first approach is beyond the scope of this manual and requires that you have an understanding of how to call the threads library.1 The second approach is described here.

To share data among several HP Fortran programs that are executing simultaneously in separate processes, use the $HP$ SHARED_COMMONdirective. This directive enables you to create a common block that is accessible by HP Fortran programs executing in different processes.

The $HP$ SHARED_COMMONdirective causes the compiler to insert HP-UX system calls to perform shared memory operations. To the programmer, the programs sharing the memory segment appear as though they were program units in the same program, accessing a set of common block variables.

Following are two programs to illustrate how the $HP$ SHARED_COMMONdirective works:

• The first program, go_to_sleep.f90, must execute first. Because it executes first, it creates

the shared memory segment and then enters a DOloop, where it waits until the second program starts to execute. You can use the ipcs -m command to confirm that a shared memory segment has been created.

• When the second program, wakeup.f90, starts to execute, it writes to the shared common

block variables, one of which causes go_to_sleep.f90to break out of the DOloop and run to completion.

The $HP$ SHARED_COMMONdirective must appear at the beginning of the specification part of the main program unit of each program sharing the memory segment. Also, the common block specified for sharing must have the same layout in all files in which it is declared.

You can use the ipcs -mcommand both to determine that HP-UX has created a shared memory segment and, after the programs complete execution, to confirm that it has been released.

• Specifying the +Oparalleloption causes the compiler to transform eligible loops in an HP

Fortran program for parallel execution on HP 9000 systems. For information about compiling for parallel execution, see “Compiling for parallel execution” (page 100).

The following two examples illustrate these concepts.

Example 3-3 go_to_sleep.f90 PROGRAM main ! This program, go_to_sleep.f90, and its companion, wake_up.f90, ! share data in a common block, using the $HP$ SHARED_COMMON ! directive. Execute this program first. After it starts to ! execute, use ipcs(1) to confirm that a shared memory segment ! has been created. In a separate process, run wake.f90. ! When it executes, it assigns to alarm, ending this program. LOGICAL :: alarm CHARACTER(LEN=8) :: message ! Declare a common block, shared_data, for sharing among ! multiple, simultaneously executing programs. Each program ! that shares the common block must reference it by the same ! key, 'scb1' !$HP$ SHARED_COMMON KEY=’scb1’ /shared_data/ ! Declare a common block with two variables: alarm and message.

72 Controlling data storage

! when alarm is set by wake_up.f90, this program breaks out ! of the DO loop, prints message (which wake_up.f90 has ! written to), and exits. COMMON /shared_data/ alarm, message alarm = .FALSE. ! Wait for alarm to be set... DO WHILE (alarm .EQ. .FALSE.) ! sleep(1) is an HP-UX system call that suspends a process ! for the number of seconds specified by the argument. ! The %VAL function tells Fortran that sleep expects its ! argument to be passed by value. CALL sleep(%VAL(1)) END DO ! Message from wake.f90: PRINT *, message ! The shared memory segment is destroyed when this program halts. END

IMPORTANT In the example above, you must use +U77to access the correct sleep in the Fortran library. If you use +U77, the line above:

CALL sleep (%VAL(1)) should instead read: CALL sleep (1)

PROGRAM main ! This program, wake_up.f90, should be run just after its ! companion, go_to_sleep.f90, starts to execute but in a ! separate process. The $HP$ SHARED_COMMON directive ! enables both programs to share the same memory. ! Directive puts the common block in shared memory. $SHARED_COMMON KEY=’scb1’ /shared_common/ LOGICAL :: alarm CHARACTER(LEN=8) :: message ! Declare a named common block for shared memory. It must ! be laid out n exactly the same way in both programs. COMMON /shared_common/ alarm, message ! Write to message, sleep reads it. message = "I'm up!" ! Set alarm to wake up sleep. alarm = .TRUE. ! The shared memory segment will now be detached. ! However, because go_to_sleep is still running, ! the segment will still be present in memory until ! it stops executing, too. END

Following are the command lines to compile each program:

$ f90 -o go_to_sleep go_to_sleep.f $ f90 -o wake_up wake_up.f

Run the first program in any process by doing the following:

$ go_to_sleep

In another process, use the following command line to confirm that a shared memory segment has been created for the program (the last in the list is the newly created one):

$ ipcs -m IPC status from /dev/kmem as of Fri Mar 21 15:55:29 1997 T ID KEY MODE OWNER GROUP Shared Memory: m 0 0x4119c72b --rw-rw-rw- root root m 1 0x4e180002 --rw-rw-rw- root root m 2 0x41187bf4 --rw-rw-rw- root root m 3 0x00000000 --rw------- root sys m 7004 0x43186ea0 --rw-rw-rw- daemon daemon m 6005 0x73636231 --rw-rw-rw- ed lang

Sharing data among programs 73

Now run the second program in the second process:

$ wake_up

At this point, the program executing in the first process outputs the following and completes execution:

I'm up!

The following command line confirms that the shared memory segment was released:

For information about sharing data between Fortran program units and C functions within the same program, see “Sharing data” (page 122). The HP Fortran Programmer's Reference provides detailed information about the COMMONstatement and about the $HP$ SHARED_COMMONdirective. Refer to the shmop(2) man page for information about HP-UX shared memory operations.

Modules vs. common blocks

The common block has been a mainstay of Fortran programs throughout the evolution of the language, and it continues to be a part of Fortran. The common block provides a convenient means to share data among program units, especially when the program units sharing data do not otherwise communicate with each other. The common block can also be used to share data between simultaneously executing Fortran programs (see “Sharing data among programs” (page 72)) and between Fortran program units and C functions linked together in the same program (see “Sharing

data” (page 122)).

One of the problems with the common block, however, is that the programmer must replicate the COMMONdeclaration in each of the sharing program units. If any of the common variables are out of order or have a different type or size, the program units may not access the same data. The compiler gives no indication of this discrepancy because it assumes that the programmer is giving one program unit a different view of the shared storage—even when the discrepancy is owing to oversight.

To deal with this problem, many implementations of FORTRAN 77 have provided the INCLUDEextension. This extension enables the user to centralize common block definitions in one file. At compile-time, the compiler reads the file into program units that have the INCLUDEI line. While this approach eliminates the problem of discrepant common blocks, it introduces another problem: the INCLUDEfacility is nonstandard FORTRAN 77, and its use is nonportable.

To deal with the portability issue, Standard Fortran defines the INCLUDEline. Unfortunately, the definition in the Standard leaves many of the details up to the implementation, so that use of the INCLUDEline in Fortran programs still runs the risk of nonportability.

Another problem with the common block—especially when used with equivalencing—is that it can inhibit optimization. Common block variables are generally ineligible for register allocation, and aliasing variables in common can prevent the optimization of the program units that use the aliased variables.

The module program unit is the Fortran answer to the common block. The programmer declares shareable variables in a module. Any program unit that wants to access them references the name of the module in a USEstatement. The concept of the module eliminates the need to re-declare the common variables, without requiring the INCLUDEline.

74 Controlling data storage

In addition, the module provides the following controls on access to module data:

• The PUBLIC and PRIVATEstatements declare which module variables are accessible outside

the module and which are not.

• The USE statement has an ONLYclause that specifies which module variables are accessible

to a particular program unit.

• The USE statement also has a renaming feature to resolve name clashes between local variables

and module variables.

Another feature of the module is that it can include procedures. This feature provides a way to package data with the procedures needed to operate on the data. A program unit accesses module procedures in the same way it does module data, with the USEstatement. The interface of module procedures is available to the compiler, which can perform compile-time checks on the actual arguments that are passed to a module procedure.

Although the module does not completely replace the common block (see, for example, “Sharing

data among programs” (page 72)), it does provide a safer and more flexible alternative to the

more common uses—and abuses—of the common block. For an example of a program that uses the module to share data, see “Compiling programs with

modules” (page 54). The HP Fortran Programmer's Reference provides detailed information about

the module program unit and the MODULEand USEstatements.

Modules vs. common blocks 75

4 Debugging

This chapter describes different HP Fortran features for debugging programs. These features include compile-line options, compiler directives, and programming tools that are useful for locating errors in your program. More specifically, this chapter discusses the following topics:

• Using the HP WDB debugger

• Stripping debugging information

• Handling runtime exceptions

• Using debugging lines

Using the HP WDB debugger

The HP WDB debuggeris the primary tool for debugging HP Fortran programs. The debugger provides such basic debugging functions as program control, process control, program and data monitoring, and expression evaluation. The debugger has both a graphical interface and a line-mode interface.

The debugger software includes different managers that enable it to handle different source languages, target machines, object file formats, and user formats. The Fortran language manager allows you to use Fortran syntax when entering expressions on the debugger command line.

Before beginning a debugging session, you must compile the program with the -gcompile-line option. If you compile and link separately, you must use the -goption on both command lines. The option causes the compiler to generate additional information needed by the debugger and to insert it into the output code.

After compiling your program with the -goption, invoke the debugger with the wdbcommand, supplying the name of the executable as an argument. For example, the following command compiles prog.f90for debugging:

$ f90 -g prog.f90 -o db_prog

Here is the command to start debugging the executable program:

$ wdb db_prog

You can use the debugger to debug code that has been optimized at levels 0, 1, and 2. To debug optimized code, compile the program with both the -gand +Oopt-leveloptions, where opt-level is 0, 1, or 2. The following command line prog.f90at optimization level 2 and prepares for debugging:

$ f90 +O2 -g prog.f90 -o db_prog

Compiling with the -goption increases the size of both the object file and the executable file. After you have debugged your program and are ready to build the production version, you may want to recompile without the -goption.

For complete information about HP WDB debugger, see http://www.hp.com/go/wdb. Printed and online documentation are available at this site.

Stripping debugging information

Programs compiled with HP Fortran include minimal debugging information in the executable program. This information consists of a symbol table—a list of all the symbols in your program and their offset addresses. The symbol table provides the information needed to produce a procedure traceback. It is also used by the debugger and by the CXperf performance analysis tool.

However, the symbol table is not the same as the debugging information that is added to your program when you compile with the -goption. The symbol table is added to an executable even if the program is not compiled with the -goption.

76 Debugging

If the size of executable is critical to your application, you can use the +stripoption to remove symbol table information from the production version of your program. If you compile and link on separate command lines, you must use the +stripoption on both command lines. Instead of recompiling with +strip, you can use the strip utility, which removes all debugging information, including the symbol table.

If the size of your executable is not important, you may want to retain the symbol table in the production version of your program. This table can be used by the debugger to provide minimal debugging. If a program has not been compiled with -g and does not include a symbol table, it is unusable by the debugger. Also, without the information provided by the symbol table, a procedure traceback displays virtual addresses only.

The amount of code that the symbol table information that adds to an executable is considerably less than the amount that compiling with -g adds. For descriptions of the -gand +strip options, see the HP Fortran Programmer’s Reference. For information about the strip utility, refer to thestrip(1) man page.

Handling runtime exceptions

Broadly defined, an exception is an error or fault condition that affects a program’s results. Exceptions can range from the relatively begin in exact result condition that occurs in certain floating-point operations to the more severe segmentation violation that occurs when a runaway program attempts to access unallocated memory.

Exceptions that threaten the integrity of the operating system can cause HP-UX to raise an exception signal (for example, SIGSEGV for a segmentation violation) so that the process can take appropriate action to recover from the exception. Such exceptions may cause the program that took the exception to abort, but not necessarily. By trapping an exception—that is, by catching the signal—a program may handle the exception, if only by aborting when it occurs.

There are also a well-defined set of floating-point conditions that, although they pose no threat to the operating system, can also cause an exception—for example, dividing a floating-point number by zero. By default, traps for floating-point exceptions are disabled on HP9000 computers, but they can be enabled by +fp_exceptionand +FPoptions. (You can also use the ON statement to enable traps for floating-point exceptions.)

Programs that have been compiled with the +fp_exception option can trap the exceptions listed in Table4-1. Any of the exceptions listed in the second column will cause the operating system to generate the signal listed in the first column. Programs compiled with +FP can trap specific floating-point exceptions (SIGFPE).

Table 25 Signals recognized by +fp_exception

ExceptionSignal

Bus error instructionSIGBUS

Floating-point exceptionsSIGFPE

Illegal instructionSIGILL

Segmentation violation or memory faultSIGSEGV

Bad argument to a kernel system callSIGSYS

When a program compiled with +fp_exception takes an exception, the following events occur:

• The program traps the exception.

• A procedure traceback is displayed on standard error. A procedure traceback lists routine

names and their offsets that are helpful in locating the code that triggered the exception.

• The program aborts.

Handling runtime exceptions 77

The following sections discuss each of exceptions listed in Table4-1. For more information about signals, refer to the signal((2))) and signal((5))man pages.

NOTE: Standard Fortran90 provides the IOSTAT=and ERR=specifiers for handling I/O runtime

errors. For information about these specifiers, refer to the descriptions of the I/O statements (for example, OPENand READ) in the HPFortran Programmer’s Reference. For a descriptive list of the error messages that can be returned byIOSTAT=, see the HPFortran Programmer’s Reference.

Bus error exception

A bus error exception occurs when a program references an inaccessible memory location, typically because the reference is to an unaligned or nonexistent address, or because of a hardware failure.

The most likely cause of a bus error is unaligned data reference. A program that passes an array of (KIND=1) elements to a routine that attempts to access them as (KIND=4)elements may take a bus error exception. Or if an array of(KIND=1) elements is declared in a common block and the third element is passed to a routine that attempts to access it as a (KIND=4) variable, the program will take a bus error exception. For information about the alignment of HP Fortran data types, see the HPFortran Programmer’s Reference.

Bus errors can occur (as can other exceptions) in any program that generates bad address references. Although less likely to happen with programs that use the standard Fortran90 pointer, bad address references can happen when the Cray-style pointer extension is misused or when Fortran program unit passes a parameter by value to a C routine that attempts to use it as a pointer.

Floating-point exceptions

In accordance with the IEEE Posix Standard, floating-point exceptions are disabled on HP9000 computers. Thus, if a program attempts the following operation:

x = 1.0/0.0

it will not trap it as an exception and will not abort. Instead, the value of a positive infinity (displayed as +INF) will be assigned to x.

HPFortran provides two compile-line options, +FPand +fp_exception, which enable traps for floating-point exceptions. The differences between the two options are:

The +fp_exceptionoption enables traps for the following IEEE floating-point exceptions:

❏ Invalid operation ❏ Division by zero ❏ Division by zero ❏ Overflow ❏ Underflow

The +FP option also enables the trap for the inexact operation exception. For detailed descriptions of these exceptions, see the HP-UX Floating-Point Guide.

• Unlike the +fp_exception option, the +FPoption includes a flags argument by which you

can enable specific exceptions.

• The +FPoption can also be used to enable fast underflow on systems that support it (chiefly

PA2.0 systems).

• Both options cause your program to abort when it traps the exception. However,

You can also trap floating-point exceptions with the ONstatement. Although the ONstatement requires you to modify source code, it enables you to write trap procedures so that your program can recover from exceptions. For more information about using the ON statement, see Chapter 5,“Using

the ON statement” (page 81).

78 Debugging

+fp_exception identifies the type of the exception that occurred and the virtual address of the statement that triggered it. Also, +FP causes a core dump;+fp_exception does not.

See the HPFortran Programmer’s Reference, for detailed information about the +FPand +fp_exceptionoptions. Also, the HP-UX Floating-Point Guide has a useful discussion of both options and includes detailed information on floating-point exceptions and how to handle them.

Illegal instruction exception

An illegal instruction exception occurs when a program attempts to execute a bit pattern that is not an op-code. A common cause of this exception is an overwritten stack. If a program overwrites the part of the stack that holds the return address, the new (and bad) address may cause execution control to jump to a memory location that contains data or some other nonexecutable bit pattern. The attempt to execute this location will result in an illegal instruction exception.

This exception can also occur if your program is linked to a bad library, especially if the library contains code that was written in assembler or if it was corrupted during a file transfer.

This exception may indicate a compiler error. If you cannot find the cause of this exception in your code, contact your HP support representative.

Segmentation violation exception

Before a program starts to execute, it is allocated a memory segment, which defines the area of memory that it can use. If the program attempts to access a memory location outside its segment, the operating system will raise the SIGSEGVsignal, indicating a segmentation violation or memory fault.

Any program that can generate address references outside its segment—for example, by indexing beyond the declared boundary of an array—may cause a segmentation violation. In C programs, bad pointers often result in this exception. The standard Fortran90 pointer is more self-protective than the C pointer, but it too can be misused and lead to the state of mind memorialized in the lyric (known only to Cooper Redwine1): “I’ve got those segmentation violation, core dumped blues.” The Cray-style pointer extension is more like the C pointer and is therefore more susceptible to the abuse that results in segmentation violations.

Programs that cause a stack overflow (for example, by attempting to allocate more local variables on the stack than the kernel can handle or by infinite recursion) can also cause a segmentation violation. If your program needs a bigger stack, run the System Administrator Manager (SAM) and increase the maxssiz parameter. Also, see the HP-UX System Administration Tasks manual for information about reconfiguring the kernel.

Segmentation violations are especially common when calling C functions from Fortran program units. If the number, type, or calling conventions of the arguments being passed do not match, the call is likely to result in an exception. For example, if you use the built-in function %VAL to declare an argument as passed by value, but the C function is expecting a pointer, a segmentation violation may occur. (%VAL and %REF are HPFortran extensions; for information about using them when calling a C routine from Fortran, see “Argument-passing conventions” (page 113).)

In most cases, debugging requires locating the code that caused the segmentation violation and rewriting it. If your program aborts with this error, recompile it with the +fp_exceptionoption. A program compiled with this option will display a procedure traceback when it aborts. The procedure traceback lists procedure names and offset addresses of the code that caused the exception.

If you suspect that an out-of-bounds array reference is causing the segmentation violation, you can use the +check=alloption instead of the +fp_exceptionoption. When compiled with the +check=all option, a program that attempts to reference an array element that is outside the declared array boundary will abort with an error message that gives the line number of where the reference was detected.

The +check=allalso performs runtime checks for out-of-bounds substrings and for integer overflow; see “Calling a trap procedure” (page 84). The +check option is fully described in the HPFortran Programmer’s Reference.

Handling runtime exceptions 79

Bad argument exception

This exception occurs when a bad argument (for example, an out-of-range argument) is passed to a kernel system routine. This exception can also occur in programs that make explicit calls to the kernel threads library, /usr/lib/libpthread.sl, and pass bad arguments.

Using debugging lines

An HPFortran program that has been written in fixed source form can contain debugging lines. These are statements that begin with the letter D or d in column 1. When compiled with the +dlinesoption, the debugging lines are treated as statements and compiled; otherwise, they are treated as comments and ignored. A program that contains debugging lines must also be compiled for fixed source form; that is, the filename extension must be either .f or the program must be compiled with the +source=fixedoption.

The +dlinesoption makes it possible to include WRITEstatements as debugging lines in the source file and to remove them from the production version of the program without having to change source code. Instead of deleting the WRITEstatements when you are ready to build the production version, you recompile without the +dlinesoption, or with the +nodlines option.

Although debugging lines are supported by many implementations of Fortran (especially FORTRAN77), it is nonstandard and therefore nonportable. Use of this feature is even more restrictive by reason of its being incompatible with free source form. If you try to compile a Fortran90 program as free source form and the program contains debugging lines, the compilation will almost certainly fail with syntax errors.

The C preprocessor (cpp) provides a set of directives that have the same functionality as debugging lines but are much more powerful and can be used in either fixed or free source form. Although the cpp directives are not part of standard Fortran 90, they are available on most UNIX systems, such as HP-UX.

The cpp directives are described in the cpp((1)) man page. See the HPFortran Programmer’s Reference for information about the source form of HPFortran programs and the +dlinesoption.

80 Debugging

5 Using the ON statement

Whenever a runtime error occurs, the default action of your program depends on the type of the error. If the error results from a floating-point exception, the program will continue to execute. Other errors will cause it to abort.

As described in “Handling runtime exceptions” (page 77), the +fp_exception and +FPoptions provide control over how a program behaves when a runtime error occurs. The ON statement provides an additional level of control by enabling your program to handle floating-point and integer exceptions and +Ctrl-C interrupts. Before an exception can be handled, the flow of control must pass through an ON statement that specifies:

• The type of the exception

• One of the following actions:

❏ Execute a trap procedure ❏ Ignore the interrupt ❏ Abort the program

The action specified by the ON statement can only be changed by another ON statement that specifies the same exception.

This chapter describes how to use the ON statement. The syntax of the ON statement is described in the HPFortran Programmer’s Reference. For detailed information about trapping math errors, see the HP-UX Floating-Point Guide.

NOTE: If you include theON statement in a program that you optimize at level 2 or higher and

the program takes an exception, the results may vary from those you would get from an unoptimized program or from a program that didn’t have the ONstatement.

Exceptions handled by the ON statement

Like the +fp_exceptionoption, the ONstatement enables traps for floating-point exceptions (by default, traps for floating-point exceptions are disabled on HP 9000 computers). When traps are enabled, an executing program that takes any of the following exceptions will abort, unless an ONstatement specifies a different action:

• Division by zero

• Overflow

• Underflow

• Invalid (or illegal) operation

These exceptions are defined by the IEEE standard for floating-point operations. The ONstatement enables traps for these exceptions, regardless of whether the exception is taken by user code or by a call to a library routine. In addition, the ON statement also enables traps for integer division by zero, integer overflow, and +Ctrl-Cinterrupts. The +Ctrl-Cinterrupt occurs when the user presses +Ctrl-Cduring program execution.

Table5-1 on page131 lists the exceptions handled by the ONstatement and gives the keywords that must be specified in the ONstatement to indicate the exception being handled. The first column indicates the type of exception. The second column gives the keywords that must appear in the ON statement, immediately following the word ON. The third column gives alternate keywords you can specify instead of those in the second column.

For example, the following ONstatement will trap attempts to divide by zero with 8-byte floating-point operands:

ON REAL(8) DIV 0 CALL div_zero_trap

Exceptions handled by the ON statement 81

The next example ONstatement does the same as the first but uses the alternate keywords from the third column of the table:

ON DOUBLE PRECISION DIV 0 CALL div_zero_trap

Table 26 Exceptions handled by the ON statement

Alternate keywordsException keywordsExceptions

REAL DIV 0REAL(4) DIV 0Division by zero

DOUBLE PRECISION DIV 0REAL(8) DIV 0

(none)REAL(16) DIV 0

INTEGER*2 DIV 0INTEGER(2) DIV 0

INTEGER DIV 0INTEGER(4) DIV 0

REAL OVERFLOWREAL(4) OVERFLOWOverflow

DOUBLE PRECISION OVERFLOWREAL(8) OVERFLOW

(none)REAL(16) OVERFLOW

INTEGER*2 OVERFLOWINTEGER(2) OVERFLOW

INTEGER OVERFLOWINTEGER(4) OVERFLOW

REAL UNDERFLOWREAL(4) UNDERFLOWUnderflow

Actions specified by ON

The action taken after an exception is trapped depends on the action specified by the most recently executed ONstatement for that exception. To specify an action, the ON statement must include the keyword ABORT, IGNORE, or CALL. These keywords have the following meanings:

• If ABORTis specified, a standard error message is generated and the program is aborted.

• If IGNOREis specified, processing continues with the next instruction.If the exception is an

integer division by zero, the result is set to zero. For other conditions, the previous content of the target register is supplied as the result. IGNOREis particularly useful for preventing +Ctrl-C interrupts at inconvenient times during program execution.

• +Ctrl-C

If CALLis specified, the normal (ABORT)error message is suppressed, and control is transferred to the specified trap procedure.

Zero or one parameter is passed to the trap procedure. If an argument is specified, it is the result of the operation that took the exception. The procedure can analyze this value to get more precise information, and it can assign another value to the parameter to recover from the error. The type of the argument must be the same as that specified in the keywords. The specified trap procedure is generally an external procedure. However, it is also possible to specify a dummy procedure argument.

The following sections describe how to use the ONstatement to specify different actions to take in the event of an exception.

DOUBLE PRECISION UNDERFLOWREAL(8) UNDERFLOW

(none)REAL(16) UNDERFLOW

REAL ILLEGALREAL(4) ILLEGALInvalid (illegal) operation

DOUBLE PRECISION ILLEGALREAL(8) ILLEGAL

(none)CONTROLC+Ctrl-C interrupt

82 Using the ON statement

Terminating program execution

Use the ABORTform of the CALLstatement to terminate the program when an exception occurs. In the following example, the log is taken of a negative number. The ABORTclause causes the program immediately after the exception is detected and to issue a procedure traceback:

Example 14 Example5-1 abort.f90

PROGRAM main REAL :: x, y, z ! The next statement enables traps for floating-point exceptions ! and specifies the action to take for divide by zero. ! ON REAL DIV 0 ABORT x = 10.0 y = 0.0 z = x / y PRINT *, y END PROGRAM main

Here is the command line and the output from a sample run:

$ f90 abort.f90 $ a.out PROGRAM ABORTED : IEEE divide by zero PROCEDURE TRACEBACK: ( 0) 0x0000248c _start + 0x6c [./a.out]

The program would have the same result if you were to comment out the ON statement and compile with the +fp_exceptionoption.

Ignoring errors

You can use the ONstatement to ignore an exception by specifying the IGNOREkeyword. The following paragraphs discuss an example program,ignore.f90, that uses the ONstatement to ignore an invalid operation. The following program illustrates this.

Example 15 Example5-2 ignore.f90

PROGRAM main REAL :: x, y, z ! The following ON statement enables traps for floating-point ! exceptions and causes the program to ignore an invalid ! operation exception. ON REAL ILLEGAL IGNORE ! The next two statements pass a negative argument to the LOG ! intrinsic, resulting in an invalid operation. This ! exception is ignored, as specified by the ON statement. x = -10.0 y = LOG(x) PRINT *, y ! The next three statements attempt to divide by zero. The ! trap for this exception is enabled by the previous ! ON statement but no action is specified. Therefore, ! the program will abort execution. x = 9.0 y = 0 z = x/y PRINT *, z END PROGRAM main

As defined by the IEEE standard, a floating-point operation that results in a NaN is an exception known as an invalid operation. The example program performs an invalid operation when it passes a negative argument to the LOG intrinsic, causing the intrinsic to return a NaN. The following ON statement:

ON REAL INVALID IGNORE

causes the program to ignore this exception and continue execution.

Actions specified by ON 83

The program also attempts to divide by zero. Although the ONstatement enables the trap triggered by a divide-by-zero exception, the statement has no other effect. As a result, the exception will cause the program to abort. To ignore the divide-by-zero exception would require an additional

ONstatement:

ON REAL DIV 0 IGNORE

Here is command line to compile the program, followed by the output from a sample run:

$ f90 ignore.f90 $ a.out NaN PROGRAM ABORTED : IEEE divide by zero PROCEDURE TRACEBACK: (0) 0x00002504 _start + 0xbc [./a.out]

Calling a trap procedure

You can write trap procedures that are callable by the ON statement to handle arithmetic errors in user code and in library routines. Trap procedures can take zero or one argument. If an argument is specified, it is the result and must have the type specified by the exception keyword. For example, if the following ONstatement occurs in a program:

ON DOUBLE PRECISION OVERFLOW CALL trap

then the procedure trap could declare one argument of type DOUBLE PRECISION. Note that the argument is optional. Also, depending on the exception, the contents of the argument may not always be meaningful.

The following sections discuss two example programs that use the ONstatement to call a trap procedure for floating-point exception and for an integer exception.

Trapping floating-point exceptions

The following program, call_fptrap.f90, causes an invalid operation exception and includes an ONstatement to handle the exception. The ONstatement calls the trap procedure trap_illegal, which assigns a different value to the result argument. The program prints the result. Here is the program listing:

Example 16 Example5-3 call_fptrap.f90

PROGRAM main REAL :: x, y ON REAL ILLEGAL CALL trap_illegal x = -10.0 y = LOG(x) ! causes an invalid operation PRINT *, yEND PROGRAM main

SUBROUTINE trap_illegal(res) ! res is the result value of the invalid operation ! trapped by the ON statement REAL :: res res = 99.87 ! assign another value to the result argument END SUBROUTINE trap_illegal

Here is the command line, followed by the output from a sample run: $ f90 call_fptrap.f90 $ a.out

99.87

Trapping integer overflow exceptions

This section discusses an example program that illustrates how to use the ONstatement to call a trap procedure for an integer overflow exception.

An integer overflow occurs when an operation on an integer variable results in the attempt to assign it an out-of-range value. HPFortran does not trap this exception by default. However, you can use the ON statement in conjunction with the $HP$ CHECK_OVERFLOWdirective to trap an integer overflow. The following program, call_itrap.f90, illustrates how to do this:

84 Using the ON statement

Example 17 Example5-4 call_itrap.f90

PROGRAM main !$HP$ CHECK_OVERFLOW INTEGER ON INTEGER :: i ON INTEGER OVERFLOW CALL trap_oflow ! assign to i the biggest number it can hold I = 2147483647 ! now add 1 I = i + 1 PRINT *, i END PROGRAM main SUBROUTINE trap_oflow(n) INTEGER :: n ! write error message to standard error WRITE (7, *) 'integer overflow occurred, assigning 0 to result' n = 0 END SUBROUTINE trap_oflow

If you were to comment out the ON statement but keep the directive, the program would abort with a procedure traceback and a core dump. Compiling with the +check=alloption would have the same effect.

Trapping +Ctrl-C trap interrupts

A +Ctrl-Cinterrupt can occur during the following circumstances:

• When the user enters the interrupt code from the terminal while the program is running or

awaiting input

• During the execution of a PAUSEstatement

The trap procedure for a +Ctrl-Cinterrupt must have no formal arguments. The interrupt code is the character defined by the HP-UX stty(1)command for the intrparameter. The system default for intris +Ctrl-C.

You can use the +Ctrl-Cform of the ONstatement to handle the interrupt signal 2. In the following example, when an interrupt occurs, the program reports status information on standard output, assuring the user that the program is still at work in the DOloop. The program uses the ONstatement to set the action for a +Ctrl-Cinterrupt to be the call to the trap handler status:

PROGRAM main COMMON i ON CONTROLC CALL status DO i = 1, 100000 ... ! Long computation END DO END SUBROUTINE status COMMON i PRINT *, 'Currently on iteration ', i END SUBROUTINE status

When this program is run, a +Ctrl-Cinterrupt causes the status routine to be called, which prints the iteration count. The program then resumes executing the DOloop.

Allowing core dumps

If a program includes the ONstatement and takes an exception other than the one specified by the exception keywords, the program will abort with a procedure traceback but without a core dump. If you want to allow a core dump for one or more signals for a program that includes the ONstatement, you must revise the program for each such signal.

For example, you may wish to handle floating-point exceptions with the ONstatement, but still allow a core dump for other signals (for example, a bus error). The following example program uses theSIGNAL routine in the libU77library to reset the default behavior for a bus error signal. The

Trapping +Ctrl-C trap interrupts 85

program uses the ON statement to handle floating-point exceptions, but allows a core dump when a bus error occurs:

Example 18 Example 5-5 allow_core.f90

PROGRAM main ON REAL OVERFLOW IGNORE CALL take_err END PROGRAM main

SUBROUTINE take_err DOUBLE PRECISION :: d POINTER (ip, d) ! Cray-style pointer REAL :: x, y INTEGER, PARAMETER :: sigbus=10, sigdfl=0 INTEGER :: sigrtn, SIGNAL ! Set the action for bus error to be the default (DUMP CORE), ! overriding the action of issuing a procedure traceback ! that is established by using the ON statement. ! To suppress the core dump and enable a procedure traceback, ! comment out the next statement sigrtn = SIGNAL(sigbus, 0, sigdfl) x = 1.0E38 x = y * 10.0 ! causes a real overflow ! Bus error is caused by the next statements ip = MALLOC(40) ip = ip + 4 ! ip is now 4-byte aligned d = 99.0 ! bus error END SUBROUTINE take_err

This program must be compiled with the +U77option to link in the libU77library. Here is the command line and the output from a sample run:

$ f90 +U77 allow_core.f90 $ a.out Bus error(coredump) $ ls corecore

86 Using the ON statement

6 Performance and optimization

This chapter describes how to use different features of HP Fortran to tune your program for optimum performance. The most important of these features is the optimizer. You invoke the optimizer when compiling your program by specifying either +On(where nrepresents the level of optimization to be applied to your program) or the -Ooption for the default level of optimization (level 2). The -O option is provided for compatibility with the POSIX standard and has the same functionality as the

+O2 option.The following command line compiles , usingprog.f90 the default level of optimization:

$ f90 -O prog.f90

For most applications, -O provides effective optimization. However, some applications can realize significant increases in performance at higher levels of optimization or when you use other features of the optimizer to boost performance. This chapter discusses these features as well as the following topics:

• Using profilers

• Using options to control optimization

• Conservative vs. aggressive optimization

• Parallelizing HP Fortran programs

• Vectorization

• Controlling code generation for performance

For information about getting the best performance from floating-point intensive applications running on HP-UX, see the HP-UX Floating-Point Guide.

Using profilers

A profiler is a tool for sampling a program during execution so that you can determine where your program spends most of its time. After examining the data provided by a profiler, you can decide whether to redesign parts of the program to improve their performance or to re-compile the program with optimization options. For example, if your program contains a loop with an embedded call and profiling reveals that the program spends much of its time in the loop, you may decide to inline the embedded call.

The following sections describe the HP Caliper performance analysis tool, which is bundled with HPFortranas well as the two UNIX profilers,gprof and prof.

NOTE: As described in “Stripping debugging information” (page 76), all programs compiled

by HP Fortran include symbol table information in the executable file, unless you compile with the +stripoption or have removed the symbol table with the strip utility. This information must be present in the executable in order to use the profiling tools.

HP Caliper

HP Caliper works with your compiler to perform Profile Based Optimization (PBO). The compiler option +Oprofile=collectuses HP Caliper in the background to collect profile

information for your entire application. This is equivalent to the +Ioption on HP PA-RISC systems.

Using profilers 87

Invoking HP Caliper with the compiler to optimize your program

The simplest way to use HP Caliper is to use the PBO mode to create a flow-data file that the compiler can use to further optimize your program.

Step 1.

Step 2.

Compile your program with the +Oprofile=collectoption.

/opt/ansic/bin/cc -Aa +O3 -o program +Oprofile=collect program.c

The loader sets a run-time flag on the program to automatically invoke HP Caliper. HP Caliper supports programs compiled with any compiler option. When you

use the +Oprofile=collectoption, however, the compiler forces +01optimization to fully measure the program performance.

Run your program to generate the flow-data file.

program[program_arguments]

HP Caliper automatically runs to generate the flow-data information in a file namedflow.data in the current directory.

You can repeat this step multiple times with different program_arguments to create aggregated profile information in the flow.datafile to improve your program’s optimization.

If the flow.datafile already exists in the current directory when you run your program, then HP Caliper merges the results into the file. If you run your program multiple times in different directories, then HP Caliper creates a separate flow.data file in each directory. You can combine the files using the fdm(1)utility program, which is bundled with the HP C, HP aC++, and Fortran 90 compilers.

When collecting PBO data, the more your program use scenarios resemble the ways in which your program will actually be used, the more the compiler can optimize your program specifically for how it is actually used.

When you make changes to your source files, you should delete the flow.data file before collecting more PBO data on your program.

Step 3.

Comparing Program Performance

You can use HP Caliper’s other measurement features to explicitly see the results of using PBO. An example process for comparing performance would be:

1. Compile your program with +Oprofile=collectto generate the executable to optimize.

2. Run your program to generate the profile data file. Use as many use scenarios as possible to

collect representative profile data.

3. Compile your program with +O3to generate the baseline executable to compare against the

fully optimized executable.

4. Run HP Caliper on your +O3executable with a configuration file such as total_cputo measure

performance.

5. Recompile your program with +O3optimization and +Oprofile=useto use the profile data

collected in Step 2 as input.

6. Run HP Caliper on your fully optimized executable from Step 5 with the same configuration

file from Step 4 and compare the report results with the report from Step 4.

Use the flow.datafile as input to the compiler when you specify the

+Oprofile=use option on subsequent re-compiles.

/opt/ansic/bin/cc -Aa +O3 -o program +Oprofile=use program.c

The compiler uses the HP Caliper information to help optimize your program. Note that the benefit of profile-based optimization is application dependent. Some

programs may not improve while others may improve significantly.

88 Performance and optimization

Using Options to Control Data Collection

You can use various options to specify how to collect profile information about your program.

• Collecting information on specific load modules.

• Improving HP Caliper performance by excluding load modules you don’t want to profile.

• Collecting information on specific processes.

You can specify HP Caliper data collection options by using the CALIPER_OPTSenvironment variable. In your Makefile or build scripts you can set CALIPER_OPTS to the options you want to use. HP Caliper uses those options when it is started by the compiler.

Here are some examples of ways you can control how HP Caliper collects optimization information.

Specifying PBO file names and locations

HP Caliper creates two files when doing profile-based optimization:

You can use the FLOW_DATAand FLOW_LOGenvironment variables to specify the file names. If the file you specify with the FLOW_DATAvariable already exists when you run your program,

then HP Caliper merges the results of the current run in into the file.

The file name used to store profile data.flow.data

The file name used to log diagnostic messages.flow.log

gprof

CAUTION: You cannot combine profile information from different programs into a

singleflow.data file. Combining profile information from multiple programs corrupts the flow.datafile. The file is designed to contain data from multiple runs of the same program with

different input to help improve optimization results.

The gprof profiler enables you to determine which subprograms are called the most and how much time is spent in each subprogram. To use gprof, do the following:

1. Compile the program with the +gprofoption. For example:

$ f90 -o prog +gprof prog.f90

2. Run the program. This creates the file gmon.outin the current directory. For example:

$ prog $ ls gmon.out gmon.out

3. Run gprof, specifying the name of the program as an argument. It will display two tables to

standard output: a flat profile and a call graph profile. Since these tables can be quite large, you may want to redirect the output from gprof, as follows:

$ gprof prog >gprof.out

The flat profilelists the number of times each subprogram was called and the percentage of the total execution time for each of the subprogram times. The call graph profile includes such information as the index of the function in the call graph listing, the percentage of total time of the program accounted for by a routine and its descendents, and the number of seconds spent in the routine itself.

4. Once gprofis finished, you can view the output tables using an ASCII editor.

For more information about gprof, see the gprof((1) )man page.

Using profilers 89

prof

The profprofiler can also be used for profiling. Unlike the gprofprofiler, prof does not generate the call graph profile. To use prof, do the following:

1. Compile the program with the +prof option. For example:

$ f90 -o prog +prof prog.f90

2. Run the program. This creates a file named mon.out in the current directory. For example:

$ prog$ ls mon.outmon.out

3. Run prof, giving the name of the program as an argument, as follows:

$ prof prog profproduces a listing on standard output showing the time spent in each routine.

For more information about prof, see the prof((1))man page.

Using options to control optimization

HPFortranincludes a rich set of command-line options for controlling optimization. For most applications, we recommend optimizing with -O, which enables the default level of optimization. (For information about the default level of optimization, refer to Table on page 149; look up +O2 in the first column.) You can raise or lower the level of optimization with the +Oopt-leveloption, and you can use the +Ooptimizationoption to control the kinds of optimizations that are available at each level.

The following sections describe how to use the +Oopt-leveland +Ooptimizationoptions. For detailed descriptions of the optimization options, see the HPFortranProgrammer’s Reference.

Using +O to set optimization levels

HP Fortran provides four levels of optimization. Each higher level is a superset of the lower levels; level 4 is the highest level and can result in a significant increase in program performance. Level 2 is the default level of optimization.

You invoke optimization by compiling with the +Oopt-leveloption, where opt-levelis an integer in the range 0 - 4. The following command line invokes the optimizer at the highest level:

$ f90 +O4 file.f90

You can invoke level 2 (the default level) by specifying the -Ooption. Table6-1 summarizes each level, giving the option that invokes that level, the advantages,

disadvantages, and recommended usages. For technical information about the specific optimizations at each level, see the HP PA-RISC Compiler Optimization Technology White Paper. A PostScript version of this document is available online in /opt/langtools/newconfig/white_papers/ optimize.ps.

NOTE: You can debug programs optimized up to level 2. To prepare an optimized program for

debugging, use the command line:

$ f90 -g +Oopt-level prog.f90

Table 27 Optimization levels

Optimizations performedOption

Recommended useDisadvantagesAdvantages

+O0default

+O1

90 Performance and optimization

Constant folding and partial evaluation of test conditions.

Level 0 optimizations, plus

Compiles fastest; compatible with the debugger option -g.

Produces faster programs than level 0;

During program development.Does very little

optimization.

During program development.Compiles slower

than level 0.

Table 27 Optimization levels (continued)

Optimizations performedOption

Recommended useDisadvantagesAdvantages

+O2, -O

+O3

branch optimization, dead code elimination, more efficient use of registers, instruction scheduling, and peephole optimization.

Default level optimizations, including level 1, plus coloring register allocation, induction variable elimination on large float and double

reduction, common subexpression elimination, loop invariant code motion, store/copy optimization, unused definition elimination, software pipelining, and register reassociation.

Level 2 optimizations, plus loop transforms, parallelization, containing frequently executed

cloning, and inlining within a file. Some optimizations may require additional options; see “Using

the optimization options” (page 91).

compiles faster than level 2; compatible with the debugger option

-g.

Can significantly increase performance over level 1; works with debugger option-g.

Can significantly increase performance over level 2.

Compiles slower than level 0 and

Compiles slower than lower levels;

code size; not compatible withvectorization, the debugger option -g.

During program development and when building the production version; especially effective in optimizing loops that perform arithmetic operations

arrays.and strength

When building the production version; especially effective when used on source filesincreases object

loops and subprograms.

+O4

Level 3 optimizations applied across all

compiled with +O4. containing frequently executed

Using the optimization options

The +Ooptimizationoptions enable you to control the kind of optimizations that are applied to your program at each level. Table on page151 and Table6-3 on page152 list the options. The first column of each table lists each option, the second column gives the optimization level at which the option can be used, and the third column identifies what the option does. When using any of these options except +Oall, you must also use the +Onoption to specify the optimization level listed in the second column of the tables. The +Oalloption automatically invokes the optimizer at the highest level.

Provides the highest level of optimization;

increase performance over level 3.

Can use large amounts of

may increase link-time and object code size; not compatible with the debugger option

-g.

Using options to control optimization 91

When building the production version; especially effective when used on source filessystem resources;can significantlyprogram files

loops and subprograms.

Table6-2 lists the “packaged” options. These options enable or disable a set of related optimizations, such as optimizations that do not increase code size. Table6-3 lists options that enable or disable specific optimizations.

The options in both tables can be combined on the same command line, except as noted. For example, the following command line requests aggressive optimizations at level 2 that do not increase code size:

$ f90 +02 +Oaggressive +Osize prog.f90

Nearly all of the optimization options can be used to enable or disable an optimization or a package of optimizations. For example, the following command line requests aggressive level 4 optimizations that do not result in roundoff errors:

$ f90 +O4 +Oaggressive +Ofltacc prog.f90

The f90.1 manpage fully describes all of the optimization options.

Table 28 Packaged optimization options

FunctionLevelOption

+O[no]autopar

+O2or higher+O[no]aggressive

Invokes highest level+O[no]all

+O2or higher+O[no]conservative

Enable [disable] optimizations that can significantly improve performance in standard-conforming programs. The default is +Onoaggressive. For more information about this option, see “Conservative vs. aggressive optimization”

(page 99).

Enable [disable] maximum optimization. The default is +Onoall.

Do [do not] automatically parallelize loops that are deemed safe and profitable by the loop parallelizer. The default is +Onoautopa. The +Oautopar option can be used for programs at optimization levels, +O3and above.

If both +Oopenmpand +Oautopar are specified on the command line, the OpenMP directives gain precedence over the +Oautoparoption. The compiler auto-parallelizes only the loops of code that are not controlled by the OpenMP directives.

To compile a program with +Oautoparsupport, the libcps, libomp, and libpthreads runtime support libraries must be present at compile-time and runtime. This feature is available on Integrity systems and HP 9000 systems.

Suppress [do not suppress] optimizations that assume strict conformity to the Fortran90 standard. The default is +Onoconservative. For more information about this option, see “Conservative vs. aggressive optimization”

(page 99).

NOTE: This option is valid only on the PA-RISC systems.

Table 29 Fine-tuning optimization options

92 Performance and optimization

+O2or higher+O[no]limit

+O2or higher+O[no]size

+O3or higher+O[no]cache_pad_common

Enable [disable] optimizations that do not make large demands on system resources. The default is +Onolimit.

Enable [disable] optimizations that do not significantly increase code size. The default is +Onosize.

FunctionLevelLevel

Pad [do not pad] common blocks to avoid cache

Table 29 Fine-tuning optimization options (continued)

FunctionLevelLevel

collisions. The default is

+Onocache_pad_common.

+O2 or higher+O[no]dataprefetch

All+O[no]dynopt

All+O[no]entrysched

Insert [do not insert] instructions within innermost loops to explicitly prefetch data from memory into the data cache. The default is +Onodataprefetch.

Enable [disable] dynamic optimization for the output file, if the run-time environment supports this feature. Both forms of this option change the default setting, which allows the run-time environment to enable or disable dynamic optimization according to a system-wide default. This option applies only to executable files and shared libraries.chatr(1) can be used to change this setting, including restoration of the default setting, after the output file has been created.

Perform [do not perform] instruction scheduling on entry and exit code. The default is +Onoentrysched.

NOTE: This option is valid

only on the PA-RISC systems.

All+O[no]fastaccess

All+Ofast

Enable [disable] fast access to global data. The default is +Onofastaccessat levels 1, 2, and 3; +Ofastaccessat level 4.

Select a combination of optimization options for optimum execution speed and reasonable build times. Currently chosen options are:

• +O2

• +Ofltacc=relaxed

• +Onolimit

• +DSnative

• +FPD

• Wl,+pi,1M-

• -Wl,+pd,1M

• -Wl,+mergeseg

Using options to control optimization 93

Table 29 Fine-tuning optimization options (continued)

FunctionLevelLevel

NOTE: Do not use this

option for programs that depend on IEEE standard floating point denormalized numbers. Otherwise, different numerical results may occur. See +Ofltacc=relaxed. This option is a synonym for

-fast.

All+Ofaster

+O2or higher+O[no]fltacc

All+Ofrequently_called=function1[,function2...]

All+Ofrequently_called:filename

This option is equivalent to +Ofastwith an increased optimization level. The definition of +Ofastermay change or it may be deprecated in future releases.

Disable [enable] floating-point optimizations that can result in numerical differences. By default, the optimizer does not perform such optimizations. For information about the effect this option can have on your program, see the HP-UX Floating-Point Guide.

The named functions are assumed to be frequently called. This option overrides any information in a profile database.

The file indicated by filename contains a list of functions, separated by spaces or newlines. These functions are assumed to be frequently called. This option overrides any information in a profile database.

94 Performance and optimization

All+O[no]info

+O2or higher+O[no]initcheck

Display [do not display] information about the optimization process. This option is most useful at level 3 and above. The default is +Onoinfo.

Enable [disable] initialization of any local, scalar, automatic variable that is found to be uninitialized. The default is to initialize if the variable is uninitialized with respect to every path leading to its use. For more information about this option, see

“Uninitialized variables” (page 144).

Table 29 Fine-tuning optimization options (continued)

FunctionLevelLevel

+O3or higher+O[no]inline

All+Oinline_level num

All+O[no]libcalls

+O3or higher+O[no]loop_block

Enable [disable] inlining. The default is +Oinline.

This option controls inlining in fortran. The format for num is N[.n], where num is either an integral value from 0 to 9 or a value with a single decimal place from

0.0 to 9.0. For more information on this option, see F90((1))manpage.

Substitute [do not substitute] millicode versions of specific intrinsics. The default is +Olibcalls.

Loop blocking is a combination of strip mining and interchange that improves data cache locality. It is provided primarily to deal with nested loops that manipulate arrays that are too large to fit into the data cache. Under certain circumstances, loop blocking allows reuse of these arrays by transforming the loops that manipulate them so that they manipulate strips of the arrays that fit into the cache.

+O2or higher+O[no]loop_unroll=n

+O3or higher+O[no]loop_unroll_jam

All+moduleoptimize

Unroll [do not unroll] program loops by a factor of n. The default is +Oloop_unroll=4.

Loop unroll-and-jam involves partially unrolling one or more loops higher in the nest than the innermost loop, and fusing ("jamming") the resulting loops back together. This transformation is primarily intended to increase register reuse and decrease memory loads and stores per operation within an iteration of a nested loop.

The compiler reads only required information from a module file. Optimized module files are created by discarding redundant information while importing the module file. In case of nested modules or hierarchical modules, the compilation time and memory requirement of

Using options to control optimization 95

Table 29 Fine-tuning optimization options (continued)

FunctionLevelLevel

compiler can be reduced by using this option.

+O2or higher+O[no]moveflops

+O3or higher+O[no]parallel

+O2 or higher+O[no]parminit

+O2or higher+O[no]pipeline

Enable [disable] moving conditional floating-point instructions out of loops. The default is +Omoveflops.

Transform [do not transform] eligible loops for parallel execution. The default is +Onoparallel.

This option is available only on HP 9000 systems. On Integrity systems, you can use the +Oautopar option instead to parallelize loops.

Suppress optimizations that assume [do not assume] that arguments may refer to the same memory locations. The default is +Onoparmsoverlap.

Enable [disable] software pipelining. The default is +Opipeline.

NOTE: This option is valid

only on the PA-RISC systems.

All+O[no]procelim

+O3or higher+O[no]promote_indirect_calls

Remove [do not remove] unreferenced procedures from the executable. The default is +Onoprocelimat levels 0 - 3, +Oprocelimat level 4.

Use profile data from profile-based optimization and other information to determine the most likely target of indirect calls and promotes them to direct calls. Indirect calls occur with pointers to functions and virtual calls. In all cases the optimized code tests to make sure the direct call is being taken and if not, executes the indirect call. If +Oinlineis in effect, the optimizer may also inline the promoted calls.

+Opromote_indirect_calls

is only effective with profile-based optimization.

96 Performance and optimization

Table 29 Fine-tuning optimization options (continued)

FunctionLevelLevel

NOTE: The optimizer tries

to determine the most likely target of indirect calls. If the profile data is incomplete or ambiguous, the optimizer may not select the best target. If this happens, your code’s performance may decrease.

This option can be used at optimization levels 3 and 4. At +O3, it is only effective if indirect calls from functions within a file are mostly to target functions within the same file. This is because +O3optimizes only within a file whereas, +O4optimizes across files.

The default is +Opromote_indirect_callsat optimization level 3 and above. +Onopromote_indirect_callswill be the default at optimization level 2 and below.

+Orarely_called:filename

All+Orarely_called=function1[,function2...]

All+O[no]recovery

The named functions are assumed to be rarely called. This option overrides any information in a profile database.

The file indicated by filename contains a list of functions, separated by spaces or newlines. These functions are assumed to be rarely called. This option overrides any information in a profile database.

This option generates [does not generate] recovery code for control speculation. For code that writes to uncacheable memory that may not be properly identified as volatile, the +Orecoveryoption reduces the risk of incorrect behavior. The default is +Orecovery.

NOTE: The program that

uses signal handlers to catch signals raised by memory accesses may not behave correctly under +Onorecovery.

Using options to control optimization 97

Table 29 Fine-tuning optimization options (continued)

FunctionLevelLevel

+O2or higher +O2or higher+O[no]regreassoc

All+Ono_return

All+Oshortdata[=size]

Enable [disable] register association. The default is +Oregreassoc.

NOTE: This option is valid

only on the PA-RISC systems.

This option provides a hint to the optimizer that the named functions never return to the call site. If the +Ono_return:filenameform is used, the file filename contains a list of function names separated by spaces or newlines. This optimization is available at all optimization levels.

All objects of size bytes or smaller are placed in the short data area, and references to such data assume it resides in the short data area. Valid values of size are a decimal number between 8 and 4,194,304 (4MB). If no size is specified, all data is placed in the short data area. The default is +Oshortdata=8.

All+[no]srcpos

All+O[no]store_ordering

+O3or higher+O[no]vectorize

This option controls the generation of source position information for HP Caliper. The default is +srcpos.

When +srcpos, is in effect, the compiler generates source position information. When+nosrcposis in effect, the compiler does not generate this information and the compiler instructs the linker to discard any of this information found in the object files.

This option preserves [does not preserve] the original program order for stores to memory that is visible to multiple threads. This does not imply strong ordering. The default is +Onostore_ordering.

Replace [do not replace] eligible loops with calls to the math library; for more information, see “Using the +Ovectorizeoption” on page168. The default is+Onovectorize.

98 Performance and optimization

Table 29 Fine-tuning optimization options (continued)

FunctionLevelLevel

NOTE: This option is valid

only on the PA-RISC systems.

+O4or higher+O[no]whole_program_mode

Enable the assertion that only the files that are compiled with this option directly reference any global variables and procedures that are defined in these files. In other words, this option asserts that there are no unseen accesses to the globals. When this assertion is in effect, the optimizer can hold global variables in registers longer and delete inlined or cloned global procedures. This option is in effect only at +O4level of optimization.All files compiled with

+Owhole_program_mode

must also be compiled with +O4. If any of the files are compiled with +O4but are not compiled with +Owhole_program_mode, the linker disables the assertion for all files in the program.Use this option to increase performance speed, but only when you are certain that only the files compiled with +Owhole_program_modedirectly access any globals that are defined in these files.The default is +Onowhole_program_modewhich disables the assertion.

Conservative vs. aggressive optimization

At optimization level 2 or higher, the optimizer makes a number of assumptions about the program it is optimizing—for example, that re-ordering an expression for improved instruction scheduling will not change its results. In general, these assumptions relate to how closely the target program conforms to the Fortran90 Standard. For programs that conform to the Standard, it is safe for the optimizer to apply certain optimizations that can significantly improve performance. For nonstandard-conforming programs, these same optimizations could change the results or behavior of the program in ways that may not be acceptable to the programmer.

The +Oconservative and +Oaggressive options enable you to set the optimizer’s assumptions about which optimizations it can and cannot apply to a program. Each option invokes a subset of the fine-tuning options that balances safety and performance according to the coding style of the target program. You can use either option at optimization level 2 or higher.

NOTE: +Oaggressiveand +Oconservativeare incompatible and must not appear on the

same command line.

Conservative vs. aggressive optimization 99

Table on page160 lists the assumptions that the optimizer makes about your program when you compile with +Oconservative, +Oaggressive, or neither option (the default). The table also lists the fine-tuning options that are invoked by +Oconservativeand +Oaggressive. The options listed for the default case are the subset of the ones invoked by +Oconservative and +Oaggressive. For information about the fine-tuning options listed in the third column, see Table on page 152.

Table 30 Conservative, aggressive, and default optimizations

Invoked optionsAssumptionsSpecified options

+Onoconservative+Onoaggressive(the default)

+Oaggressive

• Standard-conforming

• Nonstandard

• Sensitive to rounding

differences

• Contains floating-point

expressions that must be evaluated in the specified order

• Procedure arguments may

overlap

• Standard-conforming

• Contains floating-point

expressions that permit re-ordering for optimization

• Does not contain uninitialized

variables

+Onoentrysched

+Omoveflops

+Onoparmsoverlap

+Onovectorize

+Ofltacc+Oconservative

+Onomoveflops

+Oparmsoverlap

+Oentrysched

+Onofltacc

+Onoinitcheck

+Ovectorize

NOTE: The +Oaggressiveand +Oconservative options are valid only on the PA-RISC

systems.

Parallelizing HP Fortran programs

The following sections discuss how to use the +Oparallel option and the parallel directives when preparing and compiling HP Fortran programs for parallel execution. Later sections also discuss reasons why the compiler may not have performed parallelization. The last section describes runtime warning and error messages unique to parallel-executing programs.

For a description of the +Oparalleloption, see “Fine-tuning optimization options” (page 40).

NOTE: The +Oparalleloption is not available on Integrity systems for HP Fortran Version 3.2

and later. You must use the +Oautop ar option instead to parallelize loops.

Compiling for parallel execution

The following command lines compile (without linking) three source files: x.f90, y.f90, and z.f90. The files x.f90and y.f90are compiled for parallel execution. The file z.f90 is compiled for serial execution, even though its object file will be linked with x.oandy.o.

f90 +O3 +Oparallel -c x.f90 y.f90

f90 +O3 -c z.f90 The following command line links the three object files, producing the executable file para_prog:

f90 +O3 +Oparallel -o para_prog x.o y.o z.o

100 Performance and optimization

HP HP-UX 11i v1, HP-UX 11i v2, HP-UX 11 User's Guide

Specifications and Main Features

Frequently Asked Questions

User Manual

HP Fortran Programmer Guide

Contents

HP secure development lifecycle

1 An overview of HP Fortran

The HP Fortran compiler environment

C preprocessor

Front-end

Back-end

Linker

Tools

HP-UX operating system

2 Compiling and linking

Compiling with the f90 command

f90 command syntax

Command-line options

Commonly-used options

Command-line options by category

Option descriptions

Symbol binding options

Using optimization options

Reviewing general optimization options

Fine-tuning optimization options

Filenames

Linking HP Fortran programs

Linking with f90 vs. ld

Linking to libraries

Linking to nondefault libraries

Linking HP Fortran 90 routines

Additional HP Fortran libraries

Linking to shared libraries

Library search rules

Special-purpose compilations

Compiling programs with modules

Examples

Compiling with make

Managing .mod files

Compiling for different PA-RISC machines

Creating shared libraries

Compiling with +pic

Linking with -b

Examples

Using the C preprocessor

Processing cpp directives

Saving the cpp output file

Creating demand-loadable executables

Creating shared executables

Compiling in 64-bit mode

Using environment variables

F90ROOT environment variable

ST_F90COM[64] environment variable

HP_F90OPTS environment variable

LPATH environment variable

MP_NUMBER_OF_THREADS environment variable

Floating installation

Setting up floating installation

3 Controlling data storage

Disabling implicit typing

Automatic and static variables

Increasing the precision of constants

Increasing default data sizes

Increasing default data sizes

Sharing data among programs

Modules vs. common blocks

4 Debugging

Using the HP WDB debugger

Stripping debugging information

Handling runtime exceptions

Bus error exception

Floating-point exceptions

Illegal instruction exception

Segmentation violation exception

Bad argument exception

Using debugging lines

5 Using the ON statement

Exceptions handled by the ON statement

Actions specified by ON