Release Notes: PathScale, LLC. PathScale Compiler Suite Release 3.3-Beta NOTE: The most current version of these notes is on the PathScale Website ========================================================================= Copyright (C) 2007, 2008, 2009 PathScale, LLC. All Rights Reserved. Thank you for purchasing the PathScale Compiler Suite. This file describes new features, bugs fixed, and known issues with the PathScale Compiler Suite. Where possible, we provide workarounds for known problems. Support & contact information ----------------------------- To report bugs, or for more information, send an email message to support@pathscale.com. Please report problems the first time you encounter them, even if they are listed here as known issues. Knowing who has encountered a bug helps us prioritize the bugs we know about. New Features in 3.3 ---------------- GCC Compatibility and Features * Full support of the different GNU inline asm constraints strings * More GNU x86 intrinsic functions supported Fortran 2003 * The "intent" attribute on pointer dummy argument is supported * The "protected" attribute is supported * The "move_alloc" intrinsic function for allocatable entities is supported Debugging Support * Under -g compilation, the line number information generated for use by debuggers is more accurate * Under -g compilation, type information for various program constructs are more complete * The space taken up by debugging sections in object files has been reduced without affecting functionalities Performance * Performance improvements in both floating point and integer code. - Improved loop unrolling - Improved loop fusion - Improved prefetch efficiency - Improved instruction scheduling - SPEC configuration files reflecting these improvemnets are included in the release, and available on the website - Better tuning for multicore on both AMD and Intel chip sets (-march=barcelona, -march=istanbul, -march=core, -march=wolfdale, -march=harpertown, -march=nehalem) - Support for huge pages (operating system-dependent) * This release has been tested on the following Linux distributions: Fedora Core 3, 4, 5, 6 Fedora 7, 8 RedHat Enterprise Linux 4 and 5 SUSE Linux Enterprise Server 9 SUSE Linux Enterprise Server 10 SUSE Linux Professional 9.3 SUSE Linux Professional 10 SUSE Linux Professional 10.1 openSUSE 10.3 Ubuntu 7 & 8 NOTES: Portions of the PathScale Compiler Suite are 32-bit C++ applications built with GNU 3.x tool chain. They require the 32-bit GNU 3.x C++ runtime libraries to be installed. This includes the GNU 4.x based PathScale Compilers. SUSE Linux Enterprise 10 refers to both the Desktop and Server editions. Known issues and workarounds ============================= OpenMP support for C++ ---------------------- OpenMP support for C++ is available only under -gnu4, which is the default on GNU 4.x-based systems. Thread-local storage -------------------- The compiler supports thread-local storage declared by the GCC __thread storage class. Thread-local support is not yet available for position-independent code. When compiling code that uses thread-local storage, -fPIC must not be used. Thread-local storage is also not supported under the GCC 3.x C/C++ front-ends (-gnu3). pragma support -------------- "#pragma options" and "#pragma frequency_hint" are supported only in GCC 3.x compatibility mode (-gnu3). They are not yet supported when compiling for GCC 4.x (-gnu4). Installation ------------ The README file and install script on CD image is out of date (3.2 Image) Workaround: Download a current install script from the website. - - - - - - - - - - - - - - - - - - - - - - - - - - - The PathScale Compiler Suite is a 32-bit program, and requires a 32-bit execution environment to run. We have found that some users do not have the 32-bit environment installed on their system resulting in the inability of the compiler to run. This tends to be a problem only on systems where the non-root tar installation is used or where "--force" is used to install the RPMs. RPM Installation on SUSE 9.3, and SLES 9 - - - - - - - - - - - - - - - - - - - - - - - - - - - On the SUSE Linux distributions, RPM may report that no package provides ld-linux-x86-64.so.2 when that is not the case. (This library is provided by the glibc RPM.) This is a bug in the version of RPM distributed with the specified SUSE environments. Some of the bugs fixed since the last 3.2 release: BUG Description =================== 4716 Improper use of Fortran IOSTAT in data transfer list causes compiler error 5921 Fortran list-directed write produces commas/repeat factors unlike other compilers 7969 Fortran -C array bounds checking - the compiler throws an error at compile time instead of run time 8502 pathf95 - compiler fails when bounds checking enabled 9645 pathf95 - zero length string clobbers preceding constant 9976 pathf95 -O0 -c Assert during IR->WHIRL Conversion: Expecting a BLOCK 10378 Fortran runtime checking under -C does not check for disassociated pointers 10455 Fortran intrinsic "free" does not handle character pointer 11614 pathf95 - compiler seg faults when bounds checking enabled 11829 pathf90 external max tries to call max with the intrinsics's calling sequence 12711 pathf90 memory leak in the handling of allocatable components 13662 Fortran do loop index variable should default to OMP private 14348 pathf90 compile error if a continuing char string starts with '!' 14402 asm constraint 'g' not handled correctly 14456 Assertion failure at line 3954 of ../../be/cg/lra.cxx 14480 pathf90 compiler hang in Front End Parse/Semantic phase due to empty interface block 14507 Global register variable with ASM allocation not handled correctly 14511 Compiler is leaving ekopath_crash files in /tmp 14516 Fortran 2003: Loosen argument association rules for character type 14548 Fortran fails to print runtime error for substring 14588 Seg fault in wgen 14690 pathcc not generating code for function with same name as extern inline function 14720 __attribute__((section(".data"))) resets .org and produces error 14723 pathcc does not recognize -Wpointer-sign and -Wno-pointer-sign 14739 OpenMP loop to initialize arrays fails when more than 1 thread 14773 Compiler incorrectly reporting "Target processor does not support 3DNow" 14792 pathCC aborts when the program calls an intrinsic without using its result 14797 Local register allocation unsuccessful due to large ASM clobber set 14798 no register is available for a pre-allocated tn 14799 Error: `%al' not allowed with `movl' 14806 pathCC 3.2 assertion failure at line 521 of tnutil when program uses gnu vector types 14820 Not recognizing 64-bit pentium4 processors 14837 gdb fails when -feliminate-unused-debug-types is used 14869 Compiler should not use SSE registers under -m64 -mno-sse -mno-sse2 14876 Fortran DATE_AND_TIME gives wrong offset from UTC for daylight time 14891 erroneously disallows "allocatable" with "len=*" 14909 Empty space in .ctors section causes runtime segmentation fault (issue 50) 14957 Interaction between input-output asm operand and output-only operand causes wrong code 14967 GNU visibility attribute not supported 14969 File-static symbol accessed using GOTPCREL under -fpic 15034 Fortran: bad line number in debug info due to '# n "filename" 2' from cpp 15048 OpenMP: illegal copy in symbol which is not threadprivate 15049 OpenMP: include omp_lib.h fails for free-form fortran90 15050 small program produces wrong answer at -O3 -r8 due to LNO's SIMD bug 15073 missling line number for the then part of an if statement 15104 Fortran: z_div et all conflict with CBLAS, SuperLU 15154 -nog77mangle is not working with pathf95 15174 Exp_Intrinsic_Op: unsupported intrinsic ((null)) 15176 function returning 32-bit not zeroing out high-order 32 bits 15205 Assertion failure in back end compiling NAMD 2.7b1 - null intrinsic Known issues in 3.3: [Bug 595] Some AMD64 glibc distributions include broken obstack code [Bug 949] C, C++: complex integer data types not supported [Bug 1316] F77: %loc extension not supported [Bug 1320] Fortran: some kinds of variable alignment are not supported [Bug 2312] IPA: linking may fail if filenames contain shell metacharacters [Bug 2395] The implementation of __builtin_return_address is not complete. [Bug 2446] IPA linker does not handle .a files containing IPA .o files correctly [Bug 2509] Control of floating-point trapping behavior [Bug 2809] C: compiler handles unspecified array sizes incorrectly [Bug 2896] Some fast math routines are only available in the static math library [Bug 3289] pathf90: writing to a constant passed as an argument causes program to abort [Bug 3830] C: Incomplete debug information in nested lexical scopes [Bug 4374] IPA: "make: warning: Clock skew detected. [Bug 4433] pathf90: Fortran programs without a "main" will appear to successfully link [Bug 5090] Fortran OpenMP: statically nested parallel constructs not supported. [Bug 5195] C OpenMP FIRSTPRIVATE and LASTPRIVATE on same variable for PARALLEL DO [Bug 5882] OpenMP: Serial version of a parallel region is not localizing "private" variables. [Bug 5952] pathf90: On 32-bit the logic to automatically set stack size may not work [Bug 6236] pathf90: The Fortran compiler currently does not support IEEE intrinsics [Bug 6259] pathf90: Restricted expressions use Fortran 90 rules. [Bug 7728] Inline assembly '=A' constraint not supported [Bug 10192] pathf90 cpp preprocessor does not handle ## concatenation [Bug 10388] pathf95: Use of cpp with directives before or between continuations [Bug 595] Some AMD64 glibc distributions include broken obstack code Some glibc distributions for AMD64 include broken obstack code, which incorrectly mixes 32-bit and 64-bit references to stack data. This can cause code that uses obstacks (such as gcc) to crash under some circumstances, and may occur when using either pathcc or other compilers to build code that uses obstacks. There are two possible workarounds: For packages such as gcc, use their internal obstack implementations if available. For example, you can build gcc with -D_LIBC to do this. You can also fix the obstack alignment mask manually before you use any obstacks: > obstack_alignment_mask (obstack_ptr) = 3; [Bug 949] C, C++: complex integer data types not supported Although the PathScale Compiler Suite fully supports floating point complex numbers, it does not support complex integer data types, such as "_Complex int". Complex integers are a gcc "we did it because we could" extension to ISO C99, and we have never seen any uses of them outside of the gcc test suite. [Bug 1066] C, C++: __builtin_strpbrk not implemented correctly The gcc builtin function __builtin_strpbrk (gcc's implementation of the standard strpbrk library function) is not implemented correctly, and causes code to crash at runtime. [Bug 1316] F77: %loc extension not supported The F77 %loc directive, which returns the address of a variable, is implemented by some Fortran compilers such as g77, but not yet by the PathScale compiler. Recommend using the loc(). [Bug 1320] Fortran: some kinds of variable alignment are not supported Given a Fortran program fragment like this: > character c(11) > real r > equivalence (r,c(2)) The intention of the fragment is that the variable c should start on a 7-byte alignment, i.e. that c(2) should be aligned with r. This is not standard Fortran, but is an extension supported by some Fortran compilers, e.g. g77. The PathScale Fortran compiler does not currently support this kind of alignment requirement, and will issue a compilation error. [Bug 2312] IPA: linking may fail if filenames contain shell metacharacters The IPA linker may produce strange error messages and fail to link if a file name passed to it contains a shell or "make" metacharacter. The list of characters to avoid is as follows: ;:$()[]{}<>% [Bug 2395] The implementation of __builtin_return_address is not complete. The current implementation of __builtin_return_address appears to only work correctly for an argument of 0. [Bug 2446] IPA linker does not handle .a files containing IPA .o files correctly The IPA linker attempts to correctly handle archive files containing .o files compiled with -ipa, but this support is not complete. While a regular linker will only use .o files that export needed symbols, the IPA linker uses all .o files inside the archive, which can often lead to undefined symbol errors. To work around this, extract the .o files you need from the .a file, and link explicitly against those files. [Bug 2509] Control of floating-point trapping behavior The PathScale compilers support the following options for controlling floating-point traps: > -TENV:X=(0..4) > Specify the level of enabled exceptions that will be > assumed for purposes of performing speculative code > motion (default is level 1 at all optimization levels) > In general, an instruction will not be speculated (i.e. > moved above a branch by the optimizer) unless any > exceptions it might cause are disabled by this option. > > Level 0 - No speculative code motion may be performed. > > Level 1 - Safe speculative code motion may be performed, > with IEEE-754 underflow and inexact exceptions > disabled. > > Level 2 - All IEEE-754 exceptions are disabled except > divide by zero. > > Level 3 - All IEEE-754 exceptions are disabled including > divide by zero. > > Level 4 - Memory exceptions may be disabled or ignored. > > -TENV:simd_imask=(ON|OFF) > Default is ON. Turning it OFF unmasks SIMD floating-point > invalid-operation exception. > > -TENV:simd_dmask=(ON|OFF) > Default is ON. Turning it OFF unmasks SIMD floating-point > denormalized-operand exception. > > -TENV:simd_zmask=(ON|OFF) > Default is ON. Turning it OFF unmasks SIMD floating-point > zero-divide exception. > > -TENV:simd_omask=(ON|OFF) > Default is ON. Turning it OFF unmasks SIMD floating-point > overflow exception. > > -TENV:simd_umask=(ON|OFF) > Default is ON. Turning it OFF unmasks SIMD floating-point > underflow exception. > > -TENV:simd_pmask=(ON|OFF) > Default is ON. Turning it OFF unmasks SIMD floating-point > precision exception. [Bug 2809] C: compiler handles unspecified array sizes incorrectly If the compiler encounters C code that uses ISO C99 initializers to initialize an array of flexible size, it will generate assembly code that causes the assembler to issue an "attempt to move .org backwards" error. An example of this syntax usage is as follows: > struct K { > int i; > int f[]; > }; > struct K a = { > 9, > { 1, 4, 5 } > }; > struct K b = { > 9, > { 1, 4, 5, 12, 14,9,0,9 } > }; A workaround is to specify a size in the array declaration. [Bug 2896] Some fast math routines are only available in the static math library Some high-performance standard math library routines (single- and double-precision versions of pow, fmin, fmax, finite, and copysign) are only available in the static version of the math library (libmpath) that we ship. In order to benefit from these faster routines, you should link explicitly against the static version of the math library. You can find out the location of this library using the following command: > pathcc -print-file-name=libmpath.a The static version of the math library is faster in general than the shared version. You should always use it if you want the highest floating point performance. [Bug 3289] pathf90: writing to a constant passed as an argument causes program to abort Fortran programs that pass constants to subprograms that then try to write to that argument will abort. The Fortran compiler places constants in read-only memory. Workaround: Use the option "-LANG:rw_const=on". Note: The use of this workaround may result in a degradation in performance. [Bug 3697] GNU `used' attribute not supported The GNU 'used' attribute is not currently supported which will result in functions that the compiler perceives as dead code being eliminated. Workaround: Use the '-INLINE:dfe=0' option when compiling the code. [Bug 3830] C: Incomplete debug information in nested lexical scopes The compiler omits the necessary debugging output for enabling the debugger to differentiate between variables with identical names in nested lexical scopes. For example: > void foo() { > int i = 0 ; > { int i = 1 ; > { int i = 2 ; > } > } > } A debugger will not be able to tell the difference between the three variables called 'i'. [Bug 4374] IPA: "make: warning: Clock skew detected. You may receive a "make: warning: Clock skew detected. Your build may be incomplete." message when compiling with the '-ipa' option using a remote file system such as NFS for the build directory. The '-ipa' option currently uses the 'make' command which can be sensitive to the differences in system clock between the system where the compilation is taking place and the file server. Clock skew will not affect the success of your build, but you can avoid the warning by ensuring that the system times of the file server and the build server are synchronized. [Bug 4433] pathf90: Fortran programs without a "main" will appear to successfully link. Fortran programs that do not provide a "main" entry point will appear to successfully link but will fail at runtime with the following message: > $ ./a.out > Someone linked a Fortran program with no MAIN__! Workaround: Provide a main program. [Bug 5090] Fortran OpenMP: statically nested parallel constructs not supported. The compiler does not support statically nested parallel constructs. Such constructs may cause compilation or runtime failure. For example, the following is not supported: !$OMP PARALLEL $OMP PARALLEL $OMP END PARALLEL $OMP END PARALLEL [Bug 5195] C OpenMP FIRSTPRIVATE and LASTPRIVATE on same variable for PARALLEL DO The current implementation of OpenMP in the C compiler does not support FIRSTPRIVATE and LASTPRIVATE on same variable for PARALLEL DO. The compiler will issue the following error if this is encountered: > Error: FIRSTPRIVATE and LASTPRIVATE on same variable not > yet implemented for PARALLEL DO [Bug 5571] pathCC: Inlining in the C++ front end. The g++ 3.3 front end implements extremely aggressive inlining. We have found that the increase in code size caused by this inlining may cause the compiler's back end to have excessively long run times. As a result, the default is to allow the compiler's back end to do the inlining by turning off inlining in the C++ front end. In most cases the back end inlining does as well if not better than the C++ front end. In some rare cases, allowing the C++ front end to do the inlining does result in a faster executable. (This is typically true in cases where g++ produces faster runtimes than pathCC.) [Bug 5882] OpenMP: Serial version of a parallel region is not localizing "private" variables. The current OpenMP implementation may not create copies of private variables in parallel regions in the single threaded case as required by the OpenMP standard. Workaround: Define the following environment variables: > export PSC_OMP_SILENT=1 > export PSC_OMP_SERIAL_OUTLINE=1 [Bug 5952] pathf90: On 32-bit the logic to automatically set stack size may not work For 32-bit applications the logic described in section 3.10 of the User Guide may not work as documented. If the application aborts during execution try setting the stack size to a large value or unlimited and see if that resolves the issue. Workaround: Set the stack size to unlimited before running the application: > ulimit -s unlimited [Bug 6236] pathf90: The Fortran compiler currently does not support IEEE intrinsics The current release of the Fortran compiler does not implement the IEEE floating point intrinsics: > clear_ieee_exception > disable_ieee_exception > enable_ieee_exception > get_ieee_exception > get_ieee_interrupts > get_ieee_rounding_mode > get_ieee_status > ieee_class > ieee_next_after > ieee_unordered > set_ieee_exception > set_ieee_exceptions > set_ieee_interrupts > set_ieee_rounding_mode > set_ieee_status > test_ieee_exception > test_ieee_interrupt [Bug 6259] pathf90: Restricted expressions use Fortran 90 rules. The current Fortran compiler implements the rules for restricted expressions in accordance with the Fortran 90 standard. As a result, some valid Fortran 95 programs that rely on the broader definition of restricted expressions may generate errors when compiled with pathf90. [Bug 7728] Inline assembly '=A' contraint not supported The =A constraint indicates that both eax and edx are used to hold a 64-bit value. This allows the 64-bit value to be propagated to ret without having to build it up from two 32-bit parts. The current PathScale compilers do not implement this constraint correctly. [Bug 10192] pathf95's -cpp may not handle the ## operator correctly The command "pathf95 -cpp" invokes cpp with the -traditional option so that tabs are handled properly. Without -traditional, cpp will convert tabs to single spaces which may corrupt a fixed format file. [Bug 10388] pathf95: Use of cpp with directives before or between continuations The use of cpp as the preprocessor will cause # lines to be generated. These lines before a continuation statement will cause the compile to fail with an "unexpected syntax" message.