Table of Contents
Segmentation fault or bus error
Introduction
Segmentation fault occurs when a program tries to access an inappropriate memory area or to access it with abnormal way. A bus error should come from the same issue. It may occur more frequently when you use a bigger data set. Or, you will see this error when using updated programs in that we have made some changes. When the program fails (crashes) without any messages, most likely, segmentation fault has occurred.
There are possible reasons for this error.
- Missing configuration in your computing environment
- Running out of free memory
- A bug in the program
This error can happen even if you have a lot of memory installed in your computer. Please try the following configuration before you file a bug report.
Settings in Linux and macOS
Stack size: Please type the command ulimit -s
in your shell (terminal). If it shows unlimited
, the configuration looks good. If you see a number (like 8192
), probably it is a problem. Please type the following command before running our programs.
ulimit -s unlimited
The operating system limits a user to use the resource (e.g. memory) in the computer. This command changes the resource assignment. The option -s
means stack which is the memory area available for the user. Please see the Linux/UNIX manual for details.
OpenMP stack size: Please type the command echo $OMP_STACKSIZE
in your shell. If it shows nothing, it may be a problem. Even if you have a number with a unit (like 4M
for 4 megabytes), it may be small. By default, this value is 4M
and most likely it is too small. Please type the following command before running our program.
export OMP_STACKSIZE=64M
Do not put any spaces around =
. If the program still stops with the same error, please increase the number gradually (like 128M
, 192M
, etc). A too big value will consume a lot of memory because each thread can use this amount of memory. It is hard to tell what is suitable for the user; it is system-dependent.
If you want to change it temporarily, you can put it to the command line when you run the program (no export
).
In this way, you can find a reasonable setting empirically.
OMP_STACKSIZE=64M ./airemlf90
It defines the stack size but for OpenMP library for parallel computing. This value is independent of the system stack size.
Above commands ulimit
and export OMP_STACKSIZE
can be saved in a start-up file which will be automatically executed so you don't have to manually type it before running the program. Please put the command into either of .bash_profile
or .bashrc
in your home directory. After the change, log-out once and log-in again to reflect the settings to the system.
Settings in Windows
We compile the programs with the unlimited-stack size options so that the user shouldn't hit the system stack issue. OpenMP may be a limiting factor.
You can set the environment variable OMP_STACKSIZE
in Command Prompt (for a temporary change) or Control Panel/system-configuration page (for the permanent change). For details, please search the keywords like windows, environment, variable
on the Internet by search engines.
This value should be the number plus unit like 64M
for 64 megabytes.
First, try 64M
, and if you still see the problem, increase the number to 128M
or more.
Too big number will consume a lot of memory because each thread can use this amount of memory.
The suitable value is up to your computer, so please find it empirically.
Memory usage
Another reason of segmentation fault is memory shortage (although it usually generates the insufficient memory
error).
Please monitor the memory usage of the program.
There are few options to solve this issue: install more memory modules to your computer, increase the swap area or the page-file size, use smaller data sets, or turns off the parallel computations.
Bug report
If you try all of above suggestions but still have the error, it may be a bug. Your report is helpful to figure out possible bugs in our programs. Please file it in blupf90 discussion group in Groups.io or send the email to one of the people working at Animal Breeding and Genetics Group in the University of Georgia. The support is volunteer-based and it may take a time to solve it.
Some more readings about segmentation fault
You can see a Wikipedia article for segmentation fault.
FAQ and tips
Previous version does not have this problem. Isn't is a bug in the program? No, it is most likely not. We use the latest version in our research every day and we do not hit the issue. The output of software is effectively identical unless there is an intended change.
Why does new version fail? We are using Intel Fortran Compiler which tries to generate an executable with the maximum performance in speed. The binary tends to use more memory than one generated by the other compilers (like gfortran
). Even with a small change in the code, the memory usage may differ and sometimes you hit the unlucky error.
The program works on my computer but fails in somebody else's computer. Why? It is system-dependent. Also, it depends on the data set and the model you are using. The segmentation fault error occurs more often when you use genomic data which use more memory and multiple threads in computations.
Give me more precise explanation about this fault? Our software tries to produce the maximum performance in speed in exchange for a bit more memory. There are 2 types of memory-allocation strategies in your system: stack and heap (see https://stackoverflow.com/questions/79923/what-and-where-are-the-stack-and-heap). Stack memory is much faster than heap but its size is usually limited by system. Even though you have plenty of physical memory, the stack size is small by default. Heap can use all of memory in your system but it is slower than stack because of more complicated memory-management strategy. Our program aggressively uses stack than heap to improve the speed, so the stack size should be large (or unlimited).