All tasks error out on Linux 64 bit host


advanced search

Message boards : Number crunching : All tasks error out on Linux 64 bit host

Reply to this thread
Subscribe to this thread
Sort
AuthorMessage
S@NL - John van Gorsel
private message
Joined: May 7, 2007
Posts: 26
ID: 6250
Credit: 7,250,611
RAC: 54
Message 2798 - Posted 12 Feb 2012 11:52:19 UTC

I recently updated this host from OpenSuse 11.0 (64 bit) to OpenSuse 12.1 (64 bit) (kernel update from 2.6.25 to 3.1.9).
I re-installed Boinc and it's projects and it works fine with Seti but all Leiden tasks error out after 30-40 minutes like this:


Sun Feb 12 01:42:30 2012 Leiden Classical Starting task wu_164284800_1328084580_23580_2 using classical version 556
Sun Feb 12 02:18:19 2012 Leiden Classical Computation for task wu_164284800_1328084580_23580_2 finished
Sun Feb 12 02:18:19 2012 Leiden Classical Output file wu_164284800_1328084580_23580_2_1 for task wu_164284800_1328084580_23580_2 absent


All libraries seem to be present:

ldd classical_5.56_x86_64-pc-linux-gnu.exe
linux-vdso.so.1 => (0x00007fffe8bff000)
libm.so.6 => /lib64/libm.so.6 (0x00007f119e494000)
libglut.so.3 => /usr/lib64/libglut.so.3 (0x00007f119e24c000)
libGLU.so.1 => /usr/lib64/libGLU.so.1 (0x00007f119dfde000)
libGL.so.1 => /usr/lib64/libGL.so.1 (0x00007f119dcce000)
libXmu.so.6 => /usr/lib64/libXmu.so.6 (0x00007f119dab4000)
libXt.so.6 => /usr/lib64/libXt.so.6 (0x00007f119d84d000)
libXext.so.6 => /usr/lib64/libXext.so.6 (0x00007f119d63a000)
libXi.so.6 => /usr/lib64/libXi.so.6 (0x00007f119d42a000)
libSM.so.6 => /usr/lib64/libSM.so.6 (0x00007f119d221000)
libICE.so.6 => /usr/lib64/libICE.so.6 (0x00007f119d005000)
libX11.so.6 => /usr/lib64/libX11.so.6 (0x00007f119ccc4000)
libpthread.so.0 => /lib64/libpthread.so.0 (0x00007f119caa7000)
libc.so.6 => /lib64/libc.so.6 (0x00007f119c718000)
libstdc++.so.6 => /usr/lib64/libstdc++.so.6 (0x00007f119c40e000)
libgcc_s.so.1 => /lib64/libgcc_s.so.1 (0x00007f119c1f8000)
libnvidia-tls.so.275.09.07 => /usr/lib64/tls/libnvidia-tls.so.275.09.07 (0x00007f119bff6000)
libnvidia-glcore.so.275.09.07 => /usr/lib64/libnvidia-glcore.so.275.09.07 (0x00007f119a1d0000)
librt.so.1 => /lib64/librt.so.1 (0x00007f1199fc8000)
libdl.so.2 => /lib64/libdl.so.2 (0x00007f1199dc4000)
libuuid.so.1 => /lib64/libuuid.so.1 (0x00007f1199bbf000)
libxcb.so.1 => /usr/lib64/libxcb.so.1 (0x00007f11999a2000)
/lib64/ld-linux-x86-64.so.2 (0x00007f119e6eb000)
libXau.so.6 => /usr/lib64/libXau.so.6 (0x00007f119979e000)


Any advise is welcome!
____________

m.somers User profile image
Forum moderator
Project administrator
Project developer
Project tester
Volunteer developer
Volunteer tester
Project scientist
Avatar
private message
Joined: Nov 14, 2005
Posts: 662
ID: 1
Credit: 1,417,572
RAC: 2
Message 2799 - Posted 16 Feb 2012 9:32:36 UTC

Try and download executable from http://boinc.gorlaeus.net/download/classical_5.56_x86_64-pc-linux-gnu.exe and get example input file from http://boinc.gorlaeus.net/download/classical.water_molecules_BIG_64

Then run executable using this input:

classical_5.56_x86_64-pc-linux-gnu.exe classical.water_molecules_BIG_64


That might give some more clues....

m.
____________
M.F. Somers

S@NL - John van Gorsel
private message
Joined: May 7, 2007
Posts: 26
ID: 6250
Credit: 7,250,611
RAC: 54
Message 2801 - Posted 16 Feb 2012 17:38:36 UTC
Last modified: 16 Feb 2012 17:40:57 UTC

And the result is:


Can't open init data file - running in standalone mode
SIGSEGV: segmentation violation
Stack trace (3 frames):
./classical_5.56_x86_64-pc-linux-gnu.exe[0x4756ad]
/lib64/libc.so.6(+0x34e10)[0x7fb7c7c14e10]
/usr/lib64/libGL.so.1(+0x2ecd49)[0x7fb7c9482d49]

Exiting...


This is the stderr.txt output. The first line was entered at the start, the other lines were added after the program crashed after 22 minutes

S@NL - John van Gorsel
private message
Joined: May 7, 2007
Posts: 26
ID: 6250
Credit: 7,250,611
RAC: 54
Message 2802 - Posted 16 Feb 2012 20:04:40 UTC

I noticed the same problem on a different pc. Different hardware but the identical Linux version

m.somers User profile image
Forum moderator
Project administrator
Project developer
Project tester
Volunteer developer
Volunteer tester
Project scientist
Avatar
private message
Joined: Nov 14, 2005
Posts: 662
ID: 1
Credit: 1,417,572
RAC: 2
Message 2804 - Posted 21 Feb 2012 9:15:55 UTC

H'mmm please also try the stand alone executable from http://boinc.gorlaeus.net/download/DownLoads/Standalone/Executables/Graphics/GLUT_ClassicalDynamics_Linux_EM64T.x

and see if that also crashes (to eliminate BOINC from the equation)?

m.
____________
M.F. Somers

S@NL - John van Gorsel
private message
Joined: May 7, 2007
Posts: 26
ID: 6250
Credit: 7,250,611
RAC: 54
Message 2805 - Posted 21 Feb 2012 16:46:48 UTC - in response to Message ID 2804.
Last modified: 21 Feb 2012 16:47:32 UTC

H'mmm please also try the stand alone executable from http://boinc.gorlaeus.net/download/DownLoads/Standalone/Executables/Graphics/GLUT_ClassicalDynamics_Linux_EM64T.x

and see if that also crashes (to eliminate BOINC from the equation)?

m.

I used ./GLUT_ClassicalDynamics_Linux_EM64T.x classical.water_molecules_BIG_64

This ran for about 25 minutes and finished without any indication of a crash. The console output ends with "Dynamical simulation has finished:" with a list of parameters below it (e.g The current potential energy of system). I assume this means it ran just fine.

m.somers User profile image
Forum moderator
Project administrator
Project developer
Project tester
Volunteer developer
Volunteer tester
Project scientist
Avatar
private message
Joined: Nov 14, 2005
Posts: 662
ID: 1
Credit: 1,417,572
RAC: 2
Message 2806 - Posted 22 Feb 2012 7:12:10 UTC

H'mmm okay; so it is not our Classical application itself; it is probably something with the BOINC library included into the app. This is not trivial to change. The change is also global (for all hosts etc.) I'm afraid. Also there is no guarantee that a newer BOINC library will fix things. The next step would be to see if it related to your kernel / OS and check if on your kernel a newer BOINC lib would fix it.

This could take a while as understaffed as I am right now ;-).

You might want to have a look at it yourself?

http://boinc.gorlaeus.net/download/DownLoads/Classical.tar.gz

contains all the code and makefiles; you could untar it; recompile for your host on your host with the current BOINC lib and test (go into the 'Classical Dynamics for Linux' directory and run 'make -f Makefile_BOINC_EM64T'). If that test does not help the trickery starts because then a new BOINC library needs to be recompiled on your host into the 'boinc_libs' directory...

m.
____________
M.F. Somers

S@NL - John van Gorsel
private message
Joined: May 7, 2007
Posts: 26
ID: 6250
Credit: 7,250,611
RAC: 54
Message 2814 - Posted 31 Mar 2012 12:39:17 UTC
Last modified: 31 Mar 2012 12:40:27 UTC

I have tried to follow the steps you mentioned but the the output of the make command starts with:


Apollo:~/Downloads/Classical/ClassicalDynamics for Linux # make -f Makefile_BOINC_EM64T
g++ -DUSE_BOINC_GRAPH_API -DUSE_CRLIBM -m64 -O3 -I../crlibm_libs/linux/includes -I../glut_libs/linux/includes -I../boinc_libs/linux/includes -I../sources -c ../sources/*.cpp
In file included from ../sources/Main.cpp:874:0:
../sources/OpenGl.inc:21:22: fatal error: GL/gl.h: No such file or directory
compilation terminated.

After that, a long list of warnings follow.
____________

m.somers User profile image
Forum moderator
Project administrator
Project developer
Project tester
Volunteer developer
Volunteer tester
Project scientist
Avatar
private message
Joined: Nov 14, 2005
Posts: 662
ID: 1
Credit: 1,417,572
RAC: 2
Message 2815 - Posted 2 Apr 2012 8:10:56 UTC

H'mmm, it seems that you do not have installed your X11 / OpenGL development libraries (which include the needed headers). for my CentOS 5 system, these headers are in the mesa-libGL-devel-6.5.1-7.10.el5 package.

m.
____________
M.F. Somers

S@NL - John van Gorsel
private message
Joined: May 7, 2007
Posts: 26
ID: 6250
Credit: 7,250,611
RAC: 54
Message 2816 - Posted 2 Apr 2012 11:52:48 UTC - in response to Message ID 2815.

H'mmm, it seems that you do not have installed your X11 / OpenGL development libraries (which include the needed headers). for my CentOS 5 system, these headers are in the mesa-libGL-devel-6.5.1-7.10.el5 package.

m.


Ok, needed to install the mesa libraries in openSuse.

I now compiled BOINC_ClassicalDynamics_Linux_E64T.x (please note that I had to remove the existing one as the Make utility stated that the existing file was up-to-date)

How do I test this file? The file does not run in stand-alone mode, and renaming it to "classical_5.56_64-pc-linux-gnu.exe" and replace the file in BOINC/projects/boinc.gorlaeus.net doesn't work either: task files fail to download with the message: [error] File classical_5.56_64-pc-linux-gnu.exe has wron size: expected 1179512, got 1158176.

m.somers User profile image
Forum moderator
Project administrator
Project developer
Project tester
Volunteer developer
Volunteer tester
Project scientist
Avatar
private message
Joined: Nov 14, 2005
Posts: 662
ID: 1
Credit: 1,417,572
RAC: 2
Message 2817 - Posted 3 Apr 2012 12:30:07 UTC - in response to Message ID 2816.

you should not replace your boinc binary (they are digitally signed by boinc so that's why that didn't work). What you can do is run the binary supplying an input file './binary inputfile'. Examples are included in the .tar.gz file.


m.

____________
M.F. Somers

S@NL - John van Gorsel
private message
Joined: May 7, 2007
Posts: 26
ID: 6250
Credit: 7,250,611
RAC: 54
Message 2824 - Posted 2 May 2012 14:02:32 UTC

I ran the file in standalone mode using ./BOINC_ClassicalDynamics_Linux_E64T.x classical.water.molecules_BIG_64.

The result is identical to running ./GLUT_ClassicalDynamics_Linux_EM64T.x classical.water_molecules_BIG_64 (as I did on February 21)

Anyway, the newly compiled file seems to work fine.

S@NL - John van Gorsel
private message
Joined: May 7, 2007
Posts: 26
ID: 6250
Credit: 7,250,611
RAC: 54
Message 2826 - Posted 19 May 2012 10:10:18 UTC

Looks like I found a solution for the problem. Since the problem was not in the Classical executable, I updated Boinc to the latest version 7.0.25. The Boinc manager refused to start and ldd showed the following:


ldd boincmgr
linux-vdso.so.1 => (0x00007fff5d5ff000)
libwx_gtk2u_html-2.8.so.0 => not found
libwx_gtk2u_adv-2.8.so.0 => not found
libwx_gtk2u_core-2.8.so.0 => not found
libwx_baseu_net-2.8.so.0 => not found
libwx_baseu-2.8.so.0 => not found
libsqlite3.so.0 => /usr/lib64/libsqlite3.so.0 (0x00007ff2b4669000)
libdl.so.2 => /lib64/libdl.so.2 (0x00007ff2b4465000)
libpthread.so.0 => /lib64/libpthread.so.0 (0x00007ff2b4248000)
libgtk-x11-2.0.so.0 => /usr/lib64/libgtk-x11-2.0.so.0 (0x00007ff2b3c0f000)
libgobject-2.0.so.0 => /usr/lib64/libgobject-2.0.so.0 (0x00007ff2b39bf000)
libglib-2.0.so.0 => /usr/lib64/libglib-2.0.so.0 (0x00007ff2b36c7000)
libnotify.so.4 => /usr/lib64/libnotify.so.4 (0x00007ff2b34bf000)
libstdc++.so.6 => /usr/lib64/libstdc++.so.6 (0x00007ff2b31b5000)
libm.so.6 => /lib64/libm.so.6 (0x00007ff2b2f5e000)
libgcc_s.so.1 => /lib64/libgcc_s.so.1 (0x00007ff2b2d48000)
libc.so.6 => /lib64/libc.so.6 (0x00007ff2b29b8000)
/lib64/ld-linux-x86-64.so.2 (0x00007ff2b4931000)
libgdk-x11-2.0.so.0 => /usr/lib64/libgdk-x11-2.0.so.0 (0x00007ff2b2704000)
libpangocairo-1.0.so.0 => /usr/lib64/libpangocairo-1.0.so.0 (0x00007ff2b24f7000)
libX11.so.6 => /usr/lib64/libX11.so.6 (0x00007ff2b21b6000)
libXfixes.so.3 => /usr/lib64/libXfixes.so.3 (0x00007ff2b1faf000)
libatk-1.0.so.0 => /usr/lib64/libatk-1.0.so.0 (0x00007ff2b1d8c000)
libcairo.so.2 => /usr/lib64/libcairo.so.2 (0x00007ff2b1ad7000)
libgdk_pixbuf-2.0.so.0 => /usr/lib64/libgdk_pixbuf-2.0.so.0 (0x00007ff2b18b8000)
libgio-2.0.so.0 => /usr/lib64/libgio-2.0.so.0 (0x00007ff2b1574000)
libpangoft2-1.0.so.0 => /usr/lib64/libpangoft2-1.0.so.0 (0x00007ff2b1349000)
libpango-1.0.so.0 => /usr/lib64/libpango-1.0.so.0 (0x00007ff2b10fd000)
libfontconfig.so.1 => /usr/lib64/libfontconfig.so.1 (0x00007ff2b0ec7000)
libgmodule-2.0.so.0 => /usr/lib64/libgmodule-2.0.so.0 (0x00007ff2b0cc3000)
libgthread-2.0.so.0 => /usr/lib64/libgthread-2.0.so.0 (0x00007ff2b0abe000)
libffi.so.4 => /usr/lib64/libffi.so.4 (0x00007ff2b08b6000)
libpcre.so.0 => /lib64/libpcre.so.0 (0x00007ff2b0678000)
librt.so.1 => /lib64/librt.so.1 (0x00007ff2b0470000)
libXext.so.6 => /usr/lib64/libXext.so.6 (0x00007ff2b025d000)
libXrender.so.1 => /usr/lib64/libXrender.so.1 (0x00007ff2b0052000)
libXinerama.so.1 => /usr/lib64/libXinerama.so.1 (0x00007ff2afe4f000)
libXi.so.6 => /usr/lib64/libXi.so.6 (0x00007ff2afc3f000)
libXrandr.so.2 => /usr/lib64/libXrandr.so.2 (0x00007ff2afa36000)
libXcursor.so.1 => /usr/lib64/libXcursor.so.1 (0x00007ff2af82b000)
libXcomposite.so.1 => /usr/lib64/libXcomposite.so.1 (0x00007ff2af628000)
libXdamage.so.1 => /usr/lib64/libXdamage.so.1 (0x00007ff2af425000)
libfreetype.so.6 => /usr/lib64/libfreetype.so.6 (0x00007ff2af199000)
libxcb.so.1 => /usr/lib64/libxcb.so.1 (0x00007ff2aef7c000)
libpixman-1.so.0 => /usr/lib64/libpixman-1.so.0 (0x00007ff2aecf4000)
libpng14.so.14 => /usr/lib64/libpng14.so.14 (0x00007ff2aeacb000)
libz.so.1 => /lib64/libz.so.1 (0x00007ff2ae8b3000)
libselinux.so.1 => /lib64/libselinux.so.1 (0x00007ff2ae695000)
libresolv.so.2 => /lib64/libresolv.so.2 (0x00007ff2ae47e000)
libexpat.so.1 => /lib64/libexpat.so.1 (0x00007ff2ae253000)
libXau.so.6 => /usr/lib64/libXau.so.6 (0x00007ff2ae04f000)


After installing the missing libraries (all available from the OpenSuse DVD) Boinc runs fine and the pc successfully finished two Classical tasks (not yet validated though)

Reply to this thread

Message boards : Number crunching : All tasks error out on Linux 64 bit host



Return to Leiden Classical main page


Copyright © 2017 Leiden University - Leiden Institute of Chemistry - Theoretical Chemistry Department