Failing work units


advanced search

Message boards : Leiden Classical : Failing work units

Reply to this thread
Subscribe to this thread
Sort
AuthorMessage
Biggles User profile image
private message
Joined: Feb 13, 2006
Posts: 5
ID: 89
Credit: 447,470
RAC: 4
Message 2271 - Posted 23 Sep 2008 14:59:45 UTC
Last modified: 23 Sep 2008 15:02:16 UTC

I've been crunching on a new laptop over the past week or so. It has an Intel Core2Duo T7500 processor in it. There's a problem in that it keeps failing work units that use the trajtou-p111 5.38 application. The just end up with computation errors after 1 second. Others are fine, just not those work units. The laptop is otherwise stable.

Anybody else seen something similar? Any ideas as to the cause? I tried re-downloading the app in case it was corrupt, but that hasn't fixed it.

It's quite irritating and will be slowing down crunching a bit.

The troublesome host.
____________

[AF>EDLS]GuL User profile image
private message
Joined: Apr 29, 2006
Posts: 6
ID: 1209
Credit: 1,959,746
RAC: 791
Message 2274 - Posted 25 Sep 2008 15:15:05 UTC

Hello, thanks for all your job.

I have the same matter with the same application, but either with a pentium 4 using win2000 and boinc 5.10.45 or a Core 2 duo using winxp and boinc 6.2.18. The results ids are 10639649 and 10633143.

When looking at the error log, it looks like the program calls an unknown fonction. Other leiden applications and other projects work fine.

Thank you for your help
____________

Beyond User profile image
private message
Joined: Jul 15, 2008
Posts: 6
ID: 16078
Credit: 2,022,415
RAC: 308
Message 2279 - Posted 29 Sep 2008 23:37:06 UTC

I have the same thing happening with 1 or 2 WUs a day, except they sometimes go up to 30 minutes before failing. All the failing WUs are from the trajtou-pt111 5.38 application. In my case the machine is an AMD X2. All other WUs run fine and only about 1 of 5 of the trajtou-pt111 fail. I was thinking of maybe aborting those WUs when they arrive.

m.somers User profile image
Forum moderator
Project administrator
Project developer
Project tester
Volunteer developer
Volunteer tester
Project scientist
Avatar
private message
Joined: Nov 14, 2005
Posts: 662
ID: 1
Credit: 1,417,572
RAC: 2
Message 2282 - Posted 30 Sep 2008 7:14:55 UTC

The Pt111 code runs trajectories based on the random seed derived from the WU name. The combination of the initial random seed and the spefici hardware you run the WU on, the trajectory sometimes goed haywire. Tests on my machine here with many many seeds shows that about 3% of the trajectories do this. Reproducing such a trajectory of a client with having the exact same hardware is nearly impossible making it realy hard to fix this. When I have some extra spare time on my hand, I'll investigate this a bit more and seem if I can fix this...

m.

____________
M.F. Somers

Beyond User profile image
private message
Joined: Jul 15, 2008
Posts: 6
ID: 16078
Credit: 2,022,415
RAC: 308
Message 2295 - Posted 2 Oct 2008 15:54:55 UTC - in response to Message ID 2282.

The Pt111 code runs trajectories based on the random seed derived from the WU name. The combination of the initial random seed and the spefici hardware you run the WU on, the trajectory sometimes goed haywire. Tests on my machine here with many many seeds shows that about 3% of the trajectories do this. Reproducing such a trajectory of a client with having the exact same hardware is nearly impossible making it realy hard to fix this. When I have some extra spare time on my hand, I'll investigate this a bit more and seem if I can fix this...

m.

Thanks for the reply. Notice that the three of us are running 3 very different architectures: Intel C2D, Intel P4 and AMD X2. Went 1.5 days without one of these failing and now have had 3 bad ones in the last few hours. I wouldn't mind so much if they failed immediately but wasting up to 30 minutes per failure is annoying.

Rob.B
private message
Joined: Mar 21, 2009
Posts: 1
ID: 20277
Credit: 14,714
RAC: 6
Message 2409 - Posted 18 Apr 2009 20:03:05 UTC
Last modified: 18 Apr 2009 20:03:37 UTC

Right third try at posting this reply, some blasted S/W somewhere on the board is flagging my message as SPAM! I wouldn't mind but I'm a vegetarian!!!!!

OK being a little more serious. I too have has failing work units on more than one box. Looking at the results I see that quite often I am not the only one to fail that WU. Additionally in more cases than not the failures come from the Boinc 6.6.x family of clients. For example WU 5660010.

This failure is from my Win7 test rig, but I have had similar issues with 6.6x.x clients on Vista and XP Pro.

Rob.

ocf81
private message
Joined: Oct 4, 2009
Posts: 2
ID: 23871
Credit: 304,618
RAC: 17
Message 2504 - Posted 23 Oct 2009 20:44:23 UTC
Last modified: 23 Oct 2009 20:53:46 UTC

I too have a lot of failures of Leiden Classical WU's. Running 2 i920's. 1 on Vista x64 (6GiB RAM) and one with Vista x64 SP1 (3GiB RAM)

Both running BOINC 6.6.38

Other projects are running fine.

ocf81
private message
Joined: Oct 4, 2009
Posts: 2
ID: 23871
Credit: 304,618
RAC: 17
Message 2508 - Posted 2 Nov 2009 9:32:59 UTC

One thing I've noticed is that it usually involves 'finished' work units that somehow become corrupted when the BOINC client starts up after reboot.

dasy2k1
private message
Joined: Feb 18, 2007
Posts: 4
ID: 4256
Credit: 111,908
RAC: 78
Message 2719 - Posted 30 May 2011 15:54:45 UTC

it could be that you dont have GLUT installed.

not sure how to do that on windows but it certianly seems to be needed in linux
____________

Reply to this thread

Message boards : Leiden Classical : Failing work units



Return to Leiden Classical main page


Copyright © 2017 Leiden University - Leiden Institute of Chemistry - Theoretical Chemistry Department