| Author | Message |
|
Hello, I repost my message here as it seems to be a better place. Other persons seems to have the same matter (see the message boards)
Hello, thanks for all your job.
I have the same matter with the same application, but either with a pentium 4 using win2000 and boinc 5.10.45 or a Core 2 duo using winxp and boinc 6.2.18. The results ids are 10639649 and 10633143.
When looking at the error log, it looks like the program calls an unknown fonction. Other leiden applications and other projects work fine.
Thank you for your help
____________
|
|
|
|
I got a reply from Mark on this "problem". He thinks it is a strange trajectory taken by the trajtou calculation. It's a known problem and it happens with about 3% of the tasks. Probably caused by sloppy coding by the original scientist who wrote the f77 code.
____________
Jord
BOINC FAQ Service |
|
|
|
I got a reply from Mark on this "problem". He thinks it is a strange trajectory taken by the trajtou calculation. It's a known problem and it happens with about 3% of the tasks. Probably caused by sloppy coding by the original scientist who wrote the f77 code.
Many thanks for your answer.
In my case, it's not 3% of the tasks bu all the wus pt11 that failed. Other wus are ok.
The wu just start 1 second then come into error.
____________
|
|
|
m.somers Forum moderator Project administrator Project developer Project tester Volunteer developer Volunteer tester Project scientist
 private message
Joined: Nov 14, 2005 Posts: 630 ID: 1 Credit: 1,417,572 RAC: 2
|
The Pt111 code runs trajectories based on the random seed derived from the WU name. The combination of the initial random seed and the spefici hardware you run the WU on, the trajectory sometimes goed haywire. Tests on my machine here with many many seeds shows that about 3% of the trajectories do this. Reproducing such a trajectory of a client with having the exact same hardware is nearly impossible making it realy hard to fix this. When I have some extra spare time on my hand, I'll investigate this a bit more and seem if I can fix this...
m.
____________
M.F. Somers |
|
|
|
Thanks,
That's really strange that it works on your machine, since each of my windows machine seems to have the problem. The message is always the same, for every pt11 unit
<core_client_version>6.2.18</core_client_version>
<![CDATA[
<message>
Fonction incorrecte. (0x1) - exit code 1 (0x1)
</message>
<stderr_txt>
Unrecognized XML in parse_init_data_file: computation_deadline
Skipping: 1223339633.000000
Skipping: /computation_deadline
Unrecognized XML in GLOBAL_PREFS::parse_override: mod_time
Skipping: /mod_time
Unrecognized XML in GLOBAL_PREFS::parse_override: max_ncpus_pct
Skipping: 100.000000
Skipping: /max_ncpus_pct
endfile: truncation failed in endfile
apparent state: unit 11 named details.res
last format: list io
lately writing sequential formatted external IO
</stderr_txt>
]]>
____________
|
|
|
m.somers Forum moderator Project administrator Project developer Project tester Volunteer developer Volunteer tester Project scientist
 private message
Joined: Nov 14, 2005 Posts: 630 ID: 1 Credit: 1,417,572 RAC: 2
|
Like I explained, it is specific to the combination of the random number being used, the hardware and the system. It is very hard to reproduce this. What you might try and do is to get the input file directly from thw job directory and the executable directly to, put them in a sepparate temporary directory and run it. See what happens...
m.
____________
M.F. Somers |
|
|
|
Hello,
1) I have copied the application trajtou-pt111_5.38_windows_intelx86.exe and all the files beginning with pt111 in a separate directory and run it from the command line. The leiden window opened briefly then closed. Three new files were crated : init_data.xml with the parameters used, stdout.txt with nothing inside and stderr.txt with the following lines :
Can't open init data file - running in standalone mode
open: No such file or directory
apparent state: unit 1 named pt111_trajtou.inp
lately writing direct unformatted external IO
2) If I try to copy the file pt111_trajtou.inp_BIG_81185150_1219230273_25236 into pt111_trajtou.inp, with deleting init_data.xml, stdout.txt and stderr.txt then retry, the three files are created again.
stderr.txt says :
Can't open init data file - running in standalone mode
endfile: truncation failed in endfile
apparent state: unit 11 named details.res
last format: list io
lately writing sequential formatted external IO
stdout.txt says :
Quasi-classical calculation
Numerical potential
End of potential data input
Interatomic distance: 5.23966125 ATOMIC UNITS
Asymptotic potential:
# the asymptotic potential energy function for H2 on Pt(111)
Random parallel velocity
Random impact-parameter
Random azimuthal orientation
New files are created : detail.res with nothing inside and trajnbr with just 1 inside. The result.res file is here :
Quasi-classical calculation of dimension 6
Interatomic distance: 5.23966125 ATOMIC UNITS
Perp. translation energy: 1.0000E-01 eV
Initial rovib. energy: 9.6296E-03 au Rotational quant. nbr: 1
Mass of atom A: 1.0078E+00, Mass of atom B: 1.0078E+00
Random parallel velocity
Velocity direction: thetav= 0.0000E+00 deg
Random impact-parameter
Random azimuthal orientation
Trajectories 1 to 1500
System parameters:
Initial altitude: 8.0000E+00 Ang.
Maximum integration time: 1.0000E+01 ps
Integration parameters:
Precision: 1.0000E-06, Initial time step: 1.0000E-03 ps
Equation of motion for initial state:
Precision: 1.0000E-06, Initial time step: 7.0000E+00 au
Precision on turning point: 1.0000E-06
Dissociation for r larger than: 2.2200E+00 Ang.
Minimum Z: 5.0000E-02 Ang.
Maximum Z for adsorption: 3.0000E+00 Ang.
Last, the innmol3.res file :
Bottom of well: 2.94246326 E: -3.78161195
Equilibrium distance (with rotation): 2.94248033
Outer turning point: 3.0071E+00 au
Inner turning point: 2.8833E+00 au
Initial total energy: -3.7720E+00 au
7. 2.99979006 -1.89212869
14. 2.97925916 -3.40013926
21. 2.9499041 -4.14815473
28. 2.9187298 -3.84392318
35. 2.89413733 -2.44328397
42. 2.88344404 -.28569232
49. 2.89001952 1.96343762
56. 2.91177833 3.59624981
63. 2.94212601 4.17821091
70. 2.9726583 3.66721441
77. 2.99586436 2.31153349
84. 3.00667621 .47816494
91. 3.00291095 -1.44977963
ntr 14
98. 2.98531804 -3.08717015
105. 2.95756567 -4.05384754
112. 2.92608055 -4.02393569
Final total energy: -3.7720E+00 au
Period: 8.5714E+01 au
What I understand is that in the first case the application doesn't find the input file, while in the second case it works better. Is it right ?
Hoping it will help.
____________
|
|
|
|
Any news ? Dit that help you ? What else can I do to help ?
____________
|
|
|
m.somers Forum moderator Project administrator Project developer Project tester Volunteer developer Volunteer tester Project scientist
 private message
Joined: Nov 14, 2005 Posts: 630 ID: 1 Credit: 1,417,572 RAC: 2
|
Thanx for the effort, problem still unsolved. There is hardly anything you can do more...
m.
____________
M.F. Somers |
|
|