Scheduler request failed: HTTP file not found |
Message boards : Leiden Classical : Scheduler request failed: HTTP file not found
Author | Message | |
---|---|---|
Hi, | ||
ID: 2590 | Rating: 0 | rate: ![]() ![]() ![]() | [Reply to this post] | |
This machine also has a Windows 7 64 bit hard-drive that doesn't come with this 'error' Hi, ____________ | ||
ID: 2591 | Rating: 0 | rate: ![]() ![]() ![]() | [Reply to this post] | |
Same here on Ubuntu 10.04 x64. | ||
ID: 2593 | Rating: 0 | rate: ![]() ![]() ![]() | [Reply to this post] | |
Because mr Juriaan Bakx is situated in my building; I passed by ;-). But having discussed his issue; we have not found a cause yet. | ||
ID: 2596 | Rating: 0 | rate: ![]() ![]() ![]() | [Reply to this post] | |
Hi, | ||
ID: 2599 | Rating: 0 | rate: ![]() ![]() ![]() | [Reply to this post] | |
H'mmm, according to the dump below; this is not because of a new client or something... I'll investigate the HTTP logs today... | ||
ID: 2600 | Rating: 0 | rate: ![]() ![]() ![]() | [Reply to this post] | |
According to the Apache logs; the client makes a request to a non-existent url. It tries to hit Classical_ci instead of Classical_cgi. Given the fact that plenty other clients work just fine, I'm beginning to suspect a small bug in the client. The networking (TCP/IP) should be fault tolerant. | ||
ID: 2602 | Rating: 0 | rate: ![]() ![]() ![]() | [Reply to this post] | |
I am having a similar problem - have a number of WU's sitting trying over and over to upload. | ||
ID: 2610 | Rating: 0 | rate: ![]() ![]() ![]() | [Reply to this post] | |
Same with me. Unable to upload results for more than 24 hours now. Error is that servers may be temporarily down. | ||
ID: 2611 | Rating: 0 | rate: ![]() ![]() ![]() | [Reply to this post] | |
Well the servers are online here 24h 7 days a week... Well should be, as it turns out; a power glitch occurred here in Leiden a day or so ago; apparently one of the servers did not mount the NFS mount correctly. Has been fixed, no work should be lost and I'll keep an eye on it for now... | ||
ID: 2612 | Rating: 0 | rate: ![]() ![]() ![]() | [Reply to this post] | |
Yes, all tasks uploaded overnight. | ||
ID: 2614 | Rating: 0 | rate: ![]() ![]() ![]() | [Reply to this post] | |
But the upload failures in 'doze wasn't the original problem posted... or at least these sound like two different issues (and messages). Sat 21 Aug 2010 12:40:23 PM CDT http://boinc.gorlaeus.net/ Requesting new tasks for CPU and GPU Sat 21 Aug 2010 12:40:26 PM CDT http://boinc.gorlaeus.net/ Scheduler request failed: HTTP file not found I am NOT getting the error on my Ubuntu v9.10 x64 desktop or one of the v9.10 x64 'crunchers' both of those happen to be running boinc v6.10.58 from the PPA. Sat 21 Aug 2010 12:45:38 PM CDT Leiden Classical Sending scheduler request: Requested by user. Sat 21 Aug 2010 12:45:38 PM CDT Leiden Classical Requesting new tasks for GPU Sat 21 Aug 2010 12:45:39 PM CDT Leiden Classical Scheduler request completed: got 0 new tasks I have a machine running Xubuntu v10.04 x64 with the v6.10.58 from the PPA and it is not having the problem either. Sat 21 Aug 2010 12:58:13 PM CDT Leiden Classical Sending scheduler request: Requested by user. Sat 21 Aug 2010 12:58:13 PM CDT Leiden Classical Not reporting or requesting tasks Sat 21 Aug 2010 12:58:20 PM CDT Leiden Classical Scheduler request completed So I'm thinking this issue (today anyway) is only with the Linux x64 boinc v6.10.56 from Berkeley. Why I'm only seeing it on this project is a mystery but it seems it has shown up on other projects at various times in the past. Since the repository package for Debian/Ubuntu is at v6.10.17+ is it safe to assume all those with this problem are running with the Berkeley downloaded x64 client? The solution, if running the package installed Debian/Ubuntu boinc, is to enable backports and add the PPA line to your /etc/apt/sources.list. This will also allow you to keep pretty current on boinc w/o doing the ol' overlay binaries hack that I was using until this PPA came about. See this link to set up the PPA lib and get the current boinc from the repositories. Thanx MUCH to the Debian/Ubuntu boinc package maint. crew! Personally, I MUCH prefer the Debian/Ubuntu scripts and startup methods over the Berkeley installed locations, scripts, etc. (all in /home/user/BOINC/). The reason I'm running the Berkeley version on that one freshly redone machine is because there is currently a driver permissions bug in the Debian/Ubuntu set up when trying to do GPU crunching on the ATI HD4xxx & HD5xxx series cards. I hope someone finds the above ramblings helpful. ____________ - da shu @ HeliOS, "La carencia de recursos no debe de ser impedimento para que un niño tenga acceso a la tecnología." | ||
ID: 2626 | Rating: 0 | rate: ![]() ![]() ![]() | [Reply to this post] | |
... So I'm thinking this issue (today anyway) is only with the Linux x64 boinc v6.10.56 from Berkeley. Why I'm only seeing it on this project is a mystery but it seems it has shown up on other projects at various times in the past. It's worse than that... using v6.10.58 from the PPA problem doesn't appear to happen on my Xubuntu v10.04 x64 system but DOES on this fresh Ubuntu v10.04 x64 system. Obviously somethings different between them... back to that. ____________ - da shu @ HeliOS, "La carencia de recursos no debe de ser impedimento para que un niño tenga acceso a la tecnología." | ||
ID: 2627 | Rating: 0 | rate: ![]() ![]() ![]() | [Reply to this post] | |
Have same problem on Ubuntu 10.04 Server x64 with BOINC 6.10.58 :( | ||
ID: 2663 | Rating: 0 | rate: ![]() ![]() ![]() | [Reply to this post] | |
Same here. Since I upgraded to Ubuntu 10.10 64-bit and Boinc client 6.10.58, I've been unable to access Leiden Scheduler. | ||
ID: 2664 | Rating: 0 | rate: ![]() ![]() ![]() | [Reply to this post] | |
Please convince yourself that if you go to boinc.gorlaeus.net and look at the page source, you will find the correct url to the scheduler: <scheduler> http://boinc.gorlaeus.net/Classical_cgi/cgi </scheduler>. If the client has something different; the issue is a bug in the BOINC client itself. TCP/IP does ensure correct transport. | ||
ID: 2669 | Rating: 0 | rate: ![]() ![]() ![]() | [Reply to this post] | |
Please convince yourself that if you go to boinc.gorlaeus.net and look at the page source, you will find the correct url to the scheduler: <scheduler> http://boinc.gorlaeus.net/Classical_cgi/cgi </scheduler>. If the client has something different; the issue is a bug in the BOINC client itself. TCP/IP does ensure correct transport. I had a look in master_boinc.gorlaeus.net.xml, and found: <scheduler> http://boinc.gorlaeus.net/Classical_cgi/cgi </scheduler> I'm currently running malariacontrol.net, where: <scheduler_url>http://www.malariacontrol.net/malariacontrol_cgi/cgi</scheduler_url> No problems there, so it would be weird that the BOINC client is only messing up Leiden Classical scheduler url... If i change scheduler url within client_state.xml to e.g. http://boinc.gorlaeus.net/joop/cgi and then run project update (./boinccmd --project http://boinc.gorlaeus.net/ update) the url got changed back to http://boinc.gorlaeus.net/Classical_ci//cgi. So where is this url fetched from? [EDIT #1] Installed (very) old BOINC version (6.6.41), same issue... [EDIT #2] Install BOINC 6.10.58 on Ubuntu 9.10 Server x64, no problems... ____________ | ||
ID: 2679 | Rating: 0 | rate: ![]() ![]() ![]() | [Reply to this post] | |
Found another issue: <file_info> ... <url>http://boinc.gorlaeus.net/Classical_cgi/file_upload_hnddler</url> <signed_xml> ... <url> http://boinc.gorlaeus.net/Classical_cgi/file_upload_handler </url> <generated_locally/> ... </signed_xml> ... </file_info> Files will not be uploaded because of this wrong url. Changed url through whole xml, and uploaded 130 WU's with compute errors because of these bugs. Wasted enough precious time and moved machine to other project, Linux support sucks over here!!! ____________ | ||
ID: 2680 | Rating: 0 | rate: ![]() ![]() ![]() | [Reply to this post] | |
There is clearly a client-side issue going on; with a different ubuntu / client you have a fine running system. I did not change anything server-side here => client-side issue. Do you expect from this project that we also solve all your linux/boinc client issues? | ||
ID: 2681 | Rating: 0 | rate: ![]() ![]() ![]() | [Reply to this post] | |
Then BOINC client probably contains code if (project == "Leiden Classical") { <screw up url> } Because other projects have no issue regarding urls. So i assume you reported this issue to BOINC? ____________ | ||
ID: 2682 | Rating: 0 | rate: ![]() ![]() ![]() | [Reply to this post] | |
I tried finding the origin of this issue within the BOINC client, and filed a bug for it. Someone has just today proposed a possible reason/solution and I'm going to try the suggested patch in the near future. | ||
ID: 2690 | Rating: 0 | rate: ![]() ![]() ![]() | [Reply to this post] | |
Thanx, I think it will be solved now looking at the ticket. | ||
ID: 2692 | Rating: 0 | rate: ![]() ![]() ![]() | [Reply to this post] | |
I think there is a problem from both sides. | ||
ID: 2705 | Rating: 0 | rate: ![]() ![]() ![]() | [Reply to this post] | |
I created Debian packages with the patches included: | ||
ID: 2706 | Rating: 0 | rate: ![]() ![]() ![]() | [Reply to this post] | |
Hi, | ||
ID: 2794 | Rating: 0 | rate: ![]() ![]() ![]() | [Reply to this post] | |
Removed the spaces in the URLs for the scheduler... | ||
ID: 2795 | Rating: 0 | rate: ![]() ![]() ![]() | [Reply to this post] | |
Thank you | ||
ID: 2796 | Rating: 0 | rate: ![]() ![]() ![]() | [Reply to this post] | |
After processing a couple of WUs I got calculation errors for each of them during upload because of a missing result file. <command_line> classical.in classical.out classical.tddout </command_line> which should be <command_line> classical.in classical.out classical.stdout </command_line> After
<command_line> classical.in classical.out classical.stdout </command_line> but it should be <command_line>classical.in classical.out classical.stdout</command_line> The same file includes other tags that wrap their content in whitespaces <name> name </name> <max_nbytes> 123456789 </max_nbytes> <file_name> name </file_name> <open_name> name </open_name> Conclusion I assume that those wrapping whitespace characters are responsible for the calculation errors and I kindly ask if you can go through all of your templates and take them out. Regards | ||
ID: 2797 | Rating: 0 | rate: ![]() ![]() ![]() | [Reply to this post] | |
Looked into this; alas not trivial to change (all WUs and RUs in database need changes for that to happen)... | ||
ID: 2800 | Rating: 0 | rate: ![]() ![]() ![]() | [Reply to this post] | |
I'm back (oh no, not him again)! <core_client_version>6.12.34</core_client_version> <![CDATA[ <stderr_txt> Unrecognized XML in parse_init_data_file: userid Skipping: 0 Skipping: /userid Unrecognized XML in parse_init_data_file: teamid Skipping: 0 Skipping: /teamid Unrecognized XML in parse_init_data_file: hostid Skipping: 94764 Skipping: /hostid Unrecognized XML in parse_init_data_file: result_name Skipping: wu_164284800_1329893830_14545_0 Skipping: /result_name Unrecognized XML in parse_init_data_file: starting_elapsed_time Skipping: 0.000000 Skipping: /starting_elapsed_time Unrecognized XML in parse_init_data_file: computation_deadline Skipping: 1330619806.000000 Skipping: /computation_deadline Unrecognized XML in GLOBAL_PREFS::parse_override: mod_time Skipping: /mod_time Unrecognized XML in GLOBAL_PREFS::parse_override: run_gpu_if_user_active Skipping: 1 Skipping: /run_gpu_if_user_active Unrecognized XML in GLOBAL_PREFS::parse_override: suspend_cpu_usage Skipping: 0.000000 Skipping: /suspend_cpu_usage Unrecognized XML in GLOBAL_PREFS::parse_override: max_ncpus_pct Skipping: 50.000000 Skipping: /max_ncpus_pct Unrecognized XML in GLOBAL_PREFS::parse_override: daily_xfer_limit_mb Skipping: 0.000000 Skipping: /daily_xfer_limit_mb Unrecognized XML in GLOBAL_PREFS::parse_override: daily_xfer_period_days Skipping: 0 Skipping: /daily_xfer_period_days Unrecognized XML in parse_init_data_file: userid Skipping: 0 Skipping: /userid Unrecognized XML in parse_init_data_file: teamid Skipping: 0 Skipping: /teamid Unrecognized XML in parse_init_data_file: hostid Skipping: 94764 Skipping: /hostid Unrecognized XML in parse_init_data_file: result_name Skipping: wu_164284800_1329893830_14545_0 Skipping: /result_name Unrecognized XML in parse_init_data_file: starting_elapsed_time Skipping: 1755.504030 Skipping: /starting_elapsed_time Unrecognized XML in parse_init_data_file: computation_deadline Skipping: 1330619806.000000 Skipping: /computation_deadline Unrecognized XML in GLOBAL_PREFS::parse_override: mod_time Skipping: /mod_time Unrecognized XML in GLOBAL_PREFS::parse_override: run_gpu_if_user_active Skipping: 1 Skipping: /run_gpu_if_user_active Unrecognized XML in GLOBAL_PREFS::parse_override: suspend_cpu_usage Skipping: 0.000000 Skipping: /suspend_cpu_usage Unrecognized XML in GLOBAL_PREFS::parse_override: max_ncpus_pct Skipping: 50.000000 Skipping: /max_ncpus_pct Unrecognized XML in GLOBAL_PREFS::parse_override: daily_xfer_limit_mb Skipping: 0.000000 Skipping: /daily_xfer_limit_mb Unrecognized XML in GLOBAL_PREFS::parse_override: daily_xfer_period_days Skipping: 0 Skipping: /daily_xfer_period_days freeglut ERROR: Function <glutTimerFunc> called without first calling 'glutInit'. </stderr_txt> <message> upload failure: <file_xfer_error> <file_name>wu_164284800_1329893830_14545_0_1</file_name> <error_code>-161</error_code> </file_xfer_error> </message> ]]> So beside a bunch of XML errors/warnings, actual error is an upload failure with error code -161. After editing client_state.xml (as computezrmle described), upload error is gone (XML errors/warnings still remain), and no calculation error. System is running Ubuntu Server 10.10 x64 with BOINC client v6.12.34. EDIT Other system running Ubuntu Server 10.04.3 LTS with BOINC client v6.12.34 has same issue. ____________ | ||
ID: 2807 | Rating: 0 | rate: ![]() ![]() ![]() | [Reply to this post] | |
When does the hurting stop...? Looked into this; alas not trivial to change (all WUs and RUs in database need changes for that to happen)... How hard can it be to run some update queries on your database? Oh well, moving rig to other project... ____________ | ||
ID: 2874 | Rating: 0 | rate: ![]() ![]() ![]() | [Reply to this post] | |
Well, you'd be suprised; all WU and result enries in the data base are reganerated from one single original one. That one contains spaces. Attempts to modify that (aka removing spaces) will result in a signature mismatch => download error. Removing this means stopping the workflow (stop automatically generating WUs) which means drying the project. This affects many (ALL) users. The error we are dealing with only affects 1% or less of the users/volunteers. All in all the xml parsing error is within the client and having spaces there is not invalid xml as such. So... the situation; there is a bug in the clients XML parsing that affects less than 1% of the volunteers and the assumed workarround (not sure and surely not tested) is stopping a project and all outstanding work for that? See my point? | ||
ID: 2875 | Rating: 0 | rate: ![]() ![]() ![]() | [Reply to this post] | |
All in all the xml parsing error is within the client and having spaces there is not invalid xml as such. Agree that having spaces are not invalid, and the client not able to handle spaces is indeed a bug. However, assuming the client will trim leading/trailing spaces is a wrong assumption; this will also mean spaces put there intentionally are trimmed, that would be a bug to my opinion. And a property called ' name ' is not the same as 'name'. Meaning if the client does not trim the values, your xml files does not meet the client specifications (unless specifications says spaces will be trimmed). Looking at LC's active user base, i assumed every additional machine would be appreciated... Anyway, i have installed a new machine with Ubuntu 12.04 LTS x64 with BOINC client v7.0.28, everything seems to running fine. But upgrading other machines is not an option right now (newer BOINC version assumes machines are running newest/latest Linux distro/kernel with appropiate libraries, installing these libraries on older distro's is not always supported). ____________ | ||
ID: 2878 | Rating: 0 | rate: ![]() ![]() ![]() | [Reply to this post] | |
Message boards : Leiden Classical : Scheduler request failed: HTTP file not found