Wednesday, March 07, 2007

assumptions: part 2

continuing from the last post, but on an entirely different subject (with assumptions still part of the picture).

yesterday afternoon, got a call from the technical manager. in the weeks past, i was supposed to set up a little laboratory rig; a mini render farm for his analysis and other stuff. however, since i had that overflow of edit/composite work, this was something that was put on my backburner. the call brought that initiative back to the fore.

now, i had already (in fits and starts) managed to set up the bare bones of the mini-farm: his machine had the software installed, configured - though not working for reasons i didn't want to delve into all that deeply in my extreme busyness; the two slave units were semi-configured, one of them refusing to allow root access (odd, that; but it's os x).

in the course of the conversation, i mentioned to him that i had used the ip address he emailed me (xx.xx.xx.66) in the host files of the operating systems that the render management software relied on to make reliable connections between machines. to my surprise, he mentioned that his ip was actually 62. what the?! i made the point quite plain that it was quite involved to change ip addresses once they had been set up; that at least six machines needed to be re-configured if he were to insist on using 62 as opposed to 66 as he had last specified. and this may be one reason the render management software refused to consider his machine as a valid client.

...so he changed it back.

upstairs i went to his office, and did the rest of the configuration there. still not recognized. as a last resort, i had him force his network card to full gigabit from its default auto setting. this seems to have made a difference: now his machine validated.

on to the slave units. it wouldn't verify either. i asked him what networking switch he had attached the units to. turned out that it was 100base-T. okaay. told him that somehow the render management software had problems traversing gigabit/fast ethernet connections, and that he would need gigabit for this to work. reconnect, he did therefore.

once that was done, that machine validated. okay, now we have a minimum farm. one master, one slave.

okay, run render test. fail.

troubleshoot. this is a fresh os x install, yes? he affirmed. i checked the mounts. no servers that we used at present were mounted. riiiight. okay, go to weird utility for the purpose. it declined to accept either admin or root and the password that had allowed the login in the first place. wtf?

okay, use his alternate login, it accepted the password, authenticated with the weird utility, and i mounted the servers. reboot. login. check mounts? present.

run render test. fail.

i looked at the error log. the render component was not present. i turned to him and asked: "i thought this was a fresh install?" it was. turned out that he had cloned an older system disk, one without the latest version of the render component we used - which also explained why the new mounts were not present. arrrrgh. installers? he didn't have one. downstairs i went, got mine, upstairs, installed it, left the cds with him.

run render test. work.

downstairs i went, thus ended the day.

moral of the story? in his case, never assume.

No comments: