More Hadoop Mysteries – order of initialisation

Hey out there! Still not tired of my Hadoop experiments? Not yet? That’s another one for you!

What’d you think the difference is between two snippets of code? Say, this:

SomeCodeWhichChangesConfig.initialise(getConf()); Job job = new Job(conf, "MyHadoopJob"); // ... setting the job details

if (!job.waitForCompletion(true)) { System.err.println("FAILED, cannot continue"); }

… and this:

Job job = new Job(conf, "MyHadoopJob"); // ... setting the job details

SomeCodeWhichChangesConfig.initialise(getConf()); if (!job.waitForCompletion(true)) { System.err.println("FAILED, cannot continue"); }

No difference, you say? Not quite right, sir: the difference is that whatever you do to conf after creating a job will have no further effect. That is, Job constructor apparently copies all the data and doesn’t link your copy of Configuration object with it’s copy. Brilliant, no?

(and I spent a couple of hours trying to understand why distributed cache works properly in one app and doesn’t work at all in another). So you know now. Be warned.

Posted in: Uncategorized

Tags: ,



This website uses IntenseDebate comments, but they are not currently loaded because either your browser doesn't support JavaScript, or they didn't load fast enough.

addLeave a comment