Hadoop’s “DistributedFileSystem vs DistributedCache” mystery
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 try { FileSystem dfs = DistributedFileSystem.get(hadoopJobConfiguration); final FileStatus[] sts = dfs.listStatus(new Path(this.hdfsDirectory)); for ( FileStatus s : sts ) { if ( s.getPath().toString().endsWith(".jar") ) { log.info("Jar found: " + s.getPath().toString()); DistributedCache.addFileToClassPath(new Path(s.getPath().toUri().getPath()), hadoopJobConfiguration); } } } catch (IOException [...]
Recently I wrote a Hadoop article in Russian for one of very popular Russian IT blogs. After giving this idea a second thought, I translated this article (or, rather, first part of this article as the second is still in progress) to English and uploaded it to my website (Posterous format isn’t very good for [...]
More Hadoop Mysteries – order of initialisation
Hey out there! Still not tired of my Hadoop experiments? Not yet? That’s another one for you!
What’d you think the difference is between two snippets of code? Say, this:
SomeCodeWhichChangesConfig.initialise(getConf()); Job job = new Job(conf, “MyHadoopJob”); // … setting the job details
if (!job.waitForCompletion(true)) { System.err.println(“FAILED, cannot continue”); }… and [...]
XML input and Hadoop – custom InputFormat
Today I finally hit the task I was scared for so long — processing large XML files on Hadoop. I won’t tell you for how long I crawled the Internet trying to find some working solution… not that anyone wants to know? Eventually, I came out with the solution of my own — [...]
Debugging Hadoop applications using your Eclipse
Well, it can be annoying – it can be awfully annoying, in fact, to debug Hadoop applications. But sometimes you need it, because logging doesn’t show anything, and you’ve tried anything but still cannot get under the Hadoop’s cover. In this case, do few simple steps.
1. Download and unpack Hadoop to your [...]-
Categories
-
Calendar
May 2012 M T W T F S S « Feb 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 -
Meta
