Columnar Layout | r{1}-c{1} , r{2}-c{1} , r{3}-c{1} , ... | r{1}-c{2} , r{2}-c{2} , r{3}-c{2} , ... | r{1}-c{3} , r{2}-c{3} , r{3}-c{3} , ... |
---|
Row-based Layout | r{1}-c{1} , r{1}-c{2} , r{1}-c{3} , ... | r{2}-c{1} , r{2}-c{2} , r{2}-c{3} , ... | r{3}-c{1} , r{3}-c{2} , r{3}-c{3} , ... |
---|
hadoop jar orc-tools-*-uber.jar [--help] [--define X=Y] <command> <args>
java -jar orc-tools-*-uber.jar [--help] [--define X=Y] <command> <args>
meta print the metadata about the ORC file. data print the data from the ORC file. scan scan the ORC file. convert convert CSV and JSON files to ORC. json-schema scan JSON files to determine their schema. key print information about the keys.
orc-tools-*-uber.jar
print help when invoked without parameters or with "-help
" parameter:hadoop jar orc-tools-*-uber.jar --help
.hadoop jar orc-tools-*-uber.jar COMMAND --help
.$ hadoop jar orc-tools-1.5.4-uber.jar meta --help usage: meta <input> where <input> is the orc file to print its meta data to standard output.
$ hadoop jar orc-tools-1.5.4-uber.jar meta hdfs://localhost:8020/test1.orc
$ hadoop jar orc-tools-1.5.4-uber.jar data --help usage: data <input> where <input> is the orc file to print its data to standard output.
$ hadoop jar orc-tools-1.5.4-uber.jar data hdfs://localhost:8020/test1.orc
$ hadoop jar orc-tools-1.5.4-uber.jar scan --help usage: scan <input> where <input> is the orc file to scan and print its info to standard output.
$ hadoop jar orc-tools-1.5.4-uber.jar scan hdfs://localhost:8020/test1.orc
$ hadoop jar orc-tools-1.5.4-uber.jar convert --help usage: convert <option> <input> where <input> is the csv/json file to convert to orc file.
-e,--escape <arg> | CSV escape character. -h,--help | Provide help. -H,--header <arg> | CSV header lines. -n,--null <arg> | CSV null string. -o,--output <arg> | Output filename. -q,--quote <arg> | CSV quote character. -s,--schema <arg> | The schema to write in to the file. -S,--separator <arg> | CSV separator character. -t,--timestampformat <arg> | Timestamp Format.
$ hadoop jar orc-tools-1.5.4-uber.jar convert hdfs://localhost:8020/test1.json -o hdfs://localhost:8020/test1.orc
$ hadoop jar orc-tools-1.5.4-uber.jar json-schema --help usage: json-schema <option> <input> where <input> is the json file to scan and print its schema to standard output.
-f,--flat | Print types as flat list of types. -h,--help | Provide help. -p,--pretty | Pretty print the schema. -t,--table | Print types as Hive table declaration.
$ hadoop jar orc-tools-1.5.4-uber.jar json-schema hdfs://localhost:8020/test1.json