Row-based Layout | r{1}-c{1} , r{1}-c{2} , r{1}-c{3} , ... | r{2}-c{1} , r{2}-c{2} , r{2}-c{3} , ... | r{3}-c{1} , r{3}-c{2} , r{3}-c{3} , ... |
---|
Columnar Layout | r{1}-c{1} , r{2}-c{1} , r{3}-c{1} , ... | r{1}-c{2} , r{2}-c{2} , r{3}-c{2} , ... | r{1}-c{3} , r{2}-c{3} , r{3}-c{3} , ... |
---|
hadoop jar avro-tools-*.jar <command> <args>
java -jar avro-tools-*.jar <command> <args>
getschema Prints out schema of an Avro data file. getmeta Prints out the metadata of an Avro data file. tojson Dumps an Avro data file as JSON, record per line or pretty. fromjson Reads JSON records and writes an Avro data file. canonical Converts an Avro Schema to its canonical form. cat Extracts samples from files. compile Generates Java code for the given schema. concat Concatenates avro files without re-compressing. fingerprint Returns the fingerprint for the schemas. fragtojson Renders a binary-encoded Avro datum as JSON. fromtext Imports a text file into an avro data file. idl Generates a JSON schema from an Avro IDL file. idl2schemata Extract JSON schemata of the types from an Avro IDL file. induce Induce schema/protocol from Java class/interface via reflection. jsontofrag Renders a JSON-encoded Avro datum as binary. random Creates a file with randomly generated instances of a schema. recodec Alters the codec of a data file. repair Recovers data from a corrupt Avro Data file. rpcprotocol Output the protocol of a RPC service. rpcreceive Opens an RPC Server and listens for one message. rpcsend Sends a single RPC message. tether Run a tethered mapreduce job. totext Converts an Avro data file to a text file. totrevni Converts an Avro data file to a Trevni file. trevni_meta Dumps a Trevni file’s metadata as JSON. trevni_random Create a Trevni file filled with random instances of a schema. trevni_tojson Dumps a Trevni file as JSON.
$ hadoop jar avro-tools-1.9.0.jar getschema usage: getschema <input> where <input> is the avro file to print its schema to standard output.
$ hadoop jar avro-tools-1.9.0.jar getschema hdfs://localhost:8020/test1.avro { "type" : "record", "name" : "test1", "namespace" : "com.mtitek", "doc" : "", "fields" : [ { "name" : "field1", "type" : "long", "doc" : "" }, { "name" : "field2", "type" : [ "null", "string" ], "doc" : "", "default" : null } ] }
$ hadoop jar avro-tools-1.9.0.jar getmeta usage: getmeta <input> --key [String] where <input> is the avro file to print its metadata to standard output.
--key [String] | Metadata key.
$ hadoop jar avro-tools-1.9.0.jar getmeta hdfs://localhost:8020/test1.avro avro.schema {"type":"record","name":"test1","namespace":"com.mtitek","doc":"","fields":[{"name":"field1","type":"long","doc":""},{"name":"field2","type":["null","string"],"doc":"","default":null}]} avro.codec snappy
$ hadoop jar avro-tools-1.9.0.jar getmeta hdfs://localhost:8020/test1.avro --key "avro.schema" {"type":"record","name":"test1","namespace":"com.mtitek","doc":"","fields":[{"name":"field1","type":"long","doc":""},{"name":"field2","type":["null","string"],"doc":"","default":null}]}
$ hadoop jar avro-tools-1.9.0.jar getmeta hdfs://localhost:8020/test1.avro --key "avro.codec" snappy
$ hadoop jar avro-tools-1.9.0.jar tojson usage: tojson [--pretty] [--head[=X]] <input> Dumps an Avro data file as JSON, record per line or pretty. where: <input> is the avro file to convert to json file. A dash ('-') can be given as an input file to use stdin
--head [String] | Converts the first X records (default is 10). --pretty | Turns on pretty printing.
$ hadoop jar avro-tools-1.9.0.jar tojson hdfs://localhost:8020/test1.avro {"field1":{"long":123},"field2":{"string":"abc"}}
$ hadoop jar avro-tools-1.9.0.jar fromjson usage: fromjson [OPTIONS] <input> where <input> is the json file to convert to avro file.
--schema [String] | Schema. --schema-file [String] | Schema File. --codec <String> | Compression codec (default: null). --level <Integer> | Compression level (only applies to deflate and xz) (default: -1).
$ hadoop jar avro-tools-1.9.0.jar fromjson hdfs://localhost:8020/test1.json --schema-file hdfs://localhost:8020/test1.schema --codec snappy