Skip to content

ThinkBigAnalytics/hive-json-serde

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

21 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

hive-json-serde README
======================

Code changes Copyright (C) 2010-2014 Think Big Analytics, Inc. All Rights Reserved.

Branched from the version of the Google Code project as of July 5.  Changes to the Google Code Project after July 5, 2011 are not reflected here.  The main goals of this branch were as follows:

1. Incorporate JSON Path as a standardized way of accessing JSON values.
2. Provide a method for users to specify the exact JSON elements which they want to import into Hive.  In particular, we added a way of mapping JSON Path expressions to column names (similar to the way that Hive and HBase currently interact).  A concrete example is provided below:

   CREATE EXTERNAL TABLE IF NOT EXISTS response_log_example (
      request_id string,
   )
   ROW FORMAT SERDE "org.apache.hadoop.hive.contrib.serde2.JsonSerde"
   WITH SERDEPROPERTIES (
      "request_id"="$.search_result.requestId"
   );

3. Better handling of nested JSON structures via JSON Path.  The code still does not make any attempt to handle arrays or maps and explicitly forbids any JSON Path expressions which are not "definite."

Releases

No releases published

Packages

No packages published