hadoop - Hive/Impala UDF with String input/output -


i looking impala/hive udf examples, e.g.:

public class fuzzyequalsudf extends udf {     public fuzzyequalsudf() {     }      public booleanwritable evaluate(doublewritable x, doublewritable y) {         double epsilon = 0.000001f;         if (x == null || y == null)             return null;         return new booleanwritable(math.abs(x.get() - y.get()) < epsilon);     } } 

then tried create own udf, has string input , string output. ideally, should like:

public class myudf extends udf {     public myudf() {     }      public stringwritable evaluate(stringwritable x) {         string[] y = x.split(",");         string z = y[0] + "|" + y[1]          return new stringwritable(z);     } } 

however, problem there no stringwritable class! see:

import org.apache.hadoop.hive.serde2.io.bytewritable; import org.apache.hadoop.hive.serde2.io.doublewritable; import org.apache.hadoop.hive.serde2.io.shortwritable; import org.apache.hadoop.hive.serde2.io.timestampwritable; 

how make udf string type input/output without stringwritable class? thanks!

edamame. , can use org.apache.hadoop.io.text class.

you can refer 1 of hive's built-in function . referred trim takes string , outputs string

https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/genericudfbasetrim.java


Comments

Popular posts from this blog

mysql - Dreamhost PyCharm Django Python 3 Launching a Site -

java - Sending SMS with SMSLib and Web Services -

java - How to resolve The method toString() in the type Object is not applicable for the arguments (InputStream) -