hadoop - Hive/Impala UDF with String input/output -
i looking impala/hive udf examples, e.g.:
public class fuzzyequalsudf extends udf { public fuzzyequalsudf() { } public booleanwritable evaluate(doublewritable x, doublewritable y) { double epsilon = 0.000001f; if (x == null || y == null) return null; return new booleanwritable(math.abs(x.get() - y.get()) < epsilon); } }
then tried create own udf, has string input , string output. ideally, should like:
public class myudf extends udf { public myudf() { } public stringwritable evaluate(stringwritable x) { string[] y = x.split(","); string z = y[0] + "|" + y[1] return new stringwritable(z); } }
however, problem there no stringwritable
class! see:
import org.apache.hadoop.hive.serde2.io.bytewritable; import org.apache.hadoop.hive.serde2.io.doublewritable; import org.apache.hadoop.hive.serde2.io.shortwritable; import org.apache.hadoop.hive.serde2.io.timestampwritable;
how make udf string type input/output without stringwritable class? thanks!
edamame. , can use org.apache.hadoop.io.text class.
you can refer 1 of hive's built-in function . referred trim takes string , outputs string
Comments
Post a Comment