Using UDF in Guzzle - ja-guzzle/guzzle_docs GitHub Wiki

Table of Contents

Overview

This document provides details on how to register a UDF and use it in Guzzle

On Databricks

  1. Create the UDF and register it

create udf function by extending org.apache.hadoop.hive.ql.exec.UDF class.

package com.example.sparkfunctions
import org.apache.hadoop.hive.ql.exec.UDF

class ConvertVNI2UTF extends UDF{
  def evaluate (inputString: String): String = {
    ....
    function definition 
    ...
  }
}

build jar file and place it inside /guzzle/libs directory.

  1. Go to guzzle and refer it in Ingestion (Validate/Transform)

create a temporary function using pre-sql

CREATE TEMPORARY FUNCTION convertVNI2UTF AS 'com.example.sparkfunctions.ConvertVNI2UTF'

after creating a temporary function refer it from the query, validate and transform section

convertVNI2UTF(<column_name>)
  1. In SQL for Processing

use temporary function inside select query :

select convertVNI2UTF("ÔÕEÕOÙ")
⚠️ **GitHub.com Fallback** ⚠️