Skip to content

Jaro-Winkler comparison in duckdb #551

Answered by RobinL
ericmanning asked this question in Q&A
Discussion options

You must be logged in to vote

Hi Eric,

A few notes.

First is that you may find the 2 to 3 converter useful. If you plug in the .json of a trained v2 model, it will attempt to convert it into the corresponding Splink3 Spark code.

Second, on the issue of UDFs in DuckDB.

The general principle here is that, in our move to multiple SQL backends, we can't offer blanket support for all functions (e.g. Jaro Winkler), and, in general, users will have to use functions that are available in their chosen backend (possibly by registering UDFs if they are available).

This is probably the biggest drawback of duckdb right now. The functions available are enumerated under the text similarity heading here, as you say.

First, there is p…

Replies: 1 comment 1 reply

Comment options

You must be logged in to vote
1 reply
@RobinL
Comment options

Answer selected by RobinL
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
2 participants