Text blobs are a pain, but they are commonly used with data from email platforms, Salesforce, or dynamic UTMs. In these cases, the JOIN key is probably hidden in a blob of text.
For example, say you want to join your users table using the
user_id from the Salesforce notes column. But the notes data looks like this…
Ahmed (user_id: 1f65e898-0f67-4d34-8ac3-ae223bb71a79) called us today
First Attempt: Regex Parsing
Initially, you can try to use regex to parse the user_id out of the value column. This is always annoying but can get close.
If it works, then a simple subquery and join should do the trick. However, like any free text column, the regex may not work perfectly.
Regex can be risky when the structure of the data changes:
Ahmed (user: 1f65e898-0f67-4d34-8ac3-ae223bb71a79)
(references “user” not “user_id”)
When regex doesn’t work, use this hack…
Solution: Use the "ilike" SQL hack
ilike to JOIN data by creating a dynamic key:
SELECT ... FROM some_table a. -- structured data JOIN other_table b -- data with the text blob ON ( a.column ilike '%' || b.other_column || '%')
This can be a handy trick if you’re in a pinch, but it can also be slow, so use it with caution. And watch out for unwanted duplication in your JOIN if the “ilike” matches more than once!