Tue 18th Aug 2015 at 12:30

Mysql: Finding the intersection of two SET fields (or delimited fields)

I've been working on a project that uses delimented field values rather than creating seperate join tables, as you would traditionally. Originally we were using seperate tables but most of this applications queries where utilising the GROUP_CONCAT function, so we decided to bypass this call and just store the values in text fields comma seperated. This allowed us to utilise the FIND_IN_SET funtion and has been working well for over a year.

One of the draw backs of the FIND_IN_SET function is that you can only use it to cross reference a single value e.g. FIND_IN_SET('orange', fruits). FIND_IN_SET will not let you find all lemons and oranges by calling FIND_IN_SET('lemons,orange', fruits), thus we we have been using a code solution to get around this:

<?php
$implode_type = ' OR ';
// $implode_type = ' AND ';
$search = array('lemons','oranges');
$SQL= 'SELECT a.* FROM a WHERE (';
$where_clauses = array();
forech( $search as $this_fruit ) {
    $where_clauses[] = " FIND_IN_SET({$this_fruit}, a.fruit) ";
}

$SQL .= implode($implode_type, $where_clauses);
$SQL .= ')';

$dbRes = mysql_query($SQL);
?>

 

This is has been working for us for over a year now. However, we have had a recent development request that involves cross referencing two tables on these types of fields. Up until now, we have been luck that all joins have been one-to-many relationships e.g.

SELECT a.* FROM a
INNER JOIN b ON FIND_IN_SET(b.fruit, a.fruit);

Where,

a.fruit = 'apples,pears,oranges,lemons';
b.fruit = 'oranges';

 

However, now we have:

a.fruit = 'apples,pears,oranges,lemons';
b.fruit = 'lemons,oranges';

For this we can not use code as that would require loading all records into memory and we really don't wantto do that. So, I hit my favourite search engine to see how the rest of the world would solve this. The solution came from here http://stackoverflow.com/questions/11670216/how-to-get-the-intersection-of-two-columns and while I'm not a great fan of store procedures as I like to keep my code in one place and my data in another, it appears that until MySQL implement a function, then this will do just niucely/

CREATE FUNCTION fn_intersect_string(arg_str1 VARCHAR(255), arg_str2 VARCHAR(255))
RETURNS VARCHAR(255)
BEGIN
    SET arg_str1 = CONCAT(arg_str1, ",");
    SET @var_result = "";

    WHILE(INSTR(arg_str1, ",") > 0)
    DO
        SET @var_val = SUBSTRING_INDEX(arg_str1, ",", 1);
        SET arg_str1 = SUBSTRING(arg_str1, INSTR(arg_str1, ",") + 1);

        IF(FIND_IN_SET(@var_val, arg_str2) > 0)
        THEN
            SET @var_result = CONCAT(@var_result, @var_val, ",");
        END IF;
    END WHILE;

    RETURN TRIM(BOTH "," FROM @var_result);
END;

The allows us now to do the following:

SELECT a.*, b.* FROM a
INNER JOIN b ON fn_intersect_string(a.fruit, b.fruit) != ''

This site uses cookies, please read my cookie policy.