» Aggregate join results with MySQL using GROUP_CONCAT
An ever recurring task with relational databases is to select data from a table and a list of values coming from another table (an 1-n navigation on an n-m relationship).
Say you have a table
person
and a table role
, defined as follows:CREATE TABLE person ( id INTEGER NOT NULL PRIMARY KEY, name VARCHAR(255) NOT NULL ); CREATE TABLE role ( id INTEGER NOT NULL PRIMARY KEY, role VARCHAR(255) NOT NULL );For the sake of simplicity, I'll omit real-world fields in the
person
table, as well as UNIQUE
constraints.
There, you'd typically want to have a relationship table to bind person
rows to role
rows, as you can't directly model n-m relationships. Let's name it person_role
, aggregating 1-n relationships to person
and role
:CREATE TABLE person_role ( person_id INTEGER NOT NULL, role_id INTEGER NOT NULL, PRIMARY KEY (person_id, role_id), FOREIGN KEY (person_id) REFERENCES person.id, FOREIGN KEY (role_id) REFERENCES role.id );Let's insert some data for the same of the example:
INSERT INTO role VALUES(1, 'admin'); INSERT INTO role VALUES(2, 'developer'); INSERT INTO role VALUES(3, 'user'); INSERT INTO person VALUES(1, 'John Doe'); INSERT INTO person VALUES(2, 'Scott Tiger'); INSERT INTO person_role VALUES(1, 1); INSERT INTO person_role VALUES(1, 2); INSERT INTO person_role VALUES(1, 3);The content of the tables is now:
mysql> select * from person; +----+-------------+ | id | name | +----+-------------+ | 1 | John Doe | | 2 | Scott Tiger | +----+-------------+ mysql> select * from person_role; +-----------+---------+ | person_id | role_id | +-----------+---------+ | 1 | 1 | | 1 | 2 | | 1 | 3 | +-----------+---------+ mysql> select * from role; +----+-----------+ | id | role | +----+-----------+ | 1 | admin | | 2 | developer | | 3 | user | +----+-----------+What we want to do now is to retrieve every
person
with its associated role
s.
You basically have two options:
1) Only retrieve person
rows and complete them with data from role
one by one (needs to be done in the invoking programming language, cannot be done with SQL), here using some sort of python-ish pseudocode:persons = SELECT id, name FROM person ORDER BY id; for p in persons: p.roles = [ SELECT role FROM person_role INNER JOIN role ON person_role.role_id=role.id WHERE person_id=p.id ]2) Select everything at once, and reduce duplicates in your invoking code:
SELECT person.id AS id, name, role FROM person LEFT OUTER JOIN person_role ON person.id=person_role.person_id LEFT OUTER JOIN role ON person_role.role_id=role.id ORDER BY person.id;That'll give you lots of duplicates and you'll have to sort it out yourself in your Java/PHP/Python/Ruby/C++/C/C# code. The result from the query above would look something like this:
+----+-------------+-----------+ | id | name | role | +----+-------------+-----------+ | 1 | John Doe | admin | | 1 | John Doe | developer | | 1 | John Doe | user | | 2 | Scott Tiger | NULL | +----+-------------+-----------+Note that you need to use
LEFT OUTER JOIN
to also get results from person
rows that don't have any roles assigned through person_role
.
Neither of these solutions are all that satisfactory.
Well, with MySQL's GROUP_CONCAT function, there's a third option:SELECT person.id AS id, name, GROUP_CONCAT(role.role SEPARATOR ',') AS roles FROM person LEFT OUTER JOIN person_role ON person.id=person_role.person_id LEFT OUTER JOIN role ON person_role.role_id=role.id GROUP BY id ORDER BY id;The result of the query above is:
+----+-------------+----------------------+ | id | name | roles | +----+-------------+----------------------+ | 1 | John Doe | admin,developer,user | | 2 | Scott Tiger | NULL | +----+-------------+----------------------+Neat
;)
When you use EXPLAIN
to compare the query optimizer strategy for option 2 and 3, it does exactly the same operations. It's just that you can reduce it into a single statement, and avoid having to reduce the result rows into single objects (you just need to split the roles
result column using the separator you specified in the query).
Here's what EXPLAIN
on solution 2 and 3 gives us:+----+-------------+-------------+--------+---------------+---------+---------+--------------------------+------+----------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+-------------+-------------+--------+---------------+---------+---------+--------------------------+------+----------------+ | 1 | SIMPLE | person | ALL | NULL | NULL | NULL | NULL | 2 | Using filesort | | 1 | SIMPLE | person_role | ref | PRIMARY | PRIMARY | 4 | play.person.id | 1 | Using index | | 1 | SIMPLE | role | eq_ref | PRIMARY | PRIMARY | 4 | play.person_role.role_id | 1 | | +----+-------------+-------------+--------+---------------+---------+---------+--------------------------+------+----------------+
Labels: mysql
1 Comments:
YAHOO! Thank you thank you thank you! I was getting pretty frustrated with this one. I got so close to your solution, I had almost everything, but it kept adding it all to one record. After hours of googling, I tried googling
mysql join with group_Concat
And you showed up. 15 seconds later I had added GROUP BY id and it was working fine.
Again, thank you so much for this post.
--
techdude
Post a Comment
<< Home