[storm] Subselect query

Aurélien Bompard aurelien at bompard.org
Fri Jun 14 11:33:10 UTC 2013


> This SQL query works:
> SELECT DISTINCT sender_name, sender_email, (SELECT count(*) FROM
> email e2 WHERE e2.sender_email = e1.sender_email) AS number FROM
> email e1 ORDER BY number DESC;

OK, the subselect isn't necessary, I've replaced it with a GROUP BY,
which brings me closer to the solution, but not quite there yet. Here's
my new query:

SELECT sender_name, sender_email, COUNT(sender_email) AS number
FROM email GROUP BY sender_email ORDER BY number DESC;

Which I can execute in Storm as:

store.find(Email, Count(Email.sender_email))).group_by(Email.sender_email)

This returns me tuples of Email instances and int, which is nice.
However I used to run the query without the count before, and I only
returned the sender_name and sender_email columns by adding this:

.values(Email.sender_name, Email.sender_email)

Is there a way I can only return a (sender_name, sender_email, count)
tuple ? Just adding the count in the values() call raises an
AttributeError: 'Count' object has no attribute 'variable_factory'.

Thanks!

Aurélien
-- 
http://aurelien.bompard.org    ~~~~~~    xmpp:aurelien at bompard.org
"Life is what happens to you while you're busy making other plans."
 -- John Lennon





More information about the storm mailing list