Need help optimizing inner MAX subquery

Hi Everyone,

First accept my appreciation for any time you can take to help. I've banged my head against this one for a bit, and am unsure as to how I can achieve what I require in a non-programmatical way - it would be nice to have one query do all of the work. I've also consulted the mysql forums with this question without luck (although there were many responses):

Consider the following three tables from the schema:

authorization

CREATE TABLE `authorization` (
`username` varchar(40) NOT NULL default '',
`password` varchar(12) NOT NULL default '',
`validated` int(1) NOT NULL default '0',
`customer_id` bigint(20) unsigned NOT NULL default '0',
`rights` int(32) NOT NULL default '0',
`vkey` tinytext NOT NULL,
PRIMARY KEY (`username`),
KEY `customer_id` (`customer_id`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8

topics

CREATE TABLE `topics` (
`id` int(32) unsigned NOT NULL auto_increment,
`category` varchar(12) NOT NULL default '0',
`subject` varchar(64) NOT NULL default '',
`priority` varchar(128) NOT NULL default '',
`initiator` bigint(20) unsigned NOT NULL default '0',
`date` datetime NOT NULL default '0000-00-00 00:00:00',
`status` int(1) unsigned NOT NULL default '1',
`emailed` int(1) NOT NULL default '0',
`extra` text NOT NULL,
`flags` int(32) unsigned NOT NULL default '0',
`public` int(1) unsigned NOT NULL default '0',
PRIMARY KEY (`id`),
KEY `initiator` (`initiator`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8 COMMENT='Topic Index'

messages

CREATE TABLE `messages` (
`id` int(32) unsigned NOT NULL auto_increment,
`topic_id` int(32) unsigned NOT NULL default '0',
`user` bigint(20) unsigned NOT NULL default '0',
`date` datetime NOT NULL default '0000-00-00 00:00:00',
`message` text NOT NULL,
PRIMARY KEY (`id`),
KEY `topic_id` (`topic_id`),
KEY `date` (`date`),
KEY `user` (`user`),
FULLTEXT KEY `search` (`message`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8 COMMENT='Message Container'

On these tables I am running the following query, which does the trick:

SELECT topics.*, X.mdate, authorization.username
FROM (topics LEFT JOIN
( SELECT MAX( date ) mdate, topic_id, user
FROM messages
GROUP BY topic_id
) AS X ON topics.id = X.topic_id )
LEFT JOIN authorization ON authorization.customer_id = X.user
WHERE topics.initiator='3433' AND topics.status != '4'
GROUP BY id ORDER BY topics.category, X.mdate;

EXPLAINing this query however, shows the following understandable results, which initially not a problem, are really starting to affect performance since the size of the message table is quickly growing:

"id","select_type","table","type","possible_keys", "key","key_len","ref","rows","Extra"
1,"PRIMARY","topics","ref","initiator","initiator" ,8,"const",26,"Using where; Using temporary; Using filesort"
1,"PRIMARY","<derived2>","ALL","","",(null),"",416 6,""
1,"PRIMARY","authorization","ref","customer_id","c ustomer_id",8,"X.user",1,""
2,"DERIVED","messages","ALL","","",(null),"",23685 ,"Using temporary; Using filesort"

The inner SELECT MAX( date ) query is scanning all of the 23685 rows in the message table to find the actual max. When in fact, it should in the best case, only scan the rows that belong to the topic that matches the topic being scanned in the outer query.

_________________________________________________________
SELECT COUNT(*) AS total
FROM messages LEFT JOIN topics ON messages.topic_id = topics.id
WHERE topics.initiator='3433' AND topics.status !='4'
_________________________________________________________

Yields that there are 341 message rows to match the topics with initiator 3433 and status != 4.

The question then, is how to formulate the query such that it only scans appropriate rows for the Max subquery?

Help most appreciated!

Cordially.
Alex
[4344 byte] By [Saeven] at [2007-11-19 19:55:28]
# 1 Re: Need help optimizing inner MAX subquery
The inner SELECT MAX( date ) query is scanning all of the 23685 rows in the message table to find the actual max. When in fact, it should in the best case, only scan the rows that belong to the topic that matches the topic being scanned in the outer query.Well, the inner query with the aggregate MAX does not know what conditions are being applied outside it.. So it has to make a complete scan based on what filters are currently there for it and it only...

Try to add those conditionals to the inner sub-query as well - you can even choose the same joins for this sub-query that you have on the outside.. and see if it makes sense and helps. I would not be able to help with the syntax at the moment, though. Good luck. :thumb: Regards...
exterminator at 2007-11-9 13:43:23 >