To get the full performance out of Analysis Services for distinct count, insure that a distinct count measure is in its own cube and you can perform custom aggregations for the dimensions within the cube. But the most powerful way to improve performance is to use a partitioning strategy.
The idea is to create partitions where the distinct IDs are spread out through all of the partitions (i.e. partition by dimension(s) that have the IDs repeated in most/all partitions); which is slightly counter-intuitive. At query time, this design forces AS2k to utilize multiple threads to calculate distinct count instead of just one thread; i.e. distributed query calculations in parallel. From real-world scenarios, there was a 4-10 time improvement w/ this design.
Note, you willneed to be careful with your partitioning strategy – you can actually run the risk where you peg the processors if there are too many threads running; hence the importance of multi-proc OLAP machines.