In high-speed switches the input queued (IQ) architecture is very popular due to its low memory-bandwidth requirement compared to the output queued (OQ) switch architecture, which is extremely desirable in terms of performance but requires very high memory-bandwidth. In the past decade researchers and industry people have been trying hard to find good scheduling algorithm for IQ switches. The two main performance criteria for a scheduling algorithm are: (i) throughput, and (ii) delay. There has been a lot of research done to find throughput of scheduling algorithms, but a little has been known about delay performance of algorithms. This paper mainly studies the delay properties of a class of scheduling algorithms known as maximal matching algorithms. It has been known that maximum weight matching (MWM) scheduling algorithm provides the maximum possible throughput, also denoted as 100% throughput [Tassiulas, L., et al., Dec. 1993], [McKneown, N., et al., March 1996], [Dai, J., et al., March 2000]. The delay bounds for MWM algorithm, and a suite of approximations of MWM algorithm, are known under Bernoulli i.i.d. traffic. Unfortunately there are two problems: (i) MWM and its approximations are not implementable, and (ii) delay bounds are very weak compared to the known theoretical lower bounds that can be obtained in terms of performance of an OQ switch. On the other end of spectrum lies simple maximal matching algorithm like iSLIP [McKeown, N., April 1999], which is implemented in commercially available routers. In [Dai, J., et al., March 2000] it was shown that all maximal matching scheduling algorithms are stable at speedup of 2. But nothing is known about their delay performance. In this paper, we obtain bounds on all maximal matching scheduling algorithm running at speedup 2 when traffic is Bernoulli i.i.d. Interestingly, these bounds match the theoretical lower bound very closely and much better than the bounds on MWM. In particular, we show that any CIOQ switch running at speedup 2 with maximal matching schedule as at most 5 times longer queue-sizes on average compared to the OQ switch under Bernolli i.i.d. traffic. This suggests that under assumption of traffic being independent enough, no switch can do better than a simple maximal matching algorithm running at speedup 2. This provides the first theoretical support to “iSLIP can provide QoS”. We would like to note that any IQ switch architecture that needs to sup- port OQ switch must have speedup 2 as shown in [Chuang, S.-T., et al., 1999], [Prabhakar, P., et al., 1999]. The algorithms proposed in [Chuang, S.-T., et al., 1999], [Prabhakar, P., et al., 1999] are very complex compared to algorithms like !SLIP.