[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

*To*: tk@research.att.com*Subject*: union performance bug in symbolics CL*From*: barmar@Think.COM (Barry Margolin)*Date*: Fri, 16 Aug 1991 18:33:00 -0400*Cc*: customer-reports@symbolics.com, slug@ai.sri.com, lisp@ai.research.att.com*In-reply-to*: <19910816195023.4.TK@PETZL.research.att.com>

Date: Fri, 16 Aug 1991 15:50 EDT From: tk@AI.research.att.com (Thomas Kirk) the implementation of union in symbolics CL has a nasty performance bug. i haven't analyzed the algorithm, but it appears that the number of pairwise comparisons done in this implementation of union is quadratic in terms of the lengths of the arguments; the length of the second argument is in the square term. It has a worst-case behavior of L2(L1+(L2-1)/2) comparisons, where L1 and L2 are the lengths of the first and second arguments. Worst case is when the two sets are disjoint, the first argument is non-null, and the second argument contains no duplicates. I just looked at the code and see why this happens. For each element of the second argument it compares it with all the elements of the result, and adds it to the result if the comparison fails. Note that it is comparing with the result, not the original first argument. So, as the result grows, the remaining elements of the second argument are compared to more elements; it's essentially performing a REMOVE-DUPLICATES operation on the second argument, which is O(L2**2). ouch. this can be done with N * M comparisons. Changing it to use an N*M algorithm is pretty easy. One difference is that duplicates in both arguments would be duplicated in the result; currently, only duplicates in the first argument are duplicated in the result. Both behaviors are permitted by the specification, and it's not clear that it's worth paying an extra factor of L2 for this partial duplicate suppression; a comment in the code even mentions that the use of the first argument to initialize the result is done to take advantage of the allowance for duplicates in the result. barmar

**References**:**union performance bug in symbolics CL***From:*tk@AI.research.att.com (Thomas Kirk)

- Prev by Date:
**union performance bug in symbolics CL** - Next by Date:
**efficient set computations** - Previous by thread:
**union performance bug in symbolics CL** - Next by thread:
**efficient set computations** - Index(es):