Distinct Value Estimation By Sampling On Unstructured Peer To Peer Networks
Abstract
Peer-to-Peer networks have become very popular on the Internet, with millions of peers all over the world sharing large volumes of data. The sheer scale of these networks has made it difficult to gather statistics that could be used for building new features. This thesis presents a technique of obtaining estimations of the number of distinct values matching a query on the network. The method is then analyzed by considering simulation results that demonstrate its effectiveness and flexibility in supporting a variety of queries and applications.