Monday, 2 September 2013

How to match list of urls with a list of patterns (a regex)

How to match list of urls with a list of patterns (a regex)

I have a list of urls and I want to find the name of service from given
list of url pattern and names, Currently I pick a url and match it with
all the patterns, Since both list can be huge, what the best way for
url(s) pattern matching and finding the service name? Current
Implementation is below.
urls
http://www.facebook.com
http://0.facebook.com
http://m.facebook.com
http://www.linkedin.com

Pattern service name
facebook.com Facebook
linkedin.com LinkedIn

def get_service_name(url, services_details):
url = url.rsplit('?')
# urls pattern matching
for service in services_details:
if len(url) > 1:
if service[0] in url[0]:
return service[1]
else:
if service[0] in url:
return service[1]
return "Unknown Service"

No comments:

Post a Comment