sigizmund.com
A URL matching regex in Python — any problems?
By sigizmund On January 17, 2010 · 19Leave a Commenthttp%3A%2F%2Fsigizmund.com%2Fa-url-matching-regex-in-python-%25e2%2580%2594-any-problems%2FA+URL+matching+regex+in+Python+%E2%80%94+any+problems%3F2010-01-17+11%3A20%3A21sigizmundhttp%3A%2F%2Fsigizmund.wordpress.com%2F2010%2F01%2F17%2Fa-url-matching-regex-in-python-%25e2%2580%2594-any-problems
Can anyone see any flaws in it for real-world URL?
>>> str = 'and now http://sub.domain.com/something/?here3=3ab&what=1#where=1 that was a URL'
>>> urls = re.findall('http[s]?://(?:[a-zA-Z]|[0-9]|[$-_@.&#+]|[!*(),]|(?:%[0-9a-fA-F][0-9a-fA-F]))+', str)
>>> urls
['http://sub.domain.com/something/?here3=3ab&what=1#where=1']
For me it looks like working but you never now… Comments from @HD42 would be highly appreciated =)
-
Categories
-
Calendar
February 2012 M T W T F S S « Jun 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 -
Meta




