20131030 perl regex to replace a hash character - plembo/onemoretech GitHub Wiki

title: perl regex to replace a hash character link: https://onemoretech.wordpress.com/2013/10/30/perl-regex-to-replace-a-hash-character/ author: phil2nc description: post_id: 6569 created: 2013/10/30 12:21:14 created_gmt: 2013/10/30 16:21:14 comment_status: closed post_name: perl-regex-to-replace-a-hash-character status: publish post_type: post

perl regex to replace a hash character

The humble #, also called the "hash" or "pound" character (at least here in the U.S.), can be quite useful. Or not. Here's how to deal with it using a perl regex. My colleague Mike recently enabled a new feature on our Google Search Appliance that displays a map of someone's location by clicking a link on their address. Neat feature, but to work best we have to use a full address string since the existence and formatting of postal codes globally is nowehere near universal. We had a problem right away because certain of our corporate street addresses used a hash (#) symbol in place of the old-fashioned "No." at the beginning of certain street addresses. Like this: #15 Simpson Road, Bangalore 555555, India (yeah, there's some confusion between the use of the LDAP 'street' vs 'postaladdress' attributes in some quarters) Unfortunately, the stock map lookup routine pretty much chokes on that leading hash symbol. After fiddling around with it for awhile, I came up with a regex filter to replace that leading hash in all our street addresses before they hit the backend LDAP directory that the GSA crawls.

s/^\x23/No. /;

(\x23 is perl's representation of the hex code for # in all standard character tables) That's all there really is to it. Here's how I'm using this method in context: [code language="perl" gutter="false"] if($street) { for($street) { s/^\x23/No. /; } } $entry->replace('street') =>$street; [/code]

Copyright 2004-2019 Phil Lembo