Remove all special characters except percentage (real time use case)?
How to remove all special characters except percentage (real time use case)?
If you are running a coupon website, then you will have this issue sometime while parsing/scrapping coupons from merchants.
For example,
you will have these three types of strings:
1. Get 10% Cashback
2. Get Rs. 10/- Cashback
3. Get 10/- Cashback
4. Get (Indian rupee symbol) 10 Cashback
If you use the below regex pattern then it will remove all the special characters:
[ ](?=[ ])|[^-_,A-Za-z0-9]+
Now your sample strings will become like this,
1. Get 10 Cashback
2. Get Rs 10- Cashback
3. Get 10- Cashback
4. Get 10 Cashback
Now 1 and 4 returns same output with out regex, which is wrong. (10%, Rupees(inr currency symbol)10 returns 10, because of all special characters removal regex)
But if you rewrite your logic to remove all special character than percentage(%) then you will not face this issue,
[ ](?=[ ])|[^-_,A-Za-z0-9?!%]+
?!% => to escape a particular character it needs to be given with ?!
Now it is:
1. Get 10% Cashback
2. Get Rs 10- Cashback
3. Get 10- Cashback
4. Get 10 Cashback
changed a bit more to remove to keep /- and Rs. as well:
[ ](?=[ ])|[^-_,A-Za-z0-9?!%?!Rs.?!/\\- ]+
Now it is:
1. Get 10% Cashback
2. Get Rs. 10/- Cashback
3. Get 10/- Cashback
4. Get 10 Cashback
Now it is perfect.
Full source code of the above regex example:
[java]
package in.javadomain;
public class RegexExceptPercentage {
public static void main(String[] args) {
String str1 = “Get 10% Cashback”;
String str2 = “Get Rs. 10/- Cashback”;
String str3 = “Get 10/- Cashback”;
String str4 = “Get ? 10 Cashback”;
// Regex only except percentage
String afterRem1 = str1.replaceAll(“[ ](?=[ ])|[^-_,A-Za-z0-9 ?!% ]+”, “”);
System.out.println(afterRem1);
// Regex except Rs. & /-
String afterRem2 = str2.replaceAll(“[ ](?=[ ])|[^-_,A-Za-z0-9 ?!Rs.?!/\\- ]+”, “”);
System.out.println(afterRem2);
// Regex except /-
String afterRem3 = str3.replaceAll(“[ ](?=[ ])|[^-_,A-Za-z0-9 !/\\- ]+”, “”);
System.out.println(afterRem3);
String afterRem4 = str4.replaceAll(“[ ](?=[ ])|[^-_,A-Za-z0-9 ]+”, “”);
System.out.println(afterRem4);
}
}
[/java]
Output:
[plain]
Get 10% Cashback
Get Rs. 10/- Cashback
Get 10/- Cashback
Get 10 Cashback
[/plain]