Ruby: 15-second Profanity Filter
In the specification of a current project we have need of a profanity filter. While writing the spec I thought it would be a fun diversion to see how quickly and in how few lines of code a reasonably functional basic profanity filter could be created in Ruby (I don’t actually enjoy writing documentation so I’m easily convinced to do otherwise).
Fear not. The following code won’t see production. I’m aware of how easy it is to circumvent the “protection” and besides the final project will be in PHP, through no fault of Ruby’s. (Don’t even go down that road).
This was originally a one-liner but I couldn’t resist wrapping it in a function and adding in basic (ie: poor) pluralization:
# A bare-bones word ('profanity') filter that checks a string of text for prohibited words
# and returns any that appear in the prohibition list. Also checks for some basic plurals.
# Not intended to be comprehensive, complete, or particularly expansive. Just hacking.
def filter_profanity( check_str = "", prohibited_words = [] )
prohibited_words += prohibited_words.map { |w| w + “s” }
return check_str.downcase.gsub!(/[^a-z]/, ” “).split( ” ” ).uniq!.map { |w| w if prohibited_words.include?( w ) }.compact!
end
# This will spit out a list of the words contained in “str” that aren’t allowed according to “prohibited”
str = “This is a CAT and another cat and some dogs. And parrots too.”
prohibited = ["cat", "dog", "pigeon", "ocelot"]
puts filter_profanity( str, prohibited )
Question: Can the pluralization be worked into it such that the function is once again a one-liner and is still readable by the average Rubyist? I haven’t figured out a way but….
Update: Seems like this might benefit mightily from the inclusion of Jeremy McAnally’s acts_as_good_speeler Rails plugin. Implementation is left as an exercise for the compulsive-obsessive reader.
James Rosenstein responded on 07 Aug 2008 at 8:29 am #
I found it a lot easier to use a web service like the WebPurify Profanity Web Service.
Rather than having to constantly update you list of prohibited words and trying to figure out ways to catch as much profanity as possible.
It took me no time to intergrate it into my Ruby App.