Using BBcode to format text
As technology improves some things will have become outdated and be dropped, it is important though to keep some of the ideas which brought about use of said technologies in mind. Today we have several options when it comes to rendering text into HTML, this blog uses BBcode.
BBcode was created to aid people posting messages on bulletin boards a long time ago, around when Doom or Doom II came out. By trying to bring it back again I basically get to live in a time when video games were new.
Here is everything I did to implement the old style using Ruby on Rails.
application_helper.rb
def format_with_bbcode(content, trusted=false)
result = []
# Escape HTML and split the text into blocks
for text in h(content).split(/(\[quote=.+?\].*?\[\/quote\]|\[quote\].*?\[\/quote\]|\[code\].*?\[\/code\])/mi)
text.strip! # Remove unwanted whitespace
next if text.blank? # Skip empty blocks
if i = text.scan(/^\[quote=(.+?)\](.*?)\[\/quote\]$/mi).first
text = content_tag('blockquote', format_inline_bbcode("[i][b]"#{i[0]}" wrote:[/b][/i]\n#{i[1].strip}", trusted))
elsif i = text.scan(/^\[quote\](.*?)\[\/quote\]$/mi).first
text = content_tag('blockquote', format_inline_bbcode("[i][b]Quote:[/b][/i]\n#{i[0].strip}", trusted))
elsif i = text.scan(/^\[code\](.*?)\[\/code\]$/mi).first
text = content_tag('pre', i[0].gsub(/\r\n?/, "\n"), :class => 'prettyprint')
else
text = format_inline_bbcode(text, trusted) # Format paragraphs text
end
result << text
end
return result.join("\n\n")
end
def format_inline_bbcode(content, trusted)
result = simple_format(content)
result.gsub!(/\[b\](.+?)\[\/b\]/i, '<strong>\1</strong>')
result.gsub!(/\[i\](.+?)\[\/i\]/i, '<em>\1</em>')
result.gsub!(/\[u\](.+?)\[\/u\]/i, '<u>\1</u>')
result.gsub!(/\[email=(.+?)\](.+?)\[\/email\]/i, '<a href="mailto:\1">\2</a>')
result.gsub!(/\[email\](.+?)\[\/email\]/i, '<a href="mailto:\1">\1</a>')
return result unless trusted
result.gsub!(/\[url\](.+?)\[\/url\]/i, '<a href="\1" target="_blank">\1</a>')
result.gsub!(/\[url=(.+?)\](.+?)\[\/url\]/i, '<a href="\1" target="_blank">\2</a>')
result.gsub!(/\[img\](.+?)\[\/img\]/i, '<img src="\1" />')
return result
end
Firstly the text is broken up into one of four types of block, then each block is formatted separately and reassembled. This eliminates extraneous <p> tags around our block elements and ensures no formatting is rendered on the text inside of <pre> tags.
Each regular expression used here ensures both an opening and closing tag set before it would function.
For the text inside of <pre> tags, I decided to make the line breaks cross platform, so that is why I've added gsub(/\r\n?/, "\n") to convert them all to unix which is something done inside of Rails' simple_format method automatically.
Following that, simple_format is used on all other blocks to convert paragraphs into HTML and each piece of inline formatting such as italic are applied. This all works well enough for me because I don't need to have nested quotes, some additional ruby programming will be necessary if you have to pull that off.
No additional white lists or large text sanitation plugins will be necessary at this point and that should speed up your application.
carmon
August 31, 2008
Very Nice Site! Thanx!