Posts

Showing posts from April, 2011

HTML to TEXT in Python

I just wrote a small Python program. In the script there was a part where I needed to get the body of a web page and get rid of all the html tags, javascript, css styles, html comments etc. So I searched Google, found several threads in stackoverflow and then found this: http://www.aaronsw.com/2002/html2text/ This looks cool. But when I tested it against the 'about me' page of my blog, it didn't work because of some broken tags! Then I started to write the html to text function myself to get the plain text only. With help of regular expression I solved my problem (but may be I created more problems!). Here is my Python code: def html_to_text(data):              # remove the newlines     data = data.replace("\n", " ")     data = data.replace("\r", " ")        # replace consecutive spaces into a single one     data = " ".join(data.split())          # get only the body content     bodyPat = re.compile(r'< body[

Replace consecutive whitespace with a single space

Sometimes we need to replace consecutive whitespace in a string with a single space. This is a good practice while parsing html files. Let me show you two ways of doing this. First one is to split the string and join. Here is the code snippet: >>> s = "a b c d e f" >>> " ".join(s.split()) 'a b c d e f' You can check more string methods here: http://docs.python.org/release/2.5.2/lib/string-methods.html Second method is to use regular expression. Here is the code: >>> import re >>> s = "a b c d e f" >>> p = re.compile(r'\s+') >>> data = p.sub(' ', s) >>> data 'a b c d e f'

Create Facebook Application in Python using App Engine

So far I have used PHP for all the facebook applications that I (and my other team members) have developed. Today I was thinking about using Python to develop a facebook app. After Google search I found some links and at the same time I asked one of my ex-colleagues whether he used Python for any facebook application (as I knew that he was exploring Google App Engine and also Facebook app development). Then he told me that there is already a sample application in facebook that uses Python and most interestingly Google app engine! The application is named ' Run With Friends '. Though I have seen this page before but never looked at it closely. So, I think this page is the right place to get started creating facebook app in Python and GAE: https://developers.facebook.com/docs/samples/canvas/ . The project I am planning to do might take 4 to 6 months (if I get regular free time). Let me know if you have already done any interesting facebook app using Google app engine.