Posts

Showing posts from February, 2008

swap values - the Python way

It's a very well known problem given to the beginners, 'swap values of two variables'. In our first introductory programming course (structured programming in C) we solved it in different ways. Most of us used another temporary variable. Some of us did some math tricks. I remember that one of my friend wrote the following code (in C): int a, b; scanf("%d %d", &a, &b); printf("%d %d\n", b, a); And it made us laugh :-D Here is the pythonic way of doing this. Try the following code: a = 2 b = 3 print a, b a, b = b, a print a, b :-)

How to download a file using Python?

Couple of weeks ago, I had to write a spider that harvest data from a website into a csv file and download the images. First I was thinking how to do the download... then I came up with a simple idea and wrote a function save_image that takes the url of the jpg image and filename, downloads the file and saves it with the name given in filename. import urllib2 def save_image(url, filename):     usock = urllib2.urlopen(url)     data = usock.read()     usock.close()     fp = open(filename, 'wb')     fp.write(data)     fp.close() Actually I just write the file in binary mode. Now post your code that performs this task is a different manner.

GetACoder for freelance Python projects

GetACoder.com is a good place for freelancers where you can search for Python projects (of course you can search for other projects also, not only Python :) Go here for Python projects. There are lot of websites / blogs that can help you to be a successful freelancer. So I don't want to write these things here. Just one line, 'have patience and be honest'. Have fun with Python!

Use user agent in your spider

Some websites don't allow your spider to scrape the pages unless you use an user-agent in your code. You can fool the websites using user-agent so that they understand that the request is coming from a browser. Here is a piece of code that use user agent 'Mozilla 5.0' to get the html content of a website: import urllib2 url = "http://www.example.com" #write your url here opener = urllib2.build_opener() opener.addheaders = [('User-agent', 'Mozilla/5.0')] usock = opener.open(url) url = usock.geturl() data = usock.read() usock.close() print data You can use other user agent as well. For example, the user agent my Firefox browser uses: "Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.8.1.12) Gecko/20061201 Firefox/2.0.0.12 (Ubuntu-feisty)" What is your user agent?

Python 2.5.2 released!

Final version of Python 2.5.2 is released! Download it from here: http://www.python.org/download/releases/2.5.2/ From python.org: This is the second bugfix release of Python 2.5. Python 2.5 is now in bugfix-only mode; no new features are being added. According to the release notes, over 100 bugs and patches have been addressed since Python 2.5.1, many of them improving the stability of the interpreter, and improving its portability. Read the release notes for more: http://www.python.org/download/releases/2.5.2/NEWS.txt

Execute Linux commands in Python

For last few days, I am doing an interesting project, for which I require to execute some Linux commands from Python. So I searched a bit for it and found an way to do it. For example if I want to run the command 'ls -lt', I can use the following code: import os os.system('ls -lt') But problem is I want to store the result in a list. Then I thought of writing the output to a file and read the file into a list. My code was: import os os.system('ls -lt > output.txt') Can you suggest a better way?

which Python version do you use?

Yes, this was the first poll in this blog. 9 people voted for Python 2.5 and 1 voted for 3.0 I myself still use Python 2.5.1, waiting for the final release of 3.0... :)

How to round a floating point number?

It's another problem I faced few days ago. I needed to round a floating point number to two digits after decimal place. Then I found a function named 'round()'. Look at the following code block written in Python: d = 10.0 / 3 print d d = round(d, 2) print d

Get Original URL

Once I got into trouble while crawling some websites. Some of the URL I had wasn't the original URL, rather they were redirecting to some other URL. Then I came up with a function to get the original URL. Here I share it with you: def get_original_url(url): """This function takes an url and returns the original url with cookie (if any) """     cj = cookielib.CookieJar()     opener = urllib2.build_opener(urllib2.HTTPCookieProcessor(cj))     opener.addheaders = [('User-agent', 'Mozilla/5.0')]     usock = opener.open(url)     url = usock.geturl()     usock.close()     return url, cj Please send me your comments on this piece of code.

Python projects in GetAFreelancer

If you are interested in some freelance works using Python, there are several websites where you can search for projects. GetAFreelancer.com is one of them. You can find python projects here: http://getafreelancer.com/projects/by-job/Python.html If you don't have time to work on projects, still, browsing the projects will help you to get some ideas about the real world application of Python.

Command Line Arguments in Python

Another problem the beginner in Python faces 'to take input from the command line arguments' - maybe you want to give a text file name as a command line argument or you need it for another purpose (in the first time, my necessity was the earlier one). The solution is to use sys.argv you need to import the sys module first. then sys.argv[0] contains the python filename itself, sys.argv[1] contains the next argument and so on... import sys print 'I am ', sys.argv[0] print sys.argv[1] Run the program: python my.py file.txt Now you may assign the sys.argv[1] to a variable and do whatever you want. To know more about sys module, Check http://docs.python.org/lib/module-sys.html Hope you find this tip useful :)

Search Python code in Krugle

I found an interesting search engine that searches for source code. Visit http://www.krugle.org/kse/files and try your search for your desired programming language. Interesting thing about krugle is not only that you can search code of your desired language, you can specify where it should search the keyword - in source code or comment or documentation ... Hope you will find the site useful :-)

set timeout while spidering a site

Though I heavily depend on urllib2 module to develop web crawler, but sometimes the crawlers just stuck ... :(. So it's necessary to set a timeout but unfortunately urllib2 doesn't provide anything for this purpose. So we have to depend on socket module. here is the code that I use: import socket timeout = 300 # seconds socket.setdefaulttimeout(timeout)

get html source of an URL

I have been using Python to write web crawler/spider/scraper for a long time. And it's an interesting experience indeed. The good news is, I have decided to share my web crawler experience with you. I shall use the terms crawler, spider, scraper alternatively. The most basic thing to write a web spider is to get the html source (i.e. content) of an URL. There are many ways to do it. Here I post a simple code that gets the html source from an url. import urllib2 url = 'http://abc.com' # write the url here usock = urllib2.urlopen(url) data = usock.read() usock.close() print data Urllib2 is a very useful module for the spiderman ;) so take a look at the documentation http://www.python.org/doc/current/lib/module-urllib2.html

How to give input from console in Python?

The answer to this frequent question of beginners: name = raw_input() print name If you want to read an integer - num = raw_input() num = int(num) # raw_input returns string, so convert it to integer You will also find this post useful: http://love-python.blogspot.com/2010/11/fast-way-to-get-input-from-stdin-python.html

reverse a string in Python

How to reverse a string? s = 'abc' s = s[::-1] print s Simple! :-)

Use Python in Online Programming Contest

Most of the Online Judges allow C/C++ and Java for programming contest. Other languages are not very popular among the problem solvers and/or judges. If you want to use your Python skills in programming contest Sphere Online Judge is a good place for this. You can try many languages there. Enjoy Python, enjoy Programming Contest!

read CSV file in Python

Python has useful module named csv to deal with csv files. But recently I faced a problem reading csv files and want to share with you how I get rid of the problem. I wanted to get the data of the first column of my csv file. I tried the following code: import csv portfolio = csv.reader(open("portfolio.csv", "rb")) names = [] for data in portfolio:     names.append(data[0]) print names but the output was an empty list ([]) :-( Then I found the type of portfolio object using print type(portfolio) that the type is '_csv.reader', then I changed my program to the following and got the list :-) import csv portfolio = csv.reader(open("portfolio.csv", "rb")) portfolio_list = [] portfolio_list.extend(portfolio) names = [] for data in portfolio_list:     names.append(data[0]) print names If you have any better idea, please share. To know more about csv module, http://docs.python.org/lib/module-csv.html

FTP file upload

Uploading a file to your FTP server is very simple. You can use the following code for that purpose: import ftplib sftp = ftplib.FTP('myserver.com','login','password') # Connect fp = open('todo.txt','rb') # file to send sftp.storbinary('STOR todo.txt', fp) # Send the file fp.close() # Close file and FTP sftp.quit()

lambda magic to find prime numbers

Find prime number up to 100 with just 3 lines of code. The 4th line is for printing ;) nums = range(2, 100) for i in range(2, 10):     nums = filter(lambda x: x == i or x % i, nums) print nums Isn't it amazing? Check my new post about Prime number: http://love-python.blogspot.com/2010/11/prime-number-generator-in-python-using.html