Lists are native part of the Python language and this part makes programming easy and speedy. But every Moon has a dark side and I would like to add some light to it. Problem of the list is heavy resource's usage. Everyone should keep in mind this during coding.
Simple example from python tutorial:
Result is below:
If list is used then performance is good but memory usage is really bad. Is it possible to have a good performance and good speed ? Lets try. There are two problems are present :
myfile = open("myfile.txt")
myfile.readlines()
Python opens file and creates a list from each line in it. Simple script below provides some information about executing speed and memory usage:
#!/usr/bin/python
import datetime
import resource
currenttime = datetime.datetime.now()
print "="*20
print "Creating a file "
print "="*20
myfile = open("textfile.txt", "w")
simplerange = xrange(10000000)
try:
for i in simplerange:
myfile.write(unicode(datetime.datetime.now()))
myfile.write('\n')
finally:
myfile.close()
timespend = datetime.datetime.now()- currenttime
print timespend
print "="*20
print "="*20
print "Open file using readlines"
print "="*20
myfile = open("textfile.txt", "r")
linesinlistfile = open("linesinthelist.txt", "w")
currenttime = datetime.datetime.now()
linesinlist = myfile.readlines()
for currentline in linesinlist:
linesinlistfile.write(currentline)
myfile.close()
linesinlistfile.close()
myf = open("linesinthelist.txt", "r")
timespend = datetime.datetime.now()- currenttime
print timespend
print "="*20
print "openfile using readline"
print "="*20
myfile = open("textfile.txt", "r")
readonelinefile = open("readonelinefile.txt", "w")
while 1:
currentline = myfile.readline()
if not currentline: break
readonelinefile.write(currentline)
myfile.close()
readonelinefile.close()
timespend = datetime.datetime.now()- currenttime
print timespend
print "="*20
print "Resource usage"
print "="*20
print resource.getrusage(resource.RUSAGE_SELF).ru_maxrss
This script creates a simple text file with time string in it, reads it using readline() and readlines() functions. Last part returns memory usage in kilobytes. For correct data I've commented part of codes related to readline or readlines.Result is below:
| readline() | readlines() | |
|---|---|---|
| executing time | 0:01:10.799743 | 0:00:04.562637 |
| memory usage | 3620 | 526464 |
- Big list requires a lot of memory
- Solution without list can not be cached and be quick
while 1:
linesinlist = myfile.readlines(1000)
if not linesinlist:
break
for currentline in linesinlist:
linesinlistfile.write(currentline)
and result is below:
==================== Open file using readlines ==================== 0:00:04.383583 ==================== Resource usage ==================== 3636It is not hard to make good application, you should feel like it only !
Comments
Post a Comment