Using Redis with Python
May 1, 2010
I have a dodgy hard drive, which contains some recordings from a mythtv box. This disk works ok after boot for a while, and then throws a wobbler, marking some of the files on it as zero length. I needed a plan for getting the files off it, but I had to do it piecemeal and I didn't have a big enough disk to store a complete copy locally. The disk on my mythtv server was big enough, but it was across a wireless network and therefore slow.
What I needed was a way to work out which files were where, and find out how many files I had left to recover.
Redis
Enter redis and python
# -*- coding: utf-8 -*- import redis r = redis.Redis() r.delete('local') r.delete('mythtv') r.delete('broken') lLocalFiles = open('local.txt') for line in lLocalFiles: lFilename = "%s" % line[49:-1] r.sadd('local', lFilename) print lFilename lLocalFiles.close() lMythTvFiles = open('mythtv.txt') for line in lMythTvFiles: lFilename = "%s" % line[55:-1] r.sadd('mythtv', lFilename) print lFilename lMythTvFiles.close() lBrokenFiles = open('broken.txt') for line in lBrokenFiles: lFilename = "%s" % line[56:-1] r.sadd('broken', lFilename) print lFilename lBrokenFiles.close() print "Local files %s" % r.scard('local') print "MythTv files %s" % r.scard('mythtv') print "Broken files %s" % r.scard('broken') lLocalNotOnMythTv = r.sdiff(keys=('local', 'mythtv')) print "Local files not on MythTv %s " % len(lLocalNotOnMythTv) for file in lLocalNotOnMythTv: print file lBrokenNotOnMythTv = r.sdiff(keys=('broken', 'mythtv')) print "Broken files not on MythTv %s " % len(lBrokenNotOnMythTv) for file in lBrokenNotOnMythTv: print file
This program first deletes the three keys local
, mythtv
and broken
from the redis database. It then reads some text files which have been created from the various directories involved using ls *.mpg > filename.txt
. These were massaged to just strip out the filename and to strip of the carriage return on the end of the line.
Each filename was added to a set under the appropriate key in redis, using SADD
.
We then print out the size of each set using the redis command SCARD
.
We then use SDIFF
to find out the files that are on the local working directory, but not on the mythtv machine. Finally, we use the same SDIFF
command and find out how many files we have left to recover.