None

Using Redis with Python

May 1, 2010

I have a dodgy hard drive, which contains some recordings from a mythtv box. This disk works ok after boot for a while, and then throws a wobbler, marking some of the files on it as zero length. I needed a plan for getting the files off it, but I had to do it piecemeal and I didn't have a big enough disk to store a complete copy locally. The disk on my mythtv server was big enough, but it was across a wireless network and therefore slow.

What I needed was a way to work out which files were where, and find out how many files I had left to recover.

Redis

Enter redis and python

# -*- coding: utf-8 -*-
import redis
r = redis.Redis()
r.delete('local')
r.delete('mythtv')
r.delete('broken')

lLocalFiles = open('local.txt')
for line in lLocalFiles:
  lFilename = "%s" % line[49:-1]
  r.sadd('local', lFilename)
  print lFilename
lLocalFiles.close()

lMythTvFiles = open('mythtv.txt')
for line in lMythTvFiles:
  lFilename = "%s" % line[55:-1]
  r.sadd('mythtv', lFilename)
  print lFilename
lMythTvFiles.close()

lBrokenFiles = open('broken.txt')
for line in lBrokenFiles:
  lFilename = "%s" % line[56:-1]
  r.sadd('broken', lFilename)
  print lFilename
lBrokenFiles.close()

print "Local files %s" % r.scard('local')
print "MythTv files %s" % r.scard('mythtv')
print "Broken files %s" % r.scard('broken')

lLocalNotOnMythTv = r.sdiff(keys=('local', 'mythtv'))
print "Local files not on MythTv %s " % len(lLocalNotOnMythTv)
for file in lLocalNotOnMythTv:
  print file

lBrokenNotOnMythTv = r.sdiff(keys=('broken', 'mythtv'))
print "Broken files not on MythTv %s " % len(lBrokenNotOnMythTv)
for file in lBrokenNotOnMythTv:
  print file

This program first deletes the three keys local, mythtv and broken from the redis database. It then reads some text files which have been created from the various directories involved using ls *.mpg > filename.txt. These were massaged to just strip out the filename and to strip of the carriage return on the end of the line.

Each filename was added to a set under the appropriate key in redis, using SADD.

We then print out the size of each set using the redis command SCARD.

We then use SDIFF to find out the files that are on the local working directory, but not on the mythtv machine. Finally, we use the same SDIFF command and find out how many files we have left to recover.

Tags: redis python