python - How can I make this code quicker? -


i have lot of 750x750 images. want take geometric mean of non-overlapping 5x5 patches each image, , each image, average geometric means create 1 feature per image. wrote code below, , seems work fine. but, know it's not efficient. running on 300 or images takes around 60 seconds. have 3000 images. so, while works purpose, it's not efficient. how can improve code?

#each sublist of gmeans contain list of 22500 geometric means  #corresponding non-overlapping 5x5 patches given image.  gmeans = [[],[],[],[],[],[],[],[],[],[],[],[]] #the loop here populates gmeans. folder in range(len(subfolders)):     just_thefilename, colorsourceimages, graycroppedfiles  = get_all_images(folder)     items in graycroppedfiles:         myarray = misc.imread(items)         area_of_big_matrix=750*750         area_of_small_matrix= 5*5         how_many = area_of_big_matrix / area_of_small_matrix         n = 0          p = 0          mylist=[]         while len(mylist) < how_many:             mylist.append(gmean(myarray[n:n+5,p:p+5],none))             n=n+5             if n == 750:                 p = p+5                 n = 0         gmeans[folder].append(my list) #each sublist of mean_of_gmeans contain 1 feature per image, mean of geometric means of 5x5 patches. mean_of_gmeans = [[],[],[],[],[],[],[],[],[],[],[],[]] folder in range(len(subfolders)):     items in range(len(gmeans[0])):         mean_of_gmeans[folder].append((np.mean(gmeans[folder][items],dtype=np.float64))) 

i can understand suggestion move code review site, problem provides nice example of power of using vectorized numpy , scipy functions, i'll give answer.

the function below, cleverly called func, computes desired value. key reshape image four-dimensional array. can interpreted two-dimensional array of two-dimensional arrays, inner arrays 5x5 blocks.

scipy.stats.gmean can compute geometric mean on more 1 dimension, used reduce four-dimensional array desired two-dimensional array of geometric means. return value (arithmetic) mean of geometric means.

import numpy np scipy.stats import gmean   def func(img, blocksize=5):     # img must 2-d array dimensions divisible blocksize.     if (img.shape[0] % blocksize) != 0 or (img.shape[1] % blocksize) != 0:          raise valueerror("blocksize not divide shape of img.")      # reshape 'img' 4-d array 'blocks', blocks[i, :, j, :]     # subarray shape (blocksize, blocksize).     blocks_nrows = img.shape[0] // blocksize     blocks_ncols = img.shape[1] // blocksize     blocks = img.reshape(blocks_nrows, blocksize, blocks_ncols, blocksize)      # compute geometric mean on axes 1 , 3 of 'blocks'.  results     # in array of geometric means size (blocks_nrows, blocks_ncols).     gmeans = gmean(blocks, axis=(1, 3), dtype=np.float64)      # return value average of 'gmeans'.     avg = gmeans.mean()      return avg 

for example, here function applied array shape (750, 750).

in [358]: np.random.seed(123)  in [359]: img = np.random.randint(1, 256, size=(750, 750)).astype(np.uint8)  in [360]: func(img) out[360]: 97.035648309350179 

it isn't easy verify that correct result, here smaller example:

in [365]: np.random.seed(123)  in [366]: img = np.random.randint(1, 4, size=(3, 6))  in [367]: img out[367]:  array([[3, 2, 3, 3, 1, 3],        [3, 2, 3, 2, 3, 2],        [1, 2, 3, 2, 1, 3]])  in [368]: func(img, blocksize=3) out[368]: 2.1863131342986666 

here direct calculation:

in [369]: 0.5*(gmean(img[:,:3], axis=none) + gmean(img[:, 3:], axis=none)) out[369]: 2.1863131342986666 

Comments

Popular posts from this blog

mysql - Dreamhost PyCharm Django Python 3 Launching a Site -

java - Sending SMS with SMSLib and Web Services -

java - How to resolve The method toString() in the type Object is not applicable for the arguments (InputStream) -