python - Constructing Zipf Distribution with matplotlib, trying to draw fitted line -
i have list of paragraphs, want run zipf distribution on combination.
my code below:
from itertools import * pylab import * collections import counter import matplotlib.pyplot plt paragraphs = " ".join(targeted_paragraphs) paragraph in paragraphs: frequency = counter(paragraph.split()) counts = array(frequency.values()) tokens = frequency.keys() ranks = arange(1, len(counts)+1) indices = argsort(-counts) frequencies = counts[indices] loglog(ranks, frequencies, marker=".") title("zipf plot combined article paragraphs") xlabel("frequency rank of token") ylabel("absolute frequency of token") grid(true) n in list(logspace(-0.5, log10(len(counts)-1), 20).astype(int)): dummy = text(ranks[n], frequencies[n], " " + tokens[indices[n]], verticalalignment="bottom", horizontalalignment="left") at first have encountered following error reason , not know why:
indexerror: index 1 out of bounds axis 0 size 1 purpose attempt draw "a fitted line" in graph, , assign value variable. not know how add that. appreciated both of these issues.
i don't know targeted_paragraphs looks like, got error using:
targeted_paragraphs = ['a', 'b', 'c'] based on looks problem in how set for loop. you're indexing ranks , frequencies using list generated length of counts, gives off-by-one error because (as far can tell) ranks, frequencies, , counts should have same length. change loop index use len(counts)-1 below:
for n in list(logspace(-0.5, log10(len(counts)-1), 20).astype(int)): dummy = text(ranks[n], frequencies[n], " " + tokens[indices[n]], verticalalignment="bottom", horizontalalignment="left")
Comments
Post a Comment