python - Constructing Zipf Distribution with matplotlib, trying to draw fitted line -
i have list of paragraphs, want run zipf distribution on combination.
my code below:
from itertools import * pylab import * collections import counter import matplotlib.pyplot plt paragraphs = " ".join(targeted_paragraphs) paragraph in paragraphs: frequency = counter(paragraph.split()) counts = array(frequency.values()) tokens = frequency.keys() ranks = arange(1, len(counts)+1) indices = argsort(-counts) frequencies = counts[indices] loglog(ranks, frequencies, marker=".") title("zipf plot combined article paragraphs") xlabel("frequency rank of token") ylabel("absolute frequency of token") grid(true) n in list(logspace(-0.5, log10(len(counts)-1), 20).astype(int)): dummy = text(ranks[n], frequencies[n], " " + tokens[indices[n]], verticalalignment="bottom", horizontalalignment="left")
at first have encountered following error reason , not know why:
indexerror: index 1 out of bounds axis 0 size 1
purpose attempt draw "a fitted line" in graph, , assign value variable. not know how add that. appreciated both of these issues.
i don't know targeted_paragraphs
looks like, got error using:
targeted_paragraphs = ['a', 'b', 'c']
based on looks problem in how set for
loop. you're indexing ranks
, frequencies
using list generated length of counts
, gives off-by-one error because (as far can tell) ranks
, frequencies
, , counts
should have same length. change loop index use len(counts)-1
below:
for n in list(logspace(-0.5, log10(len(counts)-1), 20).astype(int)): dummy = text(ranks[n], frequencies[n], " " + tokens[indices[n]], verticalalignment="bottom", horizontalalignment="left")
Comments
Post a Comment