stanford nlp - French coreference annotation using CoreNLP -
can me correct setting performing coreference annotation french using corenlp? have tryed basic suggestion editing properties file:
annotators = tokenize, ssplit, pos, parse, lemma, ner, parse, depparse, mention, coref tokenize.language = fr pos.model = edu/stanford/nlp/models/pos-tagger/french/french.tagger parse.model = edu/stanford/nlp/models/lexparser/frenchfactored.ser.gz
the command:
java -cp "*" -xmx2g edu.stanford.nlp.pipeline.stanfordcorenlp -props frenchprops.properties -file frenchfile.txt
which gets following output log:
[main] info edu.stanford.nlp.pipeline.stanfordcorenlp - adding annotator tokenize [main] info edu.stanford.nlp.pipeline.stanfordcorenlp - adding annotator ssplit [main] info edu.stanford.nlp.pipeline.stanfordcorenlp - adding annotator pos reading pos tagger model edu/stanford/nlp/models/pos-tagger/french/french.tagger ... done [0.3 sec]. [main] info edu.stanford.nlp.pipeline.stanfordcorenlp - adding annotator parse [main] info edu.stanford.nlp.parser.common.parsergrammar - loading parser serialized file edu/stanford/nlp/models/lexparser/frenchfactored.ser.gz ... done [2.2 sec]. [main] info edu.stanford.nlp.pipeline.stanfordcorenlp - adding annotator lemma [main] info edu.stanford.nlp.pipeline.stanfordcorenlp - adding annotator ner loading classifier edu/stanford/nlp/models/ner/english.all.3class.distsim.crf.ser.gz ... done [2.0 sec]. loading classifier edu/stanford/nlp/models/ner/english.muc.7class.distsim.crf.ser.gz ... done [0.7 sec]. loading classifier edu/stanford/nlp/models/ner/english.conll.4class.distsim.crf.ser.gz ... done [0.9 sec]. [main] info edu.stanford.nlp.time.jollydayholidays - initializing jollydayholiday sutime classpath edu/stanford/nlp/models/sutime/jollyday/holidays_sutime.xml sutime.binder.1. reading tokensregex rules edu/stanford/nlp/models/sutime/defs.sutime.txt ago 23, 2016 5:37:34 pm edu.stanford.nlp.ling.tokensregex.coremapexpressionextractor appendrules informaciÓn: read 83 rules reading tokensregex rules edu/stanford/nlp/models/sutime/english.sutime.txt ago 23, 2016 5:37:34 pm edu.stanford.nlp.ling.tokensregex.coremapexpressionextractor appendrules informaciÓn: read 267 rules reading tokensregex rules edu/stanford/nlp/models/sutime/english.holidays.sutime.txt ago 23, 2016 5:37:34 pm edu.stanford.nlp.ling.tokensregex.coremapexpressionextractor appendrules informaciÓn: read 25 rules [main] info edu.stanford.nlp.pipeline.stanfordcorenlp - adding annotator parse [main] info edu.stanford.nlp.pipeline.stanfordcorenlp - adding annotator depparse loading depparse model file: edu/stanford/nlp/models/parser/nndep/english_ud.gz ... precomputed 100000, elapsed time: 1.639 (s) initializing dependency parser done [6.4 sec]. [main] info edu.stanford.nlp.pipeline.stanfordcorenlp - adding annotator mention using mention detector type: rule [main] info edu.stanford.nlp.pipeline.stanfordcorenlp - adding annotator coref exception in thread "main" java.lang.outofmemoryerror: java heap space @ java.util.arrays.copyofrange(arrays.java:3664) @ java.lang.string.<init>(string.java:207) @ java.lang.stringbuilder.tostring(stringbuilder.java:407) @ java.io.objectinputstream$blockdatainputstream.readutfbody(objectinputstream.java:3097) @ java.io.objectinputstream$blockdatainputstream.readutf(objectinputstream.java:2892) @ java.io.objectinputstream.readstring(objectinputstream.java:1646) @ java.io.objectinputstream.readobject0(objectinputstream.java:1344) @ java.io.objectinputstream.readobject(objectinputstream.java:373) @ java.util.hashmap.readobject(hashmap.java:1402) @ sun.reflect.generatedmethodaccessor3.invoke(unknown source) @ sun.reflect.delegatingmethodaccessorimpl.invoke(delegatingmethodaccessorimpl.java:43) @ java.lang.reflect.method.invoke(method.java:498) @ java.io.objectstreamclass.invokereadobject(objectstreamclass.java:1058) @ java.io.objectinputstream.readserialdata(objectinputstream.java:1909) @ java.io.objectinputstream.readordinaryobject(objectinputstream.java:1808) @ java.io.objectinputstream.readobject0(objectinputstream.java:1353) @ java.io.objectinputstream.defaultreadfields(objectinputstream.java:2018) @ java.io.objectinputstream.readserialdata(objectinputstream.java:1942) @ java.io.objectinputstream.readordinaryobject(objectinputstream.java:1808) @ java.io.objectinputstream.readobject0(objectinputstream.java:1353) @ java.io.objectinputstream.readobject(objectinputstream.java:373) @ edu.stanford.nlp.io.ioutils.readobjectfromurlorclasspathorfilesystem(ioutils.java:324) @ edu.stanford.nlp.scoref.simplelinearclassifier.<init>(simplelinearclassifier.java:30) @ edu.stanford.nlp.scoref.pairwisemodel.<init>(pairwisemodel.java:75) @ edu.stanford.nlp.scoref.pairwisemodel$builder.build(pairwisemodel.java:57) @ edu.stanford.nlp.scoref.clusteringcorefsystem.<init>(clusteringcorefsystem.java:31) @ edu.stanford.nlp.scoref.statisticalcorefsystem.fromprops(statisticalcorefsystem.java:48) @ edu.stanford.nlp.pipeline.corefannotator.<init>(corefannotator.java:66) @ edu.stanford.nlp.pipeline.annotatorimplementations.coref(annotatorimplementations.java:220) @ edu.stanford.nlp.pipeline.annotatorfactories$13.create(annotatorfactories.java:515) @ edu.stanford.nlp.pipeline.annotatorpool.get(annotatorpool.java:85) @ edu.stanford.nlp.pipeline.stanfordcorenlp.construct(stanfordcorenlp.java:375)
which made me think there missing configuration stuff.
afaik corenlp doesn't offer coreference resolution french. (see http://stanfordnlp.github.io/corenlp/coref.html)
Comments
Post a Comment