Skip to content

Twofishes indexer does not handle MongoDB query failures properly. #53

@piranha32

Description

@piranha32

During importing the data into the DB with "./src/jvm/io/fsq/twofishes/scripts/parse.py -w pwd/data/" the indexer crashed with the following error message. After the crash the indexer was stuck doing nothing, and the no more records were processed. Processed data was downloaded with "./src/jvm/io/fsq/twofishes/scripts/download-world.sh"

8246718 INFO  i.f.t.indexer.output.PrefixIndexer - done with 114000 of 2994876 prefixes
8247980 INFO  i.f.t.indexer.output.PrefixIndexer - done with 115000 of 2994876 prefixes
Exception in thread "main" io.fsq.rogue.RogueException: Mongo query on geocoder [db.name_index.find({ "name" : { "$regex" : "^?" , "$options" : ""} , "excludeFromPrefixIndex" : { "$ne" : true}}).sort({ "pop" : -1}).limit(1000)] failed after 2 ms
        at io.fsq.rogue.adapter.BlockingMongoClientAdapter.runCommand(BlockingMongoClientAdapter.scala:118)
        at io.fsq.rogue.adapter.BlockingMongoClientAdapter.runCommand(BlockingMongoClientAdapter.scala:52)
        at io.fsq.rogue.adapter.MongoClientAdapter.queryRunner(MongoClientAdapter.scala:471)
        at io.fsq.rogue.adapter.MongoClientAdapter.query(MongoClientAdapter.scala:496)
        at io.fsq.rogue.query.QueryExecutor.fetch(QueryExecutor.scala:144)
        at io.fsq.twofishes.indexer.output.PrefixIndexer.getRecordsByPrefix(PrefixIndexer.scala:76)
        at io.fsq.twofishes.indexer.output.PrefixIndexer$$anonfun$writeIndexImpl$2.apply(PrefixIndexer.scala:115)
        at io.fsq.twofishes.indexer.output.PrefixIndexer$$anonfun$writeIndexImpl$2.apply(PrefixIndexer.scala:110)
        at scala.collection.TraversableLike$WithFilter$$anonfun$foreach$1.apply(TraversableLike.scala:733)
        at scala.collection.immutable.List.foreach(List.scala:381)
        at scala.collection.TraversableLike$WithFilter.foreach(TraversableLike.scala:732)
        at io.fsq.twofishes.indexer.output.PrefixIndexer.writeIndexImpl(PrefixIndexer.scala:110)
        at io.fsq.twofishes.indexer.output.Indexer$$anonfun$writeIndex$1.apply$mcV$sp(Indexer.scala:54)
        at io.fsq.twofishes.indexer.output.Indexer$$anonfun$writeIndex$1.apply(Indexer.scala:54)
        at io.fsq.twofishes.indexer.output.Indexer$$anonfun$writeIndex$1.apply(Indexer.scala:54)
        at io.fsq.twofishes.util.DurationUtils$.inNanoseconds(DurationUtils.scala:16)
        at io.fsq.twofishes.util.DurationUtils$class.logDuration(DurationUtils.scala:23)
        at io.fsq.twofishes.indexer.output.Indexer.logDuration(Indexer.scala:37)
        at io.fsq.twofishes.indexer.output.Indexer.writeIndex(Indexer.scala:54)
        at io.fsq.twofishes.indexer.output.NameIndexer.writeIndexImpl(NameIndexer.scala:84)
        at io.fsq.twofishes.indexer.output.Indexer$$anonfun$writeIndex$1.apply$mcV$sp(Indexer.scala:54)
        at io.fsq.twofishes.indexer.output.Indexer$$anonfun$writeIndex$1.apply(Indexer.scala:54)
        at io.fsq.twofishes.indexer.output.Indexer$$anonfun$writeIndex$1.apply(Indexer.scala:54)
        at io.fsq.twofishes.util.DurationUtils$.inNanoseconds(DurationUtils.scala:16)
        at io.fsq.twofishes.util.DurationUtils$class.logDuration(DurationUtils.scala:23)
        at io.fsq.twofishes.indexer.output.Indexer.logDuration(Indexer.scala:37)
        at io.fsq.twofishes.indexer.output.Indexer.writeIndex(Indexer.scala:54)
        at io.fsq.twofishes.indexer.output.OutputIndexes.buildIndexes(OutputHFile.scala:28)
        at io.fsq.twofishes.indexer.importers.geonames.GeonamesParser$.writeIndexes(GeonamesParser.scala:144)
        at io.fsq.twofishes.indexer.importers.geonames.GeonamesParser$.main(GeonamesParser.scala:106)
        at io.fsq.twofishes.indexer.importers.geonames.GeonamesParser.main(GeonamesParser.scala)
Caused by: com.mongodb.MongoQueryException: Query failed with error code 2 and error message 'Regular expression is invalid: invalid UTF-8 string' on server 127.0.0.1:27017
        at com.mongodb.operation.FindOperation$1.call(FindOperation.java:521)
        at com.mongodb.operation.FindOperation$1.call(FindOperation.java:510)
        at com.mongodb.operation.OperationHelper.withConnectionSource(OperationHelper.java:431)
        at com.mongodb.operation.OperationHelper.withConnection(OperationHelper.java:404)
        at com.mongodb.operation.FindOperation.execute(FindOperation.java:510)
        at com.mongodb.operation.FindOperation.execute(FindOperation.java:81)
        at com.mongodb.Mongo.execute(Mongo.java:836)
        at com.mongodb.Mongo$2.execute(Mongo.java:823)
        at com.mongodb.OperationIterable.iterator(OperationIterable.java:47)
        at com.mongodb.OperationIterable.forEach(OperationIterable.java:70)
        at com.mongodb.FindIterableImpl.forEach(FindIterableImpl.java:166)
        at io.fsq.rogue.adapter.BlockingMongoClientAdapter.forEachProcessor(BlockingMongoClientAdapter.scala:162)
        at io.fsq.rogue.adapter.BlockingMongoClientAdapter.forEachProcessor(BlockingMongoClientAdapter.scala:52)
        at io.fsq.rogue.adapter.MongoClientAdapter$$anonfun$query$1.apply(MongoClientAdapter.scala:496)
        at io.fsq.rogue.adapter.BlockingMongoClientAdapter.findImpl(BlockingMongoClientAdapter.scala:272)
        at io.fsq.rogue.adapter.BlockingMongoClientAdapter.findImpl(BlockingMongoClientAdapter.scala:52)
        at io.fsq.rogue.adapter.MongoClientAdapter$$anonfun$queryRunner$1.apply(MongoClientAdapter.scala:472)
        at io.fsq.rogue.util.DefaultQueryLogger.onExecuteQuery(QueryLogger.scala:52)
        at io.fsq.rogue.adapter.BlockingMongoClientAdapter.runCommand(BlockingMongoClientAdapter.scala:113)
        ... 30 more

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions