Code für das Text Engineering Seminar (siehe Seminarplan )
| Inhalt | Ressourcen/Dependencies | Literatur | |
| basic | Korpus, Lineare Suche, Term-Dokument-Matrix | Shakespeare | IIR Kap. 1 |
| boole | Invertierter Index, Listen-Intersection, Vorverarbeitung, Positional Index, PositionalIntersect | IIR Kap. 1 + 2 | |
| ranked | Ranked Retrieval: Termgewichtung, Vector Space Model | IIR Kap. 6 + 7 | |
| evaluation | Evaluation: Precision, Recall, F-Maß | IIR Kap. 8 | |
| lucene | Lucene: Indexer und Searcher | lucene-core, lucene-queryparser, lucene-analyzers-common | Lucene in Action |
| web | Crawler, WebDocument | commons-io, nekohtml, jrobotx | IIR Kap. 19 + 20 |
| Inhalt | Ressourcen/Dependencies | Literatur | |
| document | Document, Topics, TermIndex, FeatureVector | ||
| corpus | Korpus, DB, DocumentIndex, Crawler | db4o, crawler (siehe package ir.web ) | |
| classification | TextClassifier, Naive Bayes |