DocuQuery: Intelligent Q&A API for PDFs, URLs & CSVs(In progress).

Overview

DocuQuery is an intelligent, scalable Question & Answer (Q&A) API designed to extract relevant information from PDFs, URLs, and CSV files. The API leverages advanced Natural Language Processing (NLP) and Machine Learning (ML) models to analyze and answer questions based on the content of these documents, providing a powerful tool for automated data retrieval and decision support.

This project demonstrates the ability to build a robust API that can interface with diverse data sources, process unstructured content, and deliver precise answers in response to user queries. DocuQuery can be used in various applications, including data extraction, document automation, and interactive knowledge systems.

Key Features

Multi-format Support: Supports extraction from PDFs, URLs, and CSV files.
Intelligent Q&A: Utilizes NLP and ML models to provide context-aware answers.
Flexible Integration: Easy to integrate into existing systems via a RESTful API.
Text Extraction: Efficient extraction of text and tables from PDFs and CSVs, and web scraping from URLs.
Real-time Queries: Offers real-time responses to user questions based on document content.

Technology Stack.

Programming Language: Python
API Framework: Flask (for RESTful API development)
Natural Language Processing: Hugging Face Transformers (for pre-trained language models)
File Parsing:
- PyPDF2 for PDF text extraction
- BeautifulSoup for web scraping from URLs
- Pandas for parsing and reading CSV data
Machine Learning: Pre-trained models (e.g., GPT-3, BERT) for contextual question answering

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
app		app
templates		templates
tests		tests
views		views
README.md		README.md
app.py		app.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

DocuQuery: Intelligent Q&A API for PDFs, URLs & CSVs(In progress).

Overview

Key Features

Technology Stack.

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

DocuQuery: Intelligent Q&A API for PDFs, URLs & CSVs(In progress).

Overview

Key Features

Technology Stack.

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages