Daily Papers: Personalized ArXiv Research Digest

Overview

Daily Papers is an intelligent research paper aggregation and summarization system that helps researchers stay up-to-date with the latest developments in their field. By leveraging Large Language Models (LLMs), it automatically curates personalized daily digests of relevant papers from arXiv and HuggingFace.

View on GitHub

Key Features

Multi-Source Paper Aggregation

ArXiv Integration: Fetches papers from specified arXiv categories (cs.AI, cs.LG, cs.CV, cs.CL, cs.RO, etc.)
HuggingFace Trending: Tracks trending and daily papers from the HuggingFace platform
Social Metrics: Displays upvotes, GitHub stars, and trending indicators

AI-Powered Curation

Smart Filtering: LLM-based relevance assessment based on your research interests and custom prompts
Intelligent Ranking: Automatically selects the most valuable papers when count exceeds limit
Deep Summarization: Generates comprehensive summaries highlighting key contributions, methodology, and results

Automated Delivery

Email Reports: Markdown and HTML-formatted reports delivered directly to your inbox
Multi-User Support: Configure multiple users with independent settings and preferences
Cost Tracking: Monitors token usage and provides cost estimates

Deployment Options

GitHub Actions: Free, zero-maintenance deployment using GitHub’s infrastructure
Local Scheduler: Option to run on local machines with customizable scheduling
Minimal Cost: Approximately $0.50/day using DeepSeek V3 (~$15/month)

Technical Implementation

Architecture

Built in Python with a modular design for easy customization
RESTful API integration with multiple LLM providers (DeepSeek, OpenAI-compatible endpoints)
Configurable filtering and summarization prompts for different research domains
Robust error handling and retry mechanisms

Workflow

Fetch: Retrieves latest papers from configured sources
Filter: LLM evaluates relevance based on user preferences
Rank: Selects top papers when volume exceeds threshold
Summarize: Generates detailed analysis of each paper
Deliver: Formats and sends email reports with original abstracts and AI summaries

Configuration

Users can customize:

arXiv categories and lookback window
Maximum papers per digest
Custom filtering, ranking, and summarization prompts
Multiple user profiles with independent settings

Use Cases

Researchers: Stay current with developments in specific subfields
PhD Students: Track relevant papers for literature reviews
Industry Professionals: Monitor practical applications and sota techniques
Study Groups: Share curated digests with team members

Cost Efficiency

With DeepSeek V3 pricing (~$0.27/M input tokens, ~$1.10/M output tokens), the system costs approximately:

$0.50/day for 10 papers (~$15/month)
Alternative models: DeepSeek V3.2-Exp, Qwen-Turbo, GLM-4-Long (~$0.30-0.40/day)
Free option: Gemini 2.0 Flash with generous free tier

Acknowledgments

Inspired by:

Share on

Bluesky Facebook LinkedIn X (formerly Twitter)

Zehua Wang