Social Crawler Index
A Mastra AI-powered multi-platform social media crawler with comprehensive sentiment analysis and vectorization.
Twitter Crypto Crawler
A Mastra AI-powered Twitter crawler that monitors specified Twitter accounts for cryptocurrency mentions, analyzes sentiment, and detects potential shill behavior. Built using TypeScript and the Mastra agent framework.
Features
- Twitter Monitoring: Automatically monitors specified Twitter accounts for cryptocurrency-related content
- Intelligent Coin Detection:
- Detects cryptocurrency mentions without relying on a predefined list
- Uses pattern matching and context analysis
- Handles common variations (e.g., BTC/Bitcoin/₿)
- Assigns confidence scores to detections
- Sentiment Analysis:
- Uses Google's Gemini 2.0 model for context-aware sentiment analysis
- Considers crypto-specific context and terminology
- Provides sentiment scores, labels, and confidence metrics
- Shill Detection:
- Tracks mention frequency per account
- Identifies repeated mentions within time windows
- Considers multiple accounts mentioning the same coins
- Data Storage:
- Uses TursoDB for efficient and reliable storage
- Maintains relationships between accounts, tweets, and coins
- Enables complex queries and analytics
- Real-time Monitoring:
- Runs every 5 minutes to stay current
- Handles rate limiting and error recovery
- Maintains state across runs
Prerequisites
- Node.js 18 or higher
- TursoDB instance (local or cloud)
- Twitter account credentials
- Google Cloud API key with Gemini access
Installation
- Clone the repository:
git clone <repository-url>
cd memedd-twitter-crawler- Install dependencies:
npm install- Create a
.envfile with your configuration:
# Database Configuration
TURSO_DB_URL=your_turso_db_url
TURSO_DB_AUTH_TOKEN=your_turso_auth_token
# Twitter Credentials
TWITTER_USERNAME=your_twitter_username
TWITTER_PASSWORD=your_twitter_password
TWITTER_EMAIL=your_twitter_email
# AI/ML Services
GEMINI_API_KEY=your_gemini_api_key
# Application Settings
APP_ENV=development
LOG_LEVEL=info
TWEET_FETCH_INTERVAL=300000 # 5 minutes in milliseconds
MAX_TWEETS_PER_FETCH=10Usage
Building the Project
Build the TypeScript code:
npm run buildManaging Twitter Accounts
Add a Twitter account to monitor:
npm run add-account <username>Running the Crawler
Start the monitoring service:
npm startThe crawler will:
- Initialize the database schema if needed
- Start monitoring configured Twitter accounts
- Analyze tweets for cryptocurrency mentions
- Perform sentiment analysis on relevant tweets
- Track potential shill behavior
- Store all results in the database
- Run continuously every 5 minutes
Viewing Results
View the latest analyses and statistics:
npm run view-resultsThis will show:
- Recent tweet analyses with detected coins and sentiment
- Top mentioned cryptocurrencies
- Account statistics and shill detection results
Database Schema
TwitterAccount
CREATE TABLE twitter_accounts (
id TEXT PRIMARY KEY,
username TEXT UNIQUE NOT NULL,
last_checked DATETIME NOT NULL,
created_at DATETIME NOT NULL DEFAULT CURRENT_TIMESTAMP,
updated_at DATETIME NOT NULL DEFAULT CURRENT_TIMESTAMP
);TweetAnalysis
CREATE TABLE tweet_analyses (
id TEXT PRIMARY KEY,
tweet_id TEXT NOT NULL,
account_id TEXT NOT NULL,
text TEXT NOT NULL,
timestamp DATETIME NOT NULL,
sentiment_score REAL NOT NULL,
sentiment_label TEXT NOT NULL,
sentiment_confidence REAL NOT NULL,
is_shill_tweet INTEGER NOT NULL,
created_at DATETIME NOT NULL DEFAULT CURRENT_TIMESTAMP,
updated_at DATETIME NOT NULL DEFAULT CURRENT_TIMESTAMP,
FOREIGN KEY (account_id) REFERENCES twitter_accounts(id)
);CoinMention
CREATE TABLE coin_mentions (
id TEXT PRIMARY KEY,
name TEXT NOT NULL,
symbol TEXT NOT NULL,
confidence REAL NOT NULL,
created_at DATETIME NOT NULL DEFAULT CURRENT_TIMESTAMP
);Relationships
CREATE TABLE tweet_coin_mentions (
tweet_analysis_id TEXT NOT NULL,
coin_mention_id TEXT NOT NULL,
created_at DATETIME NOT NULL DEFAULT CURRENT_TIMESTAMP,
PRIMARY KEY (tweet_analysis_id, coin_mention_id),
FOREIGN KEY (tweet_analysis_id) REFERENCES tweet_analyses(id),
FOREIGN KEY (coin_mention_id) REFERENCES coin_mentions(id)
);
CREATE TABLE coin_tracker_accounts (
coin_id TEXT NOT NULL,
account_id TEXT NOT NULL,
created_at DATETIME NOT NULL DEFAULT CURRENT_TIMESTAMP,
PRIMARY KEY (coin_id, account_id),
FOREIGN KEY (coin_id) REFERENCES coin_trackers(coin_id),
FOREIGN KEY (account_id) REFERENCES twitter_accounts(id)
);Development
Project Structure
src/
├── mastra/
│ ├── agents/
│ │ └── twitter-crawler.agent.ts # Main agent implementation
│ ├── db/
│ │ ├── schema.ts # Database schema definitions
│ │ └── client.ts # Database client implementation
│ └── tools/
│ ├── twitter-scraper.tool.ts # Twitter scraping implementation
│ ├── coin-detection.tool.ts # Coin detection logic
│ └── sentiment-analysis.tool.ts # Gemini sentiment analysis
├── scripts/
│ ├── add-account.ts # Account management utility
│ └── view-results.ts # Results viewing utility
└── index.ts # Application entry pointRunning in Development Mode
Start with live reload:
npm run devRunning Tests
Execute the test suite:
npm testError Handling
The application includes comprehensive error handling:
- Database Errors: Automatic retries for transient errors
- Twitter API: Rate limit handling and backoff
- Network Issues: Graceful degradation and recovery
- Invalid Data: Robust validation and sanitization
Monitoring and Maintenance
Logs
- Application logs are written to stdout/stderr
- Error logs include stack traces and context
- Successful analyses are logged with timestamps
Health Checks
- Database connectivity is verified on startup
- Twitter API access is validated before crawling
- Gemini API availability is confirmed before analysis
Contributing
- Fork the repository
- Create your feature branch:
git checkout -b feature/amazing-feature - Make your changes and commit:
git commit -m 'Add some amazing feature' - Push to your branch:
git push origin feature/amazing-feature - Open a Pull Request
Development Guidelines
- Follow TypeScript best practices
- Maintain test coverage
- Document new features
- Update schema migrations if needed
- Follow the existing code style
Troubleshooting
Common Issues
-
Database Connection Errors
- Verify TursoDB credentials
- Check network connectivity
- Ensure database is running
-
Twitter Authentication Failures
- Confirm credentials are correct
- Check for account restrictions
- Verify network access
-
Sentiment Analysis Errors
- Validate Gemini API key
- Check API quotas
- Verify request formatting
Getting Help
- Open an issue for bugs
- Use discussions for questions
- Check existing issues first
License
This project is licensed under the ISC License - see the LICENSE file for details.
Acknowledgments
- Mastra - AI Agent Framework
- @the-convocation/twitter-scraper - Twitter Scraping Library
- Google Gemini - AI Model
- TursoDB - Database Engine