Social Crawler Index

A Mastra AI-powered multi-platform social media crawler with comprehensive sentiment analysis and vectorization.

Twitter Crypto Crawler

A Mastra AI-powered Twitter crawler that monitors specified Twitter accounts for cryptocurrency mentions, analyzes sentiment, and detects potential shill behavior. Built using TypeScript and the Mastra agent framework.

Features

  • Twitter Monitoring: Automatically monitors specified Twitter accounts for cryptocurrency-related content
  • Intelligent Coin Detection:
    • Detects cryptocurrency mentions without relying on a predefined list
    • Uses pattern matching and context analysis
    • Handles common variations (e.g., BTC/Bitcoin/₿)
    • Assigns confidence scores to detections
  • Sentiment Analysis:
    • Uses Google's Gemini 2.0 model for context-aware sentiment analysis
    • Considers crypto-specific context and terminology
    • Provides sentiment scores, labels, and confidence metrics
  • Shill Detection:
    • Tracks mention frequency per account
    • Identifies repeated mentions within time windows
    • Considers multiple accounts mentioning the same coins
  • Data Storage:
    • Uses TursoDB for efficient and reliable storage
    • Maintains relationships between accounts, tweets, and coins
    • Enables complex queries and analytics
  • Real-time Monitoring:
    • Runs every 5 minutes to stay current
    • Handles rate limiting and error recovery
    • Maintains state across runs

Prerequisites

  • Node.js 18 or higher
  • TursoDB instance (local or cloud)
  • Twitter account credentials
  • Google Cloud API key with Gemini access

Installation

  1. Clone the repository:
git clone <repository-url>
cd memedd-twitter-crawler
  1. Install dependencies:
npm install
  1. Create a .env file with your configuration:
# Database Configuration
TURSO_DB_URL=your_turso_db_url
TURSO_DB_AUTH_TOKEN=your_turso_auth_token

# Twitter Credentials
TWITTER_USERNAME=your_twitter_username
TWITTER_PASSWORD=your_twitter_password
TWITTER_EMAIL=your_twitter_email

# AI/ML Services
GEMINI_API_KEY=your_gemini_api_key

# Application Settings
APP_ENV=development
LOG_LEVEL=info
TWEET_FETCH_INTERVAL=300000 # 5 minutes in milliseconds
MAX_TWEETS_PER_FETCH=10

Usage

Building the Project

Build the TypeScript code:

npm run build

Managing Twitter Accounts

Add a Twitter account to monitor:

npm run add-account <username>

Running the Crawler

Start the monitoring service:

npm start

The crawler will:

  1. Initialize the database schema if needed
  2. Start monitoring configured Twitter accounts
  3. Analyze tweets for cryptocurrency mentions
  4. Perform sentiment analysis on relevant tweets
  5. Track potential shill behavior
  6. Store all results in the database
  7. Run continuously every 5 minutes

Viewing Results

View the latest analyses and statistics:

npm run view-results

This will show:

  • Recent tweet analyses with detected coins and sentiment
  • Top mentioned cryptocurrencies
  • Account statistics and shill detection results

Database Schema

TwitterAccount

CREATE TABLE twitter_accounts (
  id TEXT PRIMARY KEY,
  username TEXT UNIQUE NOT NULL,
  last_checked DATETIME NOT NULL,
  created_at DATETIME NOT NULL DEFAULT CURRENT_TIMESTAMP,
  updated_at DATETIME NOT NULL DEFAULT CURRENT_TIMESTAMP
);

TweetAnalysis

CREATE TABLE tweet_analyses (
  id TEXT PRIMARY KEY,
  tweet_id TEXT NOT NULL,
  account_id TEXT NOT NULL,
  text TEXT NOT NULL,
  timestamp DATETIME NOT NULL,
  sentiment_score REAL NOT NULL,
  sentiment_label TEXT NOT NULL,
  sentiment_confidence REAL NOT NULL,
  is_shill_tweet INTEGER NOT NULL,
  created_at DATETIME NOT NULL DEFAULT CURRENT_TIMESTAMP,
  updated_at DATETIME NOT NULL DEFAULT CURRENT_TIMESTAMP,
  FOREIGN KEY (account_id) REFERENCES twitter_accounts(id)
);

CoinMention

CREATE TABLE coin_mentions (
  id TEXT PRIMARY KEY,
  name TEXT NOT NULL,
  symbol TEXT NOT NULL,
  confidence REAL NOT NULL,
  created_at DATETIME NOT NULL DEFAULT CURRENT_TIMESTAMP
);

Relationships

CREATE TABLE tweet_coin_mentions (
  tweet_analysis_id TEXT NOT NULL,
  coin_mention_id TEXT NOT NULL,
  created_at DATETIME NOT NULL DEFAULT CURRENT_TIMESTAMP,
  PRIMARY KEY (tweet_analysis_id, coin_mention_id),
  FOREIGN KEY (tweet_analysis_id) REFERENCES tweet_analyses(id),
  FOREIGN KEY (coin_mention_id) REFERENCES coin_mentions(id)
);

CREATE TABLE coin_tracker_accounts (
  coin_id TEXT NOT NULL,
  account_id TEXT NOT NULL,
  created_at DATETIME NOT NULL DEFAULT CURRENT_TIMESTAMP,
  PRIMARY KEY (coin_id, account_id),
  FOREIGN KEY (coin_id) REFERENCES coin_trackers(coin_id),
  FOREIGN KEY (account_id) REFERENCES twitter_accounts(id)
);

Development

Project Structure

src/
├── mastra/
│   ├── agents/
│   │   └── twitter-crawler.agent.ts    # Main agent implementation
│   ├── db/
│   │   ├── schema.ts                   # Database schema definitions
│   │   └── client.ts                   # Database client implementation
│   └── tools/
│       ├── twitter-scraper.tool.ts     # Twitter scraping implementation
│       ├── coin-detection.tool.ts      # Coin detection logic
│       └── sentiment-analysis.tool.ts   # Gemini sentiment analysis
├── scripts/
│   ├── add-account.ts                  # Account management utility
│   └── view-results.ts                 # Results viewing utility
└── index.ts                            # Application entry point

Running in Development Mode

Start with live reload:

npm run dev

Running Tests

Execute the test suite:

npm test

Error Handling

The application includes comprehensive error handling:

  • Database Errors: Automatic retries for transient errors
  • Twitter API: Rate limit handling and backoff
  • Network Issues: Graceful degradation and recovery
  • Invalid Data: Robust validation and sanitization

Monitoring and Maintenance

Logs

  • Application logs are written to stdout/stderr
  • Error logs include stack traces and context
  • Successful analyses are logged with timestamps

Health Checks

  • Database connectivity is verified on startup
  • Twitter API access is validated before crawling
  • Gemini API availability is confirmed before analysis

Contributing

  1. Fork the repository
  2. Create your feature branch:
    git checkout -b feature/amazing-feature
  3. Make your changes and commit:
    git commit -m 'Add some amazing feature'
  4. Push to your branch:
    git push origin feature/amazing-feature
  5. Open a Pull Request

Development Guidelines

  • Follow TypeScript best practices
  • Maintain test coverage
  • Document new features
  • Update schema migrations if needed
  • Follow the existing code style

Troubleshooting

Common Issues

  1. Database Connection Errors

    • Verify TursoDB credentials
    • Check network connectivity
    • Ensure database is running
  2. Twitter Authentication Failures

    • Confirm credentials are correct
    • Check for account restrictions
    • Verify network access
  3. Sentiment Analysis Errors

    • Validate Gemini API key
    • Check API quotas
    • Verify request formatting

Getting Help

  • Open an issue for bugs
  • Use discussions for questions
  • Check existing issues first

License

This project is licensed under the ISC License - see the LICENSE file for details.

Acknowledgments

On this page