Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Twitter Spaces Integration #1550

Merged
merged 6 commits into from
Jan 1, 2025
Merged

feat: Twitter Spaces Integration #1550

merged 6 commits into from
Jan 1, 2025

Conversation

slkzgm
Copy link
Contributor

@slkzgm slkzgm commented Dec 29, 2024

Risks

Low. Existing users who relied on Deepgram by default will still see no change unless they explicitly define a new TRANSCRIPTION_PROVIDER. Fallback logic preserves original behavior (Deepgram → OpenAI → Local).

Background

What does this PR do?

  • Adds an optional TRANSCRIPTION_PROVIDER setting (deepgram, openai, or local) with fallback logic.
    • If not set, old behavior remains: Deepgram → OpenAI → Local.
  • Moves Twitter Spaces plugins from agent-twitter-client into this repo for better flexibility and less friction in plugin development.
  • Introduces an AI-driven Twitter Spaces flow:
    1. Automatic Space launch decisions (random chance, business hours, cooldown intervals).
    2. Multi-speaker logic with queue management (maxSpeakers).
    3. GPT-based filler/idle messages, optional STT/TTS bridging, local audio recording.
    4. Graceful shutdown and cooldown for repeated Spaces.

Transcription Service Changes

  • We introduced a new TranscriptionProvider enum (Deepgram, OpenAI, or Local) to replace string flags.
  • In initialize(), the provider is chosen in this order:
    1. character.settings.transcription (if the API keys exist),
    2. .env (TRANSCRIPTION_PROVIDER),
    3. Old fallback logic (Deepgram → OpenAI → Local) if neither is configured.
  • For example, in your character.json, you can specify:
    {
      // ...
      "settings": {
        "transcription": "Deepgram"
      }
    }
    If you have DEEPGRAM_API_KEY set, the service will use Deepgram; otherwise it continues to the next check.
  • processQueue() uses a switch on this.transcriptionProvider to pick the final method (transcribeWithDeepgram, transcribeWithOpenAI, or transcribeLocally).

Flow Recap

  1. Periodic Check

    • If no Space is running, possibly launch one by shouldLaunchSpace() (random chance, business hours, cooldown).
    • If a Space is running, manageCurrentSpace() handles speaker timeouts, occupancy updates, queue acceptance, etc.
  2. Space Creation

    • Generates a SpaceConfig (topics from config or GPT).
    • Attaches plugins: audio recording, STT/TTS, idle monitor, etc.
    • Hooks into speakerRequest, occupancyUpdate, idleTimeout, etc.
  3. Speaker Logic

    • Maintains an activeSpeakers array + a queue if at capacity (maxSpeakers).
    • Enforces speakerMaxDurationMs per speaker.
    • If a speaker is removed, accept next in queue if available.
  4. Stopping

    • stopSpace() finalizes the Space, logs completion, clears states, etc.
    • Resumes periodic checks at a slower interval until the next launch is decided.

Configuration

A) .env / Environment Variables

# Transcription Provider
TRANSCRIPTION_PROVIDER=         # Default is local (possible values: deepgram, openai, local)
OPENAI_API_KEY=sk-...
DEEPGRAM_API_KEY=...

B) character.json"twitterSpaces" Field

{
  // ...
    "settings": {
    ...
    "transcription": "Deepgram"
  }
  "twitterSpaces": {
    "maxSpeakers": 2,
    "topics": [
      "Blockchain Trends",
      "AI Innovations"
    ],
    "typicalDurationMinutes": 45,
    "idleKickTimeoutMs": 300000,
    "minIntervalBetweenSpacesMinutes": 60,
    "businessHoursOnly": true,
    "randomChance": 0.3,
    "enableIdleMonitor": true,
    "enableSttTts": true,
    "enableRecording": false,
    "voiceId": "21m00Tcm4TlvDq8ikWAM",
    "sttLanguage": "en",
    "gptModel": "gpt-3.5-turbo",
    "systemPrompt": "You are a helpful AI co-host assistant.",
    "speakerMaxDurationMs": 240000
  }
}
  • maxSpeakers: number of concurrent speakers allowed.
  • topics: if none are provided, GPT generates them dynamically.
  • randomChance: probability for each check cycle to spawn a new Space.
  • speakerMaxDurationMs: maximum time each speaker can speak before removal.

What kind of change is this?

  • Features (new Twitter Spaces integration and optional transcription provider).
  • Improvements (unified plugin development, more config options, fallback logic maintained).

Documentation changes needed?

Yes, minimal. We must mention:

  • The new TRANSCRIPTION_PROVIDER in .env (optional).
  • The new twitterSpaces config section in character.json.

Testing

Where should a reviewer start?

  • Check transcription.service.ts to review how it resolves conflicts by prioritizing character settings, then .env, then old fallback.
  • Check new or relocated Twitter Spaces integration files for the Space lifecycle (launch, speaker management, idle detection, etc.).

Detailed testing steps

  1. Define TRANSCRIPTION_PROVIDER in .env (or leave it empty to keep old fallback).
  2. Provide valid API keys if choosing deepgram or openai.
  3. Define twitterSpaces.randomChance in the character JSON to 1 (for a 100% rate of starting a space).
  4. Run the agent; verify that Spaces launch automatically, respect the chosen transcription provider, and handle multi-speaker logic as expected.

No special database migrations are needed. Basic local runs and logs confirm correct functioning.

Future Improvements

  • More robust decision logic for accepting speakers, switching, and timeouts.
  • Realtime API plugin for smoother, on-the-fly conversation handling.
  • Solo Broadcast Mode: launch Spaces focused on a single host monologue with no external speakers.
  • True VAD (Voice Activity Detection) to detect when a speaker finishes talking, instead of relying on manual mute/unmute cues.
  • Advanced scheduling triggers (e.g., event-based or calendar-based).
  • Analytics & insights for post-Space summaries or usage metrics.

Copy link
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @slkzgm! Welcome to the ai16z community. Thanks for submitting your first pull request; your efforts are helping us accelerate towards AGI. We'll review it shortly. You are now a ai16z contributor!

@odilitime odilitime changed the base branch from main to develop December 29, 2024 19:45
Copy link
Collaborator

@odilitime odilitime left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please add back the documentation

packages/client-twitter/src/environment.ts Show resolved Hide resolved
@slkzgm slkzgm requested a review from odilitime December 30, 2024 20:19
@lalalune
Copy link
Member

lalalune commented Jan 1, 2025

Some conflicts that need review, we should prioritize getting this in since it's a pretty big push

@lalalune lalalune merged commit 6f576b6 into elizaOS:develop Jan 1, 2025
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants