Skip to content

Conversation

@icecrasher321
Copy link
Collaborator

Summary

Need to migrate to redis for sockets room membership to work correctly in multi-task context. Maintains in memory fallback for single-pod context.

Type of Change

  • Bug fix

Testing

Tested manually

Checklist

  • Code follows project style guidelines
  • Self-reviewed my changes
  • Tests added/updated and passing
  • No new warnings introduced
  • I confirm that I have read and agree to the terms outlined in the Contributor License Agreement (CLA)

@vercel
Copy link

vercel bot commented Jan 30, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

1 Skipped Deployment
Project Deployment Actions Updated (UTC)
docs Skipped Skipped Jan 30, 2026 4:15am

Request Review

@icecrasher321
Copy link
Collaborator Author

@cursor review

@greptile-apps
Copy link
Contributor

greptile-apps bot commented Jan 30, 2026

Greptile Overview

Greptile Summary

Migrated socket room management from in-memory to Redis-backed storage to enable multi-pod horizontal scaling. The implementation provides a clean abstraction with IRoomManager interface and two implementations: RedisRoomManager for production multi-pod deployments and MemoryRoomManager as a fallback for single-pod environments.

Key Changes

  • Redis Integration: Added Redis adapter for Socket.IO cross-pod broadcasting and Redis-backed state storage with atomic Lua scripts
  • Async Room Manager: Converted all room operations to async to support Redis, making the API consistent across both implementations
  • Stale Connection Cleanup: Added tabSessionId tracking to detect and clean up stale connections from same user/tab
  • Memory Leak Prevention: Added cleanup functions for pending debounced operations when sockets disconnect
  • Graceful Shutdown: Implemented proper cleanup of Redis connections and room manager state on server shutdown
  • Auth Failure Handling: Client now stops reconnection attempts on auth failures to prevent infinite loops

Architecture

The migration follows a factory pattern where createRoomManager() selects the appropriate implementation based on REDIS_URL environment variable. When Redis is configured:

  • Socket.IO uses Redis adapter for cross-pod event broadcasting
  • Room manager uses Redis for shared state with atomic operations via Lua scripts
  • All pods can handle workflow deletion, update, and revert notifications via HTTP endpoints

Testing

Tests were updated to use MemoryRoomManager with proper async initialization and mocking of Redis package to prevent actual connections during test runs.

Confidence Score: 4/5

  • Safe to merge with minor type casting issues that should be cleaned up
  • The implementation is well-architected with atomic operations, proper error handling, and comprehensive testing. The use of Lua scripts prevents race conditions in Redis operations. However, there are minor type casting issues in the room manager factory function that create unnecessary complexity. The async migration is thorough and consistent across all handlers.
  • Fix the type casting issues in apps/sim/socket/index.ts (lines 20, 53) and the avatarUrl handling in apps/sim/socket/rooms/redis-manager.ts (line 227)

Important Files Changed

Filename Overview
apps/sim/socket/rooms/redis-manager.ts New Redis-backed room manager with Lua scripts for atomic operations and proper reconnection handling
apps/sim/socket/rooms/memory-manager.ts New in-memory room manager as fallback for single-pod deployments
apps/sim/socket/rooms/types.ts Defines IRoomManager interface with async methods for both memory and Redis implementations
apps/sim/socket/index.ts Refactored to async initialization with proper room manager factory and graceful shutdown
apps/sim/socket/config/socket.ts Added Redis adapter configuration for cross-pod Socket.IO broadcasting
apps/sim/socket/handlers/workflow.ts Refactored join-workflow to handle stale connections, use async room manager, and add tabSessionId tracking
apps/sim/app/workspace/providers/socket-provider.tsx Added tabSessionId tracking, auth failure handling, and improved reconnection logic

Sequence Diagram

sequenceDiagram
    participant Client as Client Browser
    participant SocketIO as Socket.IO Server
    participant RoomMgr as Room Manager
    participant Redis as Redis
    participant Adapter as Redis Adapter
    
    Note over Client,Redis: Connection Establishment
    Client->>SocketIO: Connect with auth token
    SocketIO->>RoomMgr: Initialize (createRoomManager)
    alt Redis URL configured
        RoomMgr->>Redis: Connect & load Lua scripts
        Redis-->>RoomMgr: Connected
        SocketIO->>Adapter: Configure Redis adapter
        Note over Adapter: Enables cross-pod broadcasting
    else No Redis URL
        Note over RoomMgr: Use MemoryRoomManager
    end
    
    Note over Client,Redis: Join Workflow
    Client->>SocketIO: emit('join-workflow', {workflowId, tabSessionId})
    SocketIO->>RoomMgr: getWorkflowIdForSocket(socketId)
    RoomMgr->>Redis: GET socket:{socketId}:workflow
    Redis-->>RoomMgr: Current workflowId or null
    
    alt User in different workflow
        SocketIO->>RoomMgr: removeUserFromRoom(socketId)
        RoomMgr->>Redis: EVALSHA removeUserScript
        Note over Redis: Atomic: remove user,<br/>cleanup empty rooms
        Redis-->>RoomMgr: Previous workflowId
    end
    
    SocketIO->>RoomMgr: getWorkflowUsers(workflowId)
    RoomMgr->>Redis: HGETALL workflow:{wfId}:users
    Redis-->>RoomMgr: Existing users list
    
    Note over SocketIO: Check for stale connections<br/>from same user/tab
    
    SocketIO->>RoomMgr: addUserToRoom(workflowId, socketId, presence)
    RoomMgr->>Redis: MULTI pipeline
    RoomMgr->>Redis: HSET workflow:{wfId}:users
    RoomMgr->>Redis: SET socket:{socketId}:workflow
    RoomMgr->>Redis: HSET socket:{socketId}:session
    RoomMgr->>Redis: EXEC
    Redis-->>RoomMgr: Success
    
    SocketIO->>Client: emit('join-workflow-success')
    SocketIO->>RoomMgr: broadcastPresenceUpdate(workflowId)
    RoomMgr->>Adapter: emit to room (cross-pod)
    Adapter->>Redis: PUBLISH to channel
    Note over Redis,Adapter: All pods receive via Redis pub/sub
    Adapter-->>SocketIO: Broadcast to local clients
    SocketIO-->>Client: emit('presence-update')
    
    Note over Client,Redis: Collaborative Updates
    Client->>SocketIO: emit('cursor-update')
    SocketIO->>RoomMgr: updateUserActivity(wfId, socketId, {cursor})
    RoomMgr->>Redis: EVALSHA updateActivityScript
    Note over Redis: Atomic read-modify-write
    Redis-->>RoomMgr: Updated
    SocketIO->>Adapter: emit to room (cross-pod)
    Adapter->>Redis: PUBLISH to channel
    Adapter-->>SocketIO: Broadcast to all pods
    SocketIO-->>Client: emit('cursor-update')
    
    Note over Client,Redis: Disconnect
    Client->>SocketIO: disconnect
    SocketIO->>RoomMgr: removeUserFromRoom(socketId)
    RoomMgr->>Redis: EVALSHA removeUserScript
    Note over Redis: Atomic: HDEL user,<br/>DEL socket keys,<br/>cleanup if room empty
    Redis-->>RoomMgr: workflowId
    SocketIO->>RoomMgr: broadcastPresenceUpdate(workflowId)
    RoomMgr->>Adapter: emit to room (cross-pod)
    Adapter-->>SocketIO: Update all pods
    SocketIO-->>Client: emit('presence-update')
Loading

Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

7 files reviewed, 3 comments

Edit Code Review Agent Settings | Greptile

@icecrasher321
Copy link
Collaborator Author

@cursor review

@icecrasher321
Copy link
Collaborator Author

@cursor review

@icecrasher321
Copy link
Collaborator Author

@cursor review

@icecrasher321
Copy link
Collaborator Author

@cursor review

@icecrasher321
Copy link
Collaborator Author

@cursor review

@icecrasher321
Copy link
Collaborator Author

@cursor review

@icecrasher321
Copy link
Collaborator Author

@cursor review

@icecrasher321
Copy link
Collaborator Author

@cursor review

Copy link

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✅ Bugbot reviewed your changes and found no new issues!

Comment @cursor review or bugbot run to trigger another review on this PR

@icecrasher321 icecrasher321 merged commit b0fbf36 into staging Jan 30, 2026
12 checks passed
@waleedlatif1 waleedlatif1 deleted the improvement/sockets-system branch January 30, 2026 20:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants