Post

Solving Chat Message Loss in Multi-Instance Environment

Solving Chat Message Loss in Multi-Instance Environment

Problem

During the Kakao Tech Campus final project, I implemented team chat functionality. Used WebSocket, worked fine locally. But after deploying to AWS ECS, problems arose.

Instance A and B were running in parallel. User A connects to Instance A, User B connects to Instance B. User A sends a message… User B doesn’t receive it.


Problem Analysis

Every instance maintains its own WebSocket sessions. Instances have no knowledge of each other.

1
2
3
4
5
6
User A → Instance A (has A's session)
User B → Instance B (has B's session)

User A sends message
→ Instance A sends to its sessions
→ User B's session is on Instance B, message not delivered

Solution: Redis Pub/Sub

Let all instances share messages via Redis.

1
2
3
4
5
User A sends message
→ Instance A publishes to Redis
→ Redis delivers to all subscribers (A, B instances)
→ Instance B sends to B's session
→ User B receives

Redis Config

1
2
3
4
5
6
7
8
9
10
11
12
13
@Configuration
public class RedisConfig {
    @Bean
    public RedisMessageListenerContainer redisContainer(
            RedisConnectionFactory factory,
            MessageListenerAdapter adapter) {
        RedisMessageListenerContainer container = 
            new RedisMessageListenerContainer();
        container.setConnectionFactory(factory);
        container.addMessageListener(adapter, new ChannelTopic("chat"));
        return container;
    }
}

Message Publish

1
2
3
4
5
6
7
8
9
@Service
public class ChatService {
    private final RedisTemplate<String, String> redisTemplate;

    public void sendMessage(ChatMessage message) {
        redisTemplate.convertAndSend("chat", 
            objectMapper.writeValueAsString(message));
    }
}

Message Receive

1
2
3
4
5
6
7
8
9
10
11
12
13
14
@Component
public class ChatMessageListener implements MessageListener {
    @Override
    public void onMessage(Message message, byte[] pattern) {
        String payload = new String(message.getBody());
        ChatMessage chatMessage = objectMapper.readValue(payload, ChatMessage.class);
        
        // Broadcast to all WebSocket sessions on this instance
        messagingTemplate.convertAndSend(
            "/topic/chat/" + chatMessage.getRoomId(), 
            chatMessage
        );
    }
}

Results

  • Messages delivered properly even in multi-instance environment
  • Tested with JMeter load testing
  • Works correctly with horizontal scaling (more instances)

Alternative Considered

Also considered using a message queue (SQS, RabbitMQ). But Redis Pub/Sub was simpler for chat requiring real-time delivery. MQ has persistence advantages but was overkill for this use case.


Lessons Learned

  • Local environment alone isn’t sufficient for verification
  • Multi-instance is the default in production
  • Message broadcasting needs external system (Redis, MQ, etc.)
  • System design changes are needed for distributed environment

From Kakao Tech Campus 3rd cohort final project (student schedule management service).

This post is licensed under CC BY 4.0 by the author.