Solving Chat Message Loss in Multi-Instance Environment
Problem
During the Kakao Tech Campus final project, I implemented team chat functionality. Used WebSocket, worked fine locally. But after deploying to AWS ECS, problems arose.
Instance A and B were running in parallel. User A connects to Instance A, User B connects to Instance B. User A sends a message… User B doesn’t receive it.
Problem Analysis
Every instance maintains its own WebSocket sessions. Instances have no knowledge of each other.
1
2
3
4
5
6
User A → Instance A (has A's session)
User B → Instance B (has B's session)
User A sends message
→ Instance A sends to its sessions
→ User B's session is on Instance B, message not delivered
Solution: Redis Pub/Sub
Let all instances share messages via Redis.
1
2
3
4
5
User A sends message
→ Instance A publishes to Redis
→ Redis delivers to all subscribers (A, B instances)
→ Instance B sends to B's session
→ User B receives
Redis Config
1
2
3
4
5
6
7
8
9
10
11
12
13
@Configuration
public class RedisConfig {
@Bean
public RedisMessageListenerContainer redisContainer(
RedisConnectionFactory factory,
MessageListenerAdapter adapter) {
RedisMessageListenerContainer container =
new RedisMessageListenerContainer();
container.setConnectionFactory(factory);
container.addMessageListener(adapter, new ChannelTopic("chat"));
return container;
}
}
Message Publish
1
2
3
4
5
6
7
8
9
@Service
public class ChatService {
private final RedisTemplate<String, String> redisTemplate;
public void sendMessage(ChatMessage message) {
redisTemplate.convertAndSend("chat",
objectMapper.writeValueAsString(message));
}
}
Message Receive
1
2
3
4
5
6
7
8
9
10
11
12
13
14
@Component
public class ChatMessageListener implements MessageListener {
@Override
public void onMessage(Message message, byte[] pattern) {
String payload = new String(message.getBody());
ChatMessage chatMessage = objectMapper.readValue(payload, ChatMessage.class);
// Broadcast to all WebSocket sessions on this instance
messagingTemplate.convertAndSend(
"/topic/chat/" + chatMessage.getRoomId(),
chatMessage
);
}
}
Results
- Messages delivered properly even in multi-instance environment
- Tested with JMeter load testing
- Works correctly with horizontal scaling (more instances)
Alternative Considered
Also considered using a message queue (SQS, RabbitMQ). But Redis Pub/Sub was simpler for chat requiring real-time delivery. MQ has persistence advantages but was overkill for this use case.
Lessons Learned
- Local environment alone isn’t sufficient for verification
- Multi-instance is the default in production
- Message broadcasting needs external system (Redis, MQ, etc.)
- System design changes are needed for distributed environment
From Kakao Tech Campus 3rd cohort final project (student schedule management service).