How I Finally Cracked the Perfect Table Schema for My Chat App

Perfecting a Chat Database Schema: A Journey of Trial and Error

Have you ever embarked on a project that seems straightforward at first but quickly became a learning experience? That was me with my first project of the year.

I decided to build a web-based chat app as my first project of the year. It’s meant to help me brush up on my Laravel skills, as I only started using Laravel late last year. The app is going to be very simple in functionality, allowing users to find available users and start conversations with them, create groups, and chat with group members. The message input would support emojis, file attachments, and maybe voice notes. It’s going great so far and I’m very excited about it.

I commenced the project by setting up the development environment, which was the first and easiest part. I used Docker to containerize all the necessary software, including Node.js for managing the frontend, PHP, Composer, Nginx, MySQL, and PDO.

The next step was to design the system architecture of the application, with the database table schema being a significant component. It didn't take me long to develop the first schema, and it seemed perfect—until it wasn't.

In this guide, I'll be sharing with you:

  • An analysis of the initial problems with the schema I first developed.

  • The ideal system design to use if you ever want to build your own chat app.

  • Insights into database table relationships.

If you're a Laravel beginner, a developer exploring database design, or someone planning to build their own chat app, you'll find this guide helpful.

The Initial Struggles

The initial design included tables for users and messages, with each user capable of having multiple messages. Additionally, it featured tables for conversations, groups, and message attachments. The relationships between these tables are shown below.

This all seemed promising, and I even managed to seed the tables without breaking a sweat. But then came the conversations table, holding its receiver_id with pride. At first glance, it looked all good—after all, the messages table already had a sender_id. You’d probably think, 'Hey, they’re linked... what could possibly go wrong?' Spoiler alert: a lot!

The issue with the conversations table having a receiver_id is that it restricts the conversation to a one-way communication model. This means that each conversation is designed to have only one receiver, which limits the interaction to messages being sent from multiple senders to a single receiver. In a typical chat application, you would expect a conversation to allow for two-way communication, where both parties can send and receive messages. The current design does not support this, as it only allows for messages to be directed to one receiver, making it unsuitable for a dynamic chat environment where users can freely exchange messages back and forth.

Another problem I noticed was in the seeding. Somehow, multiple users ended up with messages tied to a single private conversation (through the conversation_id). It was like assigning the same seat on a bus to several people—it just didn’t make sense.

It was clear that I needed to develop a better schema and seeding system. I spent the following week iterating back and forth to achieve the perfect design, which I am now ready to share with you.

The Current Table Schema

For the most part, this new schema resembles the original. The main changes were made to the conversations and messages tables to address the issues identified earlier.

  1. CONVERSATIONS

     CREATE TABLE `conversations`(
         `id` BIGINT UNSIGNED NOT NULL AUTO_INCREMENT PRIMARY KEY,
         `type` ENUM('private', 'group') NOT NULL,
         `group_id` BIGINT NULL,
         `last_message_id` BIGINT NOT NULL,
         `created_at` DATETIME NOT NULL
     );
    

    Compared to the previous version, this one now consists only of type (an Enum indicating whether the conversation is private or group) and a nullable group_id for group conversations, in addition to id and timestamps.

  2. MESSAGES

     CREATE TABLE `messages`(
         `id` BIGINT UNSIGNED NOT NULL AUTO_INCREMENT PRIMARY KEY,
         `message` VARCHAR(255) NOT NULL,
         `conversation_id` BIGINT NOT NULL,
         `sender_id` BIGINT NOT NULL,
         `receiver_id` BIGINT NULL,
         `created_at` DATETIME NOT NULL,
         `updated_at` DATETIME NOT NULL
     );
    

    This version now consists of sender_id, receiver_id (nullable), and conversation_id. You might wonder why the messages table doesn't just handle the group_id column. This is because having both group_id and conversation_id in the messages table would be redundant, as conversation_id is already linked to the group (through group_id). You'll understand this better when we discuss relationships.

  3. MESSAGE ATTACHMENTS

     CREATE TABLE `message_attchmanet`(
         `id` BIGINT UNSIGNED NOT NULL AUTO_INCREMENT PRIMARY KEY,
         `message_id` BIGINT NOT NULL,
         `name` VARCHAR(255) NOT NULL,
         `path` VARCHAR(1024) NOT NULL,
         `mime` VARCHAR(255) NOT NULL,
         `size` BIGINT NOT NULL,
         `created_at` DATETIME NOT NULL
     );
    
  4. USERS

     CREATE TABLE `users`(
         `id` BIGINT UNSIGNED NOT NULL AUTO_INCREMENT PRIMARY KEY,
         `name` VARCHAR(255) NOT NULL,
         `email` VARCHAR(255) NOT NULL,
         `avatar` VARCHAR(255) NULL,
         `password` VARCHAR(255) NOT NULL,
         `created_at` DATETIME NOT NULL,
         `updated_at` DATETIME NOT NULL
     );
     ALTER TABLE
         `users` ADD UNIQUE `users_email_unique`(`email`);
    
  5. GROUP

     CREATE TABLE `groups`(
         `id` BIGINT UNSIGNED NOT NULL AUTO_INCREMENT PRIMARY KEY,
         `name` VARCHAR(255) NOT NULL,
         `decription` VARCHAR(255) NULL,
         `avatar` VARCHAR(255) NULL,
         `owner_id` BIGINT NOT NULL,
         `created_at` DATETIME NOT NULL,
         `updated_at` DATETIME NOT NULL
     );
    

Relationships

The relationships between the tables play a vital role in simplifying data retrieval. A well-designed schema relationship ensures that querying the tables is efficient and straightforward.

The relationships in this schema are as follows:

  1. USERS AND GROUPS

    As you might have guessed, Users and Groups have a Many-to-Many relationship. That is, a user can belong to multiple groups, and each group should contain more than one user. However, there’s a slight twist when it comes to Group Ownership. While a single user can own many groups, each group can only have one owner. This creates a One-to-Many relationship between users and groups in the context of ownership.

  2. CONVERSATIONS AND GROUPS

    A group can have only one conversation, and all users in the group can participate in that conversation. Conversely, a (group) conversation can belong to only one group, which makes the relationship between these two tables One-to-One.

    Which Table is the Owning Side?

    Now comes the important question: Which table should be the owning side of this relationship? To answer this, let’s first clarify the concept: In a One-to-One relationship, the owning side of the relationship is typically the table that holds the foreign key reference to the other table. to better understand this, suppose database table A had a column B_id which references foreign table B, then table B is said to be the owning side of the relationship.

    Here’s an easy way to think about it:

    • Can a conversation exist without a group?
      The answer is No. If we define the type column of a conversation row as "group", then that conversation must be linked to a group (and have it’s group_id column set to the id of the group it belongs to).

    • Can a group exist without a conversation?
      Yes, a group can exist without a conversation. This is possible when a group is created, but no messages have been sent yet. Essentially, it means "the group has no conversation (yet)."

Having established that, the groups table is the owning side, which means the conversations table will include a group_id foreign key reference.

Now you see why the messages table shouldn’t have a group_id—since the conversations table already has a group_id linking it to the group, adding it again in the messages table would create redundancy.

  1. CONVERSATIONS AND MESSAGES

    For every conversation, there can be an unlimited number of messages, so this is pretty straightforward. The relationship is Many-to-One.

  2. USER AND MESSAGES

    The relationship here is also Many-to-One because a user can have an unlimited number of messages associated with them. The key difference is that a message can be linked to a user in two ways: as the sender or as the receiver. (It’s important to note that the receiver_id on a message can be null in cases where the message is part of a group conversation.) In any case, the relationship remains the same, as a user can be the recipient of many messages and can also be the sender of many messages.

  3. MESSAGES AND MESSAGE ATTCHMENTS

    Each message can have one or more file attachments, though it might be practical to set a limit on the number of attachments allowed per message. Nonetheless, this creates a One-to-Many relationship between messages and message attachments.

  4. CONVERSATIONS AND USERS

    If a conversation is not a group conversation, it is a personal conversation involving two users. These users can also participate in other conversations besides this one. Therefore, the relationship between users and conversations is Many-to-Many.

The Role of Intermediate Tables in Many-to-Many Relationships

In the case of Many-to-Many relationships, like between users and groups (or conversations and users), we need an intermediate table (also known as a pivot table) to manage the links between users and groups. This table acts as a bridge, storing references to both tables, which simplifies the process of querying and managing relationships.

Taking users and groups, the intermediate table could be called something like group_user, with columns for user_id and group_id, linking users to the groups they belong to and groups to their members (users). In frameworks like Laravel, it’s common to name this table in alphabetical order of the related tables, but the naming convention ultimately depends on your preference and consistency. For other relationships, like between users and conversations, you might choose names like user_conversation or private_conversation, depending on the context and clarity you wish to achieve.

The intermediate tables in our case are as follows:

  1. USERS AND GROUPS

     CREATE TABLE `group_user`(
         `id` BIGINT UNSIGNED NOT NULL AUTO_INCREMENT PRIMARY KEY,
         `user_id` BIGINT NOT NULL,
         `group_id` BIGINT NOT NULL,
         `role` ENUM('admin', 'member') NOT NULL
     );
    

    As you can see, the group_user table also includes a role column, allowing the pivot table to store additional data relevant to both related tables.

  2. CONVERSATIONS (private) AND USERS

     CREATE TABLE `user_conversation`(
         `id` BIGINT UNSIGNED NOT NULL AUTO_INCREMENT PRIMARY KEY,
         `user_id` BIGINT NOT NULL,
         `conversation_id` BIGINT NOT NULL,
         `created_at` DATETIME NOT NULL
     );
    

By using pivot tables, we can efficiently manage Many-to-Many relationships while also allowing for additional attributes, like roles, to be associated with the relationship itself. With this structure in place, querying and managing these connections becomes much more straightforward.

Foreign Key Constraints

To maintain data integrity and enforce relationships between tables, foreign key constraints are used to ensure that the data in one table corresponds to data in another table. These constraints help maintain the referential integrity of the database by ensuring that relationships between tables remain consistent. For example, a foreign key constraint might ensure that every group_id in the group_user table corresponds to a valid id in the groups table, and similarly, every user_id in the group_user table corresponds to a valid id in the users table. This prevents orphaned records and ensures that all references are valid.

The following foreign key constraints are used:

ALTER TABLE
    `messages` ADD CONSTRAINT `messages_sender_id_foreign` FOREIGN KEY(`sender_id`) REFERENCES `users`(`id`);
ALTER TABLE
    `messages` ADD CONSTRAINT `messages_receiver_id_foreign` FOREIGN KEY(`receiver_id`) REFERENCES `users`(`id`);
ALTER TABLE
    `messages` ADD CONSTRAINT `messages_conversation_id_foreign` FOREIGN KEY(`conversation_id`) REFERENCES `conversations`(`id`);
ALTER TABLE
    `conversations` ADD CONSTRAINT `conversations_group_id_foreign` FOREIGN KEY(`group_id`) REFERENCES `groups`(`id`);
ALTER TABLE
    `user_conversation` ADD CONSTRAINT `user_conversation_conversation_id_foreign` FOREIGN KEY(`conversation_id`) REFERENCES `conversations`(`id`);
ALTER TABLE
    `user_conversation` ADD CONSTRAINT `user_conversation_user_id_foreign` FOREIGN KEY(`user_id`) REFERENCES `users`(`id`);
ALTER TABLE
    `group_user` ADD CONSTRAINT `group_user_group_id_foreign` FOREIGN KEY(`group_id`) REFERENCES `groups`(`id`);
ALTER TABLE
    `group_user` ADD CONSTRAINT `group_user_user_id_foreign` FOREIGN KEY(`user_id`) REFERENCES `users`(`id`);
ALTER TABLE
    `message_attchmanet` ADD CONSTRAINT `message_attchmanet_message_id_foreign` FOREIGN KEY(`message_id`) REFERENCES `messages`(`id`);

Below is a visual representation of the final schema, illustrating how the tables are structured and interconnected through their relationships. This overview provides a comprehensive view of how data flows within the system.

With this structure in place, we now have a well-organized schema that ensures data integrity and facilitates efficient querying.

Key Takeaways

  • Table relationships are the backbone of a well-structured database – Understanding One-to-One, One-to-Many, and Many-to-Many relationships is crucial for efficient data retrieval.

  • Pivot tables help manage Many-to-Many relationships – Instead of storing redundant data, using an intermediate table (like group_user or user_conversation) simplifies querying.

  • Defining ownership in relationships prevents ambiguity – In One-to-One relationships, the table that cannot exist independently should hold the foreign key. Similarly, in One-to-Many relationships, a single record (like a message) can have multiple related records (like attachments), ensuring scalability.

  • A well-thought-out schema reduces complexity in application logic – Proper table relationships make querying, indexing, and managing data easier in the long run.

Conclusion

Designing a solid table schema for a chat application comes with its challenges, but getting it right makes everything else—querying, performance, and data integrity—much smoother. From handling Many-to-Many relationships with pivot tables to deciding which table should own a foreign key, every decision plays a role in how well the database functions. Looking back, refining these relationships was key to making the database both efficient and scalable. More than just structuring tables, it was about anticipating how data would flow and ensuring flexibility for future features. Hopefully, this breakdown has given you useful insights into structuring your own chat app or tackling database design more effectively. Cheers!

Further Reading

Laravel Eloquent Relationships

MySQL Foreign Key Constraints