Photo by AbsolutVision on Unsplash
How I Finally Cracked the Perfect Table Schema for My Chat App
Perfecting a Chat Database Schema: A Journey of Trial and Error
Have you ever embarked on a project that seems straightforward at first but quickly became a learning experience? That was me with my first project of the year.
I decided to build a web-based chat app as my first project of the year. It’s meant to help me brush up on my Laravel skills, as I only started using Laravel late last year. The app is going to be very simple in functionality, allowing users to find available users and start conversations with them, create groups, and chat with group members. The message input would support emojis, file attachments, and maybe voice notes. It’s going great so far and I’m very excited about it.
I commenced the project by setting up the development environment, which was the first and easiest part. I used Docker to containerize all the necessary software, including Node.js for managing the frontend, PHP, Composer, Nginx, MySQL, and PDO.
The next step was to design the system architecture of the application, with the database table schema being a significant component. It didn't take me long to develop the first schema, and it seemed perfect—until it wasn't.
In this guide, I'll be sharing with you:
An analysis of the initial problems with the schema I first developed.
The ideal system design to use if you ever want to build your own chat app.
Insights into database table relationships.
If you're a Laravel beginner, a developer exploring database design, or someone planning to build their own chat app, you'll find this guide helpful.
The Initial Struggles
The initial design included tables for users and messages, with each user capable of having multiple messages. Additionally, it featured tables for conversations, groups, and message attachments. The relationships between these tables are shown below.
This all seemed promising, and I even managed to seed the tables without breaking a sweat. But then came the conversations table, holding its receiver_id
with pride. At first glance, it looked all good—after all, the messages
table already had a sender_id
. You’d probably think, 'Hey, they’re linked... what could possibly go wrong?' Spoiler alert: a lot!
The issue with the conversations
table having a receiver_id
is that it restricts the conversation to a one-way communication model. This means that each conversation is designed to have only one receiver, which limits the interaction to messages being sent from multiple senders to a single receiver. In a typical chat application, you would expect a conversation to allow for two-way communication, where both parties can send and receive messages. The current design does not support this, as it only allows for messages to be directed to one receiver, making it unsuitable for a dynamic chat environment where users can freely exchange messages back and forth.
Another problem I noticed was in the seeding. Somehow, multiple users ended up with messages tied to a single private conversation (through the conversation_id
). It was like assigning the same seat on a bus to several people—it just didn’t make sense.
It was clear that I needed to develop a better schema and seeding system. I spent the following week iterating back and forth to achieve the perfect design, which I am now ready to share with you.
The Current Table Schema
For the most part, this new schema resembles the original. The main changes were made to the conversations and messages tables to address the issues identified earlier.
CONVERSATIONS
CREATE TABLE `conversations`( `id` BIGINT UNSIGNED NOT NULL AUTO_INCREMENT PRIMARY KEY, `type` ENUM('private', 'group') NOT NULL, `group_id` BIGINT NULL, `last_message_id` BIGINT NOT NULL, `created_at` DATETIME NOT NULL );
Compared to the previous version, this one now consists only of
type
(an Enum indicating whether the conversation is private or group) and a nullablegroup_id
for group conversations, in addition toid
and timestamps.MESSAGES
CREATE TABLE `messages`( `id` BIGINT UNSIGNED NOT NULL AUTO_INCREMENT PRIMARY KEY, `message` VARCHAR(255) NOT NULL, `conversation_id` BIGINT NOT NULL, `sender_id` BIGINT NOT NULL, `receiver_id` BIGINT NULL, `created_at` DATETIME NOT NULL, `updated_at` DATETIME NOT NULL );
This version now consists of
sender_id
,receiver_id
(nullable), andconversation_id
. You might wonder why the messages table doesn't just handle thegroup_id
column. This is because having bothgroup_id
andconversation_id
in the messages table would be redundant, asconversation_id
is already linked to the group (throughgroup_id
). You'll understand this better when we discuss relationships.MESSAGE ATTACHMENTS
CREATE TABLE `message_attchmanet`( `id` BIGINT UNSIGNED NOT NULL AUTO_INCREMENT PRIMARY KEY, `message_id` BIGINT NOT NULL, `name` VARCHAR(255) NOT NULL, `path` VARCHAR(1024) NOT NULL, `mime` VARCHAR(255) NOT NULL, `size` BIGINT NOT NULL, `created_at` DATETIME NOT NULL );
USERS
CREATE TABLE `users`( `id` BIGINT UNSIGNED NOT NULL AUTO_INCREMENT PRIMARY KEY, `name` VARCHAR(255) NOT NULL, `email` VARCHAR(255) NOT NULL, `avatar` VARCHAR(255) NULL, `password` VARCHAR(255) NOT NULL, `created_at` DATETIME NOT NULL, `updated_at` DATETIME NOT NULL ); ALTER TABLE `users` ADD UNIQUE `users_email_unique`(`email`);
GROUP
CREATE TABLE `groups`( `id` BIGINT UNSIGNED NOT NULL AUTO_INCREMENT PRIMARY KEY, `name` VARCHAR(255) NOT NULL, `decription` VARCHAR(255) NULL, `avatar` VARCHAR(255) NULL, `owner_id` BIGINT NOT NULL, `created_at` DATETIME NOT NULL, `updated_at` DATETIME NOT NULL );
Relationships
The relationships between the tables play a vital role in simplifying data retrieval. A well-designed schema relationship ensures that querying the tables is efficient and straightforward.
The relationships in this schema are as follows:
USERS AND GROUPS
As you might have guessed, Users and Groups have a
Many-to-Many
relationship. That is, a user can belong to multiple groups, and each group should contain more than one user. However, there’s a slight twist when it comes to Group Ownership. While a single user can own many groups, each group can only have one owner. This creates aOne-to-Many
relationship between users and groups in the context of ownership.CONVERSATIONS AND GROUPS
A group can have only one conversation, and all users in the group can participate in that conversation. Conversely, a (group) conversation can belong to only one group, which makes the relationship between these two tables
One-to-One
.Which Table is the Owning Side?
Now comes the important question: Which table should be the owning side of this relationship? To answer this, let’s first clarify the concept: In a One-to-One relationship, the owning side of the relationship is typically the table that holds the foreign key reference to the other table. to better understand this, suppose database table
A
had a columnB_id
which references foreign tableB
, then tableB
is said to be the owning side of the relationship.Here’s an easy way to think about it:
Can a conversation exist without a group?
The answer is No. If we define thetype
column of a conversation row as "group", then that conversation must be linked to a group (and have it’sgroup_id
column set to theid
of the group it belongs to).Can a group exist without a conversation?
Yes, a group can exist without a conversation. This is possible when a group is created, but no messages have been sent yet. Essentially, it means "the group has no conversation (yet)."
Having established that, the groups
table is the owning side, which means the conversations
table will include a group_id
foreign key reference.
Now you see why the messages
table shouldn’t have a group_id
—since the conversations
table already has a group_id
linking it to the group, adding it again in the messages
table would create redundancy.
CONVERSATIONS AND MESSAGES
For every conversation, there can be an unlimited number of messages, so this is pretty straightforward. The relationship is
Many-to-One
.USER AND MESSAGES
The relationship here is also
Many-to-One
because a user can have an unlimited number of messages associated with them. The key difference is that a message can be linked to a user in two ways: as the sender or as the receiver. (It’s important to note that thereceiver_id
on a message can benull
in cases where the message is part of a group conversation.) In any case, the relationship remains the same, as a user can be the recipient of many messages and can also be the sender of many messages.MESSAGES AND MESSAGE ATTCHMENTS
Each message can have one or more file attachments, though it might be practical to set a limit on the number of attachments allowed per message. Nonetheless, this creates a
One-to-Many
relationship between messages and message attachments.CONVERSATIONS AND USERS
If a conversation is not a group conversation, it is a personal conversation involving two users. These users can also participate in other conversations besides this one. Therefore, the relationship between users and conversations is
Many-to-Many
.
The Role of Intermediate Tables in Many-to-Many Relationships
In the case of Many-to-Many
relationships, like between users
and groups
(or conversations
and users
), we need an intermediate table (also known as a pivot table) to manage the links between users and groups. This table acts as a bridge, storing references to both tables, which simplifies the process of querying and managing relationships.
Taking users
and groups
, the intermediate table could be called something like group_user
, with columns for user_id
and group_id
, linking users to the groups they belong to and groups to their members (users). In frameworks like Laravel, it’s common to name this table in alphabetical order of the related tables, but the naming convention ultimately depends on your preference and consistency. For other relationships, like between users
and conversations
, you might choose names like user_conversation
or private_conversation
, depending on the context and clarity you wish to achieve.
The intermediate tables in our case are as follows:
USERS AND GROUPS
CREATE TABLE `group_user`( `id` BIGINT UNSIGNED NOT NULL AUTO_INCREMENT PRIMARY KEY, `user_id` BIGINT NOT NULL, `group_id` BIGINT NOT NULL, `role` ENUM('admin', 'member') NOT NULL );
As you can see, the
group_user
table also includes arole
column, allowing the pivot table to store additional data relevant to both related tables.CONVERSATIONS (private) AND USERS
CREATE TABLE `user_conversation`( `id` BIGINT UNSIGNED NOT NULL AUTO_INCREMENT PRIMARY KEY, `user_id` BIGINT NOT NULL, `conversation_id` BIGINT NOT NULL, `created_at` DATETIME NOT NULL );
By using pivot tables, we can efficiently manage Many-to-Many
relationships while also allowing for additional attributes, like roles, to be associated with the relationship itself. With this structure in place, querying and managing these connections becomes much more straightforward.
Foreign Key Constraints
To maintain data integrity and enforce relationships between tables, foreign key constraints are used to ensure that the data in one table corresponds to data in another table. These constraints help maintain the referential integrity of the database by ensuring that relationships between tables remain consistent. For example, a foreign key constraint might ensure that every group_id
in the group_user
table corresponds to a valid id
in the groups
table, and similarly, every user_id
in the group_user
table corresponds to a valid id
in the users
table. This prevents orphaned records and ensures that all references are valid.
The following foreign key constraints are used:
ALTER TABLE
`messages` ADD CONSTRAINT `messages_sender_id_foreign` FOREIGN KEY(`sender_id`) REFERENCES `users`(`id`);
ALTER TABLE
`messages` ADD CONSTRAINT `messages_receiver_id_foreign` FOREIGN KEY(`receiver_id`) REFERENCES `users`(`id`);
ALTER TABLE
`messages` ADD CONSTRAINT `messages_conversation_id_foreign` FOREIGN KEY(`conversation_id`) REFERENCES `conversations`(`id`);
ALTER TABLE
`conversations` ADD CONSTRAINT `conversations_group_id_foreign` FOREIGN KEY(`group_id`) REFERENCES `groups`(`id`);
ALTER TABLE
`user_conversation` ADD CONSTRAINT `user_conversation_conversation_id_foreign` FOREIGN KEY(`conversation_id`) REFERENCES `conversations`(`id`);
ALTER TABLE
`user_conversation` ADD CONSTRAINT `user_conversation_user_id_foreign` FOREIGN KEY(`user_id`) REFERENCES `users`(`id`);
ALTER TABLE
`group_user` ADD CONSTRAINT `group_user_group_id_foreign` FOREIGN KEY(`group_id`) REFERENCES `groups`(`id`);
ALTER TABLE
`group_user` ADD CONSTRAINT `group_user_user_id_foreign` FOREIGN KEY(`user_id`) REFERENCES `users`(`id`);
ALTER TABLE
`message_attchmanet` ADD CONSTRAINT `message_attchmanet_message_id_foreign` FOREIGN KEY(`message_id`) REFERENCES `messages`(`id`);
Below is a visual representation of the final schema, illustrating how the tables are structured and interconnected through their relationships. This overview provides a comprehensive view of how data flows within the system.
With this structure in place, we now have a well-organized schema that ensures data integrity and facilitates efficient querying.
Key Takeaways
Table relationships are the backbone of a well-structured database – Understanding One-to-One, One-to-Many, and Many-to-Many relationships is crucial for efficient data retrieval.
Pivot tables help manage Many-to-Many relationships – Instead of storing redundant data, using an intermediate table (like
group_user
oruser_conversation
) simplifies querying.Defining ownership in relationships prevents ambiguity – In One-to-One relationships, the table that cannot exist independently should hold the foreign key. Similarly, in One-to-Many relationships, a single record (like a message) can have multiple related records (like attachments), ensuring scalability.
A well-thought-out schema reduces complexity in application logic – Proper table relationships make querying, indexing, and managing data easier in the long run.
Conclusion
Designing a solid table schema for a chat application comes with its challenges, but getting it right makes everything else—querying, performance, and data integrity—much smoother. From handling Many-to-Many relationships with pivot tables to deciding which table should own a foreign key, every decision plays a role in how well the database functions. Looking back, refining these relationships was key to making the database both efficient and scalable. More than just structuring tables, it was about anticipating how data would flow and ensuring flexibility for future features. Hopefully, this breakdown has given you useful insights into structuring your own chat app or tackling database design more effectively. Cheers!