-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
When the binlog file is larger than 4G, data loss occurs #1366
Comments
Thank you for your detailed report and the depth of understanding you've shown regarding gh-ost and MySQL functionalities. We have carefully reviewed the implications of accepting binlog events that exceed the standard gh-ost functions by mimicking a replica to the primary database. This design means that in scenarios where a transaction exceeds 1GB, the likelihood of gh-ost successfully replicating this data without encountering errors is significantly reduced. MySQL's replication mechanism, which gh-ost relies upon, is not designed to handle individual transactions of this size, as they exceed the Our commitment with gh-ost is to provide a tool that not only enhances MySQL's capabilities but also aligns closely with its core principles and standards. Deviating from these established limits, even for edge cases, could lead to unpredictable challenges and potentially encourage practices that are outside MySQL’s recommended configurations. In light of your report, we are considering ways to harden this check to move it from a soft failure to a hard failure. However, at this time, we cannot provide a specific timeline for this enhancement. We believe this approach will further safeguard the tool against unintended consequences and align it more closely with MySQL's standards. We understand this might pose challenges in specific scenarios, especially with very large transactions. We recommend exploring alternate strategies, such as breaking down these transactions into smaller, more manageable sizes that align with MySQL's configuration guidelines. Thank you again for your contribution and understanding. |
Hi, Since the data table has a big row, assuming a single row is 4M and the chunk is 1000, the transaction generated by one copy is larger than 4G. There is nothing we can do to prevent this from happening. Losing data is a serious problem. Thanks for the evaluation. |
Hi, When the binlog file is larger than 4G, data loss occurs.
MySQL allows a single binlog file to be larger than 4G, and in the MySQL source code, the binlog event end_log_pos field type is defined as uint32, with a maximum storage of 4G. Therefore, when a single binlog file is larger than 4G, the end_log_pos field storage will overflow and be used cyclically.
Try to use mysqlbinlog to parse a binlog file larger than 4G, example:
When end_log_pos reaches 4294962881, the next event end_log_pos overflows to 3601 and becomes smaller.
Why data is lost
In the handleRowsEvent method of the gh-ost code, which is used to process a DML event, there is a logic used to judge: if the currently received event end_log_pos value is less than or equal to the last executed event end_log_pos value, the current event will be ignored. When the binlog file is larger than 4G, all events larger than 4G will be discarded due to event end_log_pos overflow storage.
The reason why binlog is larger than 4G
When there is a big row in the data table, too many chunks at a time may cause a large transaction (assuming the transaction is larger than 4G), or the business itself has a large transaction. A transaction is written to a single binlog file.
Group commit and binlog
Under the group submission mechanism, the submission of a large transaction (assuming the transaction is larger than 4G) will refresh the binlog cache with N transactions of the same group and write them to the same binlog file. DML changes currently being made to the DDL table may also appear in this group and be written to the binlog file together.
Fix
I understand that the logic of this code is just an optimization item to avoid repeated execution of events when MySQLreader retries. Can I remove this code to fix the above problem?
Thank you!
The text was updated successfully, but these errors were encountered: