Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

When the binlog file is larger than 4G, data loss occurs #1366

Closed
dongwenpeng opened this issue Jan 11, 2024 · 2 comments
Closed

When the binlog file is larger than 4G, data loss occurs #1366

dongwenpeng opened this issue Jan 11, 2024 · 2 comments

Comments

@dongwenpeng
Copy link

dongwenpeng commented Jan 11, 2024

Hi, When the binlog file is larger than 4G, data loss occurs.

MySQL allows a single binlog file to be larger than 4G, and in the MySQL source code, the binlog event end_log_pos field type is defined as uint32, with a maximum storage of 4G. Therefore, when a single binlog file is larger than 4G, the end_log_pos field storage will overflow and be used cyclically.

Try to use mysqlbinlog to parse a binlog file larger than 4G, example:

#202409 20:06:42 server id 100  end_log_pos 4294954865 CRC32 0x5e0195d9         Update_rows: table id 266
# at 4294954865
#202409 20:06:42 server id 100  end_log_pos 4294962881 CRC32 0x2f0b79cc         Update_rows: table id 266
# at 4294962881
#202409 20:06:42 server id 100  end_log_pos 3601 CRC32 0x2f367261       Update_rows: table id 266
# at 4294970897
#202409 20:06:42 server id 100  end_log_pos 11617 CRC32 0xb1ac6949      Update_rows: table id 266

When end_log_pos reaches 4294962881, the next event end_log_pos overflows to 3601 and becomes smaller.

Why data is lost

In the handleRowsEvent method of the gh-ost code, which is used to process a DML event, there is a logic used to judge: if the currently received event end_log_pos value is less than or equal to the last executed event end_log_pos value, the current event will be ignored. When the binlog file is larger than 4G, all events larger than 4G will be discarded due to event end_log_pos overflow storage.

// StreamEvents
func (this *GoMySQLReader) handleRowsEvent(ev *replication.BinlogEvent, rowsEvent *replication.RowsEvent, entriesChannel chan<- *BinlogEntry) error {
	if this.currentCoordinates.SmallerThanOrEquals(&this.LastAppliedRowsEventHint) {
		this.migrationContext.Log.Debugf("Skipping handled query at %+v", this.currentCoordinates)
		return nil
	}
        ...
        // Execute event
	...
        // Record the position information of the last executed event
	this.LastAppliedRowsEventHint = this.currentCoordinates
	return nil
}

The reason why binlog is larger than 4G

When there is a big row in the data table, too many chunks at a time may cause a large transaction (assuming the transaction is larger than 4G), or the business itself has a large transaction. A transaction is written to a single binlog file.

Group commit and binlog

Under the group submission mechanism, the submission of a large transaction (assuming the transaction is larger than 4G) will refresh the binlog cache with N transactions of the same group and write them to the same binlog file. DML changes currently being made to the DDL table may also appear in this group and be written to the binlog file together.

Fix

I understand that the logic of this code is just an optimization item to avoid repeated execution of events when MySQLreader retries. Can I remove this code to fix the above problem?

if this.currentCoordinates.SmallerThanOrEquals(&this.LastAppliedRowsEventHint) {
	this.migrationContext.Log.Debugf("Skipping handled query at %+v", this.currentCoordinates)
	return nil
}

Thank you!

@TeeKraken
Copy link

Thank you for your detailed report and the depth of understanding you've shown regarding gh-ost and MySQL functionalities.

We have carefully reviewed the implications of accepting binlog events that exceed the standard max_binlog_size and max_allowed_packet settings, as documented in the MySQL Reference Manual (max_binlog_size, max_allowed_packet). After this review, we have decided against implementing the requested change, primarily due to the way gh-ost interacts with the primary instance.

gh-ost functions by mimicking a replica to the primary database. This design means that in scenarios where a transaction exceeds 1GB, the likelihood of gh-ost successfully replicating this data without encountering errors is significantly reduced. MySQL's replication mechanism, which gh-ost relies upon, is not designed to handle individual transactions of this size, as they exceed the max_binlog_size and max_allowed_packet limit. For information on replication and max_allowed_packet, see this dev doc.

Our commitment with gh-ost is to provide a tool that not only enhances MySQL's capabilities but also aligns closely with its core principles and standards. Deviating from these established limits, even for edge cases, could lead to unpredictable challenges and potentially encourage practices that are outside MySQL’s recommended configurations.

In light of your report, we are considering ways to harden this check to move it from a soft failure to a hard failure. However, at this time, we cannot provide a specific timeline for this enhancement. We believe this approach will further safeguard the tool against unintended consequences and align it more closely with MySQL's standards.

We understand this might pose challenges in specific scenarios, especially with very large transactions. We recommend exploring alternate strategies, such as breaking down these transactions into smaller, more manageable sizes that align with MySQL's configuration guidelines.

Thank you again for your contribution and understanding.

@dongwenpeng
Copy link
Author

Hi, Since the data table has a big row, assuming a single row is 4M and the chunk is 1000, the transaction generated by one copy is larger than 4G. There is nothing we can do to prevent this from happening. Losing data is a serious problem.

Thanks for the evaluation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants