Problem:
You have encountered the following error message while reading data in Google BigQuery:
com.google.cloud.spark.bigquery.repackaged.com.google.cloud.bigquery.BigQueryException: Error while reading data, error message: Data cannot be parsed within the memory limit. Try reducing the data size or not using compression.
Cause:
This error message is typically caused by attempting to load too much data into BigQuery at once, or by using compression on data that is too large to be compressed within the memory limit. When writing data to BigQuery, an intermediate compression stage is used, and there is a 4GB limitation when writing compressed CSV data to BigQuery.
Solution:
To resolve this issue, consider splitting the upstream component prior to the Write Connector, which will allow for discrete payloads to be written to BigQuery. This will enable you to reduce the data size and avoid using compression on data that is too large to be compressed within the memory limit.
When checking the length of a record, multiplying it by the number of records will give you an indication of the size of the payload. If the payload is over 20GB uncompressed, it is unlikely to be compressed for transfer. Additionally, you can refer to the BigQuery quotas and limits documentation for more information on load jobs and their limitations.
Comments
0 comments
Please sign in to leave a comment.