Thursday, February 1, 2024

BigQuery - How to Import CSV data into BigQuery

In this article, you will learn how to build a new GCP project, construct a dataset or data schema, and then create a table by uploading raw data or a CSV file into BigQuery.

You can import CSV data into BigQuery manually using the BigQuery web UI. Here are the steps to do so:
  1. Navigate to BigQuery Console: Open the BigQuery web UI by visiting the BigQuery Console in your web browser. Make sure you're logged in with your Google Cloud Platform (GCP) account that has access to the desired BigQuery dataset.


  2. Select Project and Dataset: Select the GCP project and dataset where you want to import the CSV data. You can do this from the project dropdown menu and dataset dropdown menu on the left-hand side of the BigQuery web UI.


  3. Click "Create Table" Button: Within the selected dataset, click on the "Create Table" button to start the process of importing the CSV data.


  4. Specify Table Details: Fill out the necessary details for the new table you're creating:

    • Table Name: Choose a name for your table.
    • Schema: Define the schema of your table (i.e., column names and data types). You can let BigQuery auto-detect the schema based on the CSV data, or you can specify it manually.
    • Table Type: Choose whether your table is Native table or External table (for data stored outside of BigQuery).

  5. Select Source Data: Choose "Upload" as the source data. You'll be prompted to upload the CSV file from your local machine.


  6. Upload CSV File: Click on the "Select a file from your local machine" button to browse and select the CSV file you want to import.


  7. Configure Import Options: BigQuery provides options for configuring the import process, such as selecting the delimiter, choosing whether to allow quoted newlines, and specifying the number of rows to skip.


  8. Review and Confirm: Review the table details and import configuration to ensure everything is set up correctly. Once you're satisfied, click on the "Create Table" button to start the import process.


  9. Monitor Import Progress: BigQuery will begin importing the CSV data into the specified table. You can monitor the progress of the import job from the BigQuery web UI. Depending on the size of the CSV file and your network connection speed, the import process may take some time to complete.


  10. Verify Imported Data: Once the import job is finished, you can verify that the CSV data has been successfully imported into the BigQuery table by running SQL queries or viewing the table data from the BigQuery web UI.

To complete the tasks outlined above, you must have a GCP account and appropriate access. 


Data validation — Data validation is the process of checking the data against predefined rules and standards, such as data types, formats, ranges, and constraints.

  1. πŸ’«Schema Validation: Verify data adherence to predefined schemas, checking types, formats, and structures.
  2. πŸ’«Integrity Constraints: Enforce rules and constraints to maintain data integrity, preventing inconsistencies.
  3. πŸ’«Cross-Field Validation: Validate relationships and dependencies between different fields to ensure logical coherence.
  4. πŸ’«Data Quality Metrics: Define and track quality metrics, such as completeness, accuracy, and consistency.
  5. πŸ’«Automated Validation Scripts: Develop and run automated scripts to check data against predefined rules and criteria.

 

To learn more, please follow us -
πŸ”Š http://www.sql-datatools.com

To Learn more, please visit our YouTube channel at —
πŸ”Š http://www.youtube.com/c/Sql-datatools

To Learn more, please visit our Instagram account at -
πŸ”Š https://www.instagram.com/asp.mukesh/

To Learn more, please visit our twitter account at -
πŸ”Š
 https://twitter.com/macxima

No comments:

Post a Comment