Documentation
Enable the BigQuery sandbox | Google Cloud
How does BigQuery work?
Get started
Quickstarts
Try the Cloud console
Try the command-line tool
Explore BigQuery tools
Migrate
Migrate a data warehouse
- Introduction to BigQuery Migration Service
- Migration assessment
- Migrate schema and data
- Migrate data pipelines
Migrate SQL
- Translate SQL queries interactively
- Translate SQL queries using the API
- Translate SQL queries in batch
- Generate metadata for translation and assessment
- Transform SQL translations with YAML
- Map SQL object names for batch translation
Migration guides
Amazon Redshift
- Migration overview
- Migrate Amazon Redshift schema and data
- Migrate Amazon Redshift schema and data when using a VPC
- SQL translation reference
Apache Hive
IBM Netezza
Netezza is ==a data warehouse system that offers analytics, AI, and machine learning (ML) capabilities==. It's a subsidiary of IBM, and is available on IBM Cloud, AWS, and Microsoft Azure.
Features
-
Scalability: Scales up and down based on usage
-
Open formats: Supports open formats like Parquet and Iceberg for secure data sharing
-
In-database analytics: Allows users to run complex queries and build models directly in the database
-
Geospatial capabilities: Built-in geospatial capabilities for analyzing data
-
Solid-state disks: Data is stored on solid-state disks (SSDs) that are self-encrypting drives (SEDs)
Oracle
Snowflake
Teradata
Design
Datasets
- Introduction
- Create datasets
- List datasets
- Update dataset properties
- Cross-region replication
- Managed disaster recovery
- Dataset data retention
Tables
BigQuery tables
-
Specify table schemas
-
Segment with partitioned tables
-
Optimize with clustered tables
External tables
-
Types of external tables
Views
Logical views
Materialized views
Routines
- Manage routines
- User-defined functions
- User-defined aggregate functions
- Table functions
- Remote functions
- SQL stored procedures
- Stored procedures for Apache Spark
- Analyze object tables by using remote functions
- Remote functions and Translation API tutorial
Connections
- Introduction
- Amazon S3 connection
- Apache Spark connection
- Azure Blob Storage connection
- Cloud resource connection
- Spanner connection
- Cloud SQL connection
- AlloyDB connection
- SAP Datasphere connection
- Manage connections
- Configure connections with network attachments
Indexes
Search indexes
Vector indexes
Load, transform, and export
Load data
BigQuery Data Transfer Service
-
Transfer guides
-
Amazon S3
-
Azure Blob Storage
-
Campaign Manager
-
Cloud Storage
-
Comparison Shopping Service Center
-
Display & Video 360
-
Facebook Ads
-
Google Ad Manager
-
Google Ads
-
Google Merchant Center
-
Transfer report schema
-
Google Play
-
Oracle
-
Salesforce
-
Salesforce Marketing Cloud
-
Search Ads 360
-
ServiceNow
-
YouTube channel
-
YouTube content owner
-
Batch load data
- Introduction
- Auto-detect schemas
- Load Avro data
- Load Parquet data
- Load ORC data
- Load CSV data
- Load JSON data
- Load externally partitioned data
- Load data from a Datastore export
- Load data from a Firestore export
- Load data using the Storage Write API
- Load data into partitioned tables
Write and read data with the Storage API
-
Write data with the Storage Write API
Transform data
Prepare data
Transform data with workflows
Export data
- Introduction
- Export query results
- Export to Cloud Storage
- Export to Bigtable
- Export to Spanner
- Export to Pub/Sub
- Export as Protobuf columns
Analyze
Explore your data
- Create queries with table explorer
- Generate profile insights
- Generate data insights
- Analyze with a data canvas
- Analyze data with Gemini
Query BigQuery data
Query data with SQL
- Introduction
- Arrays
- JSON data
- Multi-statement queries
- Parameterized queries
- Pipe syntax
- Recursive CTEs
- Sketches
- Table sampling
- Time series
- Transactions
- Wildcard tables
Use geospatial analytics
-
Geospatial analytics tutorials
Search data
Work with queries
Save queries
Continuous queries
Work with sessions
Optimize queries
- Introduction
- Use the query plan explanation
- Get query performance insights
- Optimize query computation
- Use history-based optimizations
- Optimize storage for query performance
- Use materialized views
- Use BI Engine
- Use nested and repeated data
- Optimize functions
Query external data sources
Manage open source metadata
-
BigQuery metastore
- Introduction
- Use with Apache Spark and standard tables, BigQuery tables for Apache Iceberg, and external tables
- Use with Apache Spark in BigQuery Studio
- Use with Apache Spark in Dataproc
- Use with Apache Spark in Dataproc Serverless
- Use with stored procedures
- Create tables with Apache Spark and query in BigQuery
- Additional features
- Migrate from Dataproc Metastore
Use external tables and datasets
-
Amazon S3 data
-
Azure Blob Storage data
-
Cloud Storage data
Run federated queries
- Federated queries
- Query SAP Datasphere data
- Query AlloyDB data
- Query Spanner data
- Query Cloud SQL data
Use notebooks
Use Colab notebooks
Use DataFrames
Use Jupyter notebooks
Use analysis and BI tools
Google Cloud Ready - BigQuery
Share with Analytics Hub
- Introduction
- Manage data exchanges
- Manage listings
- Manage subscriptions
- Configure user roles
- View and subscribe to listings
- Share sensitive data with data clean rooms
Entity resolution
AI and machine learning
Generative AI and pretrained models
Choose generative AI and task-specific functions
- Choose a natural language processing function
- Choose a document processing function
- Choose a transcription function
Generative AI
Tutorials
-
Generate text
-
Generate embeddings
-
Vector search
Task-specific solutions
Tutorials
-
Natural language processing
-
Document processing
-
Speech recognition
-
Computer vision
Machine learning
ML models and MLOps
- End-to-end journey per model
- Model creation
- Hyperparameter tuning overview
- Model evaluation overview
- Model inference overview
- Explainable AI overview
- Model weights overview
- ML pipelines overview
- Model monitoring overview
- Manage BigQueryML models in Vertex AI
Use cases
- Forecasting
- Anomaly detection
- Recommendation
- Classification
- Regression
- Dimensionality reduction
- Clustering
Tutorials
-
Regression and classification
-
Clustering
-
Recommendation
-
Time series forecasting
- Forecast a single time series with a univariate model
- Forecast multiple time series with a univariate model
- Scale a univariate time series model to millions of time series
- Forecast a single time series with a multivariate model
- Forecast multiple time series with a multivariate model
- Use custom holidays with a univariate model
- Limit forecasted values for a univariate model
- Forecast hierarchical time series with a univariate model
-
Anomaly detection
-
Imported and remote models
-
Hyperparameter tuning
-
Export models
Augmented analytics
Tutorials
- Get data insights from contribution analysis using a summable metric
- Get data insights from contribution analysis using a summable ratio metric
Create and manage features
- Feature preprocessing overview
- Supported input feature types
- Automatic preprocessing
- Manual preprocessing
- Feature serving
- Perform feature engineering with the TRANSFORM clause
Work with models
Administer
Manage resources
Manage code assets
Manage tables
Manage table clones
Manage table snapshots
Orchestrate resources
Orchestrate code assets
Orchestrate jobs and queries
Workload management
Use reservations
- Get started
- Estimate slot capacity requirements
- View slot recommendations and insights
- Purchase and manage slot commitments
- Work with slot reservations
- Work with reservation assignments
Manage jobs
Legacy reservations
- Introduction to legacy reservations
- Legacy slot commitments
- Purchase and manage legacy slot commitments
- Work with legacy slot reservations
Manage BI Engine
Monitor workloads
- Introduction
- Monitor resource utilization
- Monitor jobs
- Monitor Analytics Hub listings
- Monitor BI Engine
- Monitor data quality
- Monitor Data Transfer Service
- Monitor materialized views
- Monitor reservations
- Monitor continuous queries
- Dashboards, charts and alerts
Optimize resources
Control costs
Optimize with recommendations
Organize with labels
Manage data quality
Govern
Control access to resources
Control access with IAM
Control access with authorization
Restrict network access
Control column and row access
Control access to table columns
- Introduction to column-level access control
- Restrict access with column-level access control
- Impact on writes
Manage policy tags
Control access to table rows
- Introduction to row-level security
- Work with row-level security
- Use row-level security with other BigQuery features
- Best practices for row-level security
Protect sensitive data
Mask data in table columns
Anonymize data with differential privacy
Manage encryption
- Encryption at rest
- Customer-managed encryption keys
- Column-level encryption with Cloud KMS
- AEAD encryption
Audit workloads
- Introduction
- Audit policy tags
- View Data Policy audit logs
- Data Transfer Service audit logs
- Analytics Hub audit logging
- BigQuery audit logs reference
- Migrate audit logs
- BigLake API audit logs
- BigQuery Migration API audit logs