Struct google_api_proto::google::cloud::bigquery::v2::JobConfigurationLoad
source · pub struct JobConfigurationLoad {Show 35 fields
pub source_uris: Vec<String>,
pub file_set_spec_type: i32,
pub schema: Option<TableSchema>,
pub destination_table: Option<TableReference>,
pub destination_table_properties: Option<DestinationTableProperties>,
pub create_disposition: String,
pub write_disposition: String,
pub null_marker: Option<String>,
pub field_delimiter: String,
pub skip_leading_rows: Option<i32>,
pub encoding: String,
pub quote: Option<String>,
pub max_bad_records: Option<i32>,
pub allow_quoted_newlines: Option<bool>,
pub source_format: String,
pub allow_jagged_rows: Option<bool>,
pub ignore_unknown_values: Option<bool>,
pub projection_fields: Vec<String>,
pub autodetect: Option<bool>,
pub schema_update_options: Vec<String>,
pub time_partitioning: Option<TimePartitioning>,
pub range_partitioning: Option<RangePartitioning>,
pub clustering: Option<Clustering>,
pub destination_encryption_configuration: Option<EncryptionConfiguration>,
pub use_avro_logical_types: Option<bool>,
pub reference_file_schema_uri: Option<String>,
pub hive_partitioning_options: Option<HivePartitioningOptions>,
pub decimal_target_types: Vec<i32>,
pub json_extension: i32,
pub parquet_options: Option<ParquetOptions>,
pub preserve_ascii_control_characters: Option<bool>,
pub connection_properties: Vec<ConnectionProperty>,
pub create_session: Option<bool>,
pub column_name_character_map: i32,
pub copy_files_only: Option<bool>,
}
Expand description
JobConfigurationLoad contains the configuration properties for loading data into a destination table.
Fields§
§source_uris: Vec<String>
[Required] The fully-qualified URIs that point to your data in Google Cloud. For Google Cloud Storage URIs: Each URI can contain one ‘’ wildcard character and it must come after the ‘bucket’ name. Size limits related to load jobs apply to external data sources. For Google Cloud Bigtable URIs: Exactly one URI can be specified and it has be a fully specified and valid HTTPS URL for a Google Cloud Bigtable table. For Google Cloud Datastore backups: Exactly one URI can be specified. Also, the ‘’ wildcard character is not allowed.
file_set_spec_type: i32
Optional. Specifies how source URIs are interpreted for constructing the file set to load. By default, source URIs are expanded against the underlying storage. You can also specify manifest files to control how the file set is constructed. This option is only applicable to object storage systems.
schema: Option<TableSchema>
Optional. The schema for the destination table. The schema can be omitted if the destination table already exists, or if you’re loading data from Google Cloud Datastore.
destination_table: Option<TableReference>
[Required] The destination table to load the data into.
destination_table_properties: Option<DestinationTableProperties>
Optional. [Experimental] Properties with which to create the destination table if it is new.
create_disposition: String
Optional. Specifies whether the job is allowed to create new tables. The following values are supported:
- CREATE_IF_NEEDED: If the table does not exist, BigQuery creates the table.
- CREATE_NEVER: The table must already exist. If it does not, a ‘notFound’ error is returned in the job result. The default value is CREATE_IF_NEEDED. Creation, truncation and append actions occur as one atomic update upon job completion.
write_disposition: String
Optional. Specifies the action that occurs if the destination table already exists. The following values are supported:
- WRITE_TRUNCATE: If the table already exists, BigQuery overwrites the data, removes the constraints and uses the schema from the load job.
- WRITE_APPEND: If the table already exists, BigQuery appends the data to the table.
- WRITE_EMPTY: If the table already exists and contains data, a ‘duplicate’ error is returned in the job result.
The default value is WRITE_APPEND. Each action is atomic and only occurs if BigQuery is able to complete the job successfully. Creation, truncation and append actions occur as one atomic update upon job completion.
null_marker: Option<String>
Optional. Specifies a string that represents a null value in a CSV file. For example, if you specify “\N”, BigQuery interprets “\N” as a null value when loading a CSV file. The default value is the empty string. If you set this property to a custom value, BigQuery throws an error if an empty string is present for all data types except for STRING and BYTE. For STRING and BYTE columns, BigQuery interprets the empty string as an empty value.
field_delimiter: String
Optional. The separator character for fields in a CSV file. The separator is interpreted as a single byte. For files encoded in ISO-8859-1, any single character can be used as a separator. For files encoded in UTF-8, characters represented in decimal range 1-127 (U+0001-U+007F) can be used without any modification. UTF-8 characters encoded with multiple bytes (i.e. U+0080 and above) will have only the first byte used for separating fields. The remaining bytes will be treated as a part of the field. BigQuery also supports the escape sequence “\t” (U+0009) to specify a tab separator. The default value is comma (“,”, U+002C).
skip_leading_rows: Option<i32>
Optional. The number of rows at the top of a CSV file that BigQuery will skip when loading the data. The default value is 0. This property is useful if you have header rows in the file that should be skipped. When autodetect is on, the behavior is the following:
- skipLeadingRows unspecified - Autodetect tries to detect headers in the first row. If they are not detected, the row is read as data. Otherwise data is read starting from the second row.
- skipLeadingRows is 0 - Instructs autodetect that there are no headers and data should be read starting from the first row.
- skipLeadingRows = N > 0 - Autodetect skips N-1 rows and tries to detect headers in row N. If headers are not detected, row N is just skipped. Otherwise row N is used to extract column names for the detected schema.
encoding: String
Optional. The character encoding of the data.
The supported values are UTF-8, ISO-8859-1, UTF-16BE, UTF-16LE, UTF-32BE,
and UTF-32LE. The default value is UTF-8. BigQuery decodes the data after
the raw, binary data has been split using the values of the quote
and
fieldDelimiter
properties.
If you don’t specify an encoding, or if you specify a UTF-8 encoding when
the CSV file is not UTF-8 encoded, BigQuery attempts to convert the data to
UTF-8. Generally, your data loads successfully, but it may not match
byte-for-byte what you expect. To avoid this, specify the correct encoding
by using the --encoding
flag.
If BigQuery can’t convert a character other than the ASCII 0
character,
BigQuery converts the character to the standard Unicode replacement
character: �.
quote: Option<String>
Optional. The value that is used to quote data sections in a CSV file. BigQuery converts the string to ISO-8859-1 encoding, and then uses the first byte of the encoded string to split the data in its raw, binary state. The default value is a double-quote (‘“’). If your data does not contain quoted sections, set the property value to an empty string. If your data contains quoted newline characters, you must also set the allowQuotedNewlines property to true. To include the specific quote character within a quoted value, precede it with an additional matching quote character. For example, if you want to escape the default character ’ “ ’, use ’ “” ’. @default “
max_bad_records: Option<i32>
Optional. The maximum number of bad records that BigQuery can ignore when running the job. If the number of bad records exceeds this value, an invalid error is returned in the job result. The default value is 0, which requires that all records are valid. This is only supported for CSV and NEWLINE_DELIMITED_JSON file formats.
allow_quoted_newlines: Option<bool>
Indicates if BigQuery should allow quoted data sections that contain newline characters in a CSV file. The default value is false.
source_format: String
Optional. The format of the data files. For CSV files, specify “CSV”. For datastore backups, specify “DATASTORE_BACKUP”. For newline-delimited JSON, specify “NEWLINE_DELIMITED_JSON”. For Avro, specify “AVRO”. For parquet, specify “PARQUET”. For orc, specify “ORC”. The default value is CSV.
allow_jagged_rows: Option<bool>
Optional. Accept rows that are missing trailing optional columns. The missing values are treated as nulls. If false, records with missing trailing columns are treated as bad records, and if there are too many bad records, an invalid error is returned in the job result. The default value is false. Only applicable to CSV, ignored for other formats.
ignore_unknown_values: Option<bool>
Optional. Indicates if BigQuery should allow extra values that are not represented in the table schema. If true, the extra values are ignored. If false, records with extra columns are treated as bad records, and if there are too many bad records, an invalid error is returned in the job result. The default value is false. The sourceFormat property determines what BigQuery treats as an extra value: CSV: Trailing columns JSON: Named values that don’t match any column names in the table schema Avro, Parquet, ORC: Fields in the file schema that don’t exist in the table schema.
projection_fields: Vec<String>
If sourceFormat is set to “DATASTORE_BACKUP”, indicates which entity properties to load into BigQuery from a Cloud Datastore backup. Property names are case sensitive and must be top-level properties. If no properties are specified, BigQuery loads all properties. If any named property isn’t found in the Cloud Datastore backup, an invalid error is returned in the job result.
autodetect: Option<bool>
Optional. Indicates if we should automatically infer the options and schema for CSV and JSON sources.
schema_update_options: Vec<String>
Allows the schema of the destination table to be updated as a side effect of the load job if a schema is autodetected or supplied in the job configuration. Schema update options are supported in two cases: when writeDisposition is WRITE_APPEND; when writeDisposition is WRITE_TRUNCATE and the destination table is a partition of a table, specified by partition decorators. For normal tables, WRITE_TRUNCATE will always overwrite the schema. One or more of the following values are specified:
- ALLOW_FIELD_ADDITION: allow adding a nullable field to the schema.
- ALLOW_FIELD_RELAXATION: allow relaxing a required field in the original schema to nullable.
time_partitioning: Option<TimePartitioning>
Time-based partitioning specification for the destination table. Only one of timePartitioning and rangePartitioning should be specified.
range_partitioning: Option<RangePartitioning>
Range partitioning specification for the destination table. Only one of timePartitioning and rangePartitioning should be specified.
clustering: Option<Clustering>
Clustering specification for the destination table.
destination_encryption_configuration: Option<EncryptionConfiguration>
Custom encryption configuration (e.g., Cloud KMS keys)
use_avro_logical_types: Option<bool>
Optional. If sourceFormat is set to “AVRO”, indicates whether to interpret logical types as the corresponding BigQuery data type (for example, TIMESTAMP), instead of using the raw type (for example, INTEGER).
reference_file_schema_uri: Option<String>
Optional. The user can provide a reference file with the reader schema. This file is only loaded if it is part of source URIs, but is not loaded otherwise. It is enabled for the following formats: AVRO, PARQUET, ORC.
hive_partitioning_options: Option<HivePartitioningOptions>
Optional. When set, configures hive partitioning support. Not all storage formats support hive partitioning – requesting hive partitioning on an unsupported format will lead to an error, as will providing an invalid specification.
decimal_target_types: Vec<i32>
Defines the list of possible SQL data types to which the source decimal values are converted. This list and the precision and the scale parameters of the decimal field determine the target type. In the order of NUMERIC, BIGNUMERIC, and STRING, a type is picked if it is in the specified list and if it supports the precision and the scale. STRING supports all precision and scale values. If none of the listed types supports the precision and the scale, the type supporting the widest range in the specified list is picked, and if a value exceeds the supported range when reading the data, an error will be thrown.
Example: Suppose the value of this field is [“NUMERIC”, “BIGNUMERIC”]. If (precision,scale) is:
- (38,9) -> NUMERIC;
- (39,9) -> BIGNUMERIC (NUMERIC cannot hold 30 integer digits);
- (38,10) -> BIGNUMERIC (NUMERIC cannot hold 10 fractional digits);
- (76,38) -> BIGNUMERIC;
- (77,38) -> BIGNUMERIC (error if value exeeds supported range).
This field cannot contain duplicate types. The order of the types in this field is ignored. For example, [“BIGNUMERIC”, “NUMERIC”] is the same as [“NUMERIC”, “BIGNUMERIC”] and NUMERIC always takes precedence over BIGNUMERIC.
Defaults to [“NUMERIC”, “STRING”] for ORC and [“NUMERIC”] for the other file formats.
json_extension: i32
Optional. Load option to be used together with source_format newline-delimited JSON to indicate that a variant of JSON is being loaded. To load newline-delimited GeoJSON, specify GEOJSON (and source_format must be set to NEWLINE_DELIMITED_JSON).
parquet_options: Option<ParquetOptions>
Optional. Additional properties to set if sourceFormat is set to PARQUET.
preserve_ascii_control_characters: Option<bool>
Optional. When sourceFormat is set to “CSV”, this indicates whether the embedded ASCII control characters (the first 32 characters in the ASCII-table, from ‘\x00’ to ‘\x1F’) are preserved.
connection_properties: Vec<ConnectionProperty>
Optional. Connection properties which can modify the load job behavior. Currently, only the ‘session_id’ connection property is supported, and is used to resolve _SESSION appearing as the dataset id.
create_session: Option<bool>
Optional. If this property is true, the job creates a new session using a
randomly generated session_id. To continue using a created session with
subsequent queries, pass the existing session identifier as a
ConnectionProperty
value. The session identifier is returned as part of
the SessionInfo
message within the query statistics.
The new session’s location will be set to Job.JobReference.location
if it
is present, otherwise it’s set to the default location based on existing
routing logic.
column_name_character_map: i32
Optional. Character map supported for column names in CSV/Parquet loads. Defaults to STRICT and can be overridden by Project Config Service. Using this option with unsupporting load formats will result in an error.
copy_files_only: Option<bool>
Optional. [Experimental] Configures the load job to copy files directly to the destination BigLake managed table, bypassing file content reading and rewriting.
Copying files only is supported when all the following are true:
source_uris
are located in the same Cloud Storage location as the destination table’sstorage_uri
location.source_format
isPARQUET
.destination_table
is an existing BigLake managed table. The table’s schema does not have flexible column names. The table’s columns do not have type parameters other than precision and scale.- No options other than the above are specified.
Implementations§
source§impl JobConfigurationLoad
impl JobConfigurationLoad
sourcepub fn decimal_target_types(
&self,
) -> FilterMap<Cloned<Iter<'_, i32>>, fn(_: i32) -> Option<DecimalTargetType>>
pub fn decimal_target_types( &self, ) -> FilterMap<Cloned<Iter<'_, i32>>, fn(_: i32) -> Option<DecimalTargetType>>
Returns an iterator which yields the valid enum values contained in decimal_target_types
.
sourcepub fn push_decimal_target_types(&mut self, value: DecimalTargetType)
pub fn push_decimal_target_types(&mut self, value: DecimalTargetType)
Appends the provided enum value to decimal_target_types
.
sourcepub fn json_extension(&self) -> JsonExtension
pub fn json_extension(&self) -> JsonExtension
Returns the enum value of json_extension
, or the default if the field is set to an invalid enum value.
sourcepub fn set_json_extension(&mut self, value: JsonExtension)
pub fn set_json_extension(&mut self, value: JsonExtension)
Sets json_extension
to the provided enum value.
sourcepub fn file_set_spec_type(&self) -> FileSetSpecType
pub fn file_set_spec_type(&self) -> FileSetSpecType
Returns the enum value of file_set_spec_type
, or the default if the field is set to an invalid enum value.
sourcepub fn set_file_set_spec_type(&mut self, value: FileSetSpecType)
pub fn set_file_set_spec_type(&mut self, value: FileSetSpecType)
Sets file_set_spec_type
to the provided enum value.
sourcepub fn column_name_character_map(&self) -> ColumnNameCharacterMap
pub fn column_name_character_map(&self) -> ColumnNameCharacterMap
Returns the enum value of column_name_character_map
, or the default if the field is set to an invalid enum value.
sourcepub fn set_column_name_character_map(&mut self, value: ColumnNameCharacterMap)
pub fn set_column_name_character_map(&mut self, value: ColumnNameCharacterMap)
Sets column_name_character_map
to the provided enum value.
Trait Implementations§
source§impl Clone for JobConfigurationLoad
impl Clone for JobConfigurationLoad
source§fn clone(&self) -> JobConfigurationLoad
fn clone(&self) -> JobConfigurationLoad
1.0.0 · source§fn clone_from(&mut self, source: &Self)
fn clone_from(&mut self, source: &Self)
source
. Read moresource§impl Debug for JobConfigurationLoad
impl Debug for JobConfigurationLoad
source§impl Default for JobConfigurationLoad
impl Default for JobConfigurationLoad
source§impl Message for JobConfigurationLoad
impl Message for JobConfigurationLoad
source§fn encoded_len(&self) -> usize
fn encoded_len(&self) -> usize
source§fn encode(&self, buf: &mut impl BufMut) -> Result<(), EncodeError>where
Self: Sized,
fn encode(&self, buf: &mut impl BufMut) -> Result<(), EncodeError>where
Self: Sized,
source§fn encode_to_vec(&self) -> Vec<u8>where
Self: Sized,
fn encode_to_vec(&self) -> Vec<u8>where
Self: Sized,
source§fn encode_length_delimited(
&self,
buf: &mut impl BufMut,
) -> Result<(), EncodeError>where
Self: Sized,
fn encode_length_delimited(
&self,
buf: &mut impl BufMut,
) -> Result<(), EncodeError>where
Self: Sized,
source§fn encode_length_delimited_to_vec(&self) -> Vec<u8>where
Self: Sized,
fn encode_length_delimited_to_vec(&self) -> Vec<u8>where
Self: Sized,
source§fn decode(buf: impl Buf) -> Result<Self, DecodeError>where
Self: Default,
fn decode(buf: impl Buf) -> Result<Self, DecodeError>where
Self: Default,
source§fn decode_length_delimited(buf: impl Buf) -> Result<Self, DecodeError>where
Self: Default,
fn decode_length_delimited(buf: impl Buf) -> Result<Self, DecodeError>where
Self: Default,
source§fn merge(&mut self, buf: impl Buf) -> Result<(), DecodeError>where
Self: Sized,
fn merge(&mut self, buf: impl Buf) -> Result<(), DecodeError>where
Self: Sized,
self
. Read moresource§fn merge_length_delimited(&mut self, buf: impl Buf) -> Result<(), DecodeError>where
Self: Sized,
fn merge_length_delimited(&mut self, buf: impl Buf) -> Result<(), DecodeError>where
Self: Sized,
self
.source§impl PartialEq for JobConfigurationLoad
impl PartialEq for JobConfigurationLoad
source§fn eq(&self, other: &JobConfigurationLoad) -> bool
fn eq(&self, other: &JobConfigurationLoad) -> bool
self
and other
values to be equal, and is used
by ==
.impl StructuralPartialEq for JobConfigurationLoad
Auto Trait Implementations§
impl Freeze for JobConfigurationLoad
impl RefUnwindSafe for JobConfigurationLoad
impl Send for JobConfigurationLoad
impl Sync for JobConfigurationLoad
impl Unpin for JobConfigurationLoad
impl UnwindSafe for JobConfigurationLoad
Blanket Implementations§
source§impl<T> BorrowMut<T> for Twhere
T: ?Sized,
impl<T> BorrowMut<T> for Twhere
T: ?Sized,
source§fn borrow_mut(&mut self) -> &mut T
fn borrow_mut(&mut self) -> &mut T
§impl<T> Instrument for T
impl<T> Instrument for T
§fn instrument(self, span: Span) -> Instrumented<Self>
fn instrument(self, span: Span) -> Instrumented<Self>
§fn in_current_span(self) -> Instrumented<Self>
fn in_current_span(self) -> Instrumented<Self>
source§impl<T> IntoRequest<T> for T
impl<T> IntoRequest<T> for T
source§fn into_request(self) -> Request<T>
fn into_request(self) -> Request<T>
T
in a tonic::Request