BULK INSERT in
SQL Server
The
Bulk Insert in (shortly
called as BCP) will be very helpful to quickly transfer a large amount of data
from Text File or CSV file to SQL Server Table or Views.
To transfer the data
present in the text file or CSV file to SQL table.
--
Example for BCP in SQL Server
BULK
INSERT [SQL Server Tutorials].[dbo].[DimGeography]
FROM 'F:\MS BI\FILE
EXAMPLES\Geography.txt'
WITH
(
DATAFILETYPE = 'char',
FIELDTERMINATOR = ',',
ROWTERMINATOR = '\n'
);
Imports A
Data File Into A Database Table Or View In A User-Specified Format In SQL Server
BULK INSERT Sales.Orders
FROM '\\SystemX\DiskZ\Sales\data\orders.dat';
BULK INSERT Sales.Orders
FROM '\\SystemX\DiskZ\Sales\data\orders.csv'
WITH ( FORMAT='CSV');
database_name Is the
database name in which the specified table or view resides. If not specified,
this is the current database.
schema_name Is the name of
the table or view schema. schema_name is
optional if the default schema for the user performing the bulk-import operation
is schema of the specified table or view. If schema is
not specified and the default schema of the user performing the bulk-import
operation is different from the specified table or view, SQL Server returns an
error message, and the bulk-import operation is canceled.
table_name Is the name of
the table or view to bulk import data into. Only views in which all columns
refer to the same base table can be used. For more information about the
restrictions for loading data into views, see INSERT (Transact-SQL).
' data_file ' Is the full
path of the data file that contains data to import into the specified table or
view. BULK INSERT can import data from a disk or Azure Blob storage (including
network, floppy disk, hard disk, and so on). data_file must specify a valid path from the server on which SQL
Server is running. If data_file is
a remote file, specify the Universal Naming Convention (UNC) name. A UNC name
has the form \\Systemname\ShareName\Path\FileName.
For example:
' data_source_name ' Applies to: SQL Server 2017 (14.x) CTP 1.1 and
Azure SQL Database. Is a named external data source pointing to the Azure Blob
storage location of the file that will be imported. The external data source
must be created using the
TYPE = BLOB_STORAGE
option added in SQL Server 2017 (14.x) CTP 1.1.
BATCHSIZE =batch_size Specifies the number of rows in a batch. Each batch is
copied to the server as one transaction. If this fails, SQL Server commits or
rolls back the transaction for every batch. By default, all data in the
specified data file is one batch. For information about performance
considerations, see "Remarks," later in this topic.
CHECK_CONSTRAINTS Specifies that all constraints on the target table or view must
be checked during the bulk-import operation. Without the CHECK_CONSTRAINTS
option, any CHECK and FOREIGN KEY constraints are ignored, and after the
operation, the constraint on the table is marked as not-trusted.
At some point, you
must examine the constraints on the whole table. If the table was non-empty
before the bulk-import operation, the cost of revalidating the constraint may
exceed the cost of applying CHECK constraints to the incremental data.
A situation in which you might want constraints disabled (the
default behavior) is if the input data contains rows that violate constraints.
With CHECK constraints disabled, you can import the data and then use
Transact-SQL statements to remove the invalid data.
CODEPAGE = { 'ACP' | 'OEM' | 'RAW' | 'code_page' } Specifies the code page of the data in the data file.
CODEPAGE is relevant only if the data contains char, varchar, or text columns with
character values greater than 127 or less than 32. For an example, see Specifying a code page.
DATAFILETYPE = { 'char' | 'native' | 'widechar' | 'widenative' } Specifies that BULK INSERT performs the import operation
using the specified data-file type value.
ERRORFILE ='file_name' Specifies the file used to collect rows that have
formatting errors and cannot be converted to an OLE DB rowset. These rows are copied
into this error file from the data file "as is."
The error file is created when the command is executed. An error
occurs if the file already exists. Additionally, a control file that has the
extension .ERROR.txt is created. This references each row in the error file and
provides error diagnostics. As soon as the errors have been corrected, the data
can be loaded. Applies to: SQL
Server 2017 (14.x) CTP 1.1. Beginning with SQL Server 2017 (14.x), the
error_file_path
can
be in Azure blob storage.
'errorfile_data_source_name' Applies to: SQL
Server 2017 (14.x) CTP 1.1. Is a named external data source pointing to the
Azure Blob storage location of the error file that will contain errors found
during the import. The external data source must be created using the
TYPE = BLOB_STORAGE
option added in SQL Server 2017 (14.x) CTP 1.1. For more
information, see CREATE EXTERNAL DATA SOURCE.
FIRSTROW = first_row Specifies the number of the first row to load. The default
is the first row in the specified data file. FIRSTROW is 1-based.
FIRE_TRIGGERS Specifies that any
insert triggers defined on the destination table execute during the bulk-import
operation. If triggers are defined for INSERT operations on the target table,
they are fired for every completed batch.
If FIRE_TRIGGERS is not specified, no insert triggers execute.
FORMATFILE_DATA_SOURCE = 'data_source_name' Applies to: SQL Server 2017 (14.x) 1.1. Is a
named external data source pointing to the Azure Blob storage location of the
format file that will define the schema of imported data. The external data
source must be created using the
TYPE = BLOB_STORAGE
option added in SQL Server 2017 (14.x) CTP 1.1. For more
information, see CREATE EXTERNAL DATA SOURCE.
KEEPIDENTITY Specifies that identity value or values in the imported data
file are to be used for the identity column. If KEEPIDENTITY is not specified,
the identity values for this column are verified but not imported and SQL
Server automatically assigns unique values based on the seed and increment
values specified during table creation. If the data file does not contain
values for the identity column in the table or view, use a format file to
specify that the identity column in the table or view is to be skipped when
importing data; SQL Server automatically assigns unique values for the column.
For more information, see DBCC CHECKIDENT (Transact-SQL).
KEEPNULLS Specifies that empty columns should retain a null value during
the bulk-import operation, instead of having any default values for the columns
inserted.
KILOBYTES_PER_BATCH = kilobytes_per_batch Specifies the approximate number of kilobytes (KB) of data
per batch as kilobytes_per_batch.
By default, KILOBYTES_PER_BATCH is unknown. For information about performance
considerations, see "Remarks," later in this topic.
LASTROW = last_row Specifies the number of the last row to
load. The default is 0, which indicates the last row in the specified data
file.
MAXERRORS = max_errors Specifies the maximum number of syntax errors allowed in
the data before the bulk-import operation is canceled. Each row that cannot be
imported by the bulk-import operation is ignored and counted as one error. If max_errors is not specified, the default is 10.
ORDER ( { column [
ASC | DESC ] } [ ,... n ] ) Specifies how the
data in the data file is sorted. Bulk import performance is improved if the
data being imported is sorted according to the clustered index on the table, if
any. If the data file is sorted in a different order, that is other than the order
of a clustered index key or if there is no clustered index on the table, the
ORDER clause is ignored. The column names supplied must be valid column names
in the destination table. By default, the bulk insert operation assumes the
data file is unordered. For optimized bulk import, SQL Server also validates
that the imported data is sorted.
n Is a
placeholder that indicates that multiple columns can be specified.
ROWS_PER_BATCH = rows_per_batch Indicates the approximate number of rows of data in the
data file.
By default, all the data in the data file is sent to the server
as a single transaction, and the number of rows in the batch is unknown to the
query optimizer. If you specify ROWS_PER_BATCH (with a value > 0) the server
uses this value to optimize the bulk-import operation. The value specified for
ROWS_PER_BATCH should approximately the same as the actual number of rows. For
information about performance considerations, see "Remarks," later in
this topic.
TABLOCK Specifies that a table-level lock is acquired for the duration
of the bulk-import operation. A table can be loaded concurrently by multiple
clients if the table has no indexes and TABLOCK is specified. By default,
locking behavior is determined by the table option table lock on bulk load. Holding a lock for the
duration of the bulk-import operation reduces lock contention on the table, in
some cases can significantly improve performance. For information about
performance considerations, see "Remarks," later in this topic.
For columnstore index. the locking behavior is different because
it is internally divided into multiple rowsets. Each thread loads data
exclusively into each rowset by taking a X lock on the rowset allowing parallel
data load with concurrent data load sessions. The use of TABLOCK option will
cause thread to take an X lock on the table (unlike BU lock for traditional
rowsets) which will prevent other concurrent threads to load data concurrently.
A. Using pipes
to import data from a file
BULK INSERT
AdventureWorks2012.Sales.SalesOrderDetail
FROM 'f:\orders\lineitem.tbl'
WITH
(
FIELDTERMINATOR =' |'
, ROWTERMINATOR =' |\n'
);
B. Using the
FIRE_TRIGGERS argument
BULK INSERT
AdventureWorks2012.Sales.SalesOrderDetail
FROM 'f:\orders\lineitem.tbl'
WITH
(
FIELDTERMINATOR =' |'
, ROWTERMINATOR = ':\n'
, FIRE_TRIGGERS
);
C. Using line feed
as a row terminator
DECLARE @bulk_cmd varchar(1000);
SET @bulk_cmd = 'BULK INSERT
AdventureWorks2012.Sales.SalesOrderDetail
FROM
''<drive>:\<path>\<filename>''
WITH
(ROWTERMINATOR = '''+CHAR(10)+''')';
EXEC(@bulk_cmd);
D. Specifying a code
page
BULK INSERT MyTable
FROM 'D:\data.csv'
WITH
(
CODEPAGE = '65001'
, DATAFILETYPE = 'char'
, FIELDTERMINATOR = ','
);
E. Importing data
from a CSV file
BULK INSERT Sales.Invoices
FROM '\\share\invoices\inv-2016-07-25.csv'
WITH (FORMAT = 'CSV'
, FIRSTROW=2
, FIELDQUOTE = '\'
, FIELDTERMINATOR = ';'
, ROWTERMINATOR = '0x0a');
F. Importing data
from a file in Azure blob storage
-->
Optional - a MASTER KEY is not required if a DATABASE SCOPED CREDENTIAL is not
required because the blob is configured for public (anonymous) access!
CREATE MASTER KEY ENCRYPTION BY PASSWORD = 'YourStrongPassword1';
GO
-->
Optional - a DATABASE SCOPED CREDENTIAL is not required because the blob is
configured for public (anonymous) access!
CREATE DATABASE SCOPED CREDENTIAL
MyAzureBlobStorageCredential
WITH IDENTITY = 'SHARED ACCESS SIGNATURE',
SECRET = '******srt=sco&sp=rwac&se=2017-02-01T00:55:34Z&st=2016-12-29T16:55:34Z***************';
-- NOTE: Make sure that you don't have a leading ? in
SAS token, and
-- that you have at least read permission on
the object that should be loaded srt=o&sp=r, and
-- that expiration period is valid (all dates
are in UTC time)
CREATE EXTERNAL DATA SOURCE MyAzureBlobStorage
WITH ( TYPE = BLOB_STORAGE,
LOCATION = 'https://****************.blob.core.windows.net/invoices'
, CREDENTIAL=
MyAzureBlobStorageCredential --> CREDENTIAL is not required if a blob is configured
for public (anonymous) access!
);
BULK INSERT Sales.Invoices
FROM 'inv-2017-12-08.csv'
WITH (DATA_SOURCE = 'MyAzureBlobStorage');
-->
Optional - a MASTER KEY is not required if a DATABASE SCOPED CREDENTIAL is not
required because the blob is configured for public (anonymous) access!
CREATE MASTER KEY ENCRYPTION BY PASSWORD = 'YourStrongPassword1';
GO
-->
Change to using Managed Identity instead of SAS key
CREATE DATABASE SCOPED CREDENTIAL msi_cred
WITH IDENTITY = 'Managed Identity';
GO
CREATE EXTERNAL DATA SOURCE MyAzureBlobStorage
WITH ( TYPE = BLOB_STORAGE,
LOCATION = 'https://****************.blob.core.windows.net/curriculum'
, CREDENTIAL= msi_cred --> CREDENTIAL is not
required if a blob is configured for public (anonymous) access!
);
BULK INSERT Sales.Invoices
FROM 'inv-2017-12-08.csv'
WITH (DATA_SOURCE = 'MyAzureBlobStorage');
No comments:
Post a Comment