LaVOZs

The World’s Largest Online Community for Developers

'; stored procedures - How do I get a continuation token for a bulk INSERT on Azure Cosmos DB? - LavOzs.Com

I want to upload a CSV file that represents 10k documents to be added to my Cosmos DB collection in a manner that's fast and atomic. I have a stored procedure like the following pseudo-code:

function createDocsFromCSV(csv_text) {
    function parse(txt) { // ... parsing code here ... }

    var collection = getContext().getCollection();
    var response = getContext().getResponse();

    var docs_to_create = parse(csv_text);
    for(var ii=0; ii<docs_to_create.length; ii++) {
        var accepted = collection.createDocument(collection.getSelfLink(),
                                                    docs_to_create[ii],
                                                    function(err, doc_created) {
                                                        if(err) throw new Error('Error' + err.message);
                                                    });
        if(!accepted) {
            throw new Error('Timed out creating document ' + ii);
        }
    }
}

When I run it, the stored procedure creates about 1200 documents before timing out (and therefore rolling back and not creating any documents).

Previously I had success updating (instead of creating) thousands of documents in a stored procedure using continuation tokens and this answer as guidance: https://stackoverflow.com/a/34761098/277504. But after searching documentation (e.g. https://azure.github.io/azure-documentdb-js-server/Collection.html) I don't see a way to get continuation tokens from creating documents like I do for querying documents.

Is there a way to take advantage of stored procedures for bulk document creation?

It’s important to note that stored procedures have bounded execution, in which all operations must complete within the server specified request timeout duration. If an operation does not complete with that time limit, the transaction is automatically rolled back.

In order to simplify development to handle time limits, all CRUD (Create, Read, Update, and Delete) operations return a Boolean value that represents whether that operation will complete. This Boolean value can be used a signal to wrap up execution and for implementing a continuation based model to resume execution (this is illustrated in our code samples below). More details, please refer to the doc.

The bulk-insert stored procedure provided above implements the continuation model by returning the number of documents successfully created.

pseudo-code:

function createDocsFromCSV(csv_text,count) {
    function parse(txt) { // ... parsing code here ... }

    var collection = getContext().getCollection();
    var response = getContext().getResponse();

    var docs_to_create = parse(csv_text);
    for(var ii=count; ii<docs_to_create.length; ii++) {
        var accepted = collection.createDocument(collection.getSelfLink(),
                                                    docs_to_create[ii],
                                                    function(err, doc_created) {
                                                        if(err) throw new Error('Error' + err.message);
                                                    });
        if(!accepted) {
            getContext().getResponse().setBody(count);
        }
    }
}

Then you could check the output document count on the client side and re-run the stored procedure with the count parameter to create the remaining set of documents until the count larger than the length of csv_text.

Hope it helps you.

Related
Azure documentdb bulk insert using stored procedure
Fastest way to insert 100,000+ records into DocumentDB
Unable to queryDocuments using a CosmosDb Sproc
How to fetch All records from azure cosmos db using query
CosmosDB Stored Procedure returns error “Encountered exception while executing Javascript. Exception = Error: Invalid arguments for query”
Azure Cosmos DB stored procedure not returning documents
Azure Cosmos DB - Delete entire partition
Azure cosmos db collection is not getting the partition key
Failed to parse the value '' as ResourceId in Cosmos DocumentQuery in Azure Function