DiscoverText API - Import Unit Into Archive

Archives

POST

https://api.discovertext.com/api/v1/archive/{id}/units

Input: a JSON object of units to be imported and the {id} of the archive to import into

{
    items: [                                        // maximum of 100 items per call
        {
            id: "(unique_id)",                      // string, required, up to 120 characters
            text: "(item_text)",                    // string, required
            title: "(title_text)",                  // string, optional, up to 120 characters
            itemTimestamp: "(ISO 8601 timestamp)",  // string, optional
            fileType: "(file_type_designation)",    // string, optional, see below for valid file types
            metadata: [                             // optional array of metadata, up to 128 items
                {
                    key: "(metadata_key)",          // string, required, up to 120 characters
                    value: "(metadata_value)",      // string, required
                    valueType: "(value type)"       // string, optional, see below
                }, ...
            ]
        }, ...
    ]
}

Return Value: A collection of Unit IDs and imported Units they

{
    meta: {
        count: (item imported count)
    },
    items: [... array of return objects ...],
}

Each return object has the structure of:

{
    itemId: "(unique id of imported item)",     // string
    id: (internal_id of the created item)       // numeric
}

Each POST call can import up to 100 items at a time, with a maximum payload size of 128 MB in size. If you need to upload more than this, use multiple batches to import, or, manually import the items into your account on DiscoverText. Any invalid paramters (e.g. more than 100 items) or any invalid documents (i.e. missing required fields) will result in a 400 - Bad Request. Likewise, you must have permissions to import data into the specified archive or else a 403 - Forbidden error will be thrown.

For each item, you must provide the text (as a UTF-8 string) and a unique id (should be unique to your project). The ID can be any arbitrary string value up to a maximum of 120 characters. The itemTimestamp is optional, but if included must be in ISO-8601 format. The title field is optional, and if omitted, up to the first 120 characters of the text field will be used. The text, title, and any metadata fields accept UTF-8 input and can handle multi-byte and multilingual character sets.

Metadata:
The optional metadata fields for the document, if added, can be used for advanced filtering and provide context for the document. The key and value fields are required for each item. The value type field can be specified to ensure the type of the value for the metadata. Valid value types are:

text: a text-only value
datetime: a date-time value
number: a numeric value

If the value type is not specified, it will attempt to guess at the type (datetime or number), and default to text if non-convertable. Likewise, if datetime or number is specified and the value cannot be converted to the type, it will also default to text.

Valid File Type Values:
If you specify a file type, it will use that file type for display and filtering purposes only. During import, the process will not perform any conversion or other processing on the input text - that is, the text is expected to be already extracted from the file(s) and in readable UTF-8 format. Only the file types in the following list are valid for import. If you do not specify a file type, or, it is invalid, it will be displayed as "plain text".

txt: plain text
word: MS Word document
doc: MS Word document
docx: MS Word document
csv: CSV spreadsheet data
xls: CSV spreadsheet data
xlsx: CSV spreadsheet data
pdf: PDF file
htm: HTML file
html: HTML file
ppt: MS PowerPoint file
rtf: Rich Text file