Processing a PDF

This requires two calls if your PDF is a scan needing OCR processing.

1. Export content from the PDF

URL: /segments/remote_export2
Post Data: session_id, target_id, can_ocr

session_id

The unique key returned in the initial login call.

target_id

The value of target_lang_id obtained when the PDF was uploaded.

can_ocr

If True then the system will attempt to OCR the given document if needed (and if the user has enough credit). The default is can_ocr=False which means scanned PDFs will be treated as normal resulting in the user will getting no text in the downloaded XLIFF.

Note

Call this immediately after you get a valid target_id returned from the upload PDF operation. The upload operation will also have returned information about any fees payable. Do not execute this call if the user has insufficient credit.

If the document undergoes OCR processing, it's ID will change. In this case call get_ocr_doc_id after processing to get the new ID.

Returns

On success, a JSON structure containing progress information.
On failure the HTTP status code will be set to something other than 200.

The JSON structure will contain amongst other things, the following fields:

task_id

The unique id of the task which corresponds to a separate thread on the server. Use this id to obtain status information about the task via the remote_task_status call.

2. Get the PDF's New ID

URL: /segments/get_ocr_doc_id
Post Data: session_id, target_id

session_id

The unique key returned in the initial login call.

target_id

The value of target_lang_id obtained when the PDF was uploaded.

Note

Call this URL after a scanned PDF has undergone OCR processing via the remote_export2 call. It does not need to be called for normal PDFs. Hence, only call this if remote_upload returns doc_needs_ocr=True in the json response.

Returns

On success, a JSON structure containing the new ID of the processed PDF.
On failure the HTTP status code will be set to something other than 200.

target_lang_id

The new ID of the document now that OCR has been performed. The old ID will no longer valid.