Automatic document field detection

Our parsers intelligently recognizes and auto-detects unique fields from uploaded documents.

Document language detection

Detect the language in scanned or printed documents, images, and PDFs.

Optical Character Recognition (OCR)

Convert scanned or printed documents, including images and PDFs, into machine-readable text.

Integration and automation

Our document parsers can be integrated into existing software systems or workflows.

Document language detection API

Parse Documents provides a robust suite of APIs designed to handle all your document parsing requirements. Our aim is to simplify the complex process of document management, be it retrieval, parsing, or error handling. This includes effortless pagination, a plethora of supported document types, and meticulous error detail.

Versatility and Flexibility

Through our versatile APIs, you can not just retrieve uploaded documents but also queue documents for parsing either via a direct upload or through an external link. Our APIs are designed keeping in mind the dynamic nature of businesses and therefore, they seamlessly cater to varying business needs and configurations.

Swagger Configuration

The APIs are coded following the OpenAPI Specification (OAS), making the integration process hassle-free and straightforward. We provide complete Swagger UI-based documentation that details the potential responses and possible status and error codes.

Your Security, Our Priority

All API requests are authenticated via JWT headers for maximum security. This ensures that your sensitive document data remains protected at all times.

Let's Get Started

We are thrilled to have you onboard and can't wait to see how you integrate and maximize the utility of Parse Documents in your document management operations!

Please make sure to replace "YourAuthTokenHere" with your actual bearer token.
Identify Document Languages
POST /v1/documents/languages

A POST method that identifies the languages of the provided document text. This method takes the document text as input and returns the identified languages along with their probabilities.

Example Request
POST /v1/documents/languages
Request Body
{
    "text": "Lorem ipsum dolor sit amet, consectetur adipiscing elit."
}
Responses
  • 200 Success: Returns the identified languages along with their probabilities.
  • 404 Not Found: The requested document is not found.
  • 400 Bad Request: The request was made incorrectly.
Here is the modified HTML template with the .NET example filled and rewritten for other programming languages:
import requests

url = "https://%(baseUrl)s/v1/documents/languages"
headers = {
    "Authorization": "Bearer {YOUR_API_KEY}"
}

payload = {
    "text": "Lorem ipsum dolor sit amet, consectetur adipiscing elit."
}

response = requests.post(url, headers=headers, json=payload)
response.raise_for_status()

identified_languages = response.json()

for lang in identified_languages:
    print(f"Language: {lang['code']} - Probability: {lang['probability']}")
        
package main

import (
    "fmt"
    "net/http"
    "bytes"
    "encoding/json"
)

func main() {
    identifyDocumentLanguages()
}

func identifyDocumentLanguages() {
    url := "https://%(baseUrl)s/v1/documents/languages"
    apiKey := "{YOUR_API_KEY}"

    payload := map[string]interface{}{
        "text": "Lorem ipsum dolor sit amet, consectetur adipiscing elit.",
    }

    requestBody, _ := json.Marshal(payload)
    req, _ := http.NewRequest("POST", url, bytes.NewBuffer(requestBody))
    req.Header.Set("Authorization", "Bearer "+apiKey)
    req.Header.Set("Content-Type", "application/json")

    client := &http.Client{}
    response, _ := client.Do(req)

    identifiedLanguages := []map[string]interface{}{}

    json.NewDecoder(response.Body).Decode(&identifiedLanguages)

    for _, lang := range identifiedLanguages {
        fmt.Printf("Language: %v - Probability: %v\n", lang["code"], lang["probability"])
    }
}
        
<?php

$curl = curl_init();

curl_setopt_array($curl, [
  CURLOPT_URL => "https://%(baseUrl)s/v1/documents/languages",
  CURLOPT_RETURNTRANSFER => true,
  CURLOPT_POST => true,
  CURLOPT_POSTFIELDS => json_encode([
    "text" => "Lorem ipsum dolor sit amet, consectetur adipiscing elit."
  ]),
  CURLOPT_HTTPHEADER => [
    "Authorization: Bearer {YOUR_API_KEY}",
    "Content-Type: application/json"
  ],
]);

$response = curl_exec($curl);
$error = curl_error($curl);

curl_close($curl);

if ($error) {
  echo "Error: " . $error;
} else {
  $identifiedLanguages = json_decode($response, true);

  foreach ($identifiedLanguages as $lang) {
    echo "Language: " . $lang['code'] . " - Probability: " . $lang['probability'] . "\n";
  }
}
using System;
using System.Net.Http;
using System.Text;
using System.Text.Json;
using System.Threading.Tasks;

class Program
{
    private static readonly HttpClient client = new HttpClient();
    private static readonly string BASE_URL = "{YOUR_API_BASE_URL}";
    private static readonly string API_KEY = "{YOUR_API_KEY}";

    static void Main(string[] args)
    {
        IdentifyDocumentLanguages().Wait();
    }

    private static async Task IdentifyDocumentLanguages()
    {
        try
        {
            client.DefaultRequestHeaders.Authorization = new System.Net.Http.Headers.AuthenticationHeaderValue("Bearer", API_KEY);

            var requestBody = new
            {
                text = "Lorem ipsum dolor sit amet, consectetur adipiscing elit."
            };

            var requestContent = new StringContent(JsonSerializer.Serialize(requestBody), Encoding.UTF8, "application/json");

            var response = await client.PostAsync(BASE_URL + "/v1/documents/languages", requestContent);
            response.EnsureSuccessStatusCode();

            var responseBody = await response.Content.ReadAsStringAsync();
            var identifiedLanguages = JsonSerializer.Deserialize<IdentifyLanguage[]>(responseBody);

            foreach (var lang in identifiedLanguages)
            {
                Console.WriteLine($"Language: {lang.code} - Probability: {lang.probability}");
            }
        }
        catch (HttpRequestException e)
        {
            Console.WriteLine($"Error: {e.Message}");
        }
    }
}

In this code, we define a simple program with a single method `IdentifyDocumentLanguages`.

This method first sets up the authentication header by adding the bearer token to the HttpClient's default headers.

Then, it creates the request body containing the document text.

It sends a POST request to the specified endpoint with the request body as JSON.

If the request fails for any reason, an HttpRequestException will be thrown and the method will catch it and print the error message to the console.

If the request is successful, the method will read the response body as an array of `IdentifyLanguage` objects and print the language code and probability for each identified language.

Request Body:

  • text: The document text to identify the languages.